非理性pbrl preference benchmark

android编译kanzi 问题 (2) Gradle sync failed: NDK not configured. Download it with SDK manager. Preferred NDK version is '21.1.6352462'.

问题原因：这个是因为本地网络不佳，下载NDK的包，然后本地已经存在的和android工程设置的又不匹配。解决办法：修改NDK版本把 21.3.6528147 改成提示的 21.1.6352462 ......

configured NDK Preferred Download android更新时间 2024-01-11

gsm8k benchmark

using gsm8k-rft-llama7b-u13b_evaluation env: lm_evaluation using GSM8K-eval ......

benchmark gsm8k gsm8 gsm 8k更新时间 2024-01-11

humaneval benchmark

use code-eval command git clone https://github.com/abacaj/code-eval.git cd code-eval conda create -n human_eval python=3.10 conda activate human_eval ......

humaneval benchmark更新时间 2024-01-09

TDSQL(PostgreSQL版本) benchmark性能测试

一、准备软件包jdk：地址：https://pan.baidu.com/s/1sbgLPROfd9e_valSfv0YAQ 提取码：4qpsbenchmark：地址：https://pan.baidu.com/s/1nAHER-BXpgG0LUnR8NbT7Q 提取码：xcbu 二、安装1、jdk ......

PostgreSQL benchmark 性能版本 TDSQL更新时间 2024-01-05

使用OHOS SDK构建benchmark

参照OHOS IDE和SDK的安装方法配置好开发环境。从github下载源码。执行如下命令： git clone --depth=1 https://github.com/google/benchmark.git 进入源码所在的目录，创建批处理文件ohos_build.cmd，内容如下： @ec ......

benchmark OHOS SDK更新时间 2024-01-03

dataset format of benchmarks

note: the datasets are classified into two types, generative(the answer is natural language, the length and content are not in a fixed format) and sel ......

benchmarks dataset format of更新时间 2024-01-02

shared_preferences缓存

封装 import 'dart:convert'; import 'package:shared_preferences/shared_preferences.dart'; class JSpUtil { JSpUtil._internal(); // 私有的构造方法，防止外部实例化 factory ......

shared_preferences 缓存 preferences shared更新时间 2023-12-27

llama benchmarks

Introduction Here we re-evaluate llama2 benchmarks to prove its performence. datasets In this blog, we'll test the following datasets shown in the ima ......

benchmarks llama更新时间 2023-12-24

LandBench 1.0: a benchmark dataset and evaluation metrics for data-driven land surface variables prediction

李老师对于landbench的，基准模型进行的论文。里面对于变量，数据集的描述，写论文可以用。题目： “LandBench 1.0: a benchmark dataset and evaluation metrics for data-driven land surface variables ......

data-driven evaluation prediction LandBench benchmark更新时间 2023-12-22

eclipse的preferences中找不到server项

最近重装了eclipse，但在使用eclipse载入Tomcat时发现Windows项中的preferences下找不到server项，通过查阅发现是没有安装相应插件，通过查询安装成功找到server项。具体步骤如下： 1、选择Help-->Install New Software 2、点击add ......

preferences eclipse server更新时间 2023-12-20

offline RL | Pessimistic Bootstrapping (PBRL)：在 Q 更新中惩罚 uncertainty，拉低 OOD Q value

critic loss = ① ID 数据的 TD-error + ② OOD 数据的伪 TD-error，① 对所转移去的 (s',a') 的 uncertainty 进行惩罚，② 对 (s, a_ood) 的 uncertainty 进行惩罚。 ......

Bootstrapping Pessimistic uncertainty offline value更新时间 2023-12-17

RLHF · PbRL | 选择 near on-policy query，加速 policy learning 收敛速度

Query-Policy Misalignment：选择的看似 informative 的 query，实际上可能与 RL agent 的兴趣不一致，因此对 policy learning 几乎没有帮助，最终导致 feedback-efficiency 低下。 ......

policy on-policy learning 速度 query更新时间 2023-12-17

Object detection in optical remote sensing images: A survey and a new benchmark

Object detection in optical remote sensing images: A survey and a new benchmark 光学遥感图像中的目标检测：调查和新基准最近人们投入了大量的精力来提出光学遥感图像中物体检测的各种方法。然而，目前对光学遥感图像中目标检测的 ......

detection benchmark optical sensing Object更新时间 2023-12-16

[Bash] Benchmark with hyperfine

https://github.com/sharkdp/hyperfine hyperfine --runs 5 "CMD_1" "CMD_2" So it will run 5 times and compare CMD_1 vs CMD_2with a nice result summary ......

Benchmark hyperfine Bash with更新时间 2023-12-13

RLHF · PBRL | B-Pref：生成多样非理性 preference，建立 PBRL benchmark

贡献：提出一种生成非理性（模拟人类）preference 的方法，使用多样化的 preference，评测了 PBRL 各环节算法设计（select informative queries、feedback schedule）的效果。 ......

非理性 PBRL preference benchmark B-Pref更新时间 2023-11-30

【具体数学】理性愉悦第二章

求和因子在第一章中，我们对于递归式 \[T_0 = 0, \\ T_n = 2 T_{n-1} + 1 \ \ (n > 0) \]使用了两边 \(+1\) 然后转化为 \(U_n\) 的方法，从而得出 \(T_n = 2^n - 1\)。我们还可以采用另外一种方法。令两边除以 \(2^n\)， ......

理性数学第二章更新时间 2023-11-28

【go】【test】benchmark

@目录写在前面 go 测试基础测试fib.gotest_fib.go使用内存generate_test.go测试参数测试generate_test.go参数测试timeRestnullsort_test.go 测试开始记时⌛️和测试结束计时⌛️测试参考资料写在前面相关博文个人博客首页免责声明 ......

benchmark test更新时间 2023-11-21

RLHF · PBRL | 发现部分 D4RL tasks 不适合做 offline reward learning 的 benchmark

发现对于很多任务，（只要给出专家轨迹），将 reward 设为 0 或随机数，也能学出很好 policy，证明这些任务不适合用来评测 reward learning 的性能好坏。 ......

benchmark learning offline 部分 reward更新时间 2023-11-13

srsLTE的ctest出现错误，benchmark_radio_multi_rf失败的解决办法

首先使用cd build 和 ctest --rerun-failed --output-on-failure,单独运行出错的内容。在build/Testing/Temporary中LastTest.log和LastTestsFailed.log写明了错误原因，如下 Error: allocati ......

benchmark_radio_multi_rf benchmark 错误办法 srsLTE更新时间 2023-11-13

RLHF · PBRL | SURF：使用半监督学习，对 labeled segment pair 进行数据增强

① 将 high-confidence 的预测 (σ0, σ1) 标上 pseudo-label；② 将 labeled segment pair 进行时序剪裁，得到更多数据增强的 labeled pair。 ......

labeled segment 数据 RLHF PBRL更新时间 2023-11-11

RLHF · PBRL | RUNE：鼓励 agent 探索 reward model 更不确定的 (s,a)

reward model 对某 (s,a) 的不确定性，由一系列 ensemble reward models 的输出结果方差的度量，直接乘一个超参数，作为 intrinsic reward 的一部分。 ......

reward agent model RLHF PBRL更新时间 2023-11-10

RLHF · PBRL | PEBBLE：通过 human preference 学习 reward model

① 使用熵 intrinsic reward 的 agent pre-training，② 选择尽可能 informative 的 queries 去获取 preference，③ 使用更新后的 reward model 对 replay buffer 进行 relabel。 ......

preference PEBBLE reward human model更新时间 2023-11-09

Go语言基准测试(benchmark)三部曲之三：提高篇

欢迎访问我的GitHub 这里分类和汇总了欣宸的全部原创(含配套源码)：https://github.com/zq2599/blog_demos 本篇概览 -《Go语言基准测试(benchmark)三部曲》已近尾声，经历了《基础篇》和《内存篇》的实战演练，相信您已熟练掌握了基准测试的常规操作以及各种 ......

三部曲基准 benchmark 语言更新时间 2023-11-03

史蒂芬·平克《理性》(1)

最近开始读史蒂芬·平克的《理性》，这本书指出“理性人”是需要通过训练来达到的，缺乏理性会使得人们容易在生活中做出错误的判断和决策。书中提供了大量的案例来证明“人如果只靠自己感觉来做判断是不靠谱的“。这些案例不都是一些文字游戏，很多都是贴近生活的案=实例。作者在这本书中提供 7 套帮助提高理性能力的 ......

理性 183更新时间 2023-11-02

Go语言基准测试(benchmark)三部曲之二：内存篇

欢迎访问我的GitHub 这里分类和汇总了欣宸的全部原创(含配套源码)：https://github.com/zq2599/blog_demos 本篇概览本文是《Go语言基准测试(benchmark)三部曲》的第二篇，目标是掌握如何用基准测试来观察被测方法的内存分配情况今天除了常规的操作，即指定 ......

三部曲基准 benchmark 内存语言更新时间 2023-11-02

城市时空预测的统一数据管理和综合性能评估 [实验、分析和基准]《Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]》

2023年11月1日，还有两个月，2023年就要结束了，希望在结束之前我能有所收获和进步，冲呀，老咸鱼。摘要解决了访问和利用不同来源、不同格式存储的不同城市时空数据集，以及确定有效的模型结构和组件。 1.为城市时空大数据设计的统一存储格式“原子文件”，并在40个不同的数据集上验证了其有效性，简化 ......

数据管理 Spatial-Temporal 基准 Comprehensive Performance更新时间 2023-11-01

共73篇 :1/3页 首页上一页123下一页尾页

526互联