noise reinforcement exploration learning

Off-Policy Deep Reinforcement Learning without Exploration

**发表时间：**2019（ICML 2019） **文章要点：**这篇文章想说在offline RL的setting下，由于外推误差（extrapolation errors）的原因，标准的off-policy算法比如DQN，DDPG之类的，如果数据的分布和当前policy的分布差距很大的话，那就 ......

Reinforcement Exploration Off-Policy Learning without更新时间 2023-05-21

《AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks》特征交叉论文阅读

背景这是一篇利用多头attention机制来做特征交叉的论文模型结构 AutoInt的模型结构如上图所示，搞模型包含 Embedding Layer、Interacting Layer、Output Layer三个部分，其中Embedding Layer和Output Layer和普通模型没什么 ......

Self-Attentive Interaction Attentive Automatic Learning更新时间 2023-05-19

Jan 2023-Prioritizing Samples in Reinforcement Learning with Reducible Loss

#1 Introduction 本文建议根据样本的可学习性进行抽样，而不是从经验回放中随机抽样。如果有可能减少代理对该样本的损失，则认为该样本是可学习的。我们将可以减少样本损失的数量称为其可减少损失(ReLo)。这与Schaul等人[2016]的vanilla优先级不同，后者只是对具有高损失的样本给 ......

Reinforcement Prioritizing Reducible Learning Samples更新时间 2023-05-17

【图像数据增强】Image Data Augmentation for Deep Learning: A Survey

Augmentation Learning 图像数据 Survey更新时间 2023-05-17

Oracle 集合-Learning-1

集合-Test1 bulk collect into 批量插入，可用limit 限制插入行数 type ... is table of DataType Index by binary_Integer 其中 index by binary_integer 在定义schema级 type 时没有使用， ......

Learning Oracle更新时间 2023-05-17

Short-Term Plasticity Neurons Learning to Learn and Forget

郑重声明：原文参见标题，如有侵权，请联系作者，将会撤销发布！ Proceedings of the 39th International Conference on Machine Learning ......

Short-Term Plasticity Learning Neurons Forget更新时间 2023-05-16

SAP UI5 Flexible Programming Model Explorer

按照 SAP UI5 官网的说法， The SAPUI5 freestyle templates are deprecated, and it’s recommended to use the custom page SAP Fiori template based on the flexible ......

Programming Flexible Explorer Model SAP更新时间 2023-05-16

论文阅读笔记《Training Socially Engaging Robots Modeling Backchannel Behaviors with Batch Reinforcement Learning》

Training Socially Engaging Robots Modeling Backchannel Behaviors with Batch Reinforcement Learning 训练社交机器人：使用批量强化学习对反馈信号行为进行建模发表于TAC 2022。 Hussain N, ......

Reinforcement Backchannel Behaviors Training Socially更新时间 2023-05-09

Robust Deep Reinforcement Learning through Adversarial Loss

郑重声明：原文参见标题，如有侵权，请联系作者，将会撤销发布！ 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Abstract 最近的研究表明，深度强化学习智能体很容易受到智能体输入上的小对抗性扰动的影响 ......

Reinforcement Adversarial Learning through Robust更新时间 2023-05-09

【五期邹昱夫】CCF-A（NeurIPS'19）Inverting gradients-how easy is it to break privacy in federated learning?

"Geiping J, Bauermeister H, Dröge H, et al. Inverting gradients-how easy is it to break privacy in federated learning?[J]. Advances in Neural Informat ......

gradients-how Inverting gradients federated learning更新时间 2023-05-08

Exploring the Use of Humanized Mouse Models in Drug Safety Evaluation

However, there are differences between animals and humans, safety studies cannot be conducted on animal models alone, and normal animals do not respon... ......

Evaluation Exploring Humanized Models Safety更新时间 2023-05-08

prompt learning如何计算损失的

在prompt learning中,对于一个类别的多个候选词,损失函数通常会计算所有词的logit和,并与真实标签作比较。以情感分类为例: 假设正面类别有两个候选词:“positive”和“optimistic”。负面类别有两个候选词:“negative”和“pessimistic”。然后模型会计 ......

learning 损失 prompt更新时间 2023-05-07

论文解读（ID-MixGCL）《ID-MixGCL: Identity Mixup for Graph Contrastive Learning》

论文信息论文标题：ID-MixGCL: Identity Mixup for Graph Contrastive Learning论文作者：Gehang Zhang.....论文来源：2023 aRxiv论文地址：download 论文代码：download视屏讲解：click 介绍 ......

ID-MixGCL MixGCL Contrastive Identity Learning更新时间 2023-05-07

20230507 TI Engineer It - How to test power supplies - Measuring Noise

Hi. I'm Bob Hanrahan application engineering at Texas Instruments.This is a series on measuring performance of power supplies .we will be measuring no ......

Measuring 20230507 Engineer supplies Noise更新时间 2023-05-07

Heuristic-Guided Reinforcement Learning

**发表时间：**2021 (NeurIPS 2021) **文章要点：**这篇文章提出了一个Heuristic-Guided Reinforcement Learning (HuRL)的框架，用domain knowledge或者offline data构建heuristic，将问题变成一个sho ......

Heuristic-Guided Reinforcement Heuristic Learning Guided更新时间 2023-05-06

Medicine River ————-Learning journals 9

Dear dairy. 2020 6 May Hey, Harlan, long time no see. How have you been lately? I've been quite busy lately. I hope you don't blame me for not coming ......

Medicine Learning journals River更新时间 2023-05-06

LLL(Life Long Learning)&灾难性遗忘(Catastrophic Forgetting)

LLL(Life Long Learning)&灾难性遗忘(Catastrophic Forgetting) https://www.youtube.com/watch?v=Y9Jay_vxOsM Life Long Learning 通常机器学习中，单个模型只解决单个或少数几个任务。对于新的任务， ......

灾难性 Catastrophic Forgetting 灾难 Learning更新时间 2023-05-06

Error:All flavors must now belong to a named flavor dimension. Learn more at

{ https://blog.csdn.net/qq_15807167/article/details/79528063 } 这是plugin 3.0.0之后有一种自动匹配消耗库的机制，便于debug variant 自动消耗一个库，然后就是必须要所有的flavor 都属于同一个维 defaultC ......

dimension flavors belong flavor Error更新时间 2023-05-05

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

郑重声明：原文参见标题，如有侵权，请联系作者，将会撤销发布！ NeurIPS 2020 ......

Reinforcement Perturbations Observations Adversarial Learning更新时间 2023-05-05

李宏毅meta learning笔记

学习如何学习其实就是学习模型本身，模型的超参数定义一个function，输入是一堆训练任务，输出是一个模型，这个和传统的机器学习没有本质不同所以也是分成三步，定义学什么，和相应的学习模型，meta learning本身也是有meta的。。。。。。定义loss函数用优化算法求解，但是这个L ......

learning 笔记 meta更新时间 2023-05-05

Learning A Single Network for Scale-Arbitrary Super-Resolution

Learning A Single Network for Scale-Arbitrary Super-Resolution abstract 现有的single image SR网络是为具有特定整数比例因子（例如，×2/3/4）的图像开发的，无法处理非整数和非对称 SR。在本文中，作者建议从特定 ......

Super-Resolution Scale-Arbitrary Resolution Arbitrary Learning更新时间 2023-05-05

Teachable Reinforcement Learning via Advice Distillation

**发表时间：**2021 (NeurIPS 2021) **文章要点：**这篇文章提出了一种学习policy的监督范式，大概思路就是先结构化advice，然后先学习解释advice，再从advice中学policy。这个advice来自于外部的teacher，相当于一种human-in-the-l ......

Reinforcement Distillation Teachable Learning Advice更新时间 2023-05-02

process explorer 如何生成转储(dmp)文件

我是直接使用proc exp dump的，因为默认的任务管理器不是所有的process都能dump。任务管理器dump 任务管理器可以说是最易获取的系统工具，同时它具有生成转储文件的功能。但要注意的是在64位操作系统上面，默认启动的是64位的任务管理器。使用任务管理器生成转储文件需要遵循一个原则： ......

explorer process 文件 dmp更新时间 2023-05-02

论文阅读-sparse gpu kernels for deep learning

论文地址：https://ieeexplore.ieee.org/document/9355309 源码地址：https://github.com/google-research/sputnik 背景深度神经网络由大量的矩阵乘法运算和卷积运算组成，这些运算中使用的矩阵可以转化成稀疏矩阵，同时不损失 ......

learning kernels sparse 论文 deep更新时间 2023-05-01

Deep Dynamics Models for Learning Dexterous Manipulation

**发表时间：**2019 (CoRL 2019) **文章要点：**文章提出了一个online planning with deep dynamics models (PDDM)的算法来学习Dexterous multi-fingered hands，大概意思就是学习拟人的灵活的手指操控技巧。大概 ......

Manipulation Dexterous Dynamics Learning Models更新时间 2023-04-30

2、题目：The Informed Design Teaching and Learning Matrix

期刊信息（1）作者：Crismond, David P. （2）期刊：Journal of Engineering Education, 2012, 101(4): 738–797 （3）DOI：10.1002/j.2168-9830.2012.tb01127.x （4）ISSN：10694730 ......

Informed Teaching Learning 题目 Design更新时间 2023-04-28

论文阅读笔记《Residual Physics Learning and System Identification for Sim to real Transfer of Policies on Buoyancy Assisted Legged Robots》

Residual Physics Learning and System Identification for Sim to real Transfer of Policies on Buoyancy Assisted Legged Robots 发表于2023年。论文较新，未找到发表期刊。基于浮 ......

Identification Residual Learning Buoyancy Assisted更新时间 2023-04-28

共560篇 :14/19页 首页上一页11121314151617下一页尾页

526互联