Evaluation

大模型评测-微软亚洲研究院：A Survey on Evaluation of Large Language Models论文分享

《A Survey on Evaluation of Large Language Models》一、论文介绍：微软亚洲研究院公开了介绍大模型评测领域的论文《A Survey on Evaluation of Large Language Models》。该论文一共调研了219篇文献，以评测内容 ......

Evaluation 研究院 Language 模型 Survey更新时间 2024-01-02

LandBench 1.0: a benchmark dataset and evaluation metrics for data-driven land surface variables prediction

李老师对于landbench的，基准模型进行的论文。里面对于变量，数据集的描述，写论文可以用。题目： “LandBench 1.0: a benchmark dataset and evaluation metrics for data-driven land surface variables ......

data-driven evaluation prediction LandBench benchmark更新时间 2023-12-22

large language model evaluation

1 Evaluate medical model fine-tuned by llama 1.1 evaluation dataset here how to organize the dataset ......

evaluation language large model更新时间 2023-12-19

jmeter beanshell常见问题:"BeanShellInterpreter: Error invoking bsh method: eval In file: inline evaluation of....

jmeter使用beanshell文件经常会遇到这个问题:BeanShellInterpreter: Error invoking bsh method: eval In file: inline evaluation of.... 原因可能有： 1.jar包没有放入对应位置解决：放到lib/ex ......

BeanShellInterpreter evaluation beanshell invoking 常见问题更新时间 2023-11-21

城市时空预测的统一数据管理和综合性能评估 [实验、分析和基准]《Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]》

2023年11月1日，还有两个月，2023年就要结束了，希望在结束之前我能有所收获和进步，冲呀，老咸鱼。摘要解决了访问和利用不同来源、不同格式存储的不同城市时空数据集，以及确定有效的模型结构和组件。 1.为城市时空大数据设计的统一存储格式“原子文件”，并在40个不同的数据集上验证了其有效性，简化 ......

数据管理 Spatial-Temporal 基准 Comprehensive Performance更新时间 2023-11-01

Black-Box Attack-Based Security Evaluation Framework forCredit Card Fraud Detection Models

Black-Box Attack-Based Security Evaluation Framework forCredit Card Fraud Detection Models 动机 AI模型容易受到对抗性攻击（对样本添加精心设计的扰动生成对抗性示例）现有的对抗性攻击可以分为白盒攻击和黑盒攻击 ......

Attack-Based Evaluation Black-Box Detection Framework更新时间 2023-09-23

安装无限重置插件报错“Your evaluation license has expired ….”您的评估许可证已过期，IntelliJ IDEA将退出

安装无限重置插件报错“Your evaluation license has expired ….” 您的评估许可证已过期，IntelliJ IDEA将退出最近不少小伙伴反馈，已经安装了IDE Eval Reset插件，但是在使用的过程中，仍然报错，弹窗提示：“Your evaluation l ......

evaluation 插件 IntelliJ 许可证 license更新时间 2023-09-18

MCU之Microchip PIC16F17146 Curiosity NANO Evaluation Kit评测报告

对比完 RISC(Proprietary) 与 RISC-V(Open Source),来点 Microchip 的 PIC16F17146 Curiosity Nano(Revision 4 has PIC16F17146 rev B2) Evaluation Kit的实测：这块板是多层PCB设 ......

评测报告 Evaluation Curiosity Microchip 报告更新时间 2023-07-22

Windows 11 Enterprise (Evaluation)下载地址

- [下载地址: https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/](https://developer.microsoft.com/en-us/windows/downloads/virtual-ma ......

Enterprise Evaluation Windows 地址 11更新时间 2023-06-27

Automatic quality of generated text Evaluation for Large Language Models，针对大模型生成结果的自动化评测研究

Automatic quality of generated text Evaluation for Large Language Models，针对大模型生成结果的自动化评测研究 ......

Evaluation Automatic generated Language 模型更新时间 2023-06-23

Exploring the Use of Humanized Mouse Models in Drug Safety Evaluation

However, there are differences between animals and humans, safety studies cannot be conducted on animal models alone, and normal animals do not respon... ......

Evaluation Exploring Humanized Models Safety更新时间 2023-05-08

A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms

介绍多视图立体重建是计算机视觉领域中一个非常重要的研究方向，它可以应用于三维建模、虚拟现实、机器人导航等多个领域。然而，目前多视图立体重建领域存在着很多问题和挑战，例如精度不高、完整性不足等。因此，作者希望通过本文对当前主流算法进行比较和评估，为该领域的进一步发展提供参考。为了更准确地评估各种算 ......

Reconstruction Comparison Algorithms Evaluation Multi-View更新时间 2023-04-22

【五期李伟平】CCF-B（PR'12）Feature evaluation and selection with cooperative game theory

Xin, S. , et al. "Feature evaluation and selection with cooperative game theory." Pattern Recognition 45.8(2012):2992-3002. 基于合作博弈寻找最优特征子集，重点解决传统基于信息论 ......

cooperative evaluation selection Feature theory更新时间 2023-03-31

ISYS3401 IT Evaluation

ISYS3401ISYS3401 IT Evaluation (2023)Individual Assignment 1 (30%) This IT Evaluation assignment follows a 3-phase assessment plan introduced in lectu ......

Evaluation ISYS 3401 IT更新时间 2023-03-30

Novelty and diversity in information retrieval evaluation

Clarke C. L. A., Kolla M., Cormack G. V., Vechtomova O., Ashkan A., B\ddot{u}ttcher S. and MacKinnon I. Novelty and diversity in information retrieval ......

information evaluation diversity retrieval Novelty更新时间 2023-03-22

论文分享丨Holistic Evaluation of Language Models

摘要：该文为大模型评估方向的综述论文。本文分享自华为云社区《【论文分享】《Holistic Evaluation of Language Models》》，作者：DevAI。大模型（LLM）已经成为了大多数语言相关的技术的基石，然而大模型的能力、限制、风险还没有被大家完整地认识。该文为大模型评估 ......

Evaluation Holistic Language Models 论文更新时间 2023-03-22

共16篇 :1/1页 首页上一页1下一页尾页