大模型涉及到的比较经典的论文

发布时间 2023-12-24 17:56:36作者: l_v_y_forever

大模型涉及到的比较经典的论文:

 

    • 2014 Neural Machine Translation by Jointly Learning to Align and Translate - This paper introduces an attention mechanism in RNNs to improve the long sequence modelling of RNNs. This paper introduces an attention mechanism to RNNs to improve their long sequence modelling capabilities. This enables RNNs to translate longer sentences more accurately.
      2014 Neural Machine Translation by Jointly Learning to Align and Translate - 本文介绍了 RNN 中的注意机制,以改进 RNN 的长序列建模。本文引入了 RNN 的注意力机制,以提高其长序列建模能力。这使得 RNN 能够更准确地翻译较长的句子。
    • 2017 Attention Is All You Need - This paper introduces the structure of the original Transformer and is the basis for the Transformer family.
      2017 Attention Is All You Need - 本文介绍了原始 Transformer 的结构,是 Transformer 系列的基础。
    • 2018 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - This paper ushered in the era of pre-training in NLP. BERT came out of nowhere.
      2018 BERT:用于语言理解的深度双向变换器的预训练 - 这篇论文开创了 NLP 预训练的时代。 BERT 不知从何而来。
    • 2018 Improving language understanding by generative pre-training - This paper introduces another popular pre-training model, also known to later generations as GPT-1.
      2018 Improving language Understanding by Generative Pre-training - 本文介绍了另一种流行的预训练模型,也被后人称为 GPT-1。
    • 2019 Language models are unsupervised multitask learners - This paper introduces GPT-2.
      2019 语言模型是无监督多任务学习器 - 本文介绍了 GPT-2。
    • 2020 Language Models are Few-Shot Learners - This paper introduces GPT-3.
      2020 语言模型是少样本学习者 - 本文介绍了 GPT-3。
    • 2022 Training lanquage models to follow instructions with human feedback - This paper presents an RLHF approach to using supervised learning to fine-tuning. It is also known as a paper that illustrates the kernel of ChatGPT's thinking. Presumably, ChatGPT is an extended version of InstructGPT that enables fine-tuning on larger datasets.
      2022 训练语言模型遵循人类反馈的指令 - 本文提出了一种使用监督学习进行微调的 RLHF 方法。也被誉为阐释ChatGPT思想核心的论文。据推测,ChatGPT 是 InstructGPT 的扩展版本,可以在更大的数据集上进行微调。
    • 2023 GPT-4 Technical Report We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.
      2023 GPT-4 技术报告 我们报告了 GPT-4 的开发,这是一种大规模、多模式模型,可以接受图像和文本输入并产生文本输出。
    • awesome-chatgpt-papers https://www.aliyundrive.com/s/RenfDZjta8T 提取码:5y6m

摘自:https://github.com/OpenMindClub/awesome-chatgpt#the-technical-principle-of-chatgpt