Paper小计：Language Models as Knowledge Bases?

Abstract

大型文本语料库上的 预训练语言模型提升下游NLP任务表现，学习语言知识，也可能存储了训练数据之间的 关系知识，可能能够回答“填空”语句的查询。

与结构化知识库对比，语言模型： 不需要模式工程；允许从业者查询一个开放的关系类，易于扩展到更多的数据，并且 不需要人工监督来进行培训。

对先进的预训练语言模型中的 关系知识进行分析的发现：

1.不用微调，bert也会包含关系知识与传统的NLP方法有一些访问oracle知识

2.bert也非常好在开放领域问题回答基于监督基线

3.某些类型的事实知识比其他人更容易学习标准语言模型训练的方法。

1 Introduction

预训练语言模型捕捉的语言知识，用来微调对各类任务来说意义非凡。

知识库通过支持查询访问注释的金标准关系数据的有效解决方案。需要复杂的NLP管道，包括实体提取、共引用解析、实体链接和关系提取。（这些组件通常需要监督数据和固定模式）

应对方式：对神经语言模型的关系数据进行查询。不需要模式工程。不需要人类注释，而且它们支持一组开放的查询。

探究的问题：

关系知识已经存在于预先训练的现成语言模型中，如ELMo和BERT。他们存储了多少 关系知识？对于不同类型的知识，如关于实体的事实、常识和一般的问题回答，这有什么不同呢？

与自动从文本中提取的符号知识库相比，在没有进行微调的情况下，它们的性能如何呢？

propose:

LAMA: 由一组知识源组成，每个知识源都由一组事实组成。

我们定义，一个预训练的语言模型知道一个事实（主语、关系、宾语），如（但丁出生在佛罗伦萨），如果它能成功预测MASK的对象，如 "但丁出生在 "这样的句子来表达这一事实。我们测试了各种类型的知识：存储在Wikidata中的实体之间的关系、常识性的概念网中的概念之间的关系，以及回答自然语言问题所需的知识SQuAD中的问题。在后一种情况下，我们手动将SQuAD问题的一个子集映射到cloze句子。

结论：

1.BERT模型(BERT-large)捕获了（准确的）关系型的知识，与用现成的关系提取器和基于甲骨文的实体提取器提取的知识库相当。

2.事实知识可以从预训练的语言模型中恢复得很好，然而，对于某些关系（特别是
N-to-M关系）性能非常差。

3.BERT-large在恢复事实知识和语言模型方面一直优于其他语言模型在恢复事实和常识性知识方面一直优于其他语言模型，同时对查询的措辞更加稳健。

4.BERT-large在开放域QA方面取得了显著的结果，在@10时精度达到57.1%，而使用特定任务的监督关系提取系统构建的知识库为63.5%。

2 Background

2.1 Unidirectional Language Models

2.2 Bidirectional “Language Models” 2

3 Related Work

我们的调查试图回答预训练的语言模型在多大程度上储存了事实性和常识性知识的程度。与传统关系提取方法所填充的符号知识库进行比较。

4 The LAMA Probe

We introduce the LAMA (LAnguage Model Analysis) probe to test the factual and commonsense
knowledge in language models：

It provides a setof knowledge sources which are composed of a corpus of facts. Facts are either subject-relationobject triples or question-answer pairs.

We evaluate each model based on how highly it ranks the ground truth token against every other word in a fifixed candidate vocabulary.

assumption： models which rank ground truth tokens high for these cloze state ments have more factual knowledge.

4.1 Knowledge Sources

we cover a variety of sources of factual and commonsense knowledge. For each source, we describe the origin of fact triples (or question answer pairs), how we transform them into cloze

templates, and to what extent aligned texts exist in Wikipedia that are known to express a partic ular fact. We use the latter information in super vised baselines that extract knowledge representa tions directly from the aligned text.

4.2 Models

4.3 Baselines

freq ：For a subject and relation pair......

re：For the relation-based knowledge source......

drqa：for open-domain question answering......

4.4 Metrics

We consider rank-based metrics and compute results per relation along with mean values across all relations. To account for multiple valid objects for a subject-relation pair ( i.e. , for N-Mrelations), we follow Bordes et al. ( 2013 ) and remove from the candidates when ranking at test time all other valid objects in the training data other than the one we test. We use the mean precision at k ( P@k ). For a given fact, this value is 1 if the object is ranked among the top k results, and 0 otherwise.

4.5 Considerations

Manually Defifined Templates

Single Token

Object Slots

Intersection of Vocabularies

5 Results

6 Discussion and Conclusion

Paper小计：Language Models as Knowledge Bases?相关推荐

QA-GNN: Reasoning with Language Models and Knowledge Graphsfor Question Answering
题目:QA-GNN:使用语言模型和知识图进行问答推理作者:Michihiro Yasunaga.Hongyu Ren.Antoine Bosselut.Percy Liang.Jure Leskov ...
Prompt-based Language Models：模版增强语言模型小结
©PaperWeekly 原创 · 作者 | 李泺秋学校 | 浙江大学硕士生研究方向 | 自然语言处理.知识图谱最近注意到 NLP 社区中兴起了一阵基于 Prompt(模版)增强模型预测的潮流: ...
论文阅读：Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA
论文阅读:Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA 来源:ACL 2 ...
Paper：GPT-3《 Language Models are Few-Shot Learners》的翻译与解读
Paper:GPT-3< Language Models are Few-Shot Learners>的翻译与解读目录 <GPT-3: Language Models are Fe ...
Paper：GPT-3之《 Language Models are Few-Shot Learners》的翻译与解读
Paper:GPT-3之< Language Models are Few-Shot Learners>的翻译与解读目录 <GPT-3: Language Models are F ...
【论文解读 AAAI 2020 | GNTP】Differentiable Reasoning on Large Knowledge Bases and Natural Language
论文题目:Differentiable Reasoning on Large Knowledge Bases and Natural Language 论文来源:AAAI 2020 伦敦大学, Fac ...
Paper简读 - ProGen2: Exploring the Boundaries of Protein Language Models
欢迎关注我的CSDN:https://spike.blog.csdn.net/ 本文地址:https://blog.csdn.net/caroline_wendy/article/details/12 ...
#Paper Reading# Language Models are Few-Shot Learner
论文题目: Language Models are Few-Shot Learner 论文地址: https://proceedings.neurips.cc/paper/2020/hash/1457 ...
AIGC之LLaMA：《LLaMA: Open and Efficient Foundation Language Models》翻译与解读
AIGC之LLaMA:<LLaMA: Open and Efficient Foundation Language Models>翻译与解读导读:该论文提出了一个开源的大规模语言模型LL ...

Paper小计：Language Models as Knowledge Bases?

Paper小计：Language Models as Knowledge Bases?相关推荐

最新文章

热门文章