View on GitHub

Heterogeneous Supervision for Relation Extraction:
A Representation Learning Approach

Liyuan Liu, Xiang Ren, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han

Paper Slides Reference

Relation Extraction

Sentence-level Relation Extraction: classify a relation mention into a set of relation types of interest or Not-Target-Type (None)

Relation Mention
an entity pair with a sentence / context.
E.g., ("Hussein", "Amman'', Hussein was born in Amman on 14 November 1935.)
Relation Types of Interest
A set of relation types.
E.g., {Born-in, President-of, Died-in, Parents-of, ...}
Relation Extraction
predict Born-in for ("Hussein", "Amman'', Hussein was born in Amman on 14 November 1935.)

Distant Supervision

Automatically generate annotations by Knowledge Base (KB).

return r for (e1, e2, s) if r(e1, e2) in KB

For example, because Born-in(Obama, USA) and President-of(Obama, USA) and Citizen-of(Obama, USA) exists in KB, ("Obama", "USA", Obama was born in Honolulu, Hawaii, USA as he has always said) would be annotated as Born-in (correct) and President-of (wrong).

Heterogeneous Supervision

Distant supervision only encodes KB, while we have more than KB.

Heterogeneous Supervision encodes heterogeneous supervision by means of labeling functions.

Challenges

Conflicts among Heterogeneous Supervision

Our Solution

Intuition

Since Context plays an important role in both tasks, we employed Representation Learning to capture context information, and bridges these two tasks.

A Representation Learning Approach

Framework

Experiments

Labeling Function

True Label Discovery

In Table 1, the first two relation mentions come from Wiki-KBP, and their annotations are {born-in, None}. The last two are created by replacing key words of the first two. Key words are marked as bold and entity mentions are marked as Italics.

Table 1. Context-aware True Label Discovery
Relation Mention REHESSION Investment
\ Universal Schemas
Ann Demeulemeester ( born 1959 , Waregem , Belgium ) is ... Born-in None
Raila Odinga was born at ..., in Maseno, Kisumu District, ... Born-in None
Ann Demeulemeester ( elected 1959 , Waregem , Belgium ) is ... None None
Raila Odinga was examined at ..., in Maseno, Kisumu District, ... None None

Investment and Universal Schemas refer None as true type for all four instances in Table 1. And our method infers born-in as the true label for the first two relation mentions. After replacing the matched contexts (born) with other words (elected and examined), our method no longer trusts born-in since the modified contexts are no longer matched, then infers None as the true label. In other words, our proposed method infer the true label in a context aware manner.

Relation Extraction

Here, we summarize performance comparison with several relation extraction systems over KBP 2013 dataset (sentence-level extraction) in Table. 2.
Table. 2 Performance on Wiki-KBP
Method Precision Recall F1
DSL (Mintz et al., 2009) 0.3301 0.5446 0.4067
MultiR (Hoffmann et al., 2011) 0.3045 0.5277 0.3810
FCM (Gormley et al., 2015) 0.2523 0.5258 0.3410
CoType-RM (Ren et al., 2017) 0.3701 0.4767 0.4122
ReHession (Our) 0.3677 0.4933 0.4208

Resources

Softwares and labeling functions have been uploaded to Github

Reference

Please cite the following paper if you find the codes and datasets useful:

@inproceedings{Liu2017rehession,
  title={Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach},
  author={Liu, Liyuan and Ren, Xiang and Zhu, Qi and Zhi, Shi and Gui, Huan and Ji, Heng and Han, Jiawei},
  booktitle={Proc. EMNLP},
  year={2017}
}