Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach

Liyuan Liu* , Xiang Ren*, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, and Jiawei Han

September 2017

PDF Code Blog Slides

Abstract

Relation extraction is a fundamental task in information extraction. Most existing methods have heavy reliance on annotations labeled by human experts, which are costly and time-consuming. To overcome this drawback, we propose a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics. These annotations, referred as heterogeneous supervision, often conflict with each other, which brings a new challenge to the original relation extraction task: how to infer the true label from noisy labels for a given instance. Identifying context information as the backbone of both relation extraction and true label discovery, we adopt embedding techniques to learn the distributed representations of context, which bridges all components with mutual enhancement in an iterative fashion. Extensive experimental results demonstrate the superiority of REHession over the state-of-the-art.

Type

Conference paper

Publication

the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)

Liyuan Liu

Senior Researcher @ MSR

Understand the underlying mechanism of pretraining heuristics.