View on GitHub

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han

Paper Bib Tex Documentation

LD-Net

LD-Net provides sequence labeling models featuring:

Remarkablely, our pre-trained NER model achieved:

Motivation

Language model has demonstrated its effectiveness on contextualized word representation and pushes the state-of-the-art performances on various tasks. Despite performance improvements, language models also makes the resulting model too slow for real-world applications. In this paper, we aim to conduct language model pruning for efficient contextualized representation, while maintaining the plug-in-and-play manner.

Details about LD-Net can be accessed at: PDF

framework

Benchmarks

NER

When models are only trained on the CoNLL 2003 English NER dataset, the results are summarized as below.
Table. 1 Performance on the CoNLL 2003 NER dataset
Model for CoNLL03 #FLOPs Mean(F1) Std(F1)
Vanilla NER w.o. LM 3 M 90.78 0.24
LD-Net (w.o. pruning) 51 M 91.86 0.15
LD-Net (origin, picked based on dev f1) 51 M 91.95
LD-Net (pruned) 5 M 91.84 0.14

POS

When models are only trained on the CoNLL 2000 English Chunking dataset, the results are summarized as below.
Table. 2 Performance on the CoNLL 2000 Chunking dataset
Model for CoNLL03 #FLOPs Mean(F1) Std(F1)
Vanilla NER w.o. LM 3 M 94.42 0.08
LD-Net (w.o. pruning) 51 M 96.01 0.07
LD-Net (origin, picked based on dev f1) 51 M 96.13
LD-Net (pruned) 10 M 95.66 0.04

Bib Tex

Please cite the following paper if you find the codes and datasets useful.

@inproceedings{liu2018efficient,
  title = "{Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling}",
  author = {Liu, Liyuan and Ren, Xiang and Shang, Jingbo and Peng, Jian and Han, Jiawei},
  booktitle = {EMNLP},
  year = 2018,
}