View on GitHub

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han

LD-Net

LD-Net provides sequence labeling models featuring:

Efficiency: constructing efficient contextualized representations without retraining LMs.
Portability: well-organized, easy-to-modify and well-documented.

Remarkablely, our pre-trained NER model achieved:

92.08 test F1 on the CoNLL03 NER task.
160K words/sec decoding speed (6X speedup compared to its original model).

Motivation

Language model has demonstrated its effectiveness on contextualized word representation and pushes the state-of-the-art performances on various tasks. Despite performance improvements, language models also makes the resulting model too slow for real-world applications. In this paper, we aim to conduct language model pruning for efficient contextualized representation, while maintaining the plug-in-and-play manner.

Details about LD-Net can be accessed at: PDF

framework

Benchmarks

NER

When models are only trained on the CoNLL 2003 English NER dataset, the results are summarized as below.

Table. 1 Performance on the CoNLL 2003 NER dataset
Model for CoNLL03	#FLOPs	Mean(F1)	Std(F1)
Vanilla NER w.o. LM	3 M	90.78	0.24
LD-Net (w.o. pruning)	51 M	91.86	0.15
LD-Net (origin, picked based on dev f1)	51 M	91.95
LD-Net (pruned)	5 M	91.84	0.14

POS

When models are only trained on the CoNLL 2000 English Chunking dataset, the results are summarized as below.

Table. 2 Performance on the CoNLL 2000 Chunking dataset
Model for CoNLL03	#FLOPs	Mean(F1)	Std(F1)
Vanilla NER w.o. LM	3 M	94.42	0.08
LD-Net (w.o. pruning)	51 M	96.01	0.07
LD-Net (origin, picked based on dev f1)	51 M	96.13
LD-Net (pruned)	10 M	95.66	0.04

Bib Tex

Please cite the following paper if you find the codes and datasets useful.

@inproceedings{liu2018efficient,
title = "{Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling}",
author = {Liu, Liyuan and Ren, Xiang and Shang, Jingbo and Peng, Jian and Han, Jiawei},
booktitle = {EMNLP},
year = 2018,
}