Specializing Word Embeddings (for Parsing) by Information Bottleneck

Xiang Lisa Li, Jason Eisner


This paper proposes a method based on Variational Information Bottleneck to compress word embeddings like BERT and Elmo into a discrete or continuous version in a way that is very fast and also more accurate for parsing problems. This paper also got the best paper award in EMNLP 2019

Main Contribution

Loss Function

Architecture & Implementaion Details