THESIS
2021
1 online resource (x, 62 pages) : illustrations (chiefly color)
Abstract
One of the current state-of-the-art multilingual document embedding model LASER is based on
the bidirectional LSTM (BiLSTM) neural machine translation (NMT) model. This paper presents
a Transformer-based Multilingual sentence/Document Embedding model, T-MDE, which makes
two significant improvements. Firstly, the BiLSTM encoder is replaced by the attention-based
transformer structure with an novel information bottleneck design. The new model structure is more
capable of learning sequential patterns in longer texts. Moreover, it is faster both in training and
embedding generation. Secondly, we augment the NMT translation loss function with an carefully
designed distance constraint loss term. It will further brings the embeddings of parallel sentences
close together in the vector space. We...[
Read more ]
One of the current state-of-the-art multilingual document embedding model LASER is based on
the bidirectional LSTM (BiLSTM) neural machine translation (NMT) model. This paper presents
a Transformer-based Multilingual sentence/Document Embedding model, T-MDE, which makes
two significant improvements. Firstly, the BiLSTM encoder is replaced by the attention-based
transformer structure with an novel information bottleneck design. The new model structure is more
capable of learning sequential patterns in longer texts. Moreover, it is faster both in training and
embedding generation. Secondly, we augment the NMT translation loss function with an carefully
designed distance constraint loss term. It will further brings the embeddings of parallel sentences
close together in the vector space. We call the T-MDE model trained with distance constraint,
cT-MDE. Our T-MDE model significantly outperforms BiLSTM-based LASER in the cross-lingual document classification tasks.
Post a Comment