Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12323/5448
Title: Generating Word Embeddings for the Azerbaijani Language and Their Empirical Evaluation on Intrinsic and Extrinsic Evaluation Tasks
Authors: Suleymanov, Umid
Keywords: Word Embeddings
Machine Learning
word2vec
GLoVe
NLP
Sentiment Analysis
LSTM
Deep Learning
Semantic Analogy
fasttext
Issue Date: 2021
Abstract: Recently, the success gained by word embeddings and pre-trained language representation architectures has increased the interest to natural language processing significantly. They are at the core of many breakthroughs in Natural Language Processing tasks such as question answering, language translation, sentiment analysis and etc. At the same time, the rapid growth in the number of techniques and architectures makes the thesis on these topics very relevant for Azerbaijani as it is an agglutinative and low resource language. In this thesis, word embeddings will be generated using various architectures and the effectiveness of word embeddings will be analyzed through empirical research in different datasets containing various natural language processing tasks such as sentiment analysis, text classification, semantic analogy, syntactic analogy and etc. This thesis will also research natural language representation approaches from traditional vector space models to very recent word embedding and pre-training of deep bidirectional transformer architectures. Novelty introduced by each of the new technique, its main advantages and the challenges addressed by those have been addressed in this paper. I will also aim at explaining mathematical background where necessary in order to make distinction between architectures more clear.
URI: http://hdl.handle.net/20.500.12323/5448
Appears in Collections:Thesis



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.