Generating Word Embeddings for the Azerbaijani Language and Their Empirical Evaluation on Intrinsic and Extrinsic Evaluation Tasks

Suleymanov, Umid

Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12323/5448

Full metadata record

DC Field	Value	Language
dc.contributor.author	Suleymanov, Umid	-
dc.date.accessioned	2022-03-11T10:45:20Z	-
dc.date.available	2022-03-11T10:45:20Z	-
dc.date.issued	2021	-
dc.identifier.uri	http://hdl.handle.net/20.500.12323/5448	-
dc.description.abstract	Recently, the success gained by word embeddings and pre-trained language representation architectures has increased the interest to natural language processing significantly. They are at the core of many breakthroughs in Natural Language Processing tasks such as question answering, language translation, sentiment analysis and etc. At the same time, the rapid growth in the number of techniques and architectures makes the thesis on these topics very relevant for Azerbaijani as it is an agglutinative and low resource language. In this thesis, word embeddings will be generated using various architectures and the effectiveness of word embeddings will be analyzed through empirical research in different datasets containing various natural language processing tasks such as sentiment analysis, text classification, semantic analogy, syntactic analogy and etc. This thesis will also research natural language representation approaches from traditional vector space models to very recent word embedding and pre-training of deep bidirectional transformer architectures. Novelty introduced by each of the new technique, its main advantages and the challenges addressed by those have been addressed in this paper. I will also aim at explaining mathematical background where necessary in order to make distinction between architectures more clear.	en_US
dc.language.iso	en	en_US
dc.subject	Word Embeddings	en_US
dc.subject	Machine Learning	en_US
dc.subject	word2vec	en_US
dc.subject	GLoVe	en_US
dc.subject	NLP	en_US
dc.subject	Sentiment Analysis	en_US
dc.subject	LSTM	en_US
dc.subject	Deep Learning	en_US
dc.subject	Semantic Analogy	en_US
dc.subject	fasttext	en_US
dc.title	Generating Word Embeddings for the Azerbaijani Language and Their Empirical Evaluation on Intrinsic and Extrinsic Evaluation Tasks	en_US
dc.type	Thesis	en_US
Appears in Collections:	Thesis

Files in This Item:

File	Description	Size	Format
Generating Word Embeddings for the Azerbaijani Language and Their Empirical Evaluation on Intrinsic and Extrinsic Evaluation Tasks.pdf		2.54 MB	Adobe PDF	View/Open

Show simple item record