Automatic POS tagging of Macedonian Language

Bonchanoski, Martin and Zdravkova, Katerina (2017) Automatic POS tagging of Macedonian Language. In: PROCEEDINGS of the 14th Conference on Informatics and Information Technology. Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Macedonia, Skopje, Macedonia, pp. 136-140. ISBN 978-608-4699-07-1

[img]
Preview
Text
978-608-4699-07-1_pp136-140.pdf

Download (274kB) | Preview
Official URL: http://ciit.finki.ukim.mk

Abstract

This paper presents research work that has led to creating system for automatic disambiguation of the part-ofspeech tags for Macedonian language. First, the need for this kind of system is explained. Next, there is given information about the characteristics of Macedonian language. It introduces the pre-processing of the lexical corpus, continues with explanation of the systems for manual tagging and disambiguation of crowd-sourced results. This work has resulted with 96.90% accuracy which is comparable to the state-of-the-art taggers for other languages. The paper contains information about the techniques of machine learning that were applied in order to get these results. The list of models that were built includes TnT tagger, averaged perceptron, model based on neural network implemented by Syntaxnet and model built upon guided learning for bidirectional sequence classification. The final results of this work led to a system that automatically assigns part-of-speech tags to unlabeled text.

Item Type: Book Section
Subjects: International Conference on Informatics and Information Technologies > Artificial Intelligence
International Conference on Informatics and Information Technologies > Robotics
International Conference on Informatics and Information Technologies > Bioinformatics
Depositing User: Vangel Ajanovski
Date Deposited: 29 Nov 2017 18:30
Last Modified: 29 Nov 2017 18:30
URI: http://eprints.finki.ukim.mk/id/eprint/11391

Actions (login required)

View Item View Item