Develop a Part-of-Speech Tagger and a Tagger-Maker: Algorithms, Implementations, Results, and APIs - Softcover

Han, Jiayun

 
9783659376221: Develop a Part-of-Speech Tagger and a Tagger-Maker: Algorithms, Implementations, Results, and APIs

Inhaltsangabe

This project is aimed to build an efficient, scalable, portable, and trainable part-of-speech tagger. Using 98% of Penn Treebank-3 as the training data, it builds a raw tagger, using Bayes’ theorem, a hidden Markov model, and the Viterbi algorithm. After that, a reinforcement machine learning algorithm and contextual transformation rules were applied to increase the tagger’s accuracy. The tagger’s final accuracy on the testing data is 96.51% and its speed is about 26,000 words per second on a computer with two-gigabyte random access memory and two 3.00 GHz Pentium duo processors. The tagger’s portability and trainability are proved by the tagger-maker’s success in building a new tagger out of a corpus that is annotated with the tagset different from that of Penn Treebank.

Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.

Über die Autorin bzw. den Autor

Jiayun Han, Obtained his PhD in Linguistics and MS in Artificial Intelligence from The University of Georgia, U.S.A. He was working for North Side Inc. as a natural language processing engineer and is currently employed by Manwin Canada as a software developer.

„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.