Pasar al contenido principal

Combining Rules and CRF Learning for Opinion Source Identification in Spanish Texts

Tipo
Artículo de journal
Año
2012
Publisher
Springer Berlin Heidelberg
Páginas
452
Volúmen
7637
Abstract

In this work we present a system for the automatic annotation of opinions in Spanish texts. We focus mainly in the definition of a {TFS}-style model for the predicates of opinion and their arguments, in the creation of a lexicon of opinion predicates and in two additional variants for identifying the source of opinions. The original system extracts opinions and all its elements (predicate, source, topic and message) based on hand-coded rules, the first variant uses a {CRF} model for learning the source, assuming that the predicate is already tagged, and the second variant is a combined version, with the result of source recognition via the rule-based system being added as an additional attribute for training the {CRF} model. We found that this hybrid system performs better than each of the systems evaluated separately. This work involved the construction of several resources for Spanish: a lexicon of opinion predicates, a 13,000 word corpus with whole opinion annotations and a 40,000 word corpus with annotations of opinion predicates and sources.

Autores

Jean-Luc Minel
Aiala Rosá
NéstorD Duque-Méndez
Juan Pavón
Rubén Fuentes-Fernández
Citekey
citeulike:12275311
doi
10.1007/978-3-642-34654-5\\_46
Keywords