Welcome to the shared task HAHA - Humor Analysis based on Human Annotation, a task to classify tweets in Spanish as humorous or not, and to determine how funny they are. This task is part of IberEval2018.



While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Machine Learning and Computational Linguistics. There exist some previous works (Mihalcea & Strapparava, 2005; Sjöbergh & Araki, 2007; Castro et al., 2016), but a characterization of humor that allows its automatic recognition and generation is far from being specified. The aim of this task is to gain better insight in what is humorous and what causes laughter.

There is past work regarding this topic. Semeval-2015 Task 11 proposed to work on figurative language, such as metaphors and irony, but focused on Sentiment Analysis. Semeval-2017 Task 6 presented a similar task to this one as well.

The HAHA evaluation campaign proposes different subtasks related to automatic humor detection. In order to carry out the tasks, an annotated corpus of tweets in Spanish will be provided.


We provide a corpus of 20,000 crowd-annotated tweets based on (Castro et al., 2017), divided in 16,000 tweets for training and 4,000 tweets for testing. The annotation was made with a voting scheme, in which users could select one of six options: the tweet does not contain humor, or the tweet contains humor and a number of stars from one to five.

All tweets are classified as humorous or not humorous. The humorous tweets received at least three votes indicating a number of stars, and at least five votes in total. The not humorous votes received at least three votes for not humor (they might have less than five votes in total).

The corpus contains annotated tweets such as the following:

Text – La semana pasada mi hijo hizo un triple salto mortal desde 20 metros de altura. – ¿Es trapecista? – Era :(
Is humorous
Votes: Not humor
Votes: 1 star 0
Votes: 2 stars 1
Votes: 3 stars 2
Votes: 4 stars 0
Votes: 5 stars 1
Average stars 3.25

Task description

Three subtasks are proposed, based on tweets written in Spanish:

Important Dates


The training and test data can be downloaded here. If you use this corpus, please cite (Castro et al., 2018a) or (Castro et al., 2018b).

Training data

Test data without annotations

Test data with gold annotations

You can find more information about this corpus here.


If you want to participate in this task, please join the Google Group hahaibereval2018. We will be sharing news and important information about the task in that group. If you have any question that you prefer to write privately, contact us via hahapln@fing.edu.uy

The organizers of the task are:

We are part of the NLP research group at Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Uruguay.


(Castro et al., 2018a) Castro, S., Chiruzzo, L., Rosá, A., Garat, D., & Moncecchi, G. (2018). A crowd-annotated spanish corpus for humor analysis. In Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media (pp. 7-11).

(Castro et al., 2018b) Castro, S., Chiruzzo, L., & Rosá, A. Overview of the HAHA Task: Humor Analysis based on Human Annotation at IberEval 2018.

(Castro et al., 2016) Castro, S., Cubero, M., Garat, D., & Moncecchi, G. (2016). Is This a Joke? Detecting Humor in Spanish Tweets. In Ibero-American Conference on Artificial Intelligence (pp. 139-150). Springer International Publishing.

(Castro et al., 2017) Castro, S., Cubero, M., Garat, D., & Moncecchi, G. (2017). HUMOR: A Crowd-Annotated Spanish Corpus for Humor Analysis. arXiv preprint arXiv:1710.00477.

(Fleiss, 1971) Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5), 378.

(Mihalcea & Strapparava, 2005) Mihalcea, R., & Strapparava, C. (2005). Making Computers Laugh: Investigations in Automatic Humor Recognition. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. HLT ’05, (pp. 531–538). Association for Computational Linguistics, Vancouver, British Columbia, Canada

(Sjöbergh & Araki, 2007) Sjöbergh, J., and Araki, K. (2007). Recognizing Humor Without Recognizing Meaning (2007). In WILF, (pp. 469–476). Springer