Welcome to the 2019 edition of the shared task HAHA - Humor Analysis based on Human Annotation, a task to classify tweets in Spanish as humorous or not, and to determine how funny they are. This task is part of IberLEF 2019.

News

Introduction

While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Machine Learning and Computational Linguistics. There exist some previous works (Mihalcea & Strapparava, 2005; Sjöbergh & Araki, 2007; Castro et al., 2016), but a characterization of humor that allows its automatic recognition and generation is far from being specified. The aim of this task is to gain better insight in what is humorous and what causes laughter.

There is past work regarding this topic. Semeval-2015 Task 11 proposed to work on figurative language, such as metaphors and irony, but focused on Sentiment Analysis. Semeval-2017 Task 6 presented a similar task to this one as well. This is the second edition of the HAHA task, you can also see the results of last year's edition of the task (Castro et al., 2018b).

The HAHA evaluation campaign proposes different subtasks related to automatic humor detection. In order to carry out the tasks, an annotated corpus of tweets in Spanish will be provided.

Corpus

We provide a corpus of crowd-annotated tweets based on (Castro et al., 2018a), divided in 80% for training and 20% tweets for testing. The annotation was made with a voting scheme in which users could select one of six options: the tweet is not humorous, or the tweet is humorous and a score is given between one (not funny) to five (excellent).

All tweets are classified as humorous or not humorous. The humorous tweets received at least three votes indicating a number of stars, and at least five votes in total. The not humorous votes received at least three votes for not humor (they might have less than five votes in total).

The corpus contains annotated tweets such as the following:

Text – La semana pasada mi hijo hizo un triple salto mortal desde 20 metros de altura. – ¿Es trapecista? – Era :(
Is humorous
True
Votes: Not humor
1
Votes: 1 star 0
Votes: 2 stars 1
Votes: 3 stars 2
Votes: 4 stars 0
Votes: 5 stars 1
Funniness score 3.25

Task description

Based on tweets written in Spanish, the following subtasks are proposed:

Important Dates

Data

The training and test data can be downloaded here. If you use this corpus, please cite (Castro et al., 2018a) or (Chiruzzo et al., 2019).

Training data

Test data used in the competition (without annotations and with autogenerated tweet ids)

Test data with gold annotations

You can find more information about this corpus here.

Results

The following are the results for Task 1:
Team F1 Precision Recall Accuracy
adilism82.179.185.285.5
Kevin & Hiromi81.680.283.185.4
bfarzin81.078.283.984.6
jamestjw79.879.380.484.2
INGEOTEC78.875.881.982.8
BLAIR GMU78.474.582.782.2
UO UPV277.378.076.582.4
vaduvabogdan77.272.982.081.1
UTMN76.075.676.581.2
LaSTUS/TALN75.977.474.581.6
Taha75.781.071.182.2
LadyHeidy72.574.470.879.1
Aspie9671.167.874.976.3
OFAI–UKP66.058.875.369.8
acattle64.068.360.273.6
jmeaney63.661.366.170.5
garain59.349.174.859.9
Amrita CEN49.547.851.459.1
random baseline44.039.449.750.5
The following are the results for Task 2:
Team RMSE
adilism0.736
bfarzin0.746
Kevin & Hiromi0.769
jamestjw0.798
INGEOTEC0.822
BLAIR GM0.910
LaSTUS/TALN0.919
UTMN0.945
acattle0.963
Amrita CEN1.074
garain1.653
Aspie961.673
OFAI–UKP1.810
random2.455

Contact

If you want to participate in this task, please join the Google Group hahaiberlef2019. We will be sharing news and important information about the task in that group. If you have any question that you prefer to write privately, contact us via hahapln@fing.edu.uy

The organizers of the task are:

We are part of the NLP research group at Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Uruguay.

Bibliography

(Chiruzzo et al., 2019) Chiruzzo, L., Castro, S., Etcheverry, M., Garat, D., Prada, J. J., & Rosá, A. (2019). Overview of HAHA at IberLEF 2019: Humor Analysis based on Human Annotation. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019). CEUR Workshop Proceedings, CEUR-WS, Bilbao, Spain (9 2019).

(Castro et al., 2018a) Castro, S., Chiruzzo, L., Rosá, A., Garat, D., & Moncecchi, G. (2018). A crowd-annotated spanish corpus for humor analysis. In Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media (pp. 7-11).

(Castro et al., 2018b) Castro, S., Chiruzzo, L., & Rosá, A. Overview of the HAHA Task: Humor Analysis based on Human Annotation at IberEval 2018.

(Castro et al., 2016) Castro, S., Cubero, M., Garat, D., & Moncecchi, G. (2016). Is This a Joke? Detecting Humor in Spanish Tweets. In Ibero-American Conference on Artificial Intelligence (pp. 139-150). Springer International Publishing.

(Castro et al., 2017) Castro, S., Cubero, M., Garat, D., & Moncecchi, G. (2017). HUMOR: A Crowd-Annotated Spanish Corpus for Humor Analysis. arXiv preprint arXiv:1710.00477.

(Fleiss, 1971) Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5), 378.

(Mihalcea & Strapparava, 2005) Mihalcea, R., & Strapparava, C. (2005). Making Computers Laugh: Investigations in Automatic Humor Recognition. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. HLT ’05, (pp. 531–538). Association for Computational Linguistics, Vancouver, British Columbia, Canada

(Sjöbergh & Araki, 2007) Sjöbergh, J., and Araki, K. (2007). Recognizing Humor Without Recognizing Meaning (2007). In WILF, (pp. 469–476). Springer