This page shows the submissions results of the two HAHA tasks.
Task 1: Humor Detection
This task is about telling if a tweet is a joke or not (intended humor by the author or not). F1 measure is the main one for this task.
Baselines for this task over the test data:
- baseline1: decide randomly with 50% probability.
- baseline2: choose tweets that start with a hyphen as humor.
Team | Run | Acc | Prec | Rec | F1 |
---|---|---|---|---|---|
INGEOTEC | run 2 | 0.8452 | 0.7796 | 0.8157 | 0.7972 |
UO_UPV | run 1 | 0.8455 | 0.8158 | 0.7567 | 0.7851 |
UO_UPV | run 2 | 0.8448 | 0.8322 | 0.7312 | 0.7785 |
ELiRF-UPV | run 1 | 0.8367 | 0.8046 | 0.7426 | 0.7724 |
UO_UPV | run 3 | 0.8397 | 0.8281 | 0.7198 | 0.7702 |
INGEOTEC | run 1 | 0.8403 | 0.8557 | 0.6877 | 0.7625 |
ELiRF-UPV | run 2 | 0.7552 | 0.6546 | 0.7279 | 0.6893 |
baseline | baseline1 | 0.4915 | 0.3645 | 0.4886 | 0.4175 |
baseline | baseline2 | 0.6595 | 0.9392 | 0.0932 | 0.1695 |
Task 2: Funniness Score Prediction
This task is about predicting a funniness score value (average stars) for a tweet in a 5-star ranking, only if it is a joke (non-jokes are not counted). The results of this task will be measured using root-mean-squared error (RMSE).
The baseline for this task over the test data: choose the value 3 (middle of the scale) for all tweets.
Team | Run | Score |
---|---|---|
INGEOTEC | run 1 | 0.9784 |
baseline | baseline | 1.1419 |
UO_UPV | run 4 | 1.5919 |