This page shows the submissions results of the two HAHA tasks.
Task 1: Humor Detection
This task is about telling if a tweet is a joke or not (intended humor by the author or not). F1 measure is the main one for this task.
Baselines for this task over the test data:
- baseline1: decide randomly with 50% probability.
 - baseline2: choose tweets that start with a hyphen as humor.
 
| Team | Run | Acc | Prec | Rec | F1 | 
|---|---|---|---|---|---|
| INGEOTEC | run 2 | 0.8452 | 0.7796 | 0.8157 | 0.7972 | 
| UO_UPV | run 1 | 0.8455 | 0.8158 | 0.7567 | 0.7851 | 
| UO_UPV | run 2 | 0.8448 | 0.8322 | 0.7312 | 0.7785 | 
| ELiRF-UPV | run 1 | 0.8367 | 0.8046 | 0.7426 | 0.7724 | 
| UO_UPV | run 3 | 0.8397 | 0.8281 | 0.7198 | 0.7702 | 
| INGEOTEC | run 1 | 0.8403 | 0.8557 | 0.6877 | 0.7625 | 
| ELiRF-UPV | run 2 | 0.7552 | 0.6546 | 0.7279 | 0.6893 | 
| baseline | baseline1 | 0.4915 | 0.3645 | 0.4886 | 0.4175 | 
| baseline | baseline2 | 0.6595 | 0.9392 | 0.0932 | 0.1695 | 
Task 2: Funniness Score Prediction
This task is about predicting a funniness score value (average stars) for a tweet in a 5-star ranking, only if it is a joke (non-jokes are not counted). The results of this task will be measured using root-mean-squared error (RMSE).
The baseline for this task over the test data: choose the value 3 (middle of the scale) for all tweets.
| Team | Run | Score | 
|---|---|---|
| INGEOTEC | run 1 | 0.9784 | 
| baseline | baseline | 1.1419 | 
| UO_UPV | run 4 | 1.5919 |