VERN compared to WASSA 2017 – Shared Task on Emotion Intensity

As part of our on-going process of testing, improvements, and comparisons to human and computer derived emotional models we scour sources for data. One of the more interesting datasets that we have encountered is the WASSA 2017 Shared Task on Emotion Intensity. Twenty-two teams participated in the shared task, with the best system obtaining a Pearson correlation of 0.747 with the gold intensity scores. (For those that aren’t into content analysis, that would be pretty good coder agreement).

The resulting database contains emotions that we also found with VERN and have detectors for, including Fear which we will provide a comparison. Our sample is of the top 135 Fear reactions as scored by the WASSA 2017, compared to VERN. While the methodology of the WASSA 17 model and the VERN model on intensities somewhat (most likely due to personal frames); they do greatly agree on what would be considered “fear.” We agreed with 78% of their detections, with 105/135 entries in agreement.

VERN Fear vs WASSA 17 Comparison CLICK FOR PDF

So what does this show?

Well it means that VERN could do the job of content analysis professionals, or at the very least assist anyone who wants to find actual emotions and emotion intensity in communication. Twenty-two teams of highly-educated academic specialists from around the world came together to work on this project. With just the raw VERN analysis (without additional logic in production ready models already available) we’re almost to 80% agreement with humans, and not just any humans. Experts in emotions.

Intensity variations are due to personal frame, and likely different operationalization of the content coding process.