SentiWords is a high coverage resource containing roughly 155.000 English words associated with a sentiment score included between -1 and 1. Words in this resource are in the form lemma#PoS and are aligned with WordNet lists (that include adjectives, nouns, verbs and adverbs). Scores are learned from SentiWordNet and represent state-of-the-art computation of words' prior polarities (i.e. polarity for non-disambiguated words) using SWN. SentiWords was built using the method described in Guerini et al. (2013) and the dataset presented in Warriner et al. (2013). For a thorugh description see Gatti et al. (2015).

How to obtain it:

SentiWords is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. To obtain it, click the "Download SentiWords" button below. Whenever making reference to this resource please cite the papers in the References section.

Main Reference:

  • Gatti, L., Guerini, M., & Turchi, M. (2016). SentiWords: Deriving a high precision and high coverage lexicon for sentiment analysis. IEEE Transactions on Affective Computing, 7(4), 409-421. [Preprint]

Additional References:

  • Guerini M., Gatti L. & Turchi M. “Sentiment Analysis: How to Derive Prior Polarities from SentiWordNet”. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP'13), pp 1259-1269. Seattle, Washington, USA. 2013. [Preprint]

  • Warriner A. B., Kuperman V. & Brysbaert M. "Norms of valence, arousal, and dominance for 13,915 English lemmas". Behavior research methods, 45(4), 1191-1207. 2013. [Download page]


Contact: Marco Guerini