Measuring Ideological Spectrum Through NLP

Authors: Franco Demarco, Juan Manuel Ortiz de Zarate and Esteban Feuerstein.

Abstract:
In the evolving landscape of online communities, the dispute between social integration and fragmentation has sparked ongoing debates. With the advent of technologically mediated social networks, understanding the structure of these communities remains a challenge. This study introduces a fresh, text-based technique to quantify the alignment of online communities along social dimensions. Through the analysis of historical Reddit data, community representations are generated from Reddit posts and projected onto ideological-partisan axes. This approach successfully scores communities, effectively situating them on the political-ideological spectrum.
Our approach rests on the premise that the language, topics, parlance, and discourse style employed by communities offer insights into their ideological leanings. We found that using posts’ text we can build a very similar and correlated partisan-ness ranking to the one inferred through user interactions, which reinforces our premise. This text-based approach also enables the analysis of books, news, blogs, and other sources that were not possible with previous approaches. Our results underscore the advantages of transformer-based embeddings when compared to skip-gram embeddings trained on the same dataset. This work contributes to the understanding of online community structures and their ideological foundations.

More information:
https://ceur-ws.org/Vol-3551/paper5.pdf

2024-04-09T11:15:00-03:00 9/April/2024|Papers|
Go to Top