The Sustaining Peace (SP) Project on Understanding Linguistic Differences Across Peaceful and Non-Peaceful Societies

January 22, 2024

The Sustaining Peace (SP) Project has been focused on understanding linguistic differences across peaceful and non-peaceful societies. With its interdisciplinary team, SP Project has been working closely with the Data Science Institute to run data analysis on large data sets to see if there are meaningful distinctions found in speech between peaceful and non-peaceful societies. As our past three years have consistently revealed, there seem to be linguistic differences across peaceful and non-peaceful countries.

After a deep dive into our large dataset of over 900 million news articles across 20 countries, the SP Project worked with data science consultants to meticulously clean the dataset and prepare models that can serve to sort through old and new data to come. After cleaning, the final dataset included 2 million articles across the 20 countries, and 1.35 billion words.

This fall, the SP Project is turning to generative AI, and beginning to experiment with ChatGPT - starting with small tests, like asking ChatGPT to generate categories from the words our data analysis uncovered as most important in determining whether the speech was peaceful or non-peaceful, we plan to continue exploring how ChatGPT would categorize speech, and how it might respond when trained on new information around positive peace, prosocial behavior, and conflict resolution.

See our recent publication "Word differences in news media of lower and higher peace countries revealed by natural language processing and machine learning" on PLOS ONE.

Learn more about the SP Project here or on our AC4 page!