Using Twitter to Examine the Relationship Between Social Media and Air Quality in American Cities


  • Yuhao "Ben" Pan
  • Dr. Timothy Leslie
  • Dr. Edward Oughton



Air pollution is one of the leading environmental risk factors, and anthropogenic pollution contributes to around 100,000 deaths each year in the United States, despite efforts to improve air quality. Prior research has suggested that social media can be an effective monitor of public health. A study on Chinese social media in 2015 has also shown a correlation between the volume of pollution-related messages and the air quality of a city. This study aims to reproduce a similar experiment in the United States with the same hypothesis. The following air pollution-related keywords were chosen for their association with public health: “Air Pollution”, “Clean Air”, “Air Quality”, “Ground-level Ozone”, “PM2.5”, “PM10”, “Particulate Matter”, “Carbon Monoxide”, “Sulfur Dioxide”, and “Nitrogen Dioxide”. Additionally, the 40 most populous metropolitan areas in the United States were chosen. Tweets containing these keywords from these geolocations in 2021 were scraped from Twitter. A regression between the volume of tweets in each city and the median Air Quality Index in 2021 was performed, yielding an r2 value of 0.24, indicating a very weak correlation. Our results prove the feasibility of analyzing the relationship between social media and air quality, but adjustments in the dataset need to be done for social media to be a useful predictor of air quality in U.S. cities. Using additional search terms or locations to eliminate background noise may be a starting point for further research.





College of Science: Department of Geography and Geoinformation Science