Developing an Adaptive Social Media Data Collection System


  • COLIN BERRY Aspiring Scientists' Summer Internship Program Intern
  • Ron Mahabir Aspiring Scientists' Summer Internship Program Mentor



The rise in social media services and users over the past decade has provided the scientific community with a unique opportunity for data collection and analysis in support of a wide array of research topics. One such service, Twitter, a microblogging system, currently boasts of having 186 million active users, and with 38 million users in the United States alone. Daily, millions of people use Twitter to communicate with friends and families, to get their news, and to engage in public discourse. As such conversations within the Twittersphere are dynamic, and subject to change at any point in time based on unfolding events taking place in the physical world, or in other cyber-communities, it is important to be able to capture this dynamic information. In this paper we develop an adaptive data collection system that automatically updates trending topical keywords used to infer the emergence and decay of topics over time. This is in comparison to the typical static approach used for collecting data using keywords that do not change. Our system provides researchers with a flexible way to follow and study the evolution of conversations taking place on Twitter, ultimately providing new insights on human behavior





College of Science: Department of Computational and Data Sciences