Computer Systems Technology - Projects
Challenge Launch: Health in My Community
This project’s purpose is to establish and demonstrate a simple method for extracting one useful set of data from the immense stream of data available online. Using a web interface, this method allows useful epidemiological data to be extracted from a social media source, in this implementation Twitter, and returned in an easily understandable format.
This project demonstrates a new application using social media to determine disease outbreaks in an area. Previous work in this subject has applied data from social media towards other useful areas, and shown that it is possible to use this data for useful purposes, and it has specifically targeted epidemiological data. However, it has not been fully utilized, instead focusing on predicting outbreaks of a single disease, rather than being applied to a multitude of potential epidemics. Merely tracking trends in general does not find disease related information, instead focusing on topics that have recently become popular.
Using the data gathered from the application, it would be possible for action to be taken to prevent widespread illness in cities. The data would be retrieved in time to take action on any sudden trends appearing in the data. By gathering epidemiological data using social media, the information could
be received and analyzed almost instantly, helping track diseases as they spread. Then, medical professionals could be alerted to be expecting cases of diseases that are becoming more common, warnings to the general population could be issued, and a greater outbreak of the disease could be
The requirements given for the project stated that it would need to search for tweets within the area of the location input for the past day, and determine a frequency count of each disease mentioned.
The application consists of a web application made up of several Java Servlets. The first servlet generates a list of locations that are available for searching, so that the user can select one. The data for the location chosen by the user is transmitted to a second servlet, which then communicates with
Twitter’s servers to retrieve the results of the required searches. It then analyzes the tweets retrieved from its searches for references of the diseases, through a list of key phrases associated with each disease loaded from an external source. The final result of this analysis is then presented to the user.
Through the use of the application introduced in this project, epidemiologists will be able to more efficiently and effectively determine the locations of disease outbreaks as the occur, without having to rely on the illnesses being directly reported to them.
For additional information, please see: