Shih-Fu Chang | Exploring Multimedia Recognition Tools in Big Data Applications
Advances in computer vision and the growth of digital photos and videos have created new opportunities to integrate content-recognition tools with mobile apps and large-scale systems. If you want more information about a building, product or bottle of wine, it’s now possible to search the Web with an image on your phone. New 3D sensors and search tools allow users to scan real-world objects and find matching models to make new products. Emerging multimedia-recognition tools are making it possible to track and summarize breaking news from streaming video and social media. This technology is also embedded in smart search engines that can mine video footage from sporting events, roads and security cameras to flag key events, from touchdowns to traffic accidents to criminal activity. I will give an overview of the novel technologies we are developing and discuss open issues.
Julia Hirschberg | Applications for Detecting Emotion in Text and Speech
Identifying the emotional content of written and spoken language is increasingly useful in business, medicine and security. Large data sets of text and speech, including social media, interviews and phone conversations, can be used to train systems to detect consumer reactions to products and services (and to flag ‘fake’ reviews), to diagnose medical conditions such as depression, and identify deception in a wide variety of government, business and social service settings. Each application picks up subtle cues that may indicate whether a speaker is angry, happy, disgusted, afraid, sad or surprised. Similar approaches have been used to distinguish among personality traits, and to infer how tired, drunk or bored someone might be.
Kathy McKeown | Tracking Events Through Time: Objective and Personal Views
The chaos following Hurricane Sandy in 2012 brought home the need for a faster, more accurate way to filter the oceans of text streaming over social media and news sites during and after a crisis. We have been working on an automated method for monitoring and summarizing news as events unfold. Our method can flag new information as it becomes available, and generate updates. This can be extremely useful during emergencies as well as for tracking a wide variety of everyday events. In a related project, we’ve come up with a way to automatically identify the most compelling part of a personal narrative, what we call the “most reportable event.” I will discuss the natural language processing techniques that underlie this work, and future research directions.
Tian Zheng | Mapping Subpopulations within Big Networks
Estimating the size of stigmatized groups such as the homeless, people with HIV and commercial sex workers remains difficult, even in the digital age. Those belonging to marginalized subpopulations may be difficult to reach by phone, or in online surveys, or may simply prefer to keep sensitive personal information to themselves. Advances in network science are now allowing researchers to move past these obstacles to learn more about hard-to-reach demographic groups. My colleagues and I have developed a modeling framework to infer the size and other hidden features of subpopulations within a large study sample. Our method produces inferential results that are easy to interpret and relevant for visualizing, monitoring and understanding structures underlying large, complex networks.