All in Artificial Intelligence

The Latent Archive

National Endowment for the Humanities Digital Humanities Advancement Grant. 2022

Previously, extracting information from moving images has been challenging and time consuming, requiring historians and film scholars to access footage by manually reviewing sequences over and over to parse the setting, the rituals, camera angle, narratives, and the material cultures involved. Now, developments in computer vision and spatial analysis technologies have opened up exciting possibilities for these scholarly processes, with direct implications for improved public access and future translational tools for disabled communities. The “latent archive” that has always been embedded in moving images can now be captured via machine-enabled analysis: locating the urban or architectural setting, producing 3D spatial reconstructions, and allowing fine-grained examination of point-of-view and shot sequence.

Towards Adversarial Architecture

ACADIA Conference, 2022. Vanguard Paper Award Runner-Up

A key technological weakness of artificial intelligence (AI) is adversarial images, a constructed form of image-noise added to an image that can manipulate machine learning algorithms but is imperceptible to humans. Over the past years, we developed Adversarial Architecture: A scalable systems approach to design adversarial surfaces, for physical objects, to manipulate machine learning algorithms.

Supersense AI

Supersense is a mobile assistive technology application for visually impaired people & blind (VI&B) Mobile artificial intelligence systems present a significant opportunity for improving the lives of more than 300 million VI&B by reducing the challenges in their everyday life such as navigation, orientation and mobility (O&M), reading, and more. Accessibility-enabled applications often yield poor and overwhelming experiences for VI&B who has different needs and drastically different ways of interacting with mobile technology. In making Supersense, we adopted a user-centered design process and introduced several guidelines for designing accessibility-first applications. The projectis funded by the MIT DesignX program, the National Science Foundation, and the Veteran Affairs. It has reached more than one hundred thousand blind users globally.

NavigAid: AI-Driven Orientation and Mobility System for the Blind

National Science Foundation SBIR Phase II Project. 2020

NavigAid is an AI-driven mobile Orientation and Mobility (O&M) system, which provides contextually relevant, task-driven solutions to problems such as finding objects, identifying paths of ingress and egress, and understanding the layout of an environment. NavigAid is enabled by our core technical innovation, Ally Networks, which represents a novel neural network architecture that is capable of extracting semantically and functionally relevant spatial features from images, which help to create a human-like understanding of physical environments.

Reasonable Perception: Connecting Vision and Language Systems

Human Robot Interaction 2018. ACM.

Understanding explanations of machine perception is an important step towards developing accountable, trustworthy machines. Furthermore, speech and vision are the primary modalities by which humans collect information about the world, but the linking of visual and natural language domains is a relatively new pursuit in computer vision, and it is difficult to test performance in a safe environment. To couple human visual understanding and machine perception, we present an explanatory system for creating a library of possible context-specific actions associated with 3D objects in immersive virtual worlds.