Dr. Cagri Hakan Zaman is a multidisciplinary researcher, educator and entrepreneur. He currently serves as a Senior AI Researcher at Samsung Research America, Samsung Design Innovation Center and supervises research at the MIT-spinoff innovation lab, Mediate Labs. He is the founder and former director of the MIT Virtual Experience Design Lab at the Massachusetts Institute of Technology. His research focuses on the development of cognitive and sensory enhancement technologies using immersive media and artificial intelligence. Dr. Zaman's innovative approach to spatial computing is showcased in his dissertation titled "Spatial Experience in Humans and Machines." With extensive research experience in embodied intelligence, computational design, and immersive media, Dr. Zaman has previously conducted research at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT Design Lab.
A recipient of the MIT DesignX challenge grant in 2017, Dr. Zaman founded Mediate, a Sommerville-based research and innovation laboratory, which develops AI and XR solutions that empower people in physical spaces. His project Supersense, an AI-powered mobile application for visually impaired and blind individuals, has been considered among the top assistive technology solutions and supported by National Science Foundation and US Veteran Affairs.
HUMAN EXPERIENCE
IMMERSIVE MEDIA
AI & ACCESSIBILITY
ACM Human Factors in Computing System (CHI) 2023.
Piano learning applications in Mixed Reality (MR) are a promising substitute for physical instruction when a piano teacher is absent. Existing piano learning applications that use visual indicators to highlight the notes to be played on the keyboard or employ video projections of a pianist provide minimal guidance on how the learner should execute hand movements to develop their technique in performance and prevent injuries. To address this gap, we developed an immersive first-person piano learning experience that uses a library of targeted visualizations of the teacher’s hands and 3D traces of hand movements in MR.
National Endowment for the Humanities Digital Humanities Advancement Grant. 2022
Previously, extracting information from moving images has been challenging and time consuming, requiring historians and film scholars to access footage by manually reviewing sequences over and over to parse the setting, the rituals, camera angle, narratives, and the material cultures involved. Now, developments in computer vision and spatial analysis technologies have opened up exciting possibilities for these scholarly processes, with direct implications for improved public access and future translational tools for disabled communities. The “latent archive” that has always been embedded in moving images can now be captured via machine-enabled analysis: locating the urban or architectural setting, producing 3D spatial reconstructions, and allowing fine-grained examination of point-of-view and shot sequence.
ACADIA Conference, 2022. Vanguard Paper Award Runner-Up
A key technological weakness of artificial intelligence (AI) is adversarial images, a constructed form of image-noise added to an image that can manipulate machine learning algorithms but is imperceptible to humans. Over the past years, we developed Adversarial Architecture: A scalable systems approach to design adversarial surfaces, for physical objects, to manipulate machine learning algorithms.
Virtual Experience Design Lab, 2017
September 1955 is a 8-minute virtual-reality documentary of the Istanbul Pogrom, a government-initiated organized attack on the minorities of Istanbul on September6-7, 1955. This interactive installation places the viewer in a reconstructed photography studio in the midst of the pogrom, allowing one to witness the events from the perspective of a local shop-owner.
Supersense is a mobile assistive technology application for visually impaired people & blind (VI&B) Mobile artificial intelligence systems present a significant opportunity for improving the lives of more than 300 million VI&B by reducing the challenges in their everyday life such as navigation, orientation and mobility (O&M), reading, and more. Accessibility-enabled applications often yield poor and overwhelming experiences for VI&B who has different needs and drastically different ways of interacting with mobile technology. In making Supersense, we adopted a user-centered design process and introduced several guidelines for designing accessibility-first applications. The projectis funded by the MIT DesignX program, the National Science Foundation, and the Veteran Affairs. It has reached more than one hundred thousand blind users globally.
PhD Dissertation. Massachusetts Institute of Technology. 2020
Spatial experience is the process by which we locate ourselves within our environment, and understand and interact with it. Understanding spatial experience has been a major endeavor within the social sciences, the arts, and architecture throughout history, giving rise to recent theories of embodied and enacted cognition. Understanding spatial experience has also been a pursuit of computer science. However, despite substantial advances in artificial intelligence and computer vision, there has yet to be a computational model of human spatial experience. What are the computations involved in human spatial experience? Can we develop machines that can describe and represent spatial experience. In this dissertation, I take a step towards developing a computational account of human spatial experience and outline the steps for developing machine spatial experience.
National Science Foundation SBIR Phase II Project. 2020
NavigAid is an AI-driven mobile Orientation and Mobility (O&M) system, which provides contextually relevant, task-driven solutions to problems such as finding objects, identifying paths of ingress and egress, and understanding the layout of an environment. NavigAid is enabled by our core technical innovation, Ally Networks, which represents a novel neural network architecture that is capable of extracting semantically and functionally relevant spatial features from images, which help to create a human-like understanding of physical environments.
Human Robot Interaction 2018. ACM.
Understanding explanations of machine perception is an important step towards developing accountable, trustworthy machines. Furthermore, speech and vision are the primary modalities by which humans collect information about the world, but the linking of visual and natural language domains is a relatively new pursuit in computer vision, and it is difficult to test performance in a safe environment. To couple human visual understanding and machine perception, we present an explanatory system for creating a library of possible context-specific actions associated with 3D objects in immersive virtual worlds.
Exhibition at the Museo Egizio, Turin, Italy. 2019-2021
The purpose of the “Invisible Archeology” exhibition was to illustrate the principles, tools, examples and results of the meticulous work of recomposition of information, data and knowledge made possible today by the application of science to other disciplines and, in particular, to the study of the artifacts. What can an object tell of itself? Our senses give us basic information about it, such as its appearance, size, shape, colour, even the traces that human, nature or time have impressed on it. Yet, all this is obviously not enough to reveal the whole story and the life cycle.