πŸ…œπŸ…πŸ…‘πŸ…’ Music and Acoustics Research Centre

for the mathematics, sciences, medicine and technologies of music and sound

Join us on LinkedIn or subscribe to our Events mailing list
Join us on LinkedIn or subscribe to our Events mailing list
Seminar

Iran Roman, Lecturer at QMUL, on Advancing Multimodal Machine Perception Through Neural Dynamics and Music AI

Fri 6th FEB 2026, 11.30am, KCL Strand Campus, STRAND BLDG S2.49 

Iran Roman, will a give talk at KCL Strand campus as part of the MARC Seminar Series, with refreshments provided courtesy of the NMES Research Culture Fund.

Title : Advancing Multimodal Machine Perception Through Neural Dynamics and Music AI

If you are unable to attend in person, you may use the following MS Teams link to attend the event virtually : MARC Seminar Talk – Iran Roman | Meeting-Join | Microsoft Teams

Abstract

His research integrates three complementary perspectives: multimodal machine perception, theoretical neuroscience, and music AI. In multimodal perception, he develops systems that process audio, visual, and spatial information to understand complex environments, from egocentric action recognition in augmented reality to spatial sound event localization. This work addresses fundamental challenges in cross-modal learning and real-time scene understanding. In theoretical neuroscience, he investigates how neural dynamics explain periodic behaviors, proposing that brain–body systems physically embody temporal structure through resonance and dynamical coupling rather than through predictive models. This framework reveals how spontaneous motor tempo affects synchronization and how delayed feedback shapes anticipatory behavior in rhythmic coordination. In music AI, he probes the perceptual and reasoning capabilities of large language models, revealing persistent gaps between symbolic and acoustic understanding. He evaluates fundamental music perception skills and develops benchmark datasets that expose limitations in current systems’ ability to truly β€œhear” rather than merely read music. Together, these perspectives advance understanding of intelligent systems that perceive, reason about, and interact with temporal multimedia information.

Speaker

Iran R. Roman is a Lecturer at the School of Electrical Engineering and Computer Science of Queen Mary University of London. Within Queen Mary, he is a member of the Center for Multimodal AI, Center for Digital Music, Center for Human-Centered Computing, the Computer Vision group, and the Cognitive Science group. His research area is machine perception, with the goal of creating algorithms that allow computers to perceive environments as living agents do. To this end, Iran has developed algorithms that leverage multimodal signals to sense, identify, and track objects in the real world. These algorithms draw inspiration from the neural mechanisms that allow living organisms to carry out similar tasks. Iran’s work has found applications in products at companies such as Apple, Tesla, Raytheon/BBN, and Plantronics. His research has been funded by the US National Science Foundation (NSF), the US Defense Advanced Research Projects Agency (DARPA), and the Howard Hughes Medical Institute (HHMI).