MI X

The MIx Group @ University of Birmingham

4 minutes

an img

Machine Intelligence + X,

X =

We are the Machine Intelligence + x group at the School of Computer Science, University of Birmingham. Welcome!
Our group mainly studies machine learning and computer vision, and also interested in other applied machine learning problems including multimodal data, neuroscience, healthcare, physics, chemistry, to name a few. That is where the x lies in.

Key research interests:
  • Learning representations with limited human supervision, e.g. self-/semi-/weakly-supervised learning
  • Multimodal data processing and analysis, e.g. vision-language, vision-audio, etc.
  • Open-world problems, e.g. incremental learning, open-vocabulary visual understanding
  • Visual semantics understanding, e.g. semantic segmentation, saliency modelling
  • 3D problems, e.g. depth estimation, multi-view geometry, 3D generation
  • Healthcare, e.g. medical image understanding and analysis, explanable AI for healthcare
  • AI for science, including neuroscience, physics and chemistry

News #

We are organising the BinEgo-360 Workshop and Challenge at ICCV 2025 in Hawaii. Please participate in the Challenge to get various awards, and join us to present your papers!
Nov 2025: Two papers (a new 3D Superquadric Splatting, and a new approach for training-free open-world learning) has been accepted to WACV 2026, congrats to the students and co-author! Oct 2025: Our paper about What Time Tells Us has been published at TMLR, congrats to the 0th-year PhD Dongheng! Oct 2025: Congrats to Dongheng on receiving the NeurIPS 2025 Scholar Award! Sep 2025: One paper about Zero-Shot Video Anomaly Analysis has been accepted to NeurIPS 2025, congrats to Dongheng and the co-authors! Jul 2025: We have 4 papers accepted to BMVC 2025, congrats to the students! Jun 2025: Our paper about dynamic 3D scene reconstruction is accepted to ICCV 2025, congrats to the co-authors! May 2025: Our BinEgo-360 Workshop is accepted to ICCV 2025, we invite paper presentations and challenge participation! Mar 2025: Our paper about incremental learning is accepted to CVPR 2025, congrats to the co-authors! Feb 2025: Congratulations to Kangning on successfully defending her viva! Feb 2025: Our paper about molecular structure prediction is accepted to The Journal of Physical Chemistry, congrats to Wenjin! Jan 2025: Our paper about open-world learning is accepted to ICLR 2025, congrats to Qiming! Nov 2024: Very glad to receive the Best Paper Award at the MICAD 2024! Congrats to Kangning and the Team! Nov 2024: Great thanks to Meta Project Aria for their in-kind contributions, and honoured to be an Academic Partner! Oct 2024: Very glad to receive the Best Paper Award and Best Presentation (Runner-Up) Award at the MICCAI 2024 ASMUS Workshop! Congrats to Kangning and the Team! Sep 2024: Two papers (1 Spotlight) accepted to NeurIPS 2024, and one paper accepted to ACCV 2024, congrats to all the co-authors! Sep 2024: Welcome the new PhD students Isaac, Hao, Peixi, and Haotian joining the group! Jul 2024: The paper "Show from Tell" is now published in Scientific Reports (Nature Portfolio)! Please check it out here: https://rdcu.be/dNcmb and here :) Jul 2024: Three papers accepted to ECCV 2024, two papers accepted to MICCAI Workshop and ACM MM 2024, congrats to all the co-authors! Mar 2024: Very grateful to be awarded the Amazon Research Award! Feb 2024: Two papers (the 360+x (Oral) multi-modal holistic scene understanding dataset, and DyMvHuman dynamic multiview dataset) are accepted to CVPR 2024;
and another two papers are accepted to ISBI 2024 (Oral) and T-IP. Congrats to all the co-authors! Oct 2023: Two papers (1 Oral 1 Poster) are accepted to WACV 2024, congrats to all the co-authors! Sep 2023: Grateful to be awarded the Royal Society Short Industry Fellowship! Aug 2023: One paper is accepted to IJCV. Congrats to all the co-authors! Jul 2023: Four papers are accepted to ICCV 2023. Congrats to all the co-authors (esp. the MSc students Hao and Chenyuan)! Apr 2023: Grateful to receive the International Exchanges Grant from The Royal Society! Apr 2023: Two papers are accepted to CVPR 2023 Workshops (Foundation Model and Sight and Sound) about self-supervised multi-modal (video-text-audio) representation learning Mar 2023: Two papers are accepted to ICLR 2023 workshops (TML4H and Neural Fields) about medical video quality assessment and neural representations in low-level vision. Congrats to Jong (PhD) and Wentian (MSc)! Oct 2022: Very glad to receive the Best Paper Award at the ECCV 2022 Workshop on Medical Computer Vision! Congrats to the PULSENet Team! Sep 2022: One paper is accepted to NeurIPS 2022 about continual learning Aug 2022: One paper is accepted to ECCV 2022 Workshop (ECCV-MCV) about anatomy-aware contrastive medical representation learning Feb 2022: Birthday of the MIx group @ the University of Birmingham






Collections

Sections

Contact and Join Us

Contact E-mail: mix.group.uk@gmail.com Join us We are always looking for people with strong self-motivation, unusual creativity, and passion for hard problems! If you share the same intetests and passion with us, please send your CV together with a short description (2 – 3 sentences) of your research interests to the above email address (with the keywords “[PhD/Postdoc/RA/Visitor/Collaboration application]” in your email subject). Prospective PhD students Please apply via the University application system here, and mention the PI’s name on your application.

2 minutes

Datasets

TOC Dataset TMLR, Dataset link: https://www.kaggle.com/datasets/011af1d77cea3112779e0ea0139debab55141b1dd93d0c2524cfc68ec5be774d The ime-Oriented Collection (TOC) dataset introduces a new benchmark about time, consisting of high-quality images sourced from social media, featuring reliable image metadata. We collected 117,815 training samples and 13,091 test samples from the Cross-View Time, mitigating various limitations in previous datasets. This dataset reflects real-world scenarios and human activities, making time-of-day estimation more applicable to potential practical applications For more details please refer to the paper Atypical Video Dataset BMVC, Dataset link: https://huggingface.

4 minutes

Projects

PCo3D: Physically Plausible Controllable 3D Generative Models Amazon Research Award, PI, with Aleš Leonardis Generative AI has shown remarkable performance across various applications involving content generation, showcasing its potential in both academic research and industrial settings. While its effectiveness in generating images and videos is well-established, there exists a notable gap when it comes to 3D content creation, particularly in the consideration of physical properties during the generation process. Another gap is the controllability of the physics-aware generation.

5 minutes

Publications

*Equal contribution   What Time Tells Us? An Explorative Study of Time-Awareness Learned from Static Images Dongheng Lin*, Han Hu*, Jianbo Jiao Transactions on Machine Learning Research (TMLR), 2025 [PDF] [BibTeX] [arXiv] [Short Video Intro] [Project Page] [Dataset] A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis Dongheng Lin, Mengxue Qu, Kunyang Han, Jianbo Jiao, Xiaojie Jin, Yunchao Wei Annual Conference on Neural Information Processing Systems (NeurIPS), 2025 [PDF] [BibTeX] [arXiv] [Short Video Intro] [Project Page] Exploring Image Representation with Decoupled Classical Visual Descriptors Chenyuan Qu, Hao Chen, Jianbo Jiao British Machine Vision Conference (BMVC), 2025 [PDF] [BibTeX] [arXiv] [Short Video Intro] [Project Page]   What Can We Learn from Harry Potter?

8 minutes

Team Members

Jianbo Jiao Principle Investigator Jianbo is an Associate Professor in the School of Computer Science at the University of Birmingham, a Fellow of the HEA and a former Royal Society Short Industry Fellow. Isaac Akintaro PhD Student (2024 -), MI v Isaac's research focuses are around Visual Reasoning and 3D generation. He completed an MSc as a Google DeepMind Scholar. Hao Ai PhD Student (2024 -), MI v Hao's research focuses are on scene understanding, scene generation, and 3D geometry estimation, especially in the real-world scenarios.

3 minutes