Soumya Banerjee
Assistant Research Professor
University of Cambridge
Summary
I am an Assistant Research Professor at the University of Cambridge working on explainable
and trustworthy AI with applications in healthcare and computational biology. My work
spans machine learning, federated and privacy-preserving analysis, complex systems,
and reproducible research. I teach and supervise across undergraduate and postgraduate
programmes and develop openly available teaching materials.
Research interests
- Explainable & trustworthy AI
- Machine learning for healthcare and electronic health records
- Federated & privacy-preserving analysis (DataSHIELD)
- Computational & systems biology
- Complex systems, multi-scale simulation
- Reproducible research and scientific software
Recent positions
- Assistant Research Professor, University of Cambridge (2025–Present)
- Explainable AI techniques applied to healthcare; teaching and supervision.
- Senior Research Fellow & Affiliated Lecturer, University of Cambridge (2022–Present)
- Explainable AI techniques applied to healthcare; teaching and supervision.
- Postdoctoral Researcher, University of Cambridge (2019–2022)
- ML & data science on electronic health records; published in Nature Partner Journal Schizophrenia.
- Postdoctoral Researcher, University of Oxford (2016–2018)
- Researcher, CSIRO, Australia (2015–2016)
- Postdoctoral Research Fellow, Harvard Medical School & Broad Institute (2014–2015)
- Postdoctoral Research Fellow, Max Planck Institute for Molecular Physiology (2013–2014)
Education
- PhD in Computer Science, University of New Mexico, USA (2013)
- B.E. (Computer Science) with Distinction, Nagpur University, India (2003)
Teaching & supervision
Fellow of the Higher Education Academy (Advance HE). I teach introductory machine learning, reproducible research, and data visualisation. I have supervised MPhil students and PhD students and developed openly available course materials (links above).
Teaching materials
A course that I developed
Selected publications
- Banerjee, S., Alsop, P., Jones, L., Cardinal, R. (2022). Patient and public involvement to build trust in artificial intelligence: a framework, tools and case studies. Patterns, 3(6):100506.
- Banerjee, S., Lio, P., Jones, P., Cardinal, R. (2021). A class-contrastive human-interpretable machine learning approach to predict mortality in severe mental illness. Nature Partner Journal Schizophrenia, 7:60.
- Aschenbrenner, D., Quaranta, M., Banerjee, S., et al. (2020). Deconvolution of monocyte responses in inflammatory bowel disease. Gut.
- Banerjee, S., Chapman, S.J. (2018). Influence of correlated antigen presentation on T cell negative selection in the thymus. Journal of the Royal Society Interface, 15(148):20180311.
- Mallick, H., Franzosa, E., McIver, L., Banerjee, S., et al. (2019). Predictive metabolomic profiling of microbial communities. Nature Communications, 10:3136.
For a complete publication list, see my Google Scholar profile.
Skills & tools
- Programming: Python, R, MATLAB, UNIX shell, C/C++, Perl, Haskell
- Databases: MS SQL Server, Sybase
- Image analysis: ImageJ, CellProfiler
- R packages: dsSurvival, dsSurvivalClient
Grants & invited talks (selected)
- OpenAI Researcher Access Program (API credits), Apr 2024
- AI@CAM Pilot Grants, Co-investigator, £150,000, Feb 2024
- Invited talk: Responsible AI and involving patients in AI model building, Nokia Bell Labs, Cambridge (Feb 2025)