ChatGPT-Like AI Model Details 1,300 Regions/Subregions in Mouse Brain Map – Genetic Engineering and Biotechnology News
AI-produced rendering of mouse brain regionalization overlaid with network motifs, symbolizing the fusion of artificial intelligence and neuroanatomical discovery. [University of California, San Francisco]
In a powerful fusion of AI and neuroscience, researchers at the University of California, San Francisco (UCSF) and at the Allen Institute have created one of the most detailed maps of the mouse brain to date, outlining 1,300 regions/subregions. At the heart of the breakthrough is CellTransformer, a powerful AI model that can automatically identify important subregions of the brain from massive spatial transcriptomics datasets. The new map includes previously uncharted subregions of the brain, opening new avenues for neuroscience exploration.
The findings offer an unprecedented level of detail and advance understanding of the brain by allowing researchers to link specific functions, behaviors, and disease states to smaller, more precise cellular regions, providing a roadmap for new hypotheses and research to explore the roles these areas play.
“Our model is built on the same powerful technology as AI tools like ChatGPT,” explained Reza Abbasi-Asl, PhD, associate professor of neurology and bioengineering at UCSF. “Both are built on a ‘transformer’ framework which excels at understanding context. While transformers are often applied to analyze the relationship between words in a sentence, we use CellTransformer to analyze the relationship between cells that are nearby in space. It learns to predict a cell’s molecular features based on its local neighborhood, allowing it to build up a detailed map of the overall tissue organization.”
Abbasi-Asl is senior and corresponding author of the team’s published paper in Nature Communications, titled “Data-driven fine-grained region discovery in the mouse brain with transformers.” In their paper the researchers stated, “CellTransformer advances the state of the art for automated domain detection by facilitating the identification of granular and biologically relevant spatial domains that are extensible to very large, multi-animal spatial transcriptomic datasets.”
Spatial transcriptomics offers what the researchers suggest are “unique opportunities to define the spatial organization of tissues and organs, such as the mouse brain.” But while spatial transcriptomics reveals where certain brain cell types are positioned in the brain, it does not reveal regions of the brain based on their composition.
![Three-dimensional representation of region/subregion in mouse brain map created by CellTransformer. Fewer regions are generated for visual clarity/simplicity [University of California, San Francisco]](https://www.genengnews.com/wp-content/uploads/2025/10/01-Low-Res_3D-Map-White-Background-300x169.jpeg)
CellTransformer allows scientists to define brain regions and subdivisions based on calculations of shared cellular neighborhoods, much like sketching a city’s borders based on the types of buildings within it. “Our objective was to develop a tool that would operationally identify plausible brain structures and substructures in a data-driven way and to satisfy a neuroanatomical convention for discrete domains,” the team noted in their paper.
The methodology is centered on a novel deep learning model that is built on a transformer-based encoder-decoder architecture, Abbasi-Asl explained. “The model takes spatial transcriptomics data and learns a rich representation of the tissue through a self-supervised process. It essentially learns to predict a cell’s molecular features based on the context of its surrounding cellular neighborhood.”
This allows the model to hierarchically build an understanding of tissue structure, from local patterns at the single-cell and molecular levels, to large-scale tissue domains, Abbasi-Asl commented. This model is capable of extracting descriptive information from high-dimensional data that is otherwise difficult for humans to interpret. The extracted information is then paired with GPU-accelerated clustering algorithms to group cells into distinct regions across entire multi-million cell datasets.”
![Examples from 1300 regions/subregion in mouse brain created by CellTransformer. [University of California, San Francisco]](https://www.genengnews.com/wp-content/uploads/2025/10/Low-Res_1300-Regions-300x169.jpg)
The model is like ChatGPT in its core architecture, Abbasi-Asl further explained. “Both are built on a “transformer” framework, which excels at understanding context. While ChatGPT learns the relationships between words in a sentence, CellTransformer learns the “language” of cells by analyzing the relationships between a central cell and its neighbors.” Both use a self-supervised approach, meaning they learn these complex patterns directly from the data itself without needing every single cell to be manually labeled. This allows them to uncover important relationships in the data that would be otherwise difficult for a human to identify, whether that system is text or biological tissue.
The Allen Institute’s Common Coordinate Framework (CCF) served as the essential gold standard for validating CellTransformer accuracy. “We use this method in combination with the Allen Brain Cell-Mouse Whole Brain Atlas, one of the largest spatial transcriptomics datasets to date, to unlock new insights into the anatomy of the mammalian brain,” Abbasi-Asl added. The results confirmed that the model successfully replicates known regions of the brain, such as the hippocampus, but more importantly, showed that it can also discover previously uncatalogued, finer-grained subregions.
“Typical neuroanatomical studies of the brain are expert-driven explorations of one area using multi-modal data, but we showed our technique, in combination with the amazing data generated by our collaborators at the Allen Institute, can find these subregions all at once and across the brain. Importantly, our method uncovered previously known anatomy as well as new fine-grained regions in poorly annotated areas of the brain.”
![Study authors Reza Abbasi-Asl, Ph.D., associate professor of neurology, bioengineering and therapeutic sciences at University of California, San Francisco with Alex Lee, Ph.D. candidate at University of California, San Francisco [University of California, San Francisco]](https://www.genengnews.com/wp-content/uploads/2025/10/Low-Res_Researchers-300x196.jpg)
The authors further noted, “CellTransformer is effective at integrating cells across tissue sections, identifying domains highly similar to ones in existing ontologies such as Allen Mouse Brain Common Coordinate Framework (CCF) while allowing discovery of hundreds of uncatalogued areas with minimal loss of domain spatial coherence.”
The findings are noteworthy primarily because they solve a major bottleneck in modern neuroscience, which is scalability, Abbasi-Asl pointed out. “Existing methods are often unable to process the extremely large volume of data generated by organ-scale spatial transcriptomics, which can involve millions or tens of millions of cells. CellTransformer is one of the first workflows capable of handling this scale. The discovery is also noteworthy for its consistency; creating a coherent map from multiple individual animals is a major challenge due to natural variation, and this model’s ability to do so represents a significant step towards a truly representative reference map. Finally, the unbiased, data-driven discovery of novel and plausible brain subregions moves the field beyond traditional, manually drawn maps.”
With 1,300 regions and subregions, the map represents one of the most granular and complex data-driven brain maps of any animal to date. “It’s like going from a map showing only continents and countries to one showing states and cities,” said study co-author Bosiljka Tasic, PhD, director of molecular genetics at the Allen Institute. “This new, detailed brain parcellation solely based on data, and not human expert annotation, reveals previously uncharted subregions of the mouse brain. And based on decades of neuroscience, new regions correspond to specialized brain functions to be discovered.”
First author Alex Lee, a PhD candidate at UCSF, commented, “By comparing the brain regions automatically identified by CellTransformer to the CCF, we were able to show that our data-driven method was identifying areas aligned with known expert-defined anatomical structures. Seeing that our model produces results so similar to CCF, which is such a well-characterized and high-quality resource for the field, was reassuring. The high level of agreement with the CCF provided a critical benchmark, giving confidence that the new subregions discovered by CellTransformer may also be biologically meaningful. We are hoping to explore and validate the results with further computational and experimental studies.”
The work will advance bioscience by providing a much more detailed and accurate map of the mammalian brain, which is fundamental to understanding its function, Abbasi-Asl believes. “A higher-resolution map allows scientists to link specific functions, behaviors, or disease states to much smaller and more precise cellular domains. Our work advances the field of neuroanatomy by offering a fully data-driven map of the brain which doesn’t require years of human-driven manual annotations and is less biased by historical data gathering techniques.”
The potential of this research to unlock critical insights reaches beyond neuroscience. CellTransformer’s powerful AI capabilities are tissue agnostic: They can be used on other organ systems and tissues, including cancerous tissue, where large-scale spatial transcriptomics data is available to better understand the biology of health and disease and fuel the discovery of new treatments and therapies.
“… the scalable computational workflow itself is a major advance that can be applied to other organs and species to create similarly detailed maps for a wide range of biological systems,” Abbasi-Asl stated. “While this study focused on the mouse brain, the computational method is a powerful, tissue-agnostic tool that can be applied to any organ system, such as the heart, where large-scale spatial transcriptomics data is available. This work provides a new brain map as well as a foundational and scalable solution for creating high-resolution cellular maps for virtually any tissue. This paves the way for a deeper, data-driven understanding of tissue organization across different species and disease states.”
The authors concluded, “As spatially resolved transcriptomic and multi-omics studies of the brain become more prevalent, tools such as CellTransformer provide avenues to transform data into refined anatomical maps of the brain and other complex organs and pave the way towards tissue-level structure-function mapping.”
Touching Base is the dynamic podcast series from the editors of GEN. Each episode features a rotating case of senior editors—including John Sterling, Kevin Davies, Julianna LeMieux, Alex Phillippidis, Uduak Thomas, Corinna Singleman, and Fay Lin—who delve into emerging stories, exchange ideas, and debate the latest trends in biotech. Additionally, they talk to some of the leading voices in the industry about what’s now and next. Start listening today!
Stay up to date with the lasted episodes of Touching Base by subscribing to the GEN Podcast Newsletter
Access the latest issue of GEN and browse the archive of back issues when you subscribe to our digital edition.
Copyright © 2025 Sage Publications or its affiliates, licensors, or contributors. All rights reserved, including those for text and data mining and training of large language models, artificial intelligence technologies, or similar technologies.