Solving the mysteries of bioscience
Foundational Science Fuels Breakthroughs
Inspiring Next-Generation Scientists
Researchers develop a machine learning approach to classify brain cells more precisely, which could aid studies into their role in health and disease.
By Jake Siegel / Allen Institute
09.23.2024
3 min read
Share:
The human brain contains an astonishing diversity of cell types, each with unique shapes, functions, and gene expression patterns. This complexity has long challenged scientists trying to understand how the brain works—and what happens when it doesn’t.
Traditionally, researchers have categorized these cells into distinct “bins,” treating each type as a separate entity. However, that approach may oversimplify the brain’s cellular landscape by overlooking the continuous spectrum of variability within each cell type. Such simplification can mask subtle differences and similarities among cells, potentially missing critical insights into brain function and disease.
Scientists at the Allen Institute have developed an innovative machine learning-based approach to enhance our view of both the distinct identities and the continuous variability within brain cells. This method, outlined in a new paper in Nature Computational Science, could advance our understanding of neurological conditions like Alzheimer’s disease.
“The brain is extreme in terms of the diversification of its cells,” said Uygar Sümbül, Ph.D., an associate investigator and the study’s senior author. “The key question is whether to view this diversity as discrete categories or as a continuum where each cell type blends into the next.”
To address this question, the researchers developed MMIDAS, short for Mixture Model Inference with Discrete-Coupled Autoencoders. It is a machine learning method that simultaneously infers both distinct cell types and the continuous variations within them.
“Most methods treat cell types as a clustering problem,” explained Yeganeh Marghi, Ph.D., a scientist on Sümbül’s team and the study’s lead author. “They first identify clusters in the dataset, then explore within-cluster variability. But this approach isn’t optimal for capturing both sources of diversity.”
MMIDAS, by contrast, combines these steps, providing a more nuanced understanding of cell types and their internal variations. It is designed as a tool for scientists to explore sprawling single-cell datasets, helping them identify cell types and understand their functions by analyzing gene expression, proteins, and other molecular features.
Yeganeh Marghi, Ph.D. (right), and Rohan Gala, Ph.D., scientists at the Allen Institute, going over an equation representing the loss function used to train a variational model with an A-tuple of autoencoding arms. Photo by Peter Kim / Allen Institute.
The team tested MMIDAS on several datasets, including the Seattle Alzheimer’s Disease Brain Cell Atlas (SEA-AD), which profiles brain cells from individuals at various stages of Alzheimer’s disease (AD). MMIDAS identified fewer, but potentially more robust, categories of neurons compared to traditional methods, accounting for continuous variability within those categories. The method also detected changes in the relative abundance and gene expression of different neuron types as the disease progressed. MMIDAS uncovered continuous variables correlating with the stage of AD, particularly in excitatory neurons, but not in inhibitory neurons, suggesting cell type-specific disease progression.
Importantly, these correlations held true even when data from some individuals were excluded from the initial analysis, suggesting that the method can generalize well across different samples.
These findings demonstrate MMIDAS’ ability to reveal new insights into how brain cell populations change during disease progression—an approach that might contribute to better diagnostics or targeted therapies for AD, Marghi said.
While the study represents a significant advance, the researchers acknowledge that more work is needed. They are exploring ways to improve the method’s robustness and accuracy, which may require additional computational resources.
In the meantime, the authors hope that MMIDAS will be adopted by the broader scientific community. “We want researchers to use it and give us feedback, especially in applications beyond neuroscience, such as genomics, bioinformatics, and other complex, data-driven fields,” Marghi said.
MMIDAS offers a sophisticated approach to analyzing the brain’s cellular diversity, potentially transforming our understanding of neurological diseases like Alzheimer’s, the authors said. As Sümbül put it, this method might be the key to “carving nature at its joints,” offering a clearer view of the structures that define brain health and dysfunction.
The Allen Institute is an independent, 501(c)(3) nonprofit research organization founded by philanthropist and visionary, the late Paul G. Allen. The Allen Institute is dedicated to answering some of the biggest questions in bioscience and accelerating research worldwide. The Institute is a recognized leader in large-scale research with a commitment to an open science model. Its research institutes and programs include the Allen Institute for Brain Science, the Allen Institute for Cell Science, the Allen Institute for Immunology, and the Allen Institute for Neural Dynamics. In 2016, the Allen Institute expanded its reach with the launch of The Paul G. Allen Frontiers Group, which identifies pioneers with new ideas to expand the boundaries of knowledge and make the world better. For more information, visit alleninstitute.org.
01.13.2025