Skip to main content

Data and Technology

Driving breakthroughs in neuroscience through advanced computation, data management technologies and collaborative open data sharing

Goals and Approach

UMAP image of brain cell typesThe Data and Technology team at the Allen Institute for Brain Science is dedicated to advancing knowledge of the brain through the Brain Knowledge Platform. This comprehensive ecosystem integrates pipelines, tools, and data to unify neuroscience information and create a detailed atlas of mammalian brain cell types. Our expanding data platform is built on cloud infrastructure, making it easy for scientists inside and outside the Allen Institute to contribute, visualize, and analyze the growing body of knowledge. By using this innovative resource, researchers worldwide can collaborate and make meaningful contributions to neuroscience research, ultimately leading to improvements in human health. The Brain Knowledge Platform represents a major leap forward in technology for this vital field, unlocking the mysteries of the brain and enabling us to gain a deeper understanding of how it works.


Image showing two whole brain datasets open for comparison through the Allen Brain Cell Atlas web portal.

The Allen Brain Cell (ABC) Atlas , empowers researchers worldwide to visually explore and analyze multiple whole-brain datasets simultaneously. The beta release currently includes a single cell RNA-seq dataset and a MERFISH dataset, each with ~4M cells, representing ~5200 clusters in the whole mouse brain. As the Allen Institute and collaborators continue to add new modalities, species, and insights to the ABC Atlas, this platform will enable discoveries and breakthroughs in neuroscience. 

Data & Resources

Screenshot of the CompBio Workbench online tool

The CompBio Workbench is a tool designed to support computational neuroscientists working from large-scale datasets. This tool extends AWS Sagemaker to allows researchers to easily access, load, and analyze complex data using advanced machine learning and analytical techniques.

One unique and exciting feature of the workbench is its ability to help researchers map their gene expression data against the massive expression datasets that support the Allen Institute’s cell type taxonomies. This feature transforms the taxonomies into a powerful tool that computational neuroscientists can use to understand their data better, rather than relying solely on the information presented in academic papers. By making it easier for researchers to contribute to and analyze data, the CompBio Workbench represents a major step forward in computational neuroscience.

Whole mouse brain image showing spatial location of transcriptomic-defined cell types

The Allen Institute for Brain Science, is driving toward a comprehensive atlas to revolutionize understanding of the human brain. To make this vision a reality, the Data and Technology team is ramping up efforts in two key areas: genomics and imaging.

With the genomics revolution impacting all biological fields, this team is developing a modern, cloud-based pipeline that empowers scientists to generate, process, and analyze massive single-cell genomics datasets. These powerful datasets are the backbone of efforts to define cell types, unlocking new insights into the intricate workings of the brain.

The Data and Technology team is also pushing the limits of cutting-edge imaging technologies too explore spatial transcriptomics, connectivity, and the impacts of genetic tools in unprecedented detail. These imaging pipelines provide the muscle for a comprehensive understanding of mammalian brain cell types.

With petabytes of data available and the amount of data rapidly growing, the Data and Technology team is dedicated to making this information findable and useful for anyone interested in neuroscience research. This team is investing in a state-of-the-art services-oriented architecture that enables continuous extensions of data models and keep up with the breakneck pace of scientific discovery, with the ultimate goal of handling exabytes of data in the future.

By using foundational services as “sources of truth” to leverage for search and analysis services in the knowledgebase, this team is creating a comprehensive platform that unlocks the power of vast amounts of data for developers, empowering them to create the tools and pipelines needed to drive tomorrow’s neuroscience breakthroughs. This platform is accessible to anyone, regardless of background or level of expertise.

The quality and integrity of data are paramount to advancing our understanding of the brain and unlocking the mysteries of neurological diseases. That’s why this team is continuously improving data governance policies and implementing cutting-edge data quality processes to ensure data generated at the Allen Institute for Brain Science is accurate, reliable, and trustworthy.

This team is developing robust data standards and powerful automation for data integration and validation. These tools allow us to streamline processes, identify and correct errors, and ensure that data is consistent across various platforms and datasets.

At the Allen Institute, we’re proud to be at the forefront of data-driven neuroscience research. We will continue investing in data quality and integrity to ensure our data remains a valuable resource for the scientific community.

Science Programs at Allen Institute