Science Vignette: Using gene expression data to classify cells

May 14, 2015

With the release of our newest resource, the Allen Cell Types Database, we are eager to explore how the tremendously diverse cells in the brain can be grouped into categories. But how exactly do we go about doing that? Historically, cell type classification has often been done using just a few characteristics of cells at a time, such expression of a small number genes or aspects of the cells’ shapes.

Our latest Science Vignette explores how we can instead use large amounts of data—in this case, expression of every gene in individual cells—to categorize cells into types.

The animation shows how scientists took more than 1,600 individual cells from the visual cortex of the mouse brain and analyzed their complete gene expression. They then used a “cluster” analysis to split cells into types and subtypes. Because the analysis ignored information like which layer of the brain and which mouse line the cells came from, the resulting 49 types are based entirely on gene expression information.

The animated portion of the Science Vignette (click the video to watch) guides users through the clustering process. Interestingly, even though cells are classified based just on their genes, different types of cells end up corresponding to certain features. For example, inhibitory neurons fall into categories that correspond to characteristic genes, called marker genes. In contrast, excitatory neuron types correspond to the layer of the cortex where they are found.

After being guided through the animation, users can explore the data in the Science Vignette more deeply by investigating the cell types in a large matrix, which also contains information on the transgenic mouse lines, brain layers and gene expression of the cells in this analysis.

Click to view the entire Science Vignette and explore the resources on our data portal at