INTRODUCTION



Introduction


Spatial partitioning of biological functions is a phenomenon fundamental to life. For humans, this spatial compartmentalization constitutes a hierarchy of specialized systems ranging across all scales, from organs to specialized cells, from cells to organelles. The Human Protein Atlas utilizes an antibody-based approach, using both in-house generated antibodies, as well as commercial antibodies from different providers. Antibodies are used for immunofluorescence staining on cell lines for determination of spatial distribution at a subcellular level, as well as for immunohistochemistry on tissue microarrays for distribution of the protein expression in normal and cancer tissues. The open access database contains millions of high-resolution images, released together with application-specific validation performed for each antibody. The database has been developed in a gene-centric manner with the inclusion of all human genes predicted from genome efforts. Search functionalities allow for complex queries regarding protein expression profiles, protein classes, and chromosome location.

Uhlén M et al, 2015. Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419

Uhlen M et al, 2010. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol.
PubMed: 21139605 DOI: 10.1038/nbt1210-1248

The full publication list is available here.



The Human Protein Atlas


The Human Protein Atlas contains information for a large majority of all human protein-coding genes regarding the expression and localization of the corresponding proteins based on both RNA and protein data. The atlas consists of three subparts; cell, normal tissue, and cancer with each subpart containing images and data based on antibody-based proteomics and transcriptomics. The tissue atlas contains information of 44 different human tissues and organs with annotation data for altogether 76 different cell types. The transcriptomics data provide quantitative data on gene expression levels across the tissues and organs, while the antibody-based protein profiles show the spatial distribution at a single cell level for the corresponding protein in the various substructures and cell types of the tissues. Version 16.1 of the Human Protein Atlas contains RNA data for 100% and protein data for 87% of the predictive human genes and includes more than 10 million images with primary data from immunohistochemistry and immunofluorescence.

The cell atlas


The Cell Atlas provides high-resolution insights in the spatial distribution of proteins within cells. Firstly, it contains mRNA expression profiles from a diverse panel of human-derived cell lines (n=56) representing different germ layers and tissues. Secondly, the atlas contains high-resolution, multicolour images of immunofluorescently labeled cells that detail the subcellular distribution pattern of proteins in these cells. By default U-2OS cells and 2 based on expression selected cell lines are probed with each antibody. The cells are stained in a standardized way where the antibody of interest is visualized in green, the microtubules red, the endoplasmic reticulum yellow, and nuclei counterstained in blue. The images are manually annotated in terms of spatial distribution to 30 different cellular structures representing 14 major organelles. The annotated locations for every protein are classified as main and additional, and assigned a reliability score.

Example:

CCNB1
Cyclin B1.

Protein localized to the cytosol in human and mouse cells, and expressed in a cell cycle dependent manner. The location has been validated by siRNA mediated gene silencing, analysis of GFP-tagged protein and paired antibodies.



The normal tissue atlas


The normal tissue atlas contains quantiatative data and images describing the expression and distribution of human proteins across tissues and organs, both on the mRNA and protein level. The protein expression data is derived from annotation of immunohistochemical staining of cell populations in all major human tissues and organs, including the brain, liver, kidney, lymphoid tissues, heart, lung, skin, gastrointestinal tract, pancreas, endocrine tissues and the reproductive organs. In total, 44 different human tissues are included and contain annotation data for altogether 76 different cell types. The antibody-based protein profiles are qualitative and describe the spatial distribution, cell type specificity and the rough relative abundance of proteins in these tissues, whereas the mRNA data provide quantitative data on the average gene expression within an entire tissue. For each gene, the immunohistochemical staining profile, based on a single or multiple antibodies, is matched with mRNA data and gene/protein characterization data to yield an "annotated protein expression" profile.

Example:

MYL7
Myosin, light chain 7, regulatory.

Selective cytoplasmic expression in cardiomyocytes at the protein level, highly tissue enriched in heart muscle at the mRNA level.

The mouse brain atlas

The mouse brain atlas is a complement to the normal tissue atlas, providing a more extended overview of the brain proteome. In the normal tissue atlas three forebrain regions (cerebral cortex, hippocampus, and caudate) and one hindbrain (cerebellum) region is included. Immunofluorescencently labled full mouse brain sections should provide a more extensive overview presenting more brain areas and cell types. A selected set of brain relevant genes are profiled in the mouse brain providing detailed information on the regional and cellular location of proteins in the mammalian brain.

Example:

NECAB1
N-terminal EF-hand calcium binding protein 1.

Subsets of neurons showed distinct positivity in cell bodies and dendrites. Main location of the positive neurons is layer 4 of the cerebral cortex.


The cancer atlas


The cancer tissue atlas contains a multitude of human cancer specimens representing the 20 most common forms of cancer, including breast-, colon-, prostate-, lung-, urothelial-, skin-, endometrial- and cervical cancer. Altogether 216 different cancer samples are used to generate protein expression profiles for all proteins using immunohistochemistry. The data is presented as pathology-based annotation of protein expression levels in tumor cells, along with the images underlying the annotation. This enables the identification of a potential protein signature for each given type of cancer and provides a starting point for further analyses of cancer type-specific proteins. Because the cancer atlas contains a large number of cancer samples the available protein profiles provide an excellent starting point for identifying new potential cancer biomarkers.

Example:

KLK3
Kallikrein-related peptidase 3.

Selective cytoplasmic expression in prostate cancers. All other malignant tissues were negative.


Background and history


The Human Protein Atlas project was initiated in 2003 by funding from the Knut and Alice Wallenberg foundation. Primarily based in Sweden, the Human Protein Atlas project involves the joint efforts of the Royal Institute of Technology in Stockholm, Uppsala University, Uppsala Akademiska University Hospital, and more recently also Science for Life Laboratory based in both Uppsala and Stockholm. Formal collaborations are with groups in India, South Korea, Japan, China, Germany, France, Switzerland, USA, Canada, Denmark, Finland, The Netherlands, Spain, and Italy.

The pathologists and staff at the Pathology Clinic, Uppsala University Hospital, Uppsala, Sweden, are greatly acknowledged for all efforts regarding handling and diagnostics of the tissues used in the Human Protein Atlas. Dr Sanjay Navani and Lab Surgpath, Mumbai, India, are also acknowledged for the major contribution regarding annotation of immunohistochemically stained normal and cancer tissues.

The first version of the Human Protein Atlas website was launched in 2005 and contained protein expression data based on approximately 700 antibodies. Since then, each new release has included more data and also added new functionalities and new features to the website. Important additions are the inclusion of cell-line data in version 2, and the inclusion of confocal images showing subcellular localizations in version 3. Version 3 also included a new search function that allowed advanced query based searches. In version 4, the overall database structure was shifted from a previously antibody-centric structure, to a gene-centric structure in order to include information on all genes predicted by Ensembl. The next major restructuring came in 2010 with the version 7 when the concept of annotated protein expression for paired antibodies (two independent antibodies directed against different, non-overlapping epitopes on the same protein) was introduced. In 2013, the version 12 of the protein atlas database was complemented with transcriptomics profiles from 27 normal tissues, and the format with four sub-atlases was introduced. Version 13 was released at end of 2014 and included an analysis of all major organ and tissues in the human body using transcriptomics and antibody-based profiling. The results were summarized on interactive knowledge-pages divided into 7 human proteomes and 27 tissues and organs. In version 14, a new mouse brain atlas was introduced, and in version 15 RNA-seq data from the Genotype-Tissue Expression (GTEx) consortium was included. In version16, a new Cell Atlas was launched with subcellular localization corresponding to over 12,000 protein-coding genes, together with a new approach for visualization of antibody validation and the inclusion of transcriptomics data from the FANTOM5 program.

Release history is found here



Number of gene/antibodies included per new release