DOWNLOADABLE DATA


  Search results
The data files represented here includes data available in the Human Protein Atlas version 16.1. A subset of this data can also be downloaded from the Search page with the genes corresponding to the current search result in the result in different formats; XML, RDF & TAB.
 
  Single entry
Data in XML, RDF & TAB format can be accessed at single entry level using URLs structure as below:
/ENSG00000134057.xml
/ENSG00000134057.trig
/ENSG00000134057.tab

 
  Archived data
As of version 13 of the Human Protein Atlas, the site can be reached using the url structure "vXX.proteinatlas.org" where XX is the version number. For example, version 13 of the Human Protein Atlas has the url v13.proteinatlas.org.

 
1 Normal tissue data
Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), and the gene reliability of the expression value ("Reliability"). The data is based on The Human Protein Atlas version 16.1 and Ensembl version 83.38.

normal_tissue.csv.zip
CSV-file, 4.3 MB
 
2 Cancer tumor data
Staining profiles for proteins in human tumor tissue based on immunohistochemisty using tissue micro arrays. The comma-separated file includes Ensembl gene identifier ("Gene"), tumor name ("Tumor"), staining value ("Level"), the number of patients that stain for this staining value ("Count patients") and the total amount of patients for this tumor type ("Total patients"). The data is based on The Human Protein Atlas version 16.1 and Ensembl version 83.38.

cancer.csv.zip
CSV-file, 4.8 MB
 
3 Subcellular location data
Subcellular localization of proteins based on immunofluorescently stained cells. The comma-separated file includes the following columns: Ensembl gene identifier ("Gene"), name of gene ("Gene name"), gene reliability score ("Reliability"), validated locations ("Validated"), supported locations ("Supported"), Approved locations ("Approved"), uncertain locations ("Uncertain"), locations with single-cell variation in intensity ("Cell-to-cell variation intensity"), locations with spatial single-cell variation ("Cell-to-cell variation spatial"), locations with observed cell cycle dependency (type can be one or more of biological definition, custom data or correlation) ("Cell cycle dependency"), Gene Ontology Cellular Component term identifier ("GO id")
The data is based on The Human Protein Atlas version 16.1 and Ensembl version 83.38.

subcellular_location.csv.zip
CSV-file, 172.8 KB
 
4 RNA gene data
RNA levels in 56 cell lines and 37 tissues based on RNA-seq. The comma-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Sample") and transcripts per million ("Value" and "Unit"). The data is based on The Human Protein Atlas version 16.1 and Ensembl version 83.38.
RNA sequencing data for human tissue
RNA sequencing data for human cell lines

rna_tissue.csv.zip
CSV-file, 3.8 MB
rna_celline.csv.zip
CSV-file, 5.7 MB
 
5 RNA isoform data
RNA levels in 56 cell lines and 37 tissues based on RNA-seq. The tab-separated file includes Ensembl gene identifier ("Gene"), Ensembl transcript identifier ("Transcript"), analysed sample ("Sample") and transcript per million ("TPM"). The data is based on The Human Protein Atlas version 16.1 and Ensembl version 83.38.



transcript_rna_tissue.tsv.zip
TSV-file, 73.3 MB
transcript_rna_celline.tsv.zip
TSV-file, 45.7 MB
 
6 Data from the Human Protein Atlas in XML format
The XML file contains most of the data in the Human Protein Atlas version 16.1, including protein expression data (in normal and tumor tissues and in cell lines), antigen sequences, Western blot data for antibodies, protein array data for antibodies, RNA-seq data, external references such as UniProt identifiers, and more. The data is based on Ensembl version 83.38. The file structure is presented in the XSD-schema. This data can also be downloaded for a resulting gene set when using the search function (via the xml link on the result page).
The XML file presented here is compressed with gzip due to its size. It can be uncompressed with an archive program like 7‑zip.

proteinatlas.xml.gz
XML-file (gzip compressed), 263.3 MB
 
7 Data from the Human Protein Atlas in RDF format
This file contains a subset of the data in the Human Protein Atlas version 16.1 corresponding to the tissue annotations on gene level. This data can also be downloaded for a resulting gene set when using the search function (via the RDF link on the result page). This RDF release is BETA and will be extended and developed in coming releases. We thank Mark Thompson, Rajaram Kaliyaperumal and Eelke van der Horst (LUMC, The Netherlands), and Christine Chichester (SIB, Switzerland) for providing templates for generating the first beta-release of HPA nanopublications. Their contribution was made possible by IMI project Open PHACTS and EU FP7 project RD-Connect. This beta was developed within an ELIXIR collaboration.

proteinatlas.trig.gz
RDF trig-file (gzip compressed), 84.6 MB
 
8 Data from the Human Protein Atlas in TAB format
This file contains a subset of the data in the Human Protein Atlas version 16.1 corresponding to the data seen in the search result. This data can also be downloaded for a resulting gene set when using the search function (via the TAB link on the result page).

proteinatlas.tab.gz
TAB-file (gzip compressed), 1.3 MB