Biomedical datasets
There’s a lot of publicly accessible data out there, so why not keep a list to have them all handy? The best are sites that have nice interfaces but also allow you to easily download and handle the data. Here are some resources, in not yet logical order / organization.
Downloadable data tables:
Source | Link | Description |
---|---|---|
GnomAD | http://gnomad.broadinstitute.org/ | Human polymorphisms observed in genome / exome data. |
GTEx | https://gtexportal.org/home/ | Tissue RNAseq data. Download large data files, and search with Bash. |
ClinVar | https://www.ncbi.nlm.nih.gov/clinvar/ | Interpretations of genetic variants. |
depmap | https://depmap.org/portal/ | Mutations in cell lines. |
cBioPortal | http://www.cbioportal.org/ | Cancer genomics data |
COSMIC | https://cancer.sanger.ac.uk/cosmic | Catalog of Somatic Mutations in Cancer |
Genomics of Drug Sensitivity in Cancer | https://www.cancerrxgene.org/ | Cell-line specific sensitivities to compounds |
NIH RePORTER | https://projectreporter.nih.gov/reporter.cfm | Information on NIH-funded grants |
Human Protein Atlas | https://www.proteinatlas.org/about/download | Different types of data about cells |
Additional types of data:
Source | Link | Description |
---|---|---|
Protein Data Bank | http://www.rcsb.org/ | Molecular structures |
The Human Protein Atlas | https://www.proteinatlas.org/ | Protein localization |
Clinical Genome Resource | https://www.clinicalgenome.org/ | Clinical information for genes |
Genetic Testing Registry | https://www.ncbi.nlm.nih.gov/gtr/ | List of approved genetic tests |
EVcouplings | https://evcouplings.org/ | Evolutionary coupling data |
IntAct | https://www.ebi.ac.uk/intact/ | Protein interactions |
Timetree | http://www.timetree.org/ | Timescales of organisms |
PheWAS | https://phewascatalog.org/ | Phenome -wide association studies |
FPbase | https://www.fpbase.org/ | Fluorescent proteins |
PaxDb: Protein Abundance Database | https://pax-db.org/ | Protein abundance (Mass Spec) |
PEP Tracker | https://peptracker.com/ | More Mass Spec |
Gene Ontology Consortium | http://geneontology.org/ | Gene / Protein associations |
denovo-db | http://denovo-db.gs.washington.edu/ | Tables of de novo variants seen in trios |
DECIPHER | https://decipher.sanger.ac.uk/ | Clinical genome |
DGIdb | http://dgidb.org | Druggable targets? |
Mouse Genome Informatics | http://www.informatics.jax.org/ | Information on genes (in mice) |
Human Phenotype Ontogeny | https://hpo.jax.org/app/ | Info about the protein |
Virus data:
Source | Link | Description |
---|---|---|
HIV databases at Los Alamos National Labs | https://www.hiv.lanl.gov | HIV genetic sequences and immunological epitopes |
Virus Pathogen Resource | https://www.viprbrc.org | Viral sequences (general) |