This page outlines the resources generated for the Human Brain Pharmacome project lead by Charles Tapley Hoyt in 2018 and 2019.
Knowledge Storage, Manipulation, and Exploration
One of the first goals of the Human Brain Pharmacome Project is to bring order to the unstructured knowledge locked in the biomedical literature, patents, and electronic health records.
Storing Knowledge: Biological Expression Language
We heavily rely on Biological Expression Language (BEL) as a format for storing qualitative causal and correlative relations between biological entities across multiple modes and scales, with full provenance information including namespace references, relation provenance (citation and evidence), and biological context-specific relation metadata (anatomy, cell, disease etc.)
Slater, T. (2014). Recent advances in modeling languages for pathway maps and computable biological networks. Drug Discovery Today, 19(2), 193–198.
- BEL Enhancement Proposals: https://github.com/belbio/bep
Manipulating Knowledge: PyBEL
We built PyBEL for parsing, validating, compiling, and converting networks encoded in BEL. PyBEL comprises five main components: (i) network data container, (ii) parser and validator, (iii) network database manager, (iv) data converter and (v) network visualizer.
Hoyt, C. T., Konotopez, A., Ebeling, C. (2018). PyBEL: a computational framework for Biological Expression Language. Bioinformatics, 34(4), 703–704.
Exploring Knowledge: BEL Commons
BEL Commons is an environment for curating, validating, and exploring knowledge assemblies encoded in BEL to support elucidating disease-specific, mechanistic insight. BEL Commons comprises five components: (i) the network uploader and validator, (ii) user rights and project management, (iii) the query builder, (iv) the biological network explorer and (iv) the analytical service.
Hoyt, C. T., Domingo-Fernández, D., & Hofmann-Apitius, M. (2018). BEL Commons: an environment for exploration and analysis of networks encoded in Biological Expression Language. Database, 2018(3), 1–11.
Curation of Unstructured Knowledge
BEL Curation Workflow
- Source Code
- Zenodo Reference: Hoyt, C. T. (2018, December 22). pybel/pybel-git v0.0.2 (Version v0.0.2). Zenodo. http://doi.org/10.5281/zenodo.2508180
Curation of Neurodegeneration Supporting Ontology (CONSO)
While there are several useful public terminologies useful for curation of biomedical relations, there is often the need to develop new controlled vocabularies, thesauri, taxonomies, and ontologies. We have maintained ours with the ability to export it as OBO, OWL, and as a BEL namespace.
Curation of Neurodegeneration in BEL (CONIB)
We have prioritized manual curation of the highest granularity for new content related to several neurodegenerative disease, with special focus on the human Tau protein, nicotoinic receptor biology, and proteostasis. They can be downloaded and handled with PyBEL (or other BEL tools) using the link below.
Multimodal mechanistic signatures for neurodegenerative diseases (NeuroMMSig) consists of three parts: a taxonomy of candidate pathophysiological mechanisms in three neurodegenerative diseases (i.e., Alzheimer’s disease, Parkinson’s disease, and epilepsy), disease-specific mechanistic models of each, and a web server implementing a novel mechanism enrichment algorithm.
Mechanism Taxonomy Resources
Mechanism Inventory Resources
Mechanism Enrichment Server Reference
Domingo-Fernández, D., et al. (2017). Multimodal mechanistic signatures for neurodegenerative diseases (NeuroMMSig): A web server for mechanism enrichment. Bioinformatics, 33(22), 3679–3681.
Mechanism Enrichment Server Resources
- Web Application: https://neurommsig.fraunhofer.de
Rational Enrichment of NeuroMMSig
Hoyt, C. T., et al (2019). Re-curation and Rational Enrichment of Knowledge Graphs in Biological Expression Language. Database, 2019.
Topic-driven Semi-Automated Curation
We used the tooling developed for the rational enrichment workflow to support topic-specific, semi-automated curation. So far, it has been applied to the MAPT and GSK3B proteins.
We’ve generated a dynamic web application for exploring the content curated during this project that can be found at https://github.com/pharmacome/taubase. It will be deployed publicly soon.
Acquisition and Transformation of Structured Knowledge
BEL is well-suited as an biomedical knowledge integration standard as it has precise semantics and is extensible. We generated several reusable packages for converting and harmonizing databases across modes and scales in BEL.
Hoyt, C. T., et al. (2019). Integration of Structured Biological Data Sources using Biological Expression Language. bioRxiv, 631812.
We curated mappings between three major pathway databases (KEGG, Reactome, and WikiPathways) and MSigDB to identify their overlapping entries.
Domingo-Fernandez, D., et al. (2018). ComPath: an ecosystem for exploring, analyzing, and curating mappings across pathway databases. Npj Systems Biology and Applications, 4:43.
We harmonized and extracted pathway knowledge from three major pathway databases: KEGG, Reactome, and WikiPathways in order to make comparisons about their coverage and to generate consensus pathways (by using ComPath) for downstream applications, such as with Signalling Pathway Impact Analysis (SPIA).
Domingo-Fernandez, D., et al. (2018). PathMe: Merging and exploring mechanistic pathway knowledge. BMC Bioinformatics, 20:243.
We generated super-pathways using ComPath and PathMe in order to benchmark several pathway-based functional enrichment and prediction methods.
Mubeen, S., et al. (2019). The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling. Frontiers in Genetics, 10:1203.
Comparative Mechanism Enrichment
We proposed an explanation for the cross-indication effects of Carbamazepine towards epilepsy and Alzheimer’s disease by using the NeuroMMSig mechanism enrichment server to identify overlapping mechanisms with its experimentally measured and computationally predicted targets.
Hoyt, C. T. and Domingo-Fernández, D., et al. (2018). A systematic approach for identifying shared mechanisms in epilepsy and its comorbidities. Database, 2018(1).
Target Prioritization with Network Representation Learning
We used GAT2VEC to generate random walk-based node embeddings for protein-protein interaction (PPI) networks annotated with disease-specific differential gene expression profiles from GEO, ArrayExpress, and other sources. Following the prediction and evaluation pipeline from Emig, et al. (2013), we built simple generalized linear models using annotations from OpenTargets as a training set and generated rankings for all targets.
Muslu, Ö., Hoyt, C. T., Hofmann-Apitius, M., & Fröhlich, H. (2019). GuiltyTargets: Prioritization of Novel Therapeutic Targets with Deep Network Representation Learning. bioRxiv, 1–14.
Link Prediction with Network Representation Learning
Ali, M., Hoyt, C. T., Domingo-Fernández, D., Lehmann, J., & Jabeen, H. (2019). BioKEEN: A library for learning and evaluating biological knowledge graph embeddings. Bioinformatics, btz117.
Side Effect Prediction with Network Representation Learning
We generated node and edge embeddings for a graph of drugs, their chemical similarities, their side effects, their indications, their targets in order to predict side effects for new chemicals as well as to propose mechanisms of action for some side effects.
- master’s thesis in preparation
Drug Repositioning with Network Representation Learning
- master’s thesis in preparation