IASO: A System for Medical Knowledge Graph Visualization and Curation


We present a system for node-linkbased medical knowledge graph (KG) visualization, allowing visual inspection of knowledge subgraph based on probabilistic graph layout. Specifically, we reuse existing knowledge bases to alleviate the difficulties in building a high-quality knowledge graph, ranging in size up to 7 million edges. Then we provide insights into probability distributions for the subgraph consisted of individual nodes and edges. The visualization is created by transforming probability distributions into a two-dimensional embedding using graph layout techniques. Splatting and edge bundling are used to visualize point clouds and graph topology.

In IASO, Graph Module demonstrates the detailed information of the input entity including its KG subgraph, the connected entities and their corresponding relationships. Contents Index Module presents the complete hierarchy taxonomy of the KG to make the inheritance relationship more understandable. The labeled datasets about the antibiotic drug similarity measure can be downloaded from Data Download Module. The RESTful API allows users to obtain the query results in json format.

Paper for introducing IASO system:

Shen Y, Yuan K, Dai J, Tang B, Yang M, Lei K*. KGDDS: A System for Drug-Drug Similarity Measure in Therapeutic Substitution based on Knowledge Graph Curation. Journal of Medical Systems, 2019. (SCI, IF: 2.098)

View Knowledge Graph Selected publication


IASO Knowledge Graph


The displayed antibiotics KG was constructed based on the DO, IDO, NCBI, HPO and DrugBank databases. Here, 507 infectious diseases and their therapy methods, in combination with 332 different infection sites, 936 relevant symptoms of the digestive, reproductive, neurological and other systems, 371 types of complications, 838,407 types of bacteria, 341 types of antibiotics and their introductions, 1,504 pairs of reaction rates (antibacterial spectrum) between antibiotics and bacteria, 431 pairs of drug interaction relationships, and 86 pairs of antibiotic-specific population contraindicated relationships, were studied.

KG1

Our antibiotics KG is available for unload : Download

Paper about the antibiotics KG construction and Diagnosis reasoning:

      Shen Y, Yuan K, Chen D, Colloc J, Yang M, Li Y, Lei K*. An ontology-driven clinical decision support system (IDDAP) for infectious disease diagnosis and antibiotic prescription. Artificial intelligence in medicine, 2018, 86, 20-32. (JCR 1, IF: 2.879)

Drug-drug similarity measure


Measuring drug-drug similarity is important but challenging. Significant progresses have been made in drugs whose labeled training data is sufficient and available. However, handling data skewness and incompleteness with domain-specific knowledge graph, is still a relatively new territory and an under-explored prospect. In this context, we adopt our built IASO Knowledge Graph to aid Drug-Drug Similarity measure.

     Data Collection. IASO conducts the drug similarity evaluation mainly based on the antibiotic-relevant information in DrugBank . We study the relationships between antibiotics and their corresponding side effects from SIDER , explore the mechanism of essential pharmacologic properties of medications from NDF-RT and extract textual feature from more than 500,000 papers about medicine provided by PubMed .

       Antibiotic Pairs Labeling.To verify the effectiveness of IASO, we conduct experiments on 1326 pairs most commonly used antibiotics. Doctors score the similarity between two antibiotics, which ranges in [0, 1], according to both antibacteria spectrum and efficacies of medicine (see www.iasokg.com). 0 indicates that there is no similarity between two antibiotics, while 1 implies that the two antibiotics are extremely similar. To make antibiotic pairs labeling more accurate, each pair is labeled by at least 3 doctors and the average is taken as the final result. The Pearson coefficient between the scores issued by each doctor and the average score ranges from 0.827 to 0.864 while Spearman coefficient ranges from 0.792 to 0.888, both proving the reliability of doctors' assessment. The labeled antibiotic pairs are divided into training set and test set.

Paper about the Drug-drug similarity measure:

       Lei K, Yuan K, Zhang Q, Shen Y*. MedSim: A Novel Semantic Similarity Measure in Bio-medical Knowledge Graphs. In The 11th International Conference on Knowledge Science, Engineering and Management (KSEM 2018). Changchun, China, August 17-19, 2018. pp. 479-490.

Paper about the Drug representation learning:

      Shen Y, Yuan K, Li Y, Tang B, Yang M, Du N, Lei K*. Drug2Vec: Knowledge-aware Feature-driven Method for Drug Representation Learning. In The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2018). Madrid, Spain, December 3-6 2018.

Group Introduction


lei

Kai Lei received the Ph.D. in C.S. from Peking University, China, in 2015, M.Sc in C.S. from Columbia University in 1999 and B.Sc in C.S. from Peking University in 1998. He had worked for companies including IBM T.J Waston Research Center, Citigroup, Oracle, Google from 1999 to 2004. He currently is an associate professor in the School of Electonic and Computer Engineering (SECE), Peking University, Shenzhen, and participates in the CENI project supported by National Development and Reform Commission since 2016. His research interests include social networks, knowledge graph, big data technologies and named data networking.


shen

Ying Shen is now an Assistant Researcher Professor in School of Electronics and Computer Engineering (SECE) at Peking University, leading the Artificial Intelligence - Knowledge Graph team for Medical research. She received her Ph.D. degree from the University of Paris Ouest Nanterre La Défense (France), specialized in Medical & Biomedical Information Science. She received her Erasmus Mundus Master degree in Natural Language Processing from the University of Franche-Comté (France) and University of Wolverhampton (England). Her research interest is mainly focused in the area of Medical Informatics, Natural Language Processing and Machine Learning.


yuan

Kaiqi Yuan received the BS degree in computer science from Beijing University of Posts and Telecommunications, in 2016. She is working toward the MS degree in computer science at Peking University. Her research interest is mainly focused in the area of biomedical informatics, deep learning and knowledge graph.


yuan

Desi Wen received his B.S. and M.S. degree in Computer Science from Peking University, China, in 2015 and 2018 respectively. He is currently an algorithmic engineer at Microsoft Corporation. His research interest mainly focuses on construction and completion of medical knowledge graph.



zhang

Lizhu Zhang received her BS degree in software engineering from Wuhan University, China, in 2015. She is working toward her MS degree in computer science at Peking University. Her research interest is mainly focused in the area of knowledge graph and data mining.


chen

Daoyuan Chen received the BS degree in computer science from University of Electronic Science and Technology of China, in 2016. He is working toward the MS degree in computer science at Peking University. His research interest is mainly focused in the area of deep learning and knowledge graph.


deng

Yang Deng received the BS degree in computer science from Beijing University of Posts and Telecommunications, in 2016. He is working toward the MS degree in computer science at Peking University. His research interest is mainly focused in the area of deep learning and question answering.


dai

Jingchao Dai received the BS degree in computer science from Beijing University of Posts and Telecommunications, in 2018. She is working toward the MS degree in computer science at Peking University. Her research interest is mainly focused in the area of biomedical informatics, deep learning and knowledge graph.


huang

Jiyue Huang visited City University of Hongkong as an exchanage student in 2016, and received her BS degree in electronic information engineering from Tianjin University, China, in 2017. She is working toward her MS degree in computer science at Peking University. Her research interest is mainly focused in the area of knowledge graph and federated learning.


all


Contact information:

Kai Lei, Ying Shen:
Email: {leik, shenying}{AT}pkusz.edu.cn
Address: 114, Building A, School of Electronics and Computer Engineering, Peking University Shenzhen Graduate School,                Shenzhen 518055, P.R. China
Academic website: https://netlab.pkusz.edu.cn
                                https://www.researchgate.net/profile/Kai_Lei8
                                https://www.researchgate.net/profile/Ying_Shen21

Selected publication


  1. Shen Y, Colloc J, Jacquet-Andrieu A, Lei K*. Emerging Medical Informatics with Case-Based Reasoning for Aiding Clinical Decision in Multi-Agent System. Journal of Biomedical Informatics, 2015, 56: 307–317. (JCR 1, IF: 2.882)

  2. Shen Y, Yuan K, Chen D, Colloc J, Yang M, Li Y, Lei K*. An ontology-driven clinical decision support system (IDDAP) for infectious disease diagnosis and antibiotic prescription. Artificial intelligence in medicine, 2018, 86, 20-32. (JCR 1, IF: 2.879)

  3. Shen Y, Li Y, Si S, Zhang J, Yang M, Lei K*. Gastroenterology Ontology Construction using Synonym Identification and Relation Extraction. IEEE Access. ,2018, 6(1), 52095-52104. (JCR 1, IF: 3.557).)

  4. Shen Y, Zhang L, Zhang J, Yang M, Tang B, Li Y, Lei K*. CBN: Constructing a Clinical Bayesian Network based on Data from the Electronic Medical Record. Journal of Biomedical Informatics ,2018, 88: 1–10. (JCR 1, IF: 2.882)

  5. Shen Y, Yuan K, Tang B, Li Y, Tang B, Li Y, Du N, Yang M, Lei K*. KMR: Knowledge-oriented Medicine Representation Learning for Drug-Drug Interaction and Similarity Computation. Journal of Cheminformatics, 2019, 11(1), 22. (JCR 2, IF: 4.154)

  6. Shen Y, Chen D, Tang B, Yang M, Lei K*. EAPB: Entropy-Aware Path-Based Metric for Ontology Quality. Journal of Biomedical Semantics , 2018, 9(1), 20. (JCR 2, IF: 1.883)

  7. Shen Y, Yuan K, Dai J, Tang B, Yang M, Lei K*. KGDDS: A System for Drug-Drug Similarity Measure in Therapeutic Substitution based on Knowledge Graph Curation. Journal of Medical Systems ,2019. (JCR 4, IF: 2.098)

  8. Shen Y, Li Y, Huang J, Zhang J, Si S, Yang M, Lei K*. Discovering Medical Entity Relations from Texts using Dependency Information. Natural Language Engineering , 2018, 1–21. Cambridge University Press. (JCR 4, IF: 0.8)

  9. Lei K, Zhang J, Xie Y, Wen D, Chen D, Yang M, Shen Y*. Path-based Reasoning with Constrained Type Attention for Knowledge Graph Completion. Neural Computing and Applications, 2019, 1-10. (JCR 1, IF: 4.664)

  10. Lei K, Xie Y, Zhong S, Dai J, Yang M, Shen Y*. Generative Adversarial Fusion Network for Class Imbalance Credit Scoring. Neural Computing and Applications, 2019, 1-10. (JCR 1, IF: 4.664)

  11. Lei K, Liu Y, Zhong S, Liu Y, Xu K, Shen Y, Yang M. Understanding User Behavior in Sina Weibo Online Social Network: A Community Approach. IEEE Access. , 6(2018), 13302-13316. (JCR 1, IF: 3.557)

  12. Lei K, Zhang LZ, Liu Y, Shen Y, Liu CW. An event summarizing algorithm based on the timeline relevance model in Sina Weibo. IScience China Information Sciences (SCIS), 61(12), 129101. (JCR 1, IF: 2.188)

  13. Shen Y, Chen D, Yang M, Li Y, Du N, Lei K*. Ontology Evaluation with Path-based Text-aware Entropy Computation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018). SIGIR: Ann Arbor, Michigan, USA, July 8-12. pp. 881-884. ACM. (CCF A)

  14. Shen Y, Deng Y, Yang M, Li Y, Du N, Fan W, Lei K*. Knowledge-aware Attentive Neural Network for Ranking Question Answer Pairs. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018). SIGIR: Ann Arbor, Michigan, USA, July 8-12. pp. 901-904. ACM. (CCF A)

  15. Deng Y, Xie Yu, Li Y, Yang M, Du N, Fan W, Lei K*, Shen Y. Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering. In The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19). Honolulu, Hawaii, USA, January 27 – February 1, 2019. ACM (CCF A)

  16. Shen Y, Deng Y, Zhang J, Li Y, Du N, Fan W, Lei K*. IDDAT: An Ontology-Driven Decision Support System for Infectious Disease Diagnosis and Therapy. In The 18th IEEE International Conference on Data Mining (ICDM 2018). Singapore, November 17-20, 2018. (CCF B)

  17. Shen Y, Yuan K, Li Y, Tang B, Yang M, Du N, Lei K*. Drug2Vec: Knowledge-aware Feature-driven Method for Drug Representation Learning. In The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2018). Madrid, Spain, December 3-6 2018. (CCF B)

  18. Deng Y, Shen Y, Yang M, Li Y, Du N, Fan W, Lei K*. Knowledge as A Bridge: Improving Cross-domain Answer Selection with External Knowledge. In The 27th International Conference on Computational Linguistics (COLING 2018). Santa Fe, New-Mexico, USA, August 20-26, 2018. pp. 3295-3305. (CCF B)

  19. Lei K, Chen D, Li Y, Yang M, Du N, Fan W, Shen Y. Cooperative Denoising for Distantly Supervised Relation Extraction. In The 27th International Conference on Computational Linguistics (COLING 2018). Santa Fe, New-Mexico, USA, August 20-26, 2018. pp. 426-436. (CCF B)

  20. Lei K, Yuan K, Zhang Q, Shen Y. MedSim: A Novel Semantic Similarity Measure in Bio-medical Knowledge Graphs. In The 11th International Conference on Knowledge Science, Engineering and Management (KSEM 2018). Changchun, China, August 17-19, 2018. pp. 479-490. (CCF C)

  21. Lei K, Huang J, Si S, Shen Y. Semantic Similarity Measures to Disambiguate Terms in Medical Text. In The 25th International Conference on Neural Information Processing (ICONIP2018). Siem Reap, Cambodia, December 13-16, 2018. pp. 398-409. (CCF C)

  22. Shen Y, Zhang Q, Zhang J, Huang J, Lu Y, Lei K*. Improving Medical Short Text Classification with Semantic Expansion using Word-Cluster Embedding. In The 9th iCatse Conference on Information Science and Applications (ICISA2018). Hong Kong, June 25-27, 2018. pp. 401-412.

  23. Colloc J, Yameogo R, Summons P, Shen Y, Park M, Aronson J E. EPICE an Emotion Fuzzy Vectorial Space for Time Modeling in Medical Decision, In The International Conference on Internet of Things and Machine Learning (IML 2017) Liverpool, United Kingdom, October 17-18, 2017. pp. 29-38.


Project support


  1. This work has been financially supported by Natural Science Foundation of Guangdong (No.2018A030313017), and Shenzhen Key Fundamental Project (JCYJ20170412151008290).

  2. 2016-2018: Research on Key Technologies of Knowledge Reasoning in Medical Diagnosis and Treatment, National Natural Science Foundation of China Youth Science Foundation.

  3. 2016-2019: Research on methods and techniques of knowledge mapping for general practice decision-making, Shenzhen Key Basic Research Project.