An analysis of disease-gene relationship from Medline abstracts by DigSee.
- Author : Jeongkyun Kim, Jung-jae Kim and Hyunju Lee
- Published Date : 2017
- Category : Bioinformatics and Text Mining
- Place of publication : Scientific Reports
Diseases are developed by abnormal behavior of genes in biological events such as gene regulation, mutation, phosphorylation, and epigenetics and post-translational modification. Many studies of text mining attempted to identify the relationship between gene and disease by mining the literature, but they did not consider the biological events in which genes show abnormal behaviour in response to diseases. In this study, we propose to identify disease-related genes that are involved in the development of disease through biological events from Medline abstracts. We identified associations between 13,054 genes and 4,494 disease types, which cover more disease-related genes than manually curated databases for all disease types (e.g., Online Mendelian Inheritance in Man) and also than those for specific diseases (e.g., Alzheimer’s disease and hypertension). We show that the text mining findings are reliable, as per the PubMed scale, in that the disease-disease relationships inferred from the literature-wide findings are similar to those inferred from manually curated databases in a well-known study. In addition, literature-wide distribution of biological events across disease types reveals different characteristics of disease types.