Category Archives: Home

Bayarbaatar Amgalan and Hyunju Lee (2015) DEOD: Uncovering dominant effects of cancer-driver genes based on a partial covariance selection method. Bioinformatics, 2015 Aug 1;31(15):2452-60. (IF: 4.981) (JCR: 4/52, 7.7%, MATHEMATICAL & COMPUTATIONAL BIOLOGY).

DEOD: Uncovering dominant effects of cancer-driver genes based on a partial covariance selection method.



Motivation: The generation of a large volume of cancer genomes has allowed us to identify disease-related alterations more accurately, which is expected to enhance our understanding regarding the mechanism of cancer development. With genomic alterations detected, one challenge is to pinpoint cancer-driver genes that cause functional abnormalities.

Results: Here, we propose a method for uncovering the dominant effects of cancer-driver genes (DEOD) based on a partial covariance selection approach. Inspired by a convex optimization technique, it estimates the dominant effects of candidate cancer-driver genes on the expression level changes of their target genes. It constructs a gene network as a directed-weighted graph by integrating DNA copy numbers, single nucleotide mutations, and gene expressions from matched tumor samples, and estimates partial covariances between driver genes and their target genes. Then, a scoring function to measure the cancer-driver score for each gene is applied. To test the performance of DEOD, a novel scheme is designed for simulating conditional multivariate normal variables (targets and free genes) given a group of variables (driver genes). When we applied the DEOD method to both the simulated data and breast cancer data, DEOD successfully uncovered driver variables in the simulation data, and identified well-known oncogenes in breast cancer. In addition, two highly ranked genes by DEOD were related to survival time. The copy number amplifications of MYC (8q24.21) and TRPS1 (8q23.3) were closely related to the survival time with p-values = 0.00246 and 0.00092, respectively. The results demonstrate that DEOD can efficiently uncover cancer-driver genes.

Availability: DEOD was implemented in Matlab, and source codes and data are available at

Ho Jang, Jeongkyun Kim, and Hyunju Lee (2014) Data and Text Mining for Cancer Research. KIISE, 32 (3):61-70 (March 2014).

Data and Text Mining for Cancer Research.

Paper :   link 

Website :   link

Hee-Jin Lee, Sang-Hyung Shim, Mi-Ryoung Song, Hyunju Lee, and Jong C. Park (2013) CoMAGC: a Corpus with Multi-faceted Annotations of Gene-Cancer Relations. BMC Bioinformatics, 14:323 (14 November 2013) (IF: 3.02)

CoMAGC: a Corpus with Multi-faceted Annotations of Gene-Cancer Relations.

  • Author : Hee-Jin Lee,  Sang-Hyung Shim, Mi-Ryoung Song, Hyunju Lee, and Jong C. Park
  • Published Date : 2013
  • Category : Bioinformatics and Text Mining
  • Place of publication : BMC Bioinformatics


Background: In order to access the large amount of information in biomedical literature about genes implicated in various cancers both efficiently and accurately, the aid of text mining (TM) systems is invaluable. Current TM systems do target either gene-cancer relations or biological processes involving genes and cancers, but the former type produces information not comprehensive enough to explain how a gene affects a cancer, and the latter does not provide a concise summary of gene-cancer relations.

Result: In this paper, we present a corpus for the development of TM systems that are specifically targeting gene-cancer relations but are still able to capture complex information in biomedical sentences. We describe CoMAGC, a corpus with multi-faceted annotations of gene-cancer relations. In CoMAGC, a piece of annotation is composed of four semantically orthogonal concepts that together express 1) how a gene changes, 2) how a cancer changes and 3) the causality between the gene and the cancer. The multi-faceted annotations are shown to have high inter-annotator agreement. In addition, we show that the annotations in CoMAGC allow us to infer the prospective roles of genes in cancers and to classify the genes into three classes according to the inferred roles. We encode the mapping between multi-faceted annotations and gene classes into 10 inference rules. The inference rules produce results with high accuracy as measured against human annotations. CoMAGC consists of 821 sentences on prostate, breast and ovarian cancers. Currently, we deal with changes in gene expression levels among other types of gene changes.

Availability: The corpus is available at under the terms of the Creative Commons Attribution License (
Conclusions: The corpus will be an important resource for the development of advanced TM systems on gene-cancer relations.

Shinhyuk Kim, Daeyong Jin and Hyunju Lee (2013) Predicting drug-target interactions using drug-drug interactions. PLoS One. 8(11): e80129. (IF: 3.730).

Predicting drug-target interactions using drug-drug interactions.



Computational methods for predicting drug-target interactions have become important in drug research because they can help to reduce the time, cost, and failure rates for developing new drugs. Recently, with the accumulation of drug-related data sets related to drug side e ffects and pharmacological data, it has became possible to predict potential drug-target interactions. In this study, we focus on drug-drug interactions (DDI), their adverse eff ects (DDIAE) and pharmacological information (DDIPharm), and investigate the relationship among chemical structures, side eff ects, and DDIs from several data sources. In this study, DDIPharm data from the STITCH database, DDIAE from, and drug-target pairs from ChEMBL and SIDER were first collected. Then, by applying two machine learning approaches, a support vector machine (SVM) and a kernel-based L1-norm regularized logistic regression (KL1LR), we showed that DDI is a promising feature in predicting drug-target interactions. Next, the accuracies of predicting drug-target interactions using DDI were compared to those obtained using the chemical structure and side e ffects based on the SVM and KL1LR approaches, showing that DDI was the data source contributing the most for predicting drug-target interactions.


Azad, A.K.M and Lee, H. (2013) Voting-based cancer module identification by combining topological and data-driven properties. PLoS One, 2013 Aug 5;8(8):e70498 (IF: 3.730)

Voting-based cancer module identification by combining topological and data-driven properties.

  • Author : A.K. M Azad and Hyunju Lee
  • Published Date : 2013
  • Category : Bioinformatics and Text Mining 
  • Place of publication : PLoS One



Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations e.g. GE-GE, CNA-GE, and CNA-CNA over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment q-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.


  • Software website :   link
  • Paper: link

Kim, J., So, S., Lee, H., Park, J. C., Kim, J.J., and Lee, H. (2013) DigSee: Disease Gene Search Engine with Evidence sentences (version cancer). Nucleic acids research, 41(Web server issue) (IF: 8.026)

DigSee: Disease Gene Search Engine with Evidence sentences (version cancer)..

  • Author : Jeongkyun KimSeongeun So, Heejin Lee, Jong C. Park, Jung-Jae Kim, and Hyunju Lee
  • Published Date : 2013
  • Category : Bioinformatics and Text Mining 
  • Place of publication : Nucleic acids research



Biological events such as gene expression, regulation, phosphorylation, localization, and
protein catabolism play important roles in the development of diseases. Understanding the
association between diseases and genes can be enhanced with the identification of involved
biological events in this association. Although biological knowledge has been accumulated in
several databases and can be accessed through the Web, there is no specialized Web tool yet allowing for a query into the relationship among diseases, genes, and biological events. For this task, we developed DigSee to search Medline abstracts for evidence sentences describing that “genes” are involved in the development of “cancer” through “biological events”. DigSee is available through


Paper :   link 

Website :   link

Media covered :   전자신문 (Electronic Times) (2013. 07. 05), 디지털타임스 (Digital Times) (2013.07. 04), 경제투데이 (2013. 07. 05), 뉴스1 (News1) (2013.07.05)

Jang, H. and Lee, H. (2012) Meta-Analysis of Pain Relief Effects by Laser Irradiation on Joint Areas, Photomedicine and Laser Surgery, 30(8) (IF: 1.255)

Meta-Analysis of Pain Relief Effects by Laser Irradiation on Joint Areas

  • Author : Ho JangHyunju Lee
  • Published Date : 2012
  • Category : Laser therapy
  • Place of publication : Photomedicine and Laser Surgery



Background: Laser therapy has been proposed as a physical therapy for musculoskeletal disorders and has attained popularity because no side effects have been reported after treatment. However, its true effectiveness is still controversial because several clinical trials have reported the ineffectiveness of lasers in treating pain. Methods: In this systematic review, we investigate the clinical effectiveness of low-level laser therapy (LLLT) on joint pain. Clinical trials on joint pain satisfying the following conditions are included: the laser is irradiated on the joint area, the PEDro scale score is at least 5, and the effectiveness of the trial is measured using a visual analogue scale (VAS). To estimate the overall effectiveness of all included clinical trials, a mean weighted difference in change of pain on VAS was used. Results: MEDLINE is the main source of the literature search. After the literature search, 22 trials related to joint pain were selected. The average methodological quality score of the 22 trials consisting of 1014 patients was 7.96 on the PEDro scale; 11 trials reported positive effects and 11 trials reported negative effects. The mean weighted difference in change of pain on VAS was 13.96mm (95% CI, 7.24–20.69) in favor of the active LLLT groups. When we only considered the clinical trials in which the energy dose was within the dose range suggested in the review by Bjordal et al. in 2003 and in World Association for Laser Therapy (WALT) dose recommendation, the mean effect sizes were 19.88 and 21.05mm in favor of the true LLLT groups, respectively. Conclusions: The review shows that laser therapy on the joint reduces pain in patients. Moreover, when we restrict the energy doses of the laser therapy into the dose window suggested in the previous study, we can expect more reliable pain relief treatments.


Paper : link