Category Archives: Publications

Yeonghun Lee, Sehhoon Park, Se-Hoon Lee$, Hyunju Lee$ (2017) Characterization of Genetic Aberrations in a Single Case of Metastatic Thymic Adenocarcinoma. BMC Cancer, 2017 May 15;17(1):330. (IF: 3.265) (JCR 2015: 85/213, 39.9%, Oncology)

Characterization of Genetic Aberrations in a Single Case of Metastatic Thymic Adenocarcinoma.

  • Author : Yeonghun Lee, Sehhoon Park, Se-Hoon Lee$, and Hyunju Lee$
  • Published Date : 2017
  • Category : Bioinformatics and Text Mining 
  • Place of publication : BMC Cancer

 

BACKGROUND:

Thymic adenocarcinoma is an extremely rare subtype of thymic epithelial tumors. Due to its rarity, there is currently no sequencing approach for thymic adenocarcinoma.

METHODS:

We performed whole exome and transcriptome sequencing on a case of thymic adenocarcinoma and performed subsequent validation using Sanger sequencing.

RESULTS:

The case of thymic adenocarcinoma showed aggressive behaviors with systemic bone metastases. We identified a high incidence of genetic aberrations, which included somatic mutations in RNASEL, PEG10, TNFSF15, TP53, TGFB2, and FAT1. Copy number analysis revealed a complex chromosomal rearrangement of chromosome 8, which resulted in gene fusion between MCM4 and SNTB1 and dramatic amplification of MYC and NDRG1. Focal deletion was detected at human leukocyte antigen (HLA) class II alleles, which was previously observed in thymic epithelial tumors. We further investigated fusion transcripts using RNA-seq data and found an intergenic splicing event between the CTBS and GNG5 transcript. Finally, enrichment analysis using all the variants represented the immune system dysfunction in thymic adenocarcinoma.

CONCLUSION:

Thymic adenocarcinoma shows highly malignant characteristics with alterations in several cancer-related genes.

Seungchul Lee#, Jingu Lee#, Sung Hoon Sim#, Yeonghun Lee, Kyung Chul Moon, Cheol Lee, Woong-Yang Park, Nayoung K. D. Kim, Se-Hoon Lee$, and Hyunju Lee$ (2017) Comprehensive somatic genome alterations of urachal carcinoma. Journal of Medical Genetics, Online Published (March 27 2017) (IF: 5.650) (JCR 2015: 19/166, 11.145%, GENETICS & HEREDITY)

Comprehensive somatic genome alterations of urachal carcinoma.

  • Author : Seungchul Lee#, Jingu Lee#, Sung Hoon Sim#, Yeonghun Lee, Kyung Chul Moon, Cheol Lee, Woong-Yang Park, Nayoung K. D. Kim, Se-Hoon Lee$, and Hyunju Lee$
  • Published Date : 2017
  • Category : Bioinformatics and Text Mining 
  • Place of publication : Journal of Medical Genetics

 

Abstract

Background: Urachal cancer is a rare cancer that develops in the urachus. Because of its rarity, standard treatment therapies for urachal cancer are not established, and chemotherapeutic regimens for bladder cancer have been unsuccessful for patients with urachal cancer. Hence, we aim to understand a systematic molecular characterization of urachal cancer.

Methods: We identified somatic single nucleotide variations (SNVs)/indels and somatic copy number aberrations (SCNAs) in the 17 patients by using whole-exome sequencing (WES) and OncoScanTM platform (Affymetrix) as follows: tumour-normal paired sequencing (WES, n = 10), tumour-only sequencing (WES, n = 1; targeted deep sequencing, n = 16), and OncoScanTM (n = 17).

Results: Our analyses identified 27 genes with somatic SNVs and indels, as well as six genes (APC, COL5A1, KIF26B, LRP1B, SMAD4, and TP53) that were recurrent in at least two patients. By analysing the SCNAs, we found that the extent of chromosomal amplifica tion was highly associated with the patient’s cancer stage. Interestingly, 35% (6/17) of the patients had focal DNA amplifications in FGFR family genes. The integration of somatic SNVs, indels, and SCNAs revealed significant alterations in the MAPK signalling pathways.

Conclusions: Our genome wide analysis of urachal cancer suggests that molecular characteristics may be important for the treatment of urachal cancer.

Jaeyong Kang and Hyunju Lee* (2017) Modeling User Interest in Social Media using News Media and Wikipedia. Information Systems, 2017 April 01; 65:52-64 (IF: 1.832) (JCR 2016: 34/144, 23.61%, COMPUTER SCIENCE, INFORMATION SYSTEMS).

Modeling User Interest in Social Media using News Media and Wikipedia.

  • Author : Jaeyong Kang and Hyunju Lee
  • Published Date : 2017
  • Category : Social Media
  • Place of publication : Information Systems

 

Abstract

Social media has become an important source of information and a medium for following and spreading trends, news, and ideas all over the world. Although determining the subjects of individual posts is important to extract users’ interests from social media, this task is nontrivial because posts are highly contextualized and informal and have limited length. To address this problem, we propose a user modeling framework that maps the content of texts in social media to relevant categories in news media. In our framework, the semantic gaps between social media and news media are reduced by using Wikipedia as an external knowledge base. We map term-based features from a short text and a news category into Wikipedia-based features such as Wikipedia categories and article entities. A user’s microposts are thus represented in a rich feature space of words. Experimental results show that our proposed method using Wikipedia-based features outperforms other existing methods of identifying users’ interests from social media.

Jeongkyun Kim, Jung-jae Kim and Hyunju Lee* (2017) An analysis of disease-gene relationship from Medline abstracts by DigSee. Scientific Reports, 2017 January 05; 7:40154 (IF: 5.228) (JCR 2015: 7/63, 11.3%, MULTIDISCIPLINARY SCIENCES).

An analysis of disease-gene relationship from Medline abstracts by DigSee.

  • Author : Jeongkyun Kim, Jung-jae Kim and Hyunju Lee
  • Published Date : 2017
  • Category : Bioinformatics and Text Mining 
  • Place of publication : Scientific Reports

 

Abstract

Diseases are developed by abnormal behavior of genes in biological events such as gene regulation, mutation, phosphorylation, and epigenetics and post-translational modification. Many studies of text mining attempted to identify the relationship between gene and disease by mining the literature, but they did not consider the biological events in which genes show abnormal behaviour in response to diseases. In this study, we propose to identify disease-related genes that are involved in the development of disease through biological events from Medline abstracts. We identified associations between 13,054 genes and 4,494 disease types, which cover more disease-related genes than manually curated databases for all disease types (e.g., Online Mendelian Inheritance in Man) and also than those for specific diseases (e.g., Alzheimer’s disease and hypertension). We show that the text mining findings are reliable, as per the PubMed scale, in that the disease-disease relationships inferred from the literature-wide findings are similar to those inferred from manually curated databases in a well-known study. In addition, literature-wide distribution of biological events across disease types reveals different characteristics of disease types.

Jiyoun Seo, Daeyong Jin, Chan-Hun Choi and Hyunju Lee* (2017) Integration of MicroRNA, mRNA, and Protein Expression Data for the Identification of Cancer-Related MicroRNAs. PLoS One, 2017 January 5; 12(1):e0168412 (IF: 3.057) (JCR 2015: 11/63, 17.5%, MULTIDISCIPLINARY SCIENCES).

Integration of MicroRNA, mRNA, and Protein Expression Data for the Identification of Cancer-Related MicroRNAs.

  • Author : Jiyoun SeoDaeyong Jin,  Chan-Hun Choi, and Hyunju Lee
  • Published Date : 2017
  • Category : Bioinformatics and Text Mining 
  • Place of publication : PLoS One

 

Abstract

MicroRNAs (miRNAs) are responsible for the regulation of target genes involved in various biological processes, and may play oncogenic or tumor suppressive roles. Many studies have investigated the relationships between miRNAs and their target genes, using mRNA and miRNA expression data. However, mRNA expression levels do not necessarily represent the exact gene expression profiles, since protein translation may be regulated in several different ways. Despite this, large-scale protein expression data have been integrated rarely when predicting gene-miRNA relationships. This study explores two approaches for the investigation of gene-miRNA relationships by integrating mRNA expression and protein expression data. First, miRNAs were ranked according to their effects on cancer development. We calculated influence scores for each miRNA, based on the number of significant mRNA-miRNA and protein-miRNA correlations. Furthermore, we constructed modules containing mRNAs, proteins, and miRNAs, in which these three molecular types are highly correlated. The regulatory interactions between miRNA and genes in these modules have been validated based on the direct regulations, indirect regulations, and co-regulations through transcription factors. We applied our approaches to glioblastomas (GBMs), ranked miRNAs depending on their effects on GBM, and obtained 52 GBM-related modules. Compared with the miRNA rankings and modules constructed using only mRNA expression data, the rankings and modules constructed using mRNA and protein expression data were shown to have better performance. Additionally, we experimentally verified that miR-504, highly ranked and included in the identified modules, plays a suppressive role in GBM development. We demonstrated that the integration of both expression profiles allows a more precise analysis of gene-miRNA interactions and the identification of a higher number of cancer-related miRNAs and regulatory mechanisms.

 

Wonjun Choi, Baeksoo Kim, Hyejin Cho, Doheon Lee and Hyunju Lee* (2016) A corpus for plant-chemical relationships in the biomedical domain. BMC Bioinformatics, 2016 September 20; 17:386 (IF: 2.435) (JCR 2015: 10/56, 17.9%, MATHEMATICAL & COMPUTATIONAL BIOLOGY).

A corpus for plant-chemical relationships in the biomedical domain.

 

Abstract

Background: Plants are natural products that humans consume in various ways including food and medicine. They have a long empirical history of treating diseases with relatively few side effects. Based on these strengths, many studies have been performed to verify the effectiveness of plants in treating diseases. It is crucial to understand the chemicals contained in plants because these chemicals can regulate activities of proteins that are key factors in causing diseases. With the accumulation of a large volume of biomedical literature in various databases such as PubMed, it is possible to automatically extract relationships between plants and chemicals in a large-scale way if we apply a text mining approach. A cornerstone of achieving this task is a corpus of relationships between plants and chemicals.

Results: In this study, we first constructed a corpus for plant and chemical entities and for the relationships between them. The corpus contains 267 plant entities, 475 chemical entities, and 1,007 plant–chemical relationships (550 and 457 positive and negative relationships, respectively), which are drawn from 377 sentences in 245 PubMed abstracts. Inter-annotator agreement scores for the corpus among three annotators were measured. The simple percent agreement scores for entities and trigger words for the relationships were 99.6 and 94.8 %, respectively, and the overall kappa score for the classification of positive and negative relationships was 79.8 %. We also developed a rule-based model to automatically extract such plant–chemical relationships. When we evaluated the rule-based model using the corpus and randomly selected biomedical articles, overall F-scores of 68.0 and 61.8 % were achieved, respectively.

Conclusion: We expect that the corpus for plant–chemical relationships will be a useful resource for enhancing plant research. The corpus is available at http://combio.gist.ac.kr/plantchemicalcorpus.

Corpus URL: http://combio.gist.ac.kr/herding

Daeyong Jin and Hyunju Lee* (2016) Prioritizing cancer-related microRNAs by integrating microRNA and mRNA datasets. Scientific Reports, 2016 October 13; 6:35350 (IF: 5.228) (JCR 2015: 7/62, 11.3%, MULTIDISCIPLINARY SCIENCES).

Prioritizing cancer-related microRNAs by integrating microRNA and mRNA datasets.

  • Author : Daeyong Jin and Hyunju Lee
  • Published Date : 2016
  • Category : Bioinformatics and Text Mining 
  • Place of publication : Scientific Reports

 

Abstract

MicroRNAs (miRNAs) are small non-coding RNAs regulating the expression of target genes, and they are involved in cancer initiation and progression. Even though many cancer-related miRNAs were identified, their functional impact may vary, depending on their effects on the regulation of other miRNAs and genes. In this study, we propose a novel method for the prioritization of candidate cancer-related miRNAs that may affect the expression of other miRNAs and genes across the entire biological network. For this, we propose three important features: the average expression of a miRNA in multiple cancer samples, the average of the absolute correlation values between the expression of a miRNA and expression of all genes, and the number of predicted miRNA target genes. These three features were integrated using order statistics. By applying the proposed approach to four cancer types, glioblastoma, ovarian cancer, prostate cancer, and breast cancer, we prioritized candidate cancer-related miRNAs and determined their functional roles in cancer-related pathways. The proposed approach can be used to identify miRNAs that play crucial roles in driving cancer development, and the elucidation of novel potential therapeutic targets for cancer treatment.

 

Baeksoo Kim, Jihoon Jo, Jonghyun Han, Chungoo Park* and Hyunju Lee* (2016) In silico re-identification of properties of drug target proteins. Accepted to BMC Medical Informatics and Decision Making. (Presented at DTMBIO 2016 in conjuction with CIKM, Indianapolis, USA)

In silico re-identification of properties of drug target proteins.

  • Author : Baeksoo Kim, Jihoon Jo, Jonghyun Han,  Chungoo Park and Hyunju Lee
  • Published Date : 2016
  • Category : Bioinformatics
  • Place of publication : BMC Medical Informatics and Decision Making

 

Abstract

Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other
properties of proteins have not been fully investigated. In this study, we first confirm previously known properties of drug targets with a higher statistical power by analyzing larger sets of drugs and targets. We then suggest new properties, such as gene essentiality, gene expression levels, tissue specificity, and solvent accessibility. We predict drug targets based on these features using a support vector machine and
a random forest method. We believe that our study will provide a new aspect in inferring drug-target interactions.

Jonghyun Han and Hyunju Lee* (2016) Characterizing the interests of social media users: Refinement of a topic model for incorporating heterogeneous media. Information Sciences, 2016 September 01; 358-359:112-128 (IF: 3.364) (JCR 2015: 8/144, 5.56%, COMPUTER SCIENCE, INFORMATION SYSTEMS)

Characterizing the interests of social media users: Refinement of a topic model for incorporating heterogeneous media.

  • Author : Jonghyun Han and Hyunju Lee
  • Published Date : 2016
  • Category : Mining in Social Network
  • Place of publication : Information Sciences

 

Abstract

Recent research has focused on extracting personal interest data from social media. Although many methods have been developed, accurately estimating users’ interests is often difficult because messages on social media are short and are not classified into any predefined categories. We propose a new method to overcome this problem by incorporating heterogeneous media, such as news. In our method, we first extract explicit features and implicit topics of categories using news media, where implicit topics are determined using a refined topic model. Next, we describe social media messages using these features and topics to estimate users’ interests. Compared with several other approaches, our approach provides more accurate estimations of users’ interests. We also demonstrate that the accuracy of friend recommendations is increased using the users’ interests estimated by our method. Thus, we expect that the proposed approach could be helpful for enhancing the personalization of social media services.

Wangin Kim, Sangbin Park, Chanhun Choi, Youg Ran Kim, Inkyu Park, Changseob Seo, Daehwan Youn, Wook Shin, Yumi Lee, Donghee Choi, Mirae Kim, Hyunju Lee, Seonjong Kim, and Changsu Na (2016) Evaluation of Anti-Inflammatory Potential of the New Ganghwaljetongyeum on Adjuvant-Induced Inflammatory Arthritis in Rats. Evidence-Based Complementary and Alternative Medicine, 2016 June 13; 2016:1230294 (IF 1.931) (JCR 2015: 7/24, 29.2%, INTEGRATIVE & COMPLEMENTARY MEDICINE).

Evaluation of Anti-Inflammatory Potential of the New Ganghwaljetongyeum on Adjuvant-Induced Inflammatory Arthritis in Rats.

  • Author :Wangin Kim, Sangbin Park, Chanhun Choi, Youg Ran Kim, Inkyu Park, Changseob Seo, Daehwan Youn, Wook Shin, Yumi Lee, Donghee Choi, Mirae Kim, Hyunju Lee, Seonjong Kim, and Changsu Na
  • Published Date : 2016
  • Category : Bioinformatics and Text Mining 
  • Place of publication : Evidence-Based Complementary and Alternative Medicine

 

Abstract

Ganghwaljetongyeum (GHJTY) has been used as a standard treatment for arthritis for approximately 15 years at the Korean Medicine Hospital of Dongshin University. GHJTY is composed of 18 medicinal herbs, of which five primary herbs were selected and named new Ganghwaljetongyeum (N-GHJTY). The purpose of the present study was to observe the effect of N-GHJTY on arthritis and to determine its mechanism of action. After confirming arthritis induction using complete Freund’s adjuvant (CFA) in rats, N-GHJTY (62.5, 125, and 250 mg/kg/day) was administered once a day for 10 days. In order to determine pathological changes, edema of the paws and weight were measured before and for 10 days after N-GHJTY administration. Cytokine (TNF-α, IL-1β, and IL-6) levels and histopathological lesions in the knee joint were also examined. Edema in the paw and knee joint of N-GHJTY-treated rats was significantly decreased at 6, 8, and 10 days after administration, compared to that in the CFA-control group, while weight consistently increased. Rats in N-GHJTY-treated groups also recovered from the CFA-induced pathological changes and showed a significant decline in cytokine levels. Taken together, our results showed that N-GHJTY administration was effective in inhibiting CFA-induced arthritis via anti-inflammatory effects while promoting cartilage recovery by controlling cytokine levels.