Identification of markers associated with gene expression and DNA methylation data in cancers


Cancer is steadily researched for many people because one of the most dangerous diseases is a cancer. These days cancer is curable if it is treated in the bud. Researchers tend to focus on early finding in cancer. There are many area to find caner occurrence. One of those is epigenetics. For example, DNA methylation regulates gene expression. It mainly occurs at CPG islands. Inhibiting gene expression of tumor suppressor is caused to hypermethylation. We study deep into this mechanism with gene expression to find caner in early stage. Two groups by DNA methylation status classify how genes express for each gene. So we could have significant genes including information between DNA methylation and gene expression.

Until now, there are a large number of methodologies to apply DNA methylation and gene experssion. Initially 5-azacytidine Experiments(McGhee & Ginder, 1979) trigger this area and then not only caner, but also various diseases are used to utilize DM data(Movassagh M et al., 2010).

To find improved methods, we read papers about relation with copy number (CN) and GE (TCGA Research Network, 2008). We tried to apply one of methods in this paper (Elizabeth Hyman et al., 2002). But it was hard to find threshold that find whether status is hypermethylation or not. We found two ways to find threshold. One of them is epigenetic variable outliers for risk prediction analysis (Andrew E. Teschendorff et al., 2012). In this paper, we used 77 normal cancer samples and 286 tumor samples to obtain threshold dividing DNA methylation status. As a result of this analysis, we could divide two groups. Then, we could get weight that indicates which gene is important. Also, among the genes, we only had differential expressed genes using t-test with normal samples.

We apply genes gotten in above to two types of methods. First, we found functions related with cancer using pathway enrichment test. Also, we performed hypergeometric test to find relation with genes-samples module that made from information between a subset of genes and a subset of samples. To summarize, through relation with information between DNA methylation and GE, we tried to find candidate cancer genes and pathway enriched about cancer. Also, we checked our genes works significantly by genes-samples module.