Dmitriy Babenko
Karaganda State Medical University
Dmitriy Babenko, PhD, works at Karaganda State Medical University as Research Director of Scientific Center. He received his PhD in 2017 and main interest during PhD study was discriminatory ability and concordance between difference subspecies typing methods. Dmitriy was particularly interested in typing based on WGS such as wgMLST. During his study, he has developed bioinformatics and data analysis skills. Since last year, Dmitry began research on colorectal cancer in the research group of KSMU. He successfully presented his results at international conferences and congresses.
Background: Colorectal cancer (CRC) is a leading cause of cancer-related mortality worldwide and accounts for over 9% of all cancer cases diagnosed in 2012 (Ferlay,J. et al.,2012). The etiological factors and pathogenic... [ view full abstract ]
Background: Colorectal cancer (CRC) is a leading cause of cancer-related mortality worldwide and accounts for over 9% of all cancer cases diagnosed in 2012 (Ferlay,J. et al.,2012). The etiological factors and pathogenic mechanisms underlying CRC development appear to be complex and heterogeneous. Up to 35% of CRC cases are estimated to be attributable to genetic factors (Lichtenstein,P. et al.,2000). Today, gene discovery efforts have identified many CRC susceptibility genes and several molecular pathways have been described, such as the chromosomal instability, the microsatellite instability, and the CpG island methylator phenotype pathways (Bogaert,J. et al.,2014). The aim of this study was to perform clustering analysis on oncogenes associated with CRC.
Methods: Thirteen sources, including GWAS, ClinVar, UniProt, Cosmic, HGMD, malacards.org, targetvalidation.org and others, have been used to choose gene associated with CRC. To estimate the measurement of the association of genes to cancer, the OncoScore R/Bioconductor package has been used. Integrative GeneCards® database has been used to obtain comprehensive information for the oncogenes. The combination of BiologicalProcesses, CellularComponents, MolecularFunctions and Pathways data formed a gene profile in binary format that was used for clustering analysis. Minimum spanning trees (MST) has been generated with SeqSphere+ software (Ridom).
Results: Of 1708 found genes associated with CRC, 835 (48.9%) protein coding genes have been assigned as oncogenes based on OncoScore (> 25). 9906 unique parameters have been totally determined for the oncogenes, including 5135 biological processes, 668 cellular components, 1169 Molecular functions and 2934 pathways. MST on 835 genes have revealed cluster-like structure with 3 major and several minor clusters (figure 1). The most associated with CRC genes (top 10 genes from different databases) were distributed across different clusters (shown in red). Data of some clusters are listed in Table 1.
Conclusions: Cluster analysis of oncogene demonstrated cluster-like (grouped) structure on biological process, molecular functions pathways and cell localization data. Cluster forming gene had low involvement in various biological processes and metabolic pathways, although, with a sufficiently high oncogenic potential. The most associated with CRC genes were located far from the cluster center of with high involvement in various biological processes and metabolic pathways.
Integrating Big Data (genome data, pharmacogenomics, therapeutic applications of genome ed