The Bioinformatics Section of the Biostatistics SRF aims to build and maintain robust and state-of-the-art analysis pipelines for analyzing, interpreting, and visualization of large-scale genomic, transcriptomic, and metabolomic data generated by Markey's cancer research experiments. While these pipelines can be used for general purpose bioinformatics applications, they are specifically tailored to reveal mutations and complex behaviors of cancer genomes. We work closely with and provide custom bioinformatics solutions for MCC investigators. Our current services focus on the following areas. New services will be added depending on demand.
- Microarray Processing and Analysis. The Bioinformatics Section has developed a pipeline for microarray data processing and analysis, including data normalization, quality assessment, differential expression identification and visualization, and pathway/functional analysis.
- Next Generation Sequencing Data Processing and Analysis.
- DNA-seq data analysis with whole-genome sequencing or exome sequencing. The Bioinformatics Section has developed a pipeline for exome-sequencing data analysis, including data quality control, read alignment, variant calling, functional annotation and the identification of statistically significant variants differentiating across multiple groups.
- RNA-seq data analysis. The Bioinformatics Section has developed a pipeline for RNA-seq data analysis. The pipeline includes data quality control, read alignment, differential expression identification and visualization, and pathway/functional analysis. Besides gene expression analysis, we also support the discovery of novel alternative splicing as well as variant calling and fusion detection from RNA-seq data.
- Metabolomics Data Analysis. The Bioinformatics Section provides informatics support for raw and intermediate data analysis of metabolomics datasets, especially stable isotope-resolved metabolomics datasets. Results of these analyses can feed into other biostatistical analyses provided by the section. Custom downstream metabolic modeling and relative flux analyses can be provided on a limited basis.
- Integrative Analysis of Multiple Genomics Datasets. The Bioinformatics Section provides bioinformatics support to analyze the interaction or correlation across multiple genomic data. Some examples include the integrative analysis of DNA-methylation data and RNA-seq data to look at the regulation of global gene expression, the detection of aberrant transcripts using both DNA-seq and RNA-seq data, and correlation analysis between RNA-seq and existing microarray data. The section also provides support for soft multi-omics integration using CategoryCompare, which provides integration at the level of annotations across omics datasets. Please contact us for more details.
- Genomic Data Mining. The section uses genomic data repositories such as GEO, Oncomine, and TCGA to correlate genomic data from specific gene(s) of interest with clinical outcomes.
- Other Large-Scale Genomic Data Analysis. The section provides bioinformatics support for other genomic experimental platforms such as the NanoString nCounter system.
- Grant-writing Support. The section will help investigators with genomic study design, sample size/power calculation, data analysis plans, and writing bioinformatics sections.
- Training and outreach. It is important that a rapport and dialogue is established between biomedical researchers and the bioinformaticians and computational biologists with whom they collaborate. The Bioinformatics Section will advertise new services as they become available and will work with investigators to establish new data analysis pipelines. The section's personnel will also host informational seminars on supported analysis routines and will host training/workshops on commonly used bioinformatics tools, resources and databases.
Storage and Computational Resources
We have access to the UK Lipscomb High Performance Computing (HPC) cluster. The cluster is built from a large number of commodity servers, a high-speed interconnect, a unified file system, and a large mass storage system. We collaborate with the Cancer Research Informatics Shared Resource Facility to ensure adequate resources for data storage and management.
As an NCI-funded facility, we follow the Biostatistics SRF prioritization mechanism. Specifically, we prioritize support for peer-reviewed, funded investigators, support for development of external, peer-reviewed and pilot applications, and studies related to investigator-initiated studies by MCC members. Limited support is provided free-of-charge for investigators with expectations that co-authorships or inclusion as co-investigators in grant applications be considered of Biostatistics SRF members.
- Department of Biostatistics, Division of Biomedical Informatics
- Kentucky Biomedical Research Infrastructure Network
- Microarray Core Facility
- Advanced Genetic Technologies Center
- Center for Computational Sciences
Download the Bioinformatics poster (PDF, 1.1 MB) from Markey Cancer Research Day 2014 for more information.