Statistical Approaches for Large-Scale and Complex Omics Data

dc.contributor.advisorLi, Mengen_US
dc.contributor.advisorMorris, Jeffrey S.en_US
dc.creatorLiu, Yushaen_US
dc.date.accessioned2019-12-06T19:50:36Zen_US
dc.date.available2020-12-01T06:01:11Zen_US
dc.date.created2019-12en_US
dc.date.issued2019-12-05en_US
dc.date.submittedDecember 2019en_US
dc.date.updated2019-12-06T19:50:36Zen_US
dc.description.abstractIn this thesis, we propose several novel statistical approaches to analyzing large-scale and complex omics data. This thesis consists of three projects. In the first project, with the goal of characterizing gene-level relationships between DNA methylation and gene expression, we introduce a sequential penalized regression approach to identify methylation-expression quantitative trait loci (methyl-eQTLs), a term that we have coined to represent, for each gene and tissue type, a sparse set of CpG loci best explaining gene expression and accompanying weights indicating direction and strength of association, which can be used to construct gene-level methylation summaries that are maximally correlated with gene expression for use in integrative models. Using TCGA and MD Anderson colorectal cohorts to build and validate our models, we demonstrate our strategy explains expression variability much better than commonly used integrative methods. In the second project, we propose a unified Bayesian framework to perform quantile regression on functional responses (FQR). Our approach represents functional coefficients with basis functions to borrow strength from nearby locations, and places a global-local shrinkage prior on the basis coefficients to achieve adaptive regularization. We develop a scalable Gibbs sampler to implement the approach. Simulation studies show that our method has superior performance against competing methods. We apply our method to a mass spectrometry dataset and identify proteomic biomarkers of pancreatic cancer that were entirely missed by mean-regression based approaches. The third project is a theoretical investigation of the FQR problem, extending the previous project. We propose an interpolation-based estimator that can be strongly approximated by a sequence of Gaussian processes, based upon which we can derive the convergence rate of the estimator and construct simultaneous confidence bands for the functional coefficient. The strong approximation results also build a theoretical foundation for the development of alternative approaches that are shown to have better finite-sample performance in simulation studies.en_US
dc.embargo.terms2020-12-01en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationLiu, Yusha. "Statistical Approaches for Large-Scale and Complex Omics Data." (2019) Diss., Rice University. <a href="https://hdl.handle.net/1911/107813">https://hdl.handle.net/1911/107813</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/107813en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectIntegrative Genomicsen_US
dc.subjectPenalized Regressionen_US
dc.subjectFunctional Data Analysisen_US
dc.subjectQuantile Regressionen_US
dc.subjectBayesian Hierarchical Modelingen_US
dc.subjectProteomicsen_US
dc.titleStatistical Approaches for Large-Scale and Complex Omics Dataen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentStatisticsen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LIU-DOCUMENT-2019.pdf
Size:
7.32 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.6 KB
Format:
Plain Text
Description: