Characterization of cancer development and recurrence through mathematical and statistical modeling

dc.contributor.advisorWang, Wenyi
dc.contributor.advisorKimmel, Marek
dc.creatorNguyen, Hoai Nam
dc.date.accessioned2024-05-22T15:42:52Z
dc.date.created2024-05
dc.date.issued2024-04-10
dc.date.submittedMay 2024
dc.date.updated2024-05-22T15:42:52Z
dc.descriptionEMBARGO NOTE: This item is embargoed until 2025-05-01
dc.description.abstractLi-Fraumeni syndrome (LFS) is a genetic disorder characterized by deleterious germline mutations in the TP53 tumor suppressor gene. Due to the compromised DNA repair mechanisms, patients with LFS are significantly more likely to develop a spectrum of cancer types. Furthermore, it is not uncommon for LFS patients to develop multiple primary cancers. Two risk prediction models were developed for LFS: (i) a cancer-specific model that predicts cancer-specific risks for the first primary and (ii) a multiple primary cancer model that predicts the risk of a second primary without distinguishing between cancer types. Although they have been validated on research cohorts, it is essential to show that they perform well on a clinical cohort, which more closely resembles the patient data that are observed in real counseling sessions. In the first project, we validate the models in both discrimination and calibration via the Area Under the Curve (AUC) and Observed/Expected (O/E) ratio, respectively, on a dataset collected from the Clinical Cancer Genetics program at MD Anderson Cancer Center (MDACC). To expedite the dissemination of these models, we further refine the associated software tools, LFSPRO and LFSPROShiny. A major limitation of the previous models is that they do not predict cancer-specific risks beyond the first primary. In statistical survival analysis, multiple primary cancers can be regarded as recurrent events, and different cancer types can be regarded as non-terminal competing risks. Although many models have been proposed to address these two phenomenons separately, a unified statistical framework remains a gap in knowledge. In the second project, we develop a generalized and interpretable Bayesian model that fully accounts for the complex relationships between the recurrent events. We use a non-homogeneous Poisson process to model the occurrence processes of the competing risks, each of which is characterized by a time-dependent intensity function that follows a Cox regression model. For family datasets, we further introduce fraity terms to capture within-family correlations that are induced by the unobserved covariates, and recursively compute the family-wise likelihood via the Elston-Stewart peeling algorithm to account for the dependence of family members through missing genotypes. The model parameters are estimated via a Metropolis-Hastings-within-Gibbs sampling scheme. We train and cross-validate our model on a LFS patient cohort that is prospectively collected at MDACC. In the third project, we perform a much more extensive validation of the model on independent patient cohorts from major cancer institutes across the United States. Stem cells are closely related to cancer. Given their ability to develop into many different cell types, stem cell transplants can be used to replace cells that are damaged by high doses of radiotherapy and chemotherapy, thus accelerating the process of cancer treatment. On the other hand, stem cells survive much longer than ordinary cells, and are thus more likely to accumulate harmful genetic mutations, which have the potential to trigger carcinogenesis. During cell division, a stem cell forms a progenitor cell, which continues to differentiate into the target cell type, and renews itself. In the last project, we mathematically describe this process using a two-type age-dependent branching process. By deriving closed-form expressions of the probability generating functions, we study the behavior of such process in both finite time and large time under different dynamics of the two cell types, which correspond to various biological scenarios.
dc.embargo.lift2025-05-01
dc.embargo.terms2025-05-01
dc.format.mimetypeapplication/pdf
dc.identifier.citationNguyen, Hoai Nam. Characterization of cancer development and recurrence through mathematical and statistical modeling. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116159
dc.identifier.urihttps://hdl.handle.net/1911/116159
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectrisk prediction
dc.subjectLi-Fraumeni syndrome
dc.subjectBayesian statistics
dc.subjectnon-homogeneous Poisson process
dc.subjectbranching process
dc.titleCharacterization of cancer development and recurrence through mathematical and statistical modeling
dc.typeThesis
dc.type.materialText
thesis.degree.departmentStatistics
thesis.degree.disciplineEngineering
thesis.degree.grantorRice University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
Files
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: