Wang, WenyiKimmel, Marek2024-05-222024-052024-04-10May 2024Nguyen, Hoai Nam. Characterization of cancer development and recurrence through mathematical and statistical modeling. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116159https://hdl.handle.net/1911/116159EMBARGO NOTE: This item is embargoed until 2025-05-01Li-Fraumeni syndrome (LFS) is a genetic disorder characterized by deleterious germline mutations in the TP53 tumor suppressor gene. Due to the compromised DNA repair mechanisms, patients with LFS are significantly more likely to develop a spectrum of cancer types. Furthermore, it is not uncommon for LFS patients to develop multiple primary cancers. Two risk prediction models were developed for LFS: (i) a cancer-specific model that predicts cancer-specific risks for the first primary and (ii) a multiple primary cancer model that predicts the risk of a second primary without distinguishing between cancer types. Although they have been validated on research cohorts, it is essential to show that they perform well on a clinical cohort, which more closely resembles the patient data that are observed in real counseling sessions. In the first project, we validate the models in both discrimination and calibration via the Area Under the Curve (AUC) and Observed/Expected (O/E) ratio, respectively, on a dataset collected from the Clinical Cancer Genetics program at MD Anderson Cancer Center (MDACC). To expedite the dissemination of these models, we further refine the associated software tools, LFSPRO and LFSPROShiny. A major limitation of the previous models is that they do not predict cancer-specific risks beyond the first primary. In statistical survival analysis, multiple primary cancers can be regarded as recurrent events, and different cancer types can be regarded as non-terminal competing risks. Although many models have been proposed to address these two phenomenons separately, a unified statistical framework remains a gap in knowledge. In the second project, we develop a generalized and interpretable Bayesian model that fully accounts for the complex relationships between the recurrent events. We use a non-homogeneous Poisson process to model the occurrence processes of the competing risks, each of which is characterized by a time-dependent intensity function that follows a Cox regression model. For family datasets, we further introduce fraity terms to capture within-family correlations that are induced by the unobserved covariates, and recursively compute the family-wise likelihood via the Elston-Stewart peeling algorithm to account for the dependence of family members through missing genotypes. The model parameters are estimated via a Metropolis-Hastings-within-Gibbs sampling scheme. We train and cross-validate our model on a LFS patient cohort that is prospectively collected at MDACC. In the third project, we perform a much more extensive validation of the model on independent patient cohorts from major cancer institutes across the United States. Stem cells are closely related to cancer. Given their ability to develop into many different cell types, stem cell transplants can be used to replace cells that are damaged by high doses of radiotherapy and chemotherapy, thus accelerating the process of cancer treatment. On the other hand, stem cells survive much longer than ordinary cells, and are thus more likely to accumulate harmful genetic mutations, which have the potential to trigger carcinogenesis. During cell division, a stem cell forms a progenitor cell, which continues to differentiate into the target cell type, and renews itself. In the last project, we mathematically describe this process using a two-type age-dependent branching process. By deriving closed-form expressions of the probability generating functions, we study the behavior of such process in both finite time and large time under different dynamics of the two cell types, which correspond to various biological scenarios.application/pdfengCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.risk predictionLi-Fraumeni syndromeBayesian statisticsnon-homogeneous Poisson processbranching processCharacterization of cancer development and recurrence through mathematical and statistical modelingThesis2024-05-22