Statistics Publications

Permanent URI for this collection


Recent Submissions

Now showing 1 - 20 of 157
  • Item
    Enabling accurate and early detection of recently emerged SARS-CoV-2 variants of concern in wastewater
    (Springer Nature, 2023) Sapoval, Nicolae; Liu, Yunxi; Lou, Esther G.; Hopkins, Loren; Ensor, Katherine B.; Schneider, Rebecca; Stadler, Lauren B.; Treangen, Todd J.
    As clinical testing declines, wastewater monitoring can provide crucial surveillance on the emergence of SARS-CoV-2 variant of concerns (VoCs) in communities. In this paper we present QuaID, a novel bioinformatics tool for VoC detection based on quasi-unique mutations. The benefits of QuaID are three-fold: (i) provides up to 3-week earlier VoC detection, (ii) accurate VoC detection (>95% precision on simulated benchmarks), and (iii) leverages all mutational signatures (including insertions & deletions).
  • Item
    Functional screening of lysosomal storage disorder genes identifies modifiers of alpha-synuclein neurotoxicity
    (Public Library of Science, 2023) Yu, Meigen; Ye, Hui; De-Paula, Ruth B.; Mangleburg, Carl Grant; Wu, Timothy; Lee, Tom V.; Li, Yarong; Duong, Duc; Phillips, Bridget; Cruchaga, Carlos; Allen, Genevera I.; Seyfried, Nicholas T.; Al-Ramahi, Ismael; Botas, Juan; Shulman, Joshua M.
    Heterozygous variants in the glucocerebrosidase (GBA) gene are common and potent risk factors for Parkinson’s disease (PD). GBA also causes the autosomal recessive lysosomal storage disorder (LSD), Gaucher disease, and emerging evidence from human genetics implicates many other LSD genes in PD susceptibility. We have systemically tested 86 conserved fly homologs of 37 human LSD genes for requirements in the aging adult Drosophila brain and for potential genetic interactions with neurodegeneration caused by α-synuclein (αSyn), which forms Lewy body pathology in PD. Our screen identifies 15 genetic enhancers of αSyn-induced progressive locomotor dysfunction, including knockdown of fly homologs of GBA and other LSD genes with independent support as PD susceptibility factors from human genetics (SCARB2, SMPD1, CTSD, GNPTAB, SLC17A5). For several genes, results from multiple alleles suggest dose-sensitivity and context-dependent pleiotropy in the presence or absence of αSyn. Homologs of two genes causing cholesterol storage disorders, Npc1a / NPC1 and Lip4 / LIPA, were independently confirmed as loss-of-function enhancers of αSyn-induced retinal degeneration. The enzymes encoded by several modifier genes are upregulated in αSyn transgenic flies, based on unbiased proteomics, revealing a possible, albeit ineffective, compensatory response. Overall, our results reinforce the important role of lysosomal genes in brain health and PD pathogenesis, and implicate several metabolic pathways, including cholesterol homeostasis, in αSyn-mediated neurotoxicity.
  • Item
    Data to Action: Community-Based Participatory Research to Address Concerns about Metal Air Pollution in Overburdened Neighborhoods near Metal Recycling Facilities in Houston
    (Environmental Health Perspectives, 2023) Symanski, Elaine; An, Han Heyreoun; McCurdy, Sheryl; Hopkins, Loren; Flores, Juan; Han, Inkyu; Smith, Mary Ann; Caldwell, James; Fontenot, Cecelia; Wyatt, Bobbie; Markham, Christine
    Background: Exposures to environmental contaminants can be influenced by social determinants of health. As a result, persons living in socially disadvantaged communities may experience disproportionate health risks from environmental exposures. Mixed methods research can be used to understand community-level and individual-level exposures to chemical and nonchemical stressors contributing to environmental health disparities. Furthermore, community-based participatory research (CBPR) approaches can lead to more effective interventions. Objectives: We applied mixed methods to identify environmental health perceptions and needs among metal recyclers and residents living in disadvantaged neighborhoods near metal recycling facilities in Houston, Texas, in a CBPR study, Metal Air Pollution Partnership Solutions (MAPPS). Informed by what we learned and our previous findings from cancer and noncancer risk assessments of metal air pollution in these neighborhoods, we developed an action plan to lower metal aerosol emissions from metal recycling facilities and enhance community capacity to address environmental health risks. Methods: Key informant interviews, focus groups, and community surveys were used to identify environmental health concerns of residents. A diverse group from academia, an environmental justice advocacy group, the community, the metal recycling industry, and the local health department collaborated and translated these findings, along with results from our prior risk assessments, to inform a multifaceted public health action plan. Results: An evidence-based approach was used to develop and implement neighborhood-specific action plans. Plans included a voluntary framework of technical and administrative controls to reduce metal emissions in the metal recycling facilities, direct lines of communication among residents, metal recyclers, and local health department officials, and environmental health leadership training. Discussion: Using a CBPR approach, health risk assessment findings based on outdoor air monitoring campaigns and community survey results informed a multipronged environmental health action plan to mitigate health risks associated with metal air pollution.
  • Item
    Adverse Health Outcomes Following Hurricane Harvey: A Comparison of Remotely-Sensed and Self-Reported Flood Exposure Estimates
    (Wiley, 2023) Ramesh, Balaji; Callender, Rashida; Zaitchik, Benjamin F.; Jagger, Meredith; Swarup, Samarth; Gohlke, Julia M.
    Remotely sensed inundation may help to rapidly identify areas in need of aid during and following floods. Here we evaluate the utility of daily remotely sensed flood inundation measures and estimate their congruence with self-reported home flooding and health outcomes collected via the Texas Flood Registry (TFR) following Hurricane Harvey. Daily flood inundation for 14 days following the landfall of Hurricane Harvey was acquired from FloodScan. Flood exposure, including number of days flooded and flood depth was assigned to geocoded home addresses of TFR respondents (N = 18,920 from 47 counties). Discordance between remotely-sensed flooding and self-reported home flooding was measured. Modified Poisson regression models were implemented to estimate risk ratios (RRs) for adverse health outcomes following flood exposure, controlling for potential individual level confounders. Respondents whose home was in a flooded area based on remotely-sensed data were more likely to report injury (RR = 1.5, 95% CI: 1.27–1.77), concentration problems (1.36, 95% CI: 1.25–1.49), skin rash (1.31, 95% CI: 1.15–1.48), illness (1.29, 95% CI: 1.17–1.43), headaches (1.09, 95% CI: 1.03–1.16), and runny nose (1.07, 95% CI: 1.03–1.11) compared to respondents whose home was not flooded. Effect sizes were larger when exposure was estimated using respondent-reported home flooding. Near-real time remote sensing-based flood products may help to prioritize areas in need of assistance when on the ground measures are not accessible.
  • Item
    Yule’s “nonsense correlation” for Gaussian random walks
    (Elsevier, 2023) Ernst, Philip A.; Huang, Dongzhou; Viens, Frederi G.
    This paper provides an exact formula for the second moment of the empirical correlation (also known as Yule’s “nonsense correlation”) for two independent standard Gaussian random walks, as well as implicit formulas for higher moments. We also establish rates of convergence of the empirical correlation of two independent standard Gaussian random walks to the empirical correlation of two independent Wiener processes.
  • Item
    Semiparametric count data regression for self-reported mental health
    (Wiley, 2023) Kowal, Daniel R.; Wu, Bohan
    ‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self-reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: The data are overdispersed, zero-inflated, bounded by 30, and heaped in 5- and 7-day increments. To address these challenges—which are especially common for health questionnaire data—we design a semiparametric estimation and inference framework for count data regression. The data-generating process is defined by simultaneously transforming and rounding (star) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an expectation-maximization (EM) algorithm that is compatible with any continuous data model estimable by least squares. star regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. Using star regression, we identify key factors associated with self-reported mental health and demonstrate substantial improvements in goodness-of-fit compared to existing count data regression models.
  • Item
    Bayesian feature selection for radiomics using reliability metrics
    (Frontiers Media S.A., 2023) Shoemaker, Katherine; Ger, Rachel; Court, Laurence E.; Aerts, Hugo; Vannucci, Marina; Peterson, Christine B.
    Introduction: Imaging of tumors is a standard step in diagnosing cancer and making subsequent treatment decisions. The field of radiomics aims to develop imaging based biomarkers using methods rooted in artificial intelligence applied to medical imaging. However, a challenging aspect of developing predictive models for clinical use is that many quantitative features derived from image data exhibit instability or lack of reproducibility across different imaging systems or image-processing pipelines.Methods: To address this challenge, we propose a Bayesian sparse modeling approach for image classification based on radiomic features, where the inclusion of more reliable features is favored via a probit prior formulation.Results: We verify through simulation studies that this approach can improve feature selection and prediction given correct prior information. Finally, we illustrate the method with an application to the classification of head and neck cancer patients by human papillomavirus status, using as our prior information a reliability metric quantifying feature stability across different imaging systems.
  • Item
    Evaluating bone marrow dosimetry with the addition of bone marrow structures to the medical internal radiation dose phantom
    (Wiley, 2023) Ferrone, Kristine L.; Willis, Charles E.; Guan, Fada; Ma, Jingfei; Peterson, Leif E.; Kry, Stephen F.
    Background Reliable estimates of radiation dose to bone marrow are critical to understanding the risk of radiation-induced cancers. Although the medical internal radiation dose phantom is routinely used for dose estimation, bone marrow is not defined in the phantom. Consequently, methods of indirectly estimating bone marrow dose have been implemented based on dose to surrogate volumes or average dose to soft tissue. Methods In this study, new bone marrow structures were implemented and evaluated to the medical internal radiation dose phantom in Geant4, offering improved fidelity. The dose equivalent to the bone marrow was calculated across medical, occupational, and space radiation exposure scenarios, and compared with results using prior indirect estimation methods. Conclusion Our results show that bone marrow dose may be overestimated by up to a factor of three when using the traditional methods when compared with the improved fidelity medical internal radiation dose method, specifically at clinical x-ray energies.
  • Item
    Moran process version of the tug-of-war model: Behavior revealed by mathematical analysis and simulation studies
    (Amerian Institute of Mathematical Sciences, 2023) Bobrowski, Adam; Kimmel, Marek; Kurpas, Monika K.; Ratajczyk, Elżbieta
    In a series of publications McFarland and co-authors introduced the tug-of-war model of evolution of cancer cell populations. The model is explaining the joint effect of rare advantageous and frequent slightly deleterious mutations, which may be identifiable with driver and passenger mutations in cancer. In this paper, we put the tug-of-war model in the framework of a denumerable-type Moran process and use mathematics and simulations to understand its behavior. The model is associated with a time-continuous Markov Chain (MC), with a generator that can be split into a sum of the drift and selection process part and of the mutation process part. Operator semigroup theory is then employed to prove that the MC does not explode, as well as to characterize a strong-drift limit version of the MC which displays 'instant fixation' effect, which was an assumption in the original McFarland's model. Mathematical results are confirmed by simulations of the complete and limit versions. Simulations also visualize complex stochastic transients and genealogies of clones arising in the model.
  • Item
    Bayesian non-homogeneous hidden Markov model with variable selection for investigating drivers of seizure risk cycling
    (Project Euclid, 2023) Wang, Emily T.; Chiang, Sharon; Haneef, Zulfi; Rao, Vikram R.; Moss, Robert; Vannucci, Marina
    A major issue in the clinical management of epilepsy is the unpredictability of seizures. Yet, traditional approaches to seizure forecasting and risk assessment in epilepsy rely heavily on raw seizure frequencies which are a stochastic measurement of seizure risk. We consider a Bayesian nonhomogeneous hidden Markov model for unsupervised clustering of zero-inflated seizure count data. The proposed model allows for a probabilistic estimate of the sequence of seizure risk states at the individual level. It also offers significant improvement over prior approaches by incorporating a variable selection prior for the identification of clinical covariates that drive seizure risk changes and accommodating highly granular data. For inference, we implement an efficient sampler that employs stochastic search and data augmentation techniques. We evaluate model performance on simulated seizure count data. We then demonstrate the clinical utility of the proposed model by analyzing daily seizure count data from 133 patients with Dravet syndrome collected through the Seizure TrackerTMTM system, a patient-reported electronic seizure diary. We report on the dynamics of seizure risk cycling, including validation of several known pharmacologic relationships. We also uncover novel findings characterizing the presence and volatility of risk states in Dravet syndrome which may directly inform counseling to reduce the unpredictability of seizures for patients with this devastating cause of epilepsy.
  • Item
    Detection and characterization of constitutive replication origins defined by DNA polymerase epsilon
    (Springer Nature, 2023) Jaksik, Roman; Wheeler, David A.; Kimmel, Marek
    Despite the process of DNA replication being mechanistically highly conserved, the location of origins of replication (ORI) may vary from one tissue to the next, or between rounds of replication in eukaryotes, suggesting flexibility in the choice of locations to initiate replication. Lists of human ORI therefore vary widely in number and location, and there are currently no methods available to compare them. Here, we propose a method of detection of ORI based on somatic mutation patterns generated by the mutator phenotype of damaged DNA polymerase epsilon (POLE).
  • Item
    Bayes goes fast: Uncertainty quantification for a covariant energy density functional emulated by the reduced basis method
    (Frontiers Media S.A., 2023) Giuliani, Pablo; Godbey, Kyle; Bonilla, Edgard; Viens, Frederi; Piekarewicz, Jorge
    A covariant energy density functional is calibrated using a principled Bayesian statistical framework informed by experimental binding energies and charge radii of several magic and semi-magic nuclei. The Bayesian sampling required for the calibration is enabled by the emulation of the high-fidelity model through the implementation of a reduced basis method (RBM)—a set of dimensionality reduction techniques that can speed up demanding calculations involving partial differential equations by several orders of magnitude. The RBM emulator we build—using only 100 evaluations of the high-fidelity model—is able to accurately reproduce the model calculations in tens of milliseconds on a personal computer, an increase in speed of nearly a factor of 3,300 when compared to the original solver. Besides the analysis of the posterior distribution of parameters, we present model calculations for masses and radii with properly estimated uncertainties. We also analyze the model correlation between the slope of the symmetry energy L and the neutron skin of 48Ca and 208Pb. The straightforward implementation and outstanding performance of the RBM makes it an ideal tool for assisting the nuclear theory community in providing reliable estimates with properly quantified uncertainties of physical observables. Such uncertainty quantification tools will become essential given the expected abundance of data from the recently inaugurated and future experimental and observational facilities.
  • Item
    Wastewater surveillance of SARS-CoV-2 and influenza in preK-12 schools shows school, community, and citywide infections
    (Elsevier, 2023) Wolken, Madeline; Sun, Thomas; McCall, Camille; Schneider, Rebecca; Caton, Kelsey; Hundley, Courtney; Hopkins, Loren; Ensor, Katherine; Domakonda, Kaavya; Prashant, Kalvapalle; Persse, David; Williams, Stephen; Stadler, Lauren B.
    Wastewater surveillance is a passive and efficient way to monitor the spread of infectious diseases in large populations and high transmission areas such as preK-12 schools. Infections caused by respiratory viruses in school-aged children are likely underreported, particularly because many children may be asymptomatic or mildly symptomatic. Wastewater monitoring of SARS-CoV-2 has been studied extensively and primarily by sampling at centralized wastewater treatment plants, and there are limited studies on SARS-CoV-2 in preK-12 school wastewater. Similarly, wastewater detections of influenza have only been reported in wastewater treatment plant and university manhole samples. Here, we present the results of a 17-month wastewater monitoring program for SARS-CoV-2 (n = 2176 samples) and influenza A and B (n = 1217 samples) in 51 preK-12 schools. We show that school wastewater concentrations of SARS-CoV-2 RNA were strongly associated with COVID-19 cases in schools and community positivity rates, and that influenza detections in school wastewater were significantly associated with citywide influenza diagnosis rates. Results were communicated back to schools and local communities to enable mitigation strategies to stop the spread, and direct resources such as testing and vaccination clinics. This study demonstrates that school wastewater surveillance is reflective of local infections at several population levels and plays a crucial role in the detection and mitigation of outbreaks.
  • Item
    Bayesian graphical models for modern biological applications
    (Springer Nature, 2022) Ni, Yang; Baladandayuthapani, Veerabhadran; Vannucci, Marina; Stingo, Francesco C.
    Graphical models are powerful tools that are regularly used to investigate complex dependence structures in high-throughput biomedical datasets. They allow for holistic, systems-level view of the various biological processes, for intuitive and rigorous understanding and interpretations. In the context of large networks, Bayesian approaches are particularly suitable because it encourages sparsity of the graphs, incorporate prior information, and most importantly account for uncertainty in the graph structure. These features are particularly important in applications with limited sample size, including genomics and imaging studies. In this paper, we review several recently developed techniques for the analysis of large networks under non-standard settings, including but not limited to, multiple graphs for data observed from multiple related subgroups, graphical regression approaches used for the analysis of networks that change with covariates, and other complex sampling and structural settings. We also illustrate the practical utility of some of these methods using examples in cancer genomics and neuroimaging.
  • Item
    Fast, Optimal, and Targeted Predictions Using Parameterized Decision Analysis
    (Taylor & Francis, 2022) Kowal, Daniel R.
    Prediction is critical for decision-making under uncertainty and lends validity to statistical inference. With targeted prediction, the goal is to optimize predictions for specific decision tasks of interest, which we represent via functionals. Although classical decision analysis extracts predictions from a Bayesian model, these predictions are often difficult to interpret and slow to compute. Instead, we design a class of parameterized actions for Bayesian decision analysis that produce optimal, scalable, and simple targeted predictions. For a wide variety of action parameterizations and loss functions—including linear actions with sparsity constraints for targeted variable selection—we derive a convenient representation of the optimal targeted prediction that yields efficient and interpretable solutions. Customized out-of-sample predictive metrics are developed to evaluate and compare among targeted predictors. Through careful use of the posterior predictive distribution, we introduce a procedure that identifies a set of near-optimal, or acceptable targeted predictors, which provide unique insights into the features and level of complexity needed for accurate targeted prediction. Simulations demonstrate excellent prediction, estimation, and variable selection capabilities. Targeted predictions are constructed for physical activity (PA) data from the National Health and Nutrition Examination Survey to better predict and understand the characteristics of intraday PA. Supplementary materials for this article are available online.
  • Item
    Genomic Analysis of SARS-CoV-2 Alpha, Beta and Delta Variants of Concern Uncovers Signatures of Neutral and Non-Neutral Evolution
    (MDPI, 2022) Kurpas, Monika Klara; Jaksik, Roman; Kuś, Pawel; Kimmel, Marek
    Due to the emergence of new variants of the SARS-CoV-2 coronavirus, the question of how the viral genomes evolved, leading to the formation of highly infectious strains, becomes particularly important. Three major emergent strains, Alpha, Beta and Delta, characterized by a significant number of missense mutations, provide a natural test field. We accumulated and aligned 4.7 million SARS-CoV-2 genomes from the GISAID database and carried out a comprehensive set of analyses. This collection covers the period until the end of October 2021, i.e., the beginnings of the Omicron variant. First, we explored combinatorial complexity of the genomic variants emerging and their timing, indicating very strong, albeit hidden, selection forces. Our analyses show that the mutations that define variants of concern did not arise gradually but rather co-evolved rapidly, leading to the emergence of the full variant strain. To explore in more detail the evolutionary forces at work, we developed time trajectories of mutations at all 29,903 sites of the SARS-CoV-2 genome, week by week, and stratified them into trends related to (i) point substitutions, (ii) deletions and (iii) non-sequenceable regions. We focused on classifying the genetic forces active at different ranges of the mutational spectrum. We observed the agreement of the lowest-frequency mutation spectrum with the Griffiths–Tavaré theory, under the Infinite Sites Model and neutrality. If we widen the frequency range, we observe the site frequency spectra much more consistently with the Tung–Durrett model assuming clone competition and selection. The coefficients of the fitting model indicate the possibility of selection acting to promote gradual growth slowdown, as observed in the history of the variants of concern. These results add up to a model of genomic evolution, which partly fits into the classical drift barrier ideas. Certain observations, such as mutation “bands” persistent over the epidemic history, suggest contribution of genetic forces different from mutation, drift and selection, including recombination or other genome transformations. In addition, we show that a “toy” mathematical model can qualitatively reproduce how new variants (clones) stem from rare advantageous driver mutations, and then acquire neutral or disadvantageous passenger mutations which gradually reduce their fitness so they can be then outcompeted by new variants due to other driver mutations.
  • Item
    Bayesian data synthesis and the utility-risk trade-off for mixed epidemiological data
    (Project Euclid, 2022) Feldman, Joseph; Kowal, Daniel R.
    Much of the microdata used for epidemiological studies contain sensitive measurements on real individuals. As a result, such microdata cannot be published out of privacy concerns, and without public access to these data, any statistical analyses originally published on them are nearly impossible to reproduce. To promote the dissemination of key datasets for analysis without jeopardizing the privacy of individuals, we introduce a cohesive Bayesian framework for the generation of fully synthetic high-dimensional microdatasets of mixed categorical, binary, count, and continuous variables. This process centers around a joint Bayesian model that is simultaneously compatible with all of these data types, enabling the creation of mixed synthetic datasets through posterior predictive sampling. Furthermore, a focal point of epidemiological data analysis is the study of conditional relationships between various exposures and key outcome variables through regression analysis. We design a modified data synthesis strategy to target and preserve these conditional relationships, including both nonlinearities and interactions. The proposed techniques are deployed to create a synthetic version of a confidential dataset containing dozens of health, cognitive, and social measurements on nearly 20,000 North Carolina children.
  • Item
    Sequencing individual genomes with recurrent genomic disorder deletions: an approach to characterize genes for autosomal recessive rare disease traits
    (Springer Nature, 2022) Yuan, Bo; Schulze, Katharina V.; Assia Batzir, Nurit; Sinson, Jefferson; Dai, Hongzheng; Zhu, Wenmiao; Bocanegra, Francia; Fong, Chin-To; Holder, Jimmy; Nguyen, Joanne; Schaaf, Christian P.; Yang, Yaping; Bi, Weimin; Eng, Christine; Shaw, Chad; Lupski, James R.; Liu, Pengfei
    In medical genetics, discovery and characterization of disease trait contributory genes and alleles depends on genetic reasoning, study design, and patient ascertainment; we suggest a segmental haploid genetics approach to enhance gene discovery and molecular diagnostics.
  • Item
    Modes of Selection in Tumors as Reflected by Two Mathematical Models and Site Frequency Spectra
    (Frontiers Media S.A., 2022) Kurpas, Monika K.; Kimmel, Marek
    The tug-of-war model was developed in a series of papers of McFarland and co-authors to account for existence of mutually counteracting rare advantageous driver mutations and more frequent slightly deleterious passenger mutations in cancer. In its original version, it was a state-dependent branching process. Because of its formulation, the tug-of-war model is of importance for tackling the problem as to whether evolution of cancerous tumors is “Darwinian” or “non-Darwinian.” We define two Time-Continuous Markov Chain versions of the model, including identical mutation processes but adopting different drift and selection components. In Model A, drift and selection process preserves expected fitness whereas in Model B it leads to non-decreasing expected fitness. We investigate these properties using mathematical analysis and extensive simulations, which detect the effect of the so-called drift barrier in Model B but not in Model A. These effects are reflected in different structure of clone genealogies in the two models. Our work is related to the past theoretical work in the field of evolutionary genetics, concerning the interplay among mutation, drift and selection, in absence of recombination (asexual reproduction), where epistasis plays a major role. Finally, we use the statistics of mutation frequencies known as the Site Frequency Spectra (SFS), to compare the variant frequencies in DNA of sequenced HER2+ breast cancers, to those based on Model A and B simulations. The tumor-based SFS are better reproduced by Model A, pointing out a possible selection pattern of HER2+ tumor evolution. To put our models in context, we carried out an exploratory study of how publicly accessible data from breast, prostate, skin and ovarian cancers fit a range of models found in the literature.
  • Item
    Racial residential segregation shapes the relationship between early childhood lead exposure and fourth-grade standardized test scores
    (National Academy of Sciences, 2022) Bravo, Mercedes A.; Zephyr, Dominique; Kowal, Daniel; Ensor, Katherine; Miranda, Marie Lynn
    Racial/ethnic disparities in academic performance may result from a confluence of adverse exposures that arise from structural racism and accrue to specific subpopulations. This study investigates childhood lead exposure, racial residential segregation, and early educational outcomes. Geocoded North Carolina birth data is linked to blood lead surveillance data and fourth-grade standardized test scores (n = 25,699). We constructed a census tract-level measure of racial isolation (RI) of the non-Hispanic Black (NHB) population. We fit generalized additive models of reading and mathematics test scores regressed on individual-level blood lead level (BLL) and neighborhood RI of NHB (RINHB). Models included an interaction term between BLL and RINHB. BLL and RINHB were associated with lower reading scores; among NHB children, an interaction was observed between BLL and RINHB. Reading scores for NHB children with BLLs of 1 to 3 µg/dL were similar across the range of RINHB values. For NHB children with BLLs of 4 µg/dL, reading scores were similar to those of NHB children with BLLs of 1 to 3 µg/dL at lower RINHB values (less racial isolation/segregation). At higher RINHB levels (greater racial isolation/segregation), children with BLLs of 4 µg/dL had lower reading scores than children with BLLs of 1 to 3 µg/dL. This pattern becomes more marked at higher BLLs. Higher BLL was associated with lower mathematics test scores among NHB and non-Hispanic White (NHW) children, but there was no evidence of an interaction. In conclusion, NHB children with high BLLs residing in high RINHB neighborhoods had worse reading scores.