Statistics Publications
Permanent URI for this collection
Browse
Browsing Statistics Publications by Issue Date
Now showing 1 - 20 of 165
Results Per Page
Sort Options
Item Making an R package(2010-11-09) Wickham, HadleyItem A systems biology approach reveals common metastatic pathways in osteosarcoma(BioMed Central, 2012) Flores, Ricardo J.; Li, Yiting; Yu, Alexander; Shen, Jianhe; Rao, Pulivarthi H.; Lau, Serrine S.; Vannucci, Marina; Lau, Ching C.; Man, Tsz-KwongBackground: Osteosarcoma (OS) is the most common malignant bone tumor in children and adolescents. The survival rate of patients with metastatic disease remains very dismal. Nevertheless, metastasis is a complex process and a single-level analysis is not likely to identify its key biological determinants. In this study, we used a systems biology approach to identify common metastatic pathways that are jointly supported by both mRNA and protein expression data in two distinct human metastatic OS models. Results: mRNA expression microarray and N-linked glycoproteomic analyses were performed on two commonly used isogenic pairs of human metastatic OS cell lines, namely HOS/143B and SaOS-2/LM7. Pathway analysis of the differentially regulated genes and glycoproteins separately revealed pathways associated to metastasis including cell cycle regulation, immune response, and epithelial-to-mesenchymal-transition. However, no common significant pathway was found at both genomic and proteomic levels between the two metastatic models, suggesting a very different biological nature of the cell lines. To address this issue, we used a topological significance analysis based on a “shortest-path” algorithm to identify topological nodes, which uncovered additional biological information with respect to the genomic and glycoproteomic profiles but remained hidden from the direct analyses. Pathway analysis of the significant topological nodes revealed a striking concordance between the models and identified significant common pathways, including “Cytoskeleton remodeling/TGF/WNT”, “Cytoskeleton remodeling/Cytoskeleton remodeling”, and “Cell adhesion/Chemokines and adhesion”. Of these, the “Cytoskeleton remodeling/TGF/WNT” was the top ranked common pathway from the topological analysis of the genomic and proteomic profiles in the two metastatic models. The up-regulation of proteins in the “Cytoskeleton remodeling/TGF/WNT” pathway in the SaOS-2/LM7 and HOS/143B models was further validated using an orthogonal Reverse Phase Protein Array platform. Conclusions: In this study, we used a systems biology approach by integrating genomic and proteomic data to identify key and common metastatic mechanisms in OS. The use of the topological analysis revealed hidden biological pathways that are known to play critical roles in metastasis. Wnt signaling has been previously implicated in OS and other tumors, and inhibitors of Wnt signaling pathways are available for clinical testing. Further characterization of this common pathway and other topological pathways identified from this study may lead to a novel therapeutic strategy for the treatment of metastatic OS.Item Hyaluronan turnover and hypoxic brown adipocytic differentiation are co-localized with ossification in calcified human aortic valves(Elsevier, 2012) Stephens, Elizabeth H.; Saltarrelli, Jerome G. Jr.; Balaoing, Liezl R.; Baggett, L. Scott; Nandi, Indrajit; Anderson, Kristin M.; Morrisett, Joel D.; Reardon, Michael J.; Simpson, Melanie A.; Weigel, Paul H.; Olmsted-Davis, Elizabeth A.; Davis, Alan R.; Grande-Allen, K. JaneThe calcification process in aortic stenosis has garnered considerable interest but only limited investigation into selected signaling pathways. This study investigated mechanisms related to hypoxia, hyaluronan homeostasis, brown adipocytic differentiation, and ossification within calcified valves. Surgically explanted calcified aortic valves (nᅠ=ᅠ14) were immunostained for markers relevant to these mechanisms and evaluated in the center (NodCtr) and edge (NodEdge) of the calcified nodule (NodCtr), tissue directly surrounding nodule (NodSurr); center and tissue surrounding small モprenodulesヤ (PreNod, PreNodSurr); and normal fibrosa layer (CollFibr). Pearson correlations were determined between staining intensities of markers within regions. Ossification markers primarily localized to NodCtr and NodEdge, along with markers related to hyaluronan turnover and hypoxia. Markers of brown adipocytic differentiation were frequently co-localized with markers of hypoxia. In NodCtr and NodSurr, brown fat and ossification markers correlated with hyaluronidase-1, whereas these markers, as well as hypoxia, correlated with hyaluronan synthases in NodEdge. The protein product of tumor necrosis factor-? stimulated gene-6 strongly correlated with ossification markers and hyaluronidase in the regions surrounding the nodules (NodSurr, PreNodSurr). In conclusion, this study suggests roles for hyaluronan homeostasis and the promotion of hypoxia by cells demonstrating brown fat markers in calcific aortic valve disease.Item Accuracy of optical spectroscopy for the detection of cervical intraepithelial neoplasia without colposcopic tissue information; a step toward automation for low resource settings(Society of Photo-Optical Instrumentation Engineers, 2012-04) Yamal, Jose-Miguel; Zewdie, Getie A.; Cox, Dennis D.; Atkinson, E. Neely; Cantor, Scott B.; MacAulay, Calum; Davies, Kalatu; Adewole, Isaac; Buys, Timon P. H.; Follen, MicheleOptical spectroscopy has been proposed as an accurate and low-cost alternative for detection of cervical intraepithelial neoplasia. We previously published an algorithm using optical spectroscopy as an adjunct to colposcopy and found good accuracy (sensitivity ¼ 1.00 [95% confidence interval ðCIÞ ¼ 0.92 to 1.00], specificity ¼ 0.71 [95% CI ¼ 0.62 to 0.79]). Those results used measurements taken by expert colposcopists as well as the colposcopy diagnosis. In this study, we trained and tested an algorithm for the detection of cervical intraepithelial neoplasia (i.e., identifying those patients who had histology reading CIN 2 or worse) that did not include the colposcopic diagnosis. Furthermore, we explored the interaction between spectroscopy and colposcopy, examining the importance of probe placement expertise. The colposcopic diagnosis-independent spectroscopy algorithm had a sensitivity of 0.98 (95% CI ¼ 0.89 to 1.00) and a specificity of 0.62 (95% CI ¼ 0.52 to 0.71). The difference in the partial area under the ROC curves between spectroscopy with and without the colposcopic diagnosis was statistically significant at the patient level (p ¼ 0.05) but not the site level (p ¼ 0.13). The results suggest that the device has high accuracy over a wide range of provider accuracy and hence could plausibly be implemented by providers with limited training.Item Regularized partial least squares with an application to NMR spectroscopy(John Wiley & Sons, Inc., 2013) Allen, Genevera I.; Peterson, Christine; Vannucci, Marina; Maletic-Savatic, MirjanaHigh-dimensional data common in genomics, proteomics, and chemometrics often contains complicated correlation structures. Recently, partial least squares (PLS) and Sparse PLS methods have gained attention in these areas as dimension reduction techniques in the context of supervised data analysis. We introduce a framework for Regularized PLS by solving a relaxation of the SIMPLS optimization problem with penalties on the PLS loadings vectors. Our approach enjoys many advantages including flexibility, general penalties, easy interpretation of results, and fast computation in high-dimensional settings. We also outline extensions of our methods leading to novel methods for non-negative PLS and generalized PLS, an adoption of PLS for structured data. We demonstrate the utility of our methods through simulations and a case study on proton Nuclear Magnetic Resonance (NMR) spectroscopy data.Item An Integrative Bayesian Modeling Approach to Imaging Genetics(Taylor & Francis, 2013) Stingo, Francesco C.; Guindani, Michele; Vannucci, Marina; Calhoun, Vince D.Item Molecular pathway identification using biological network-regularized logistic models(BioMed Central, 2013) Zhang, Wen; Wan, Ying-wooi; Allen, Genevera I.; Pang, Kaifang; Anderson, Matthew L.; Liu, ZhandongBackground: Selecting genes and pathways indicative of disease is a central problem in computational biology. This problem is especially challenging when parsing multi-dimensional genomic data. A number of tools, such as L1-norm based regularization and its extensions elastic net and fused lasso, have been introduced to deal with this challenge. However, these approaches tend to ignore the vast amount of a priori biological network information curated in the literature. Results: We propose the use of graph Laplacian regularized logistic regression to integrate biological networks into disease classification and pathway association problems. Simulation studies demonstrate that the performance of the proposed algorithm is superior to elastic net and lasso analyses. Utility of this algorithm is also validated by its ability to reliably differentiate breast cancer subtypes using a large breast cancer dataset recently generated by the Cancer Genome Atlas (TCGA) consortium. Many of the protein-protein interaction modules identified by our approach are further supported by evidence published in the literature. Source code of the proposed algorithm is freely available at http://www.github.com/zhandong/Logit-Lapnet. Conclusion: Logistic regression with graph Laplacian regularization is an effective algorithm for identifying key pathways and modules associated with disease subtypes. With the rapid expansion of our knowledge of biological regulatory networks, this approach will become more accurate and increasingly useful for mining transcriptomic, epi-genomic, and other types of genome wide association studies.Item Inferring metabolic networks using the Bayesian adaptive graphical lasso with informative priors(International Press, 2013) Peterson, Christine; Vannucci, Marina; Karakas, Cemal; Choi, William; Ma, Lihua; Maletic-Savatic, MirjanaMetabolic processes are essential for cellular function and survival. We are interested in inferring a metabolic network in activated microglia, a major neuroimmune cell in the brain responsible for the neuroinflammation associated with neurological diseases, based on a set of quantified metabolites. To achieve this, we apply the Bayesian adaptive graphical lasso with informative priors that incorporate known relationships between covariates. To encourage sparsity, the Bayesian graphical lasso places double exponential priors on the off-diagonal entries of the precision matrix. The Bayesian adaptive graphical lasso allows each double exponential prior to have a unique shrinkage parameter. These shrinkage parameters share a common gamma hyperprior. We extend this model to create an informative prior structure by formulating tailored hyperpriors on the shrinkage parameters. By choosing parameter values for each hyperprior that shift probability mass toward zero for nodes that are close together in a reference network, we encourage edges between covariates with known relationships. This approach can improve the reliability of network inference when the sample size is small relative to the number of parameters to be estimated. When applied to the data on activated microglia, the inferred network includes both known relationships and associations of potential interest for further investigation.Item Small median tumor diameter at cure threshold (<20 mm) among aggressive non-small cell lung cancers in male smokers predicts both chest X-ray and CT screening outcomes in a novel simulation framework(UICC, 2013) Goldwasser, Deborah L.; Kimmel, MarekThe effectiveness of population-wide lung cancer screening strategies depends on the underlying natural course of lung cancer. We evaluate the expected stage distribution in the Mayo CT screening study under an existing simulation model of non-small cell lung cancer (NSCLC) progression calibrated to the Mayo lung project (MLP). Within a likelihood framework, we evaluate whether the probability of 5-year NSCLC survival conditional on tumor diameter at detection depends significantly on screening detection modality, namely chest X-ray and computed tomography. We describe a novel simulation framework in which tumor progression depends on cellular proliferation and mutation within a stem cell compartment of the tumor. We fit this model to randomized trial data from the MLP and produce estimates of the median radiologic size at the cure threshold. We examine the goodness of model fit with respect to radiologic tumor size and 5-year NSCLC survival among incident cancers in both the MLP and Mayo CT studies. An existing model of NSCLC progression under-predicts the number of advanced-stage incident NSCLCs among males in the Mayo CT study (p-value = 0.004). The probability of 5-year NSCLC survival conditional on tumor diameter depends significantly on detection modality (p-value = 0.0312). In our new model, selected solution sets having a median tumor diameter of 16.2ヨ22.1 mm at cure threshold among aggressive NSCLCs predict both MLP and Mayo CT outcomes. We conclude that the median lung tumor diameter at cure threshold among aggressive NSCLCs in male smokers may be small (<20 mm).Item Robust fitting of a Weibull model with optional censoring(Elsevier, 2013) Yang, Jingjing; Scott, David W.The Weibull family is widely used to model failure data, or lifetime data, although the classical two-parameter Weibull distribution is limited to positive data and monotone failure rate. The parameters of the Weibull model are commonly obtained by maximum likelihood estimation; however, it is well-known that this estimator is not robust when dealing with contaminated data. A new robust procedure is introduced to fit a Weibull model by using L2 distance, i.e. integrated square distance, of the Weibull probability density function. The Weibull model is augmented with a weight parameter to robustly deal with contaminated data. Results comparing a maximum likelihood estimator with an L2 estimator are given in this article, based on both simulated and real data sets. It is shown that this new L2 parametric estimation method is more robust and does a better job than maximum likelihood in the newly proposed Weibull model when data are contaminated. The same preference for L2 distance criterion and the new Weibull model also happens for right-censored data with contamination.Item Investigating Multiple Candidate Genes and Nutrients in the Folate Metabolism Pathway to Detect Genetic and Nutritional Risk Factors for Lung Cancer(Public Library of Science, 2013) Swartz, Michael D.; Peterson, Christine B.; Lupo, Philip J.; Wu, Xifeng; Forman, Michele R.; Spitz, Margaret R.; Hernandez, Ladia M.; Vannucci, Marina; Shete, SanjayPurpose: Folate metabolism, with its importance to DNA repair, provides a promising region for genetic investigation of lung cancer risk. This project investigates genes (MTHFR, MTR, MTRR, CBS, SHMT1, TYMS), folate metabolism related nutrients (B vitamins, methionine, choline, and betaine) and their gene-nutrient interactions. Methods: We analyzed 115 tag single nucleotide polymorphisms (SNPs) and 15 nutrients from 1239 and 1692 non-Hispanic white, histologically-confirmed lung cancer cases and controls, respectively, using stochastic search variable selection (a Bayesian model averaging approach). Analyses were stratified by current, former, and never smoking status. Results: Rs6893114 in MTRR (odds ratio [OR] = 2.10; 95% credible interval [CI]: 1.20–3.48) and alcohol (drinkers vs. non-drinkers, OR = 0.48; 95% CI: 0.26–0.84) were associated with lung cancer risk in current smokers. Rs13170530 in MTRR (OR = 1.70; 95% CI: 1.10–2.87) and two SNP*nutrient interactions [betaine*rs2658161 (OR = 0.42; 95% CI: 0.19–0.88) and betaine*rs16948305 (OR = 0.54; 95% CI: 0.30–0.91)] were associated with lung cancer risk in former smokers. SNPs in MTRR (rs13162612; OR = 0.25; 95% CI: 0.11–0.58; rs10512948; OR = 0.61; 95% CI: 0.41–0.90; rs2924471; OR = 3.31; 95% CI: 1.66–6.59), and MTHFR (rs9651118; OR = 0.63; 95% CI: 0.43–0.95) and three SNP*nutrient interactions (choline*rs10475407; OR = 1.62; 95% CI: 1.11–2.42; choline*rs11134290; OR = 0.51; 95% CI: 0.27–0.92; and riboflavin*rs8767412; OR = 0.40; 95% CI: 0.15–0.95) were associated with lung cancer risk in never smokers. Conclusions: This study identified possible nutrient and genetic factors related to folate metabolism associated with lung cancer risk, which could potentially lead to nutritional interventions tailored by smoking status to reduce lung cancer risk.Item Stochastic hypothesis of transition from inborn neutropenia to AML: interactions of cell population dynamics and population genetics(Frontiers Media, 2013) Kimmel, Marek; Corey, SethWe present a stochastic model of driver mutations in the transition from severe congenital neutropenia to myelodysplastic syndrome to acute myeloid leukemia (AML). The model has the form of a multitype branching process. We derive equations for the distributions of the times to consecutive driver mutations and set up simulations involving a range of hypotheses regarding acceleration of the mutation rates in successive mutant clones. Our model reproduces the clinical distribution of times at diagnosis of secondary AML. Surprisingly, within the framework of our assumptions, stochasticity of the mutation process is incapable of explaining the spread of times at diagnosis of AML in this case; it is necessary to additionally assume a wide spread of proliferative parameters among disease cases. This finding is unexpected but generally consistent with the wide heterogeneity of characteristics of human cancers.Item Modeling neutral evolution using an in nite-allele Markov branching process(Hindawi, 2013) Wu, Xiaowei; Kimmel, MarekWe consider an in nite-allele Markov branching process (IAMBP). Our main focus is the frequency spectrum of this process, i.e., the proportion of alleles having a given number of copies at a speci ed time point. We derive the variance of the frequency spectrum, which is useful for interval estimation and hypothesis testing for process parameters. In addition, for a class of special IAMBP with birth and death o spring distribution, we show that the mean of its limiting frequency spectrum has an explicit form in terms of the hypergeometric function. We also derive an asymptotic expression for convergence rate to the limit. Simulations are used to illustrate the results for the birth and death process.Item Imaging genetics via sparse canonical correlation analysis(IEEE, 2013) Chi, Eric C.; Allen, Genevera I.; Zhou, Hua; Kohannim, Omid; Lange, Kenneth; Thompson, Paul M.The collection of brain images from populations of subjects who have been genotyped with genome-wide scans makes it feasible to search for genetic effects on the brain. Even so, multivariate methods are sorely needed that can search both images and the genome for relationships, making use of the correlation structure of both datasets. Here we investigate the use of sparse canonical correlation analysis (CCA) to home in on sets of genetic variants that explain variance in a set of images. We extend recent work on penalized matrix decomposition to account for the correlations in both datasets. Such methods show promise in imaging genetics as they exploit the natural covariance in the datasets. They also avoid an astronomically heavy statistical correction for searching the whole genome and the entire image for promising associations.Item Heavy-tailed densities(Wiley, 2013) Rojo, JavierThe concept of heavy- or long-tailed densities (or distributions) has attracted much well-deserved attention in the literature. A quick search in Google using the keywords long-tailed statistics retrieves almost 12 million items. The concept has become a pillar of the theory of extremes, and through its connection with outlier-prone distributions, long-tailed distributions also play a central role in the theory of robustness. The concept of tail heaviness is by now ubiquitous, appearing in a diverse set of disciplines that includes: economics, communications, atmospheric sciences, climate modeling, social sciences, physics, modeling of complex systems, etc. Nevertheless, the precise meaning of ‘long-’ or ‘heavy tails’ remains somewhat elusive. Thus, in a substantial portion of the early literature, long-tailednessmeant that the underlying distributionwas capable of producing anomalous observations in the sense that they were ‘too far’ from themain body of observations. Implicit in these informal definitions was the notion that any distribution that behaved that way had to do so because its tails were longer than those of the normal distribution. This paper discusses tail orderings and several approaches for the classification of probability distributions according to tail heaviness. It is concluded that an approach based on the limiting behavior of the residual life function, and its corresponding characterizations based on functions of regular variation and asymptotic distribution of extreme spacings, provides the more natural and illuminating concepts of tail behavior.Item Model-based clustering of large networks(Project Euclid, 2013) Vu, Duy Q.; Hunter, David R.; Schweinberger, MichaelWe describe a network clustering framework, based on finite mixture models, that can be applied to discrete-valued networks with hundreds of thousands of nodes and billions of edge variables. Relative to other recent model-based clustering work for networks, we introduce a more flexible modeling framework, improve the variational-approximation estimation algorithm, discuss and implement standard error estimation via a parametric bootstrap approach, and apply these methods to much larger data sets than those seen elsewhere in the literature. The more flexible framework is achieved through introducing novel parameterizations of the model, giving varying degrees of parsimony, using exponential family models whose structure may be exploited in various theoretical and algorithmic ways. The algorithms are based on variational generalized EM algorithms, where the E-steps are augmented by a minorization-maximization (MM) idea. The bootstrapped standard error estimates are based on an efficient Monte Carlo network simulation idea. Last, we demonstrate the usefulness of the model-based clustering framework by applying it to a discrete-valued network with more than 131,000 nodes and 17 billion edge variables.Item Neural Networks of Colored Sequence Synesthesia(Society for Neuroscience, 2013) Tomson, Steffie N.; Narayan, Manjari; Allen, Genevera I.; Eagleman, David M.Synesthesia is a condition in which normal stimuli can trigger anomalous associations. In this study,weexploit synesthesia to understand how the synesthetic experience can be explained by subtle changes in network properties. Of the many forms of synesthesia, we focus on colored sequence synesthesia, a form in which colors are associated with overlearned sequences, such as numbers and letters (graphemes). Previous studies have characterized synesthesia using resting-state connectivity or stimulus-driven analyses, but it remains unclear how network properties change as synesthetes move from one condition to another. To address this gap, we used functional MRI in humans to identify grapheme-specific brain regions, thereby constructing a functional “synesthetic” network. We then explored functional connectivity of color and grapheme regions during a synesthesia-inducing fMRI paradigm involving rest, auditory grapheme stimulation, and audiovisual grapheme stimulation. Using Markov networks to represent direct relationships between regions, we found that synesthetes had more connections during rest and auditory conditions. We then expanded the network space to include 90 anatomical regions, revealing that synesthetes tightly cluster in visual regions, whereas controls cluster in parietal and frontal regions. Together, these results suggest that synesthetes have increased connectivity between grapheme and color regions, and that synesthetes use visual regions to a greater extent than controls when presented with dynamic grapheme stimulation. These data suggest that synesthesia is better characterized by studying global network dynamics than by individual properties of a single brain region.Item Using community level strategies to reduce asthma attacks triggered by outdoor air pollution: a case crossover analysis(BioMed Central, 2014) Raun, Loren H.; Ensor, Katherine B.; Persse, DavidEvidence indicates that asthma attacks can be triggered by exposure to ambient air pollutants, however, detailed pollution information is missing from asthma action plans. Asthma is commonly associated with four criteria pollutants with standards derived by the United States Environmental Protection Agency. Since multiple pollutants trigger attacks and risks depend upon city-specific mixtures of pollutants, there is lack of specific guidance to reduce exposure. Until multi-pollutant statistical modeling fully addresses this gap, some guidance on pollutant attack risk is required. This study examines the risks from exposure to the asthma-related pollutants in a large metropolitan city and defines the city-specific association between attacks and pollutant mixtures. Our goal is that city-specific pollution risks be incorporated into individual asthma action plans as additional guidance to prevent attacks. Case-crossover analysis and conditional logistic regression were used to measure the association between ozone, fine particulate matter, nitrogen dioxide, sulfur dioxide and carbon monoxide pollution and 11,754 emergency medical service ambulance treated asthma attacks in Houston, Texas from 2004-2011. Both single and multi-pollutant models are presented. In Houston, ozone and nitrogen dioxide are important triggers (RR = 1.05; 95% CI: 1.00, 1.09), (RR = 1.10; 95% CI: 1.05, 1.15) with 20 and 8 ppb increase in ozone and nitrogen dioxide, respectively, in a multi-pollutant model. Both pollutants are simultaneously high at certain times of the year. The risk attributed to these pollutants differs when they are considered together, especially as concentrations increase. Cumulative exposure for ozone (0-2 day lag) is of concern, whereas for nitrogen dioxide the concern is with single day exposure. Persons at highest risk are aged 46-66, African Americans, and males. Accounting for cumulative and concomitant outdoor pollutant exposure is important to effectively attribute risk for triggering of an asthma attack, especially as concentrations increase. Improved asthma action plans for Houston individuals should warn of these pollutants, their trends, correlation and cumulative effects. Our Houston based study identifies nitrogen dioxide levels and the three-day exposure to ozone to be of concern whereas current single pollutant based national standards do not.Item Point source influence on observed extreme pollution levels in a monitoring network(Elsevier, 2014) Ensor, Katherine B.; Ray, Bonnie K.; Charlton, Sarah J.This paper presents a strategy to quantify the influence major point sources in a region have on extreme pollution values observed at each of the monitors in the network. We focus on the number of hours in a day the levels at a monitor exceed a specified health threshold. The number of daily exceedances are modeled using observation-driven negative binomial time series regression models, allowing for a zero-inflation component to characterize the probability of no exceedances in a particular day. The spatial nature of the problem is addressed through the use of a Gaussian plume model for atmospheric dispersion computed at locations of known emissions, creating covariates that impact exceedances. In order to isolate the influence of emitters at individual monitors, we fit separate regression models to the series of counts from each monitor. We apply a final model clustering step to group monitor series that exhibit similar behavior with respect to mean, variability, and common contributors to support policy decision making. The methodology is applied to eight benzene pollution series measured at air quality monitors around the Houston ship channel, a major industrial port.Item A Bayesian hierarchical model for maximizing the vascular adhesion of nanoparticles(Springer, 2014) Fronczyk, Kassandra; Guindani, Michele; Vannucci, Marina; Palange, Annalisa; Decuzzi, PaoloThe complex vascular dynamics and wall deposition of systemically injected nanoparticles is regulated by their geometrical properties (size, shape) and biophysical parameters (ligand–receptor bond type and surface density, local shear rates). Although sophisticated computational models have been developed to capture the vascular behavior of nanoparticles, it is increasingly recognized that purely deterministic approaches, where the governing parameters are known a priori and conclusively describe behaviors based on physical characteristics, may be too restrictive to accurately reflect natural processes. Here, a novel computational framework is proposed by coupling the physics dictating the vascular adhesion of nanoparticles with a stochastic model. In particular, two governing parameters (i.e. the ligand–receptor bond length and the ligand surface density on the nanoparticle) are treated as two stochastic quantities, whose values are not fixed a priori but would rather range in defined intervals with a certain probability. This approach is used to predict the deposition of spherical nanoparticles with different radii, ranging from 750 to 6,000 nm, in a parallel plate flow chamber under different flow conditions, with a shear rate ranging from 50 to 90 s−1 . It is demonstrated that the resulting stochastic model can predict the experimental data more accurately than the original deterministic model. This approach allows one to increase the predictive power of mathematical models of any natural process by accounting for the experimental and intrinsic biological uncertainties.