Center for Research Computing
Permanent URI for this collection
The Center for Research Computing (CRC) supports computational work by Rice faculty, staff, and student researchers. In cases where the lead author deems these contributions to merit an explicit acknowledgement in the paper or dataset, or the lead author is CRC staff, that item is manually added to this collection (in addition to any other collections it may already belong to).
Browse
Browsing Center for Research Computing by Title
Now showing 1 - 20 of 69
Results Per Page
Sort Options
Item A Crowdsourcing Approach to Developing and Assessing Prediction Algorithms for AML Prognosis(Public Library of Science, 2016) Noren, David P.; Long, Byron L.; Norel, Raquel; Rrhissorrakrai, Kahn; Hess, Kenneth; Hu, Chenyue Wendy; Bisberg, Alex J.; Schultz, Andre; Engquist, Erik; Liu, Li; Lin, Xihui; Chen, Gregory M.; Xie, Honglei; Hunter, Geoffrey A.M.; Boutros, Paul C.; Stepanov, Oleg; DREAM 9 AML-OPC Consortium; Norman, Thea; Friend, Stephen H.; Stolovitzky, Gustavo; Kornblau, Steven; Qutub, Amina A.Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the quantitative means to do so have yet to be developed. Here we report the results and insights gained from the DREAM 9 Acute Myeloid Prediction Outcome Prediction Challenge (AML-OPC), a crowdsourcing effort designed to promote the development of quantitative methods for AML prognosis prediction. We identify the most accurate and robust models in predicting patient response to therapy, remission duration, and overall survival. We further investigate patient response to therapy, a clinically actionable prediction, and find that patients that are classified as resistant to therapy are harder to predict than responsive patients across the 31 models submitted to the challenge. The top two performing models, which held a high sensitivity to these patients, substantially utilized the proteomics data to make predictions. Using these models, we also identify which signaling proteins were useful in predicting patient therapeutic response.Item A discrepancy-based penalty method for extended waveform inversion(Society of Exploration Geophysicists, 2017) Fu, Lei; Symes, William W.; The Rice Inversion ProjectExtended waveform inversion globalizes the convergence of seismic waveform inversion by adding nonphysical degrees of freedom to the model, thus permitting it to fit the data well throughout the inversion process. These extra degrees of freedom must be curtailed at the solution, for example, by penalizing them as part of an optimization formulation. For separable (partly linear) models, a natural objective function combines a mean square data residual and a quadratic regularization term penalizing the nonphysical (linear) degrees of freedom. The linear variables are eliminated in an inner optimization step, leaving a function of the outer (nonlinear) variables to be optimized. This variable projection method is convenient for computation, but it requires that the penalty weight be increased as the estimated model tends to the (physical) solution. We describe an algorithm based on discrepancy, that is, maintaining the data residual at the inner optimum within a prescribed range, to control the penalty weight during the outer optimization. We evaluate this algorithm in the context of constant density acoustic waveform inversion, by recovering background model and perturbation fitting bandlimited waveform data in the Born approximation.Item A Refined Parallel Simulation of Crossflow Membrane Filtration(2011) Boyle, Paul Martin; Houchens, Brent C.This work builds upon the previous research carried out in the development of a simulation that incorporated a dynamically-updating velocity profile and electric interactions between particles with a Force Bias Monte Carlo method. Surface roughness of the membranes is added to this work, by fixing particles to the membrane surface. Additionally, the previous electric interactions are verified through the addition of an allrange solution to the calculation of the electrostatic double layer potential between two particles. Numerous numerical refinements are made to the simulation in order to ensure accuracy and confirm that previous results using single-precision variables are accurate when compared to double-precision work. Finally, the method by which the particles move within a Monte Carlo step was altered in order to implement a different data handling structure for the parallel environment. This new data handling structure greatly reduces the runtime while providing a more realistic movement scheme for the particles. Additionally, this data handling scheme offers the possibility of using a variety ofn-body algorithms that could, in the future, improve the speed of the simulation in cases with very high particle counts.Item A Software Framework for Finite Difference Simulation(2009-04) Terentyev, IgorThis paper describes a software framework for solving time dependent PDEs in simple domains using finite difference (FD) methods. The framework is designed for parallel computations on distributed and shared memory computers, thus allowing for efficient solution of large-scale problems. The framework provides automated data exchange between processors based on stencil information. This automated data exchange allows a user to add FD schemes without knowledge about underlying parallel infrastructure. The framework includes acoustic solver based on staggered second-order in time and various orders in space FD schemes with perfectly matched layer and/or free surface boundary conditions.Item Advanced Computational Methods for High-accuracy Refinement of Protein Low-quality Models(2016-11-10) Zang, Tianwu; Ma, JianpengPredicting the 3-dimentional structure of protein has been a major interest in the modern computational biology. While lots of successful methods can generate models with 3~5Å root-mean-square deviation (RMSD) from the solution, the progress of refining these models is quite slow. It is therefore urgently needed to develop effective methods to bring low-quality models to higher-accuracy ranges (e.g., less than 2 Å RMSD). In this thesis, I present several novel computational methods to address the high-accuracy refinement problem. First, an enhanced sampling method, named parallel continuous simulated tempering (PCST), is developed to accelerate the molecular dynamics (MD) simulation. Second, two energy biasing methods, Structure-Based Model (SBM) and Ensemble-Based Model (EBM), are introduced to perform targeted sampling around important conformations. Third, a three-step method is developed to blindly select high-quality models along the MD simulation. These methods work together to make significant refinement of low-quality models without any knowledge of the solution. The effectiveness of these methods is examined in different applications. Using the PCST-SBM method, models with higher global distance test scores (GDT_TS) are generated and selected in the MD simulation of 18 targets from the refinement category of the 10th Critical Assessment of Structure Prediction (CASP10). In addition, in the refinement test of two CASP10 targets using the PCST-EBM method, it is indicated that EBM may bring the initial model to even higher-quality levels. Furthermore, a multi-round refinement protocol of PCST-SBM improves the model quality of a protein to the level that is sufficient high for the molecular replacement in X-ray crystallography. Our results justify the crucial position of enhanced sampling in the protein structure prediction and demonstrate that a considerable improvement of low-accuracy structures is still achievable with current force fields.Item Advanced Computational Methods for Macromolecular Modeling and Structure Determination(2013-12-05) Zhang, Chong; Ma, Jianpeng; Nordlander, Peter J.; Kiang, Ching-Hwa; Raphael, Robert M.As volume and complexity of macromolecules increase, theories and algorithms that deal with structure determination at low X-ray resolution are of particular importance. With limited diffraction data in hand, experimentalists rely on advanced computational tools to extract and utilize useful information, seeking to determinate a three dimensional model that best fits the experiment data. Success of further studies on the property and function of a specific molecule - the key to practical applications - is therefore heavily dependent on the validity and accuracy of the solved structure. In this thesis I propose Deformable Complex Network (DCN) and introduce Normal Mode Analysis (NMA), which are designed to model the average coordinates of atoms and associated fluctuations, respectively. Their applications on structure determination target two major branches ? the positional refinement and temperature factor refinement. I demonstrate their remarkable performance in structure improvements based on several criteria, such as the free R value, overfitting effect and Ramachandran Statistics, with tests carried out across a broad range of real systems for generality and consistency.Item An adaptive multiscale algorithm for efficient extended waveform inversion(Society of Exploration Geophysicists, 2017) Fu, Lei; Symes, William W.; The Rice Inversion ProjectSubsurface-offset extended full-waveform inversion (FWI) may converge to kinematically accurate velocity models without the low-frequency data accuracy required for standard data-domain FWI. However, this robust alternative approach to waveform inversion suffers from a very high computational cost resulting from its use of nonlocal wave physics: The computation of strain from stress involves an integral over the subsurface offset axis, which must be performed at every space-time grid point. We found that a combination of data-fit driven offset limits, grid coarsening, and low-pass data filtering can reduce the cost of extended inversion by one to two orders of magnitude.Item Anhydrosugars as tracers of fire air quality effects, carbon cycling and paleoclimate(2020-04-24) Suciu, Loredana G; Masiello, Caroline A; Griffin, Robert JWild and prescribed fires are important sources of a broad suite of organic compounds collectively termed pyrogenic carbon (PyC). Most PyC compounds have additional sources beyond fire, adding uncertainty to their use as tracers. However, members of the anhydrosugar family of isomeric compounds - levoglucosan, galactosan and mannosan – are generated exclusively by the pyrolysis and combustion of cellulose and hemicellulose. Although anhydrosugars are some of the only unique organic markers for fire, their use as tracers in atmospheric, marine, and terrestrial systems is challenging because there is no clear theoretical framework to deal with their reactivity and phase partitioning. The atmospheric science community has made the first approximation that they are unreactive on timescales of interest. This assumption is problematic, because there is ample evidence of anhydrosugar reactivity on short timescales. On the other hand, the terrestrial and marine science communities have not yet seen wide use of anhydrosugars as tracers because our understanding of their biogeochemistry and transport through the Earth system is poorly constrained. Chapter 2 of this thesis reviews evidence for anhydrosugar production, degradation and detection in various environments and use this information to develop a framework for uses of anhydrosugars in research on PyC and organic matter in the Earth system. Anhydrosugars are chemically reactive in all phases (gaseous, aqueous and particulate), molecularly diffusive in semisolid matter, semivolatile, water-soluble, and biodegradable. Their chemical composition also suggests that they sorb to soil mineral surfaces. Together, these characteristics mean that anhydrosugars are not conservative tracers. While these traits have historically been perceived as drawbacks, here I argue that they present opportunities for new research avenues, including tracking organic matter transport and degradation in multiple environments. Chapter 3 of this thesis provides insights on the model development and simulations of the atmospheric degradation of the most abundant anhydrosugar emitted from biomass burning - levoglucosan (LEV) - and its effects on the formation of secondary organic aerosols (SOA) and other gases, using a zero-dimensional (0-D) modeling framework. Existing chemical mechanisms (homogeneous gas-phase chemistry and heterogeneous chemistry) were updated to include the chemical degradation of LEV and its intermediary degradation products in both phases (gas and aerosol). In addition, the gas-particle partitioning mechanism was added to the model to account for the effect of evaporation and condensation on the concentrations of LEV and its degradation products. Comparison of simulation results with measurements from various chamber experiments show that the degradation time scale of LEV varied by phase, 1.5-3.5 days (gas-phase) and 8-21 hours (aerosol-phase); these relatively short time scales suggest that most of the initial LEV concentration can be lost chemically or deposited locally before being transported regionally. Estimated secondary organic aerosol SOA yields (5-32%) reveal that conversion of LEV to secondary products is significant and occurs rapidly in the studied scenarios. The chemical degradation of LEV has effects on other gases, such as increasing the concentrations of radicals and total reactive nitrogen. Decreases of nitrogen oxides (NOx) appear to drive a more rapid increase in ozone (O3) compared to volatile organic compounds (VOC) levels. Future model evaluations and subsequent implementation of the 0-D multiphase LEV chemistry (extended to include its isomers) in CTMs will allow to model both regional transport and deposition of anhydrosugars and thus better assess their atmospheric implications and use as tracers. Another application of anhydrosugars would be to trace regional air transported to highly polluted urban areas, such as the Houston area. Vegetation fires occurring outside this region contribute emissions of O3 precursors, such as VOC and NOx. However, in the Houston area, there are multiple sources of such emissions (industrial activity, vehicle exhaust, etc.), and mixing with those sources challenges the quantification of regional contributions to locally measured concentrations. This is important because air pollution control measures impact the industrial activity in the area. While anhydrosugars have not been used in this study to help constrain regional background O3 and NOx, they open an unexplored pathway for future studies that can build on the additional work presented in Chapter 4 of this thesis, such as the estimation of regional background O3 and NOx using statistical analysis of O3, NOx and meteorology measured in the Houston-Galveston-Brazoria (HGB) region. This study used four different approaches based on principal component analysis (PCA). Three of these approaches consist of independent PCA on both O3 and NOx for both 1-h and 8-h levels to compare the results with previous studies and to highlight the effect of both temporal and spatial scales. In the fourth approach, O3, NOx and meteorology were co-varied. Results show that the estimation of regional background O3 has less inherent uncertainty when it was constrained by NOx and meteorology, yielding a statistically significant temporal trend of -0.68 ± 0.27 ppb y-1. Likewise, the estimation of regional background NOx trend constrained by O3 and meteorology was -0.04 ± 0.02 ppb y-1 (upper bound) and -0.03 ± 0.01 ppb y-1 (lower bound). The best estimates of 17-y average of season-scale background O3 and NOx were 46.72 ± 2.08 ppb and 6.80 ± 0.13 ppb (upper bound) or 4.45 ± 0.08 ppb (lower bound), respectively. Average background O3 is consistent with previous studies and between the approaches used in this study, although the approaches based on 8-h averages likely overestimate background O3 compared to the hourly median approach by 7-9 ppb. Similarly, the upper bound of average background NOx is consistent between approaches in this study but overestimated compared to the hourly approach by 1 ppb, on average. The study likely overestimates the upper bound background NOx due to instrument overdetection of NOx and the 8-h averaging of NOx and meteorology coinciding with maximum daily eight hours average O3. Regional background O3 and NOx in the HGB region both have declined over the past two decades. This decline became steadier after 2007, overlapping with the effects of controlling precursor emissions and a prevailing southeasterly-southerly flow.Item Communication Optimizations for Distributed-Memory X10 Programs(2010-04-10) Barik, Rajkishore; Budimlić, Zoran; Grove, David; Peshansky, Igor; Sarkar, Vivek; Zhao, JishengX10 is a new object-oriented PGAS (Partitioned Global Address Space) programming language with support for distributed asynchronous dynamic parallelism that goes beyond past SPMD message-passing models such as MPI and SPMD PGAS models such as UPC and Co-Array Fortran. The concurrency constructs in X10 make it possible to express complex computation and communication structures with higher productivity than other distributed-memory programming models. However, this productivity often comes at the cost of high performance overhead when the language is used in its full generality. This paper introduces high-level compiler optimizations and transformations to reduce communication and synchronization overheads in distributed-memory implementations of X10 programs. Specifically, we focus on locality optimizations such as scalar replacement and task localization, combined with supporting transformations such as loop distribution, scalar expansion, loop tiling, and loop splitting. We have completed a prototype implementation of these high-level optimizations, and performed a performance evaluation that shows significant improvements in performance, scalability, communication volume and number of tasks. We evaluated the communication optimizations on three platforms: a 128-node BlueGene/P cluster, a 32-node Nehalem cluster, and a 16-node Power7 cluster. On the BlueGene/P cluster, we observed a maximum performance improvement of 31.46× relative to the unoptimized case (for the MolDyn benchmark). On the Nehalem cluster, we observed a maximum performance improvement of 3.01× (for the NQueens benchmark) and on the Power7 cluster, we observed a maximum performance improvement of 2.73× (for the MolDyn benchmark). In addition, there was no case in which the optimized code was slower than the unoptimized case. We also believe that the optimizations presented in this paper will be necessary for any high-productivity PGAS language based on modern object-oriented principles, that is designed for execution on future Extreme Scale systems that place a high premium on locality improvement for performance and energy efficiency.Item Coupling a dynamically updating velocity profile and electric field interactions with force bias Monte Carlo methods to simulate colloidal fouling in membrane filtration(2009) Boyle, Paul Martin; Houchens, Brent C.Work has been completed in the modeling of pressure-driven channel flow with particulate volume fractions ranging from one to ten percent. Transport of particles is influenced by Brownian and shear-induced diffusion, and convection due to the axial crossflow. The particles in the simulation are also subject to electrostatic double layer repulsion and van der Waals attraction both between particles and between the particles and channel surfaces. These effects are modeled using Hydrodynamic Force Bias Monte Carlo (HFBMC) simulations to predict the deposition of the particles on the channel surfaces. Hydrodynamics and the change in particle potential determine the probability that a proposed, random move of a particle will be accepted. These discrete particle effects are coupled to the continuum flow via an apparent local viscosity, yielding a dynamically updating quasi-steady-state velocity profile. Results of this study indicate particles subject to combined hydrodynamic and electric effects reach a highly stable steady-state condition when compared to systems in which particles are subject only to hydrodynamic effects.Item Critical Theory, Normativity, and Catastrophe: A Critique of Amy Allen’s Metanormative Contextualism(Rice University, 2020-05) Rehman, Bilal; Crowell, StevenCritical theory is an approach to philosophical and cultural analysis that focuses on oppression and liberation. In this essay, I consider the prospect of moral-political progress in critical theory, focusing primarily on Amy Allen’s position of metanormative contextualism as described in her 2016 work, "The End of Progress." I first consider Allen’s arguments against Jurgen Habermas' theory of communicative action, and then explore how metanormative contextualism is rooted in the thought of Theodor Adorno and Michel Foucault. Lastly, by showing how postcolonial studies reminds us of the deeply political stakes of critical theory, I argue that ideas about moral-political progress can be grounded in the urgent need to “avoid catastrophe.”Item Depth and coral cover drive the distribution of a coral macroborer across two reef systems(Public Library of Science, 2018) Maher, Rebecca L.; Johnston, Michelle A.; Brandt, Marilyn E.; Smith, Tyler Burton; Correa, Adrienne M.S.Bioerosion, the removal of calcium carbonate from coral frameworks by living organisms, influences a variety of reef features, from their topographic complexity to the net balance of carbonate budgets. Little is known, however, about how macroborers, which bore into reef substrates leaving traces greater than 0.1 mm diameter, are distributed across coral reefs, particularly reef systems with high (>50%) stony coral cover or at mesophotic depths (≥30 m). Here, we present an accurate and efficient method for quantifying macroborer densities from stony coral hosts via image analysis, using the bioeroding barnacle, Lithotrya dorsalis, and its host coral, Orbicella franksi, as a case study. We found that in 2014, L. dorsalis densities varied consistently with depth and host percent cover in two Atlantic reef systems: the Flower Garden Banks (FGB, northwest Gulf of Mexico) and the U.S. Virgin Islands (USVI). Although average barnacle density was nearly 4.5 times greater overall in the FGB than in the USVI, barnacle density decreased with depth in both reef regions. Barnacle density also scaled negatively with increasing coral cover in the study areas, suggesting that barnacle populations are not strictly space-limited in their distribution and settlement opportunities. Our findings suggest that depth and host coral cover, and potentially, local factors may strongly influence the abundance of macroborers, and thus the rate of CaCO3 loss, in a given reef system. Our image analysis method for quantifying macroborers can be standardized across historical and modern reef records to better understand how borers impact host growth and reef health.Item Digital Scholarship Services Fall 2019 Newsletter(Rice University, 2019-09-26) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations.Item Digital Scholarship Services Fall 2020 Newsletter(Rice University, 2020-10-13) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations. Note: In an effort to preserve the original newsletter content, URLs have been replaced with Perma.cc links, which provide an archived snapshot of the webpage at the time the newsletter was created.Item Digital Scholarship Services Spring 2020 Newsletter(Rice University, 2020-02-20) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations.Item Digital Scholarship Services Spring 2021 Newsletter(Rice University, 2021-02-25) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations. Note: In an effort to preserve the original newsletter content, URLs have been replaced with Perma.cc links, which provide an archived snapshot of the webpage at the time the newsletter was created.Item Digital Scholarship Services Summer 2019 Newsletter(Rice University, 2019-07-10) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations.Item Digital Scholarship Services Summer 2020 Newsletter(Rice University, 2020-06-11) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations. Note: In an effort to preserve the original newsletter content, URLs have been replaced with Perma.cc links, which provide an archived snapshot of the webpage at the time the newsletter was created.Item Digital Scholarship Services Summer 2021 Newsletter(Rice University, 2021-06-08) Fondren Library Digital Scholarship ServicesIn this newsletter, Fondren Library's Digital Scholarship Services highlight services, tools and collections of interest to the Rice community. It also shares staff news and spotlight recent collaborations. Note: In an effort to preserve the original newsletter content, URLs have been replaced with Perma.cc links, which provide an archived snapshot of the webpage at the time the newsletter was created.Item Discontinuous Galerkin Methods for Parabolic Partial Differential Equations with Random Input Data(2013-09-16) Liu, Kun; Riviere, Beatrice M.; Heinkenschloss, Matthias; Symes, William W.; Vannucci, MarinaThis thesis discusses and develops one approach to solve parabolic partial differential equations with random input data. The stochastic problem is firstly transformed into a parametrized one by using finite dimensional noise assumption and the truncated Karhunen-Loeve expansion. The approach, Monte Carlo discontinuous Galerkin (MCDG) method, randomly generates $M$ realizations of uncertain coefficients and approximates the expected value of the solution by averaging M numerical solutions. This approach is applied to two numerical examples. The first example is a two-dimensional parabolic partial differential equation with random convection term and the second example is a benchmark problem coupling flow and transport equations. I first apply polynomial kernel principal component analysis of second order to generate M realizations of random permeability fields. They are used to obtain M realizations of random convection term computed from solving the flow equation. Using this approach, I solve the transport equation M times corresponding to M velocity realizations. The MCDG solution spreads toward the whole domain from the initial location and the contaminant does not leave the initial location completely as time elapses. The results show that MCDG solution is realistic, because it takes the uncertainty in velocity fields into consideration. Besides, in order to correct overshoot and undershoot solutions caused by the high level of oscillation in random velocity realizations, I solve the transport equation on meshes of finer resolution than of the permeability, and use a slope limiter as well as lower and upper bound constraints to address this difficulty. Finally, future work is proposed.