Center for Research Computing

Permanent URI for this collection

https://hdl.handle.net/1911/114452

The Center for Research Computing (CRC) supports computational work by Rice faculty, staff, and student researchers. In cases where the lead author deems these contributions to merit an explicit acknowledgement in the paper or dataset, or the lead author is CRC staff, that item is manually added to this collection (in addition to any other collections it may already belong to).

Browse

Now showing 1 - 20 of 69

Molecules in motion: Computing structural flexibility
(2008) Shehu, Amarda; Kavraki, Lydia E.
Growing databases of protein sequences in the post-genomic era call for computational methods to extract structure and function from a protein sequence. In flexible molecules like proteins, function cannot be reliably extracted from a few structures. The amino-acid chain assumes various spatial arrangements (conformations) to modulate biological function. Characterizing the flexibility of a protein under physiological (native) conditions remains an open problem in computational biology. This thesis addresses the problem of characterizing the native flexibility of a protein by computing conformations populated under native conditions. Such computation involves locating free-energy minima in a high-dimensional conformational space. The methods proposed in this thesis search for native conformations using systematically less information from experiment: first employing an experimental structure, then using only a closure constraint in cyclic cysteine-rich peptides, and finally employing only the amino-acid sequence of small- to medium-size proteins. A novel method is proposed to compute structural fluctuations of a protein around an experimental structure. The method combines a robotics-inspired exploration of the conformational space with a statistical mechanics formulation. Thermodynamic quantities measured over generated conformations reproduce experimental data of broad time scales on small (∼ 100 amino acids) proteins with non-concerted motions. Capturing concerted motions motivates the development of the next methods. A second method is proposed that employs a closure constraint to generate native conformations of cyclic cysteine-rich peptides. The method first explores the entire conformational space, then explores in present energy minima until no lower-energy minima emerge. The method captures relevant features of the native state also observed in experiment for 20–30 amino-acid long peptides. A final method is proposed that implements a similar exploration but for longer proteins and employing only amino-acid sequence. In its first stage, the method explores the entire conformational space at a coarse-grained level of detail. A second stage focuses the exploration to low-energy regions in more detail. All-atom conformational ensembles are obtained for proteins that populate various functional states through large-scale concerted motions. These ensembles capture well the populated functional states of proteins up to 214 amino-acids long.
From high-level tasks to low-level motions: Motion planning for high-dimensional nonlinear hybrid robotic systems
(2008) Plaku, Erion; Kavraki, Lydia E.
A significant challenge of autonomous robotics in transportation, exploration, and search-and-rescue missions lies in the area of motion planning. The overall objective is to enable robots to automatically plan the low-level motions needed to accomplish assigned high-level tasks. Toward this goal, this thesis proposes a novel multi-layered approach, termed Synergic Combination of Layers of Planning ( SyCLoP ), that synergically combines high-level discrete planning and low-level motion planning. High-level discrete planning, which draws from research in AI and logic, guides low-level motion planning during the search for a solution. Information gathered during the search is in turn fed back from the low-level to the high-level layer in order to improve the high-level plan in the next iteration. In this way, high-level plans become increasingly useful in guiding the low-level motion planner toward a solution. This synergic combination of high-level discrete planning and low-level motion planning allows SyCLoP to solve motion-planning problems with respect to rich models of the robot and the physical world. This facilitates the design of feedback controllers that enable the robot to execute in the physical world solutions obtained in simulation. In particular, SyCLoP effectively solves challenging motion-planning problems that incorporate robot dynamics, physics-based simulations, and hybrid systems. Hybrid systems move beyond continuous models by employing discrete logic to instantaneously modify the underlying robot dynamics to respond to mishaps or unanticipated changes in the environment. Experiments in this thesis show that SyCLoP obtains significant computational speedup of one to two orders of magnitude when compared to state-of-the-art motion planners. In addition to planning motions that allow the robot to reach a desired destination while avoiding collisions, SyCLoP can take into account high-level tasks specified using the expressiveness of linear temporal logic (LTL). LTL allows for complex specifications, such as sequencing, coverage, and other combinations of temporal objectives. Going beyond motion planning, SyCLoP also provides a useful framework for discovering violations of safety properties in hybrid systems.
Multi-scale Behavior in Chemical Reaction Systems: Modeling, Applications, and Results
(2008-08) Turner, Jesse Hosea III
Four major approaches model the time dependent behavior of chemical reaction systems: ordinary differential equations (ODE's), the tau-leap algorithm, stochastic differential equations (SDE's), and Gillespie's stochastic simulation algorithm (SSA). ODE's are simulated the most quickly of these, but are often inaccurate for systems with slow rates and molecular species present in small numbers. Under ideal conditions, the SSA is exact, but computationally inefficient. Unfortunately, many reaction systems exhibit characteristics not well captured individually by any of these methods. Therefore, hybrid models incorporating aspects from all four must be employed. The aim is to construct an approach that is close in accuracy to the SSA, useful for a wide range of reaction system examples, and computationally efficient. The Adaptive Multi-scale Simulation Algorithm (AMSA) uses the SSA for slow reactions, SDE's for medium-speed reactions, ODE's for fast reactions, and the tau-leap algorithm for non-slow reactions involving species small in number. This article introduces AMSA and applies it to examples of reaction systems involving genetic regulation. A thorough review of existing reaction simulation algorithms is included. The computational performance and accuracy of AMSA's molecular distributions are compared to those of the SSA, which is used as the golden standard of accuracy. The use of supercomputers can generate much larger data sets than serial processors in roughly the same amount of computational time. Therefore, multi-processor machines are also employed to assess the accuracy of AMSA simulations.
Coupling a dynamically updating velocity profile and electric field interactions with force bias Monte Carlo methods to simulate colloidal fouling in membrane filtration
(2009) Boyle, Paul Martin; Houchens, Brent C.
Work has been completed in the modeling of pressure-driven channel flow with particulate volume fractions ranging from one to ten percent. Transport of particles is influenced by Brownian and shear-induced diffusion, and convection due to the axial crossflow. The particles in the simulation are also subject to electrostatic double layer repulsion and van der Waals attraction both between particles and between the particles and channel surfaces. These effects are modeled using Hydrodynamic Force Bias Monte Carlo (HFBMC) simulations to predict the deposition of the particles on the channel surfaces. Hydrodynamics and the change in particle potential determine the probability that a proposed, random move of a particle will be accepted. These discrete particle effects are coupled to the continuum flow via an apparent local viscosity, yielding a dynamically updating quasi-steady-state velocity profile. Results of this study indicate particles subject to combined hydrodynamic and electric effects reach a highly stable steady-state condition when compared to systems in which particles are subject only to hydrodynamic effects.
A Software Framework for Finite Difference Simulation
(2009-04) Terentyev, Igor
This paper describes a software framework for solving time dependent PDEs in simple domains using finite difference (FD) methods. The framework is designed for parallel computations on distributed and shared memory computers, thus allowing for efficient solution of large-scale problems. The framework provides automated data exchange between processors based on stencil information. This automated data exchange allows a user to add FD schemes without knowledge about underlying parallel infrastructure. The framework includes acoustic solver based on staggered second-order in time and various orders in space FD schemes with perfectly matched layer and/or free surface boundary conditions.
Efficient optimization of memory accesses in parallel programs
(2010) Barik, Rajkishore; Sarkar, Vivek
The power, frequency, and memory wall problems have caused a major shift in mainstream computing by introducing processors that contain multiple low power cores. As multi-core processors are becoming ubiquitous, software trends in both parallel programming languages and dynamic compilation have added new challenges to program compilation for multi-core processors. This thesis proposes a combination of high-level and low-level compiler optimizations to address these challenges. The high-level optimizations introduced in this thesis include new approaches to May-Happen-in-Parallel analysis and Side-Effect analysis for parallel programs and a novel parallelism-aware Scalar Replacement for Load Elimination transformation. A new Isolation Consistency (IC) memory model is described that permits several scalar replacement transformation opportunities compared to many existing memory models. The low-level optimizations include a novel approach to register allocation that retains the compile time and space efficiency of Linear Scan, while delivering runtime performance superior to both Linear Scan and Graph Coloring. The allocation phase is modeled as an optimization problem on a Bipartite Liveness Graph (BLG) data structure. The assignment phase focuses on reducing the number of spill instructions by using register-to-register move and exchange instructions wherever possible. Experimental evaluations of our scalar replacement for load elimination transformation in the Jikes RVM dynamic compiler show decreases in dynamic counts for getfield operations of up to 99.99%, and performance improvements of up to 1.76x on 1 core, and 1.39x on 16 cores, when compared with the load elimination algorithm available in Jikes RVM. A prototype implementation of our BLG register allocator in Jikes RVM demonstrates runtime performance improvements of up to 3.52x relative to Linear Scan on an x86 processor. When compared to Graph Coloring register allocator in the GCC compiler framework, our allocator resulted in an execution time improvement of up to 5.8%, with an average improvement of 2.3% on a POWER5 processor. With the experimental evaluations combined with the foundations presented in this thesis, we believe that the proposed high-level and low-level optimizations are useful in addressing some of the new challenges emerging in the optimization of parallel programs for multi-core architectures.
Grid-centric scheduling strategies for workflow applications
(2010) Zhang, Yang; Cooper, Keith D.
Grid computing faces a great challenge because the resources are not localized, but distributed, heterogeneous and dynamic. Thus, it is essential to provide a set of programming tools that execute an application on the Grid resources with as little input from the user as possible. The thesis of this work is that Grid-centric scheduling techniques of workflow applications can provide good usability of the Grid environment by reliably executing the application on a large scale distributed system with good performance. We support our thesis with new and effective approaches in the following five aspects. First, we modeled the performance of the existing scheduling approaches in a multi-cluster Grid environment. We implemented several widely-used scheduling algorithms and identified the best candidate. The study further introduced a new measurement, based on our experiments, which can improve the schedule quality of some scheduling algorithms as much as 20 fold in a multi-cluster Grid environment. Second, we studied the scalability of the existing Grid scheduling algorithms. To deal with Grid systems consisting of hundreds of thousands of resources, we designed and implemented a novel approach that performs explicit resource selection decoupled from scheduling Our experimental evaluation confirmed that our decoupled approach can be scalable in such an environment without sacrificing the quality of the schedule by more than 10%. Third, we proposed solutions to address the dynamic nature of Grid computing with a new cluster-based hybrid scheduling mechanism. Our experimental results collected from real executions on production clusters demonstrated that this approach produces programs running 30% to 100% faster than the other scheduling approaches we implemented on both reserved and shared resources. Fourth, we improved the reliability of Grid computing by incorporating fault- tolerance and recovery mechanisms into the workow application execution. Our experiments on a simulated multi-cluster Grid environment demonstrated the effectiveness of our approach and also characterized the three-way trade-off between reliability, performance and resource usage when executing a workflow application. Finally, we improved the large batch-queue wait time often found in production Grid clusters. We developed a novel approach to partition the workow application and submit them judiciously to achieve less total batch-queue wait time. The experimental results derived from production site batch queue logs show that our approach can reduce total wait time by as much as 70%. Our approaches combined can greatly improve the usability of Grid computing while increasing the performance of workow applications on a multi-cluster Grid environment.
Low viscosity channels and the stability of long wavelength convection
(2010) Ahmed, Omar Khalil; Lenardic, Adrian
Mantle convection simulations with a low viscosity channel, akin to the Earth's asthenosphere, are characterized by long wavelength flow structure. Boundary layer theory predicts that as the viscosity of the channel decreases, the wavelength that maximizes heat transfer increases. As a pattern selection criterion, this analysis is not complete. It provides no mechanism to relate the optimal heat transfer wavelength to the wavelength that is realized or preferred in nature. We present numerical simulation suites, for bottom and internally heated end-members, to demonstrate that the cell wavelengths that maximize heat transfer are also the most stable. This does not rule out the possibility of multiple wavelengths being realizable but it does imply that wavelengths near the stability peak will be preferred and, for the configurations we explore, the stability peak corresponds to the energetically most efficient flow configuration.
Communication Optimizations for Distributed-Memory X10 Programs
(2010-04-10) Barik, Rajkishore; Budimlić, Zoran; Grove, David; Peshansky, Igor; Sarkar, Vivek; Zhao, Jisheng
X10 is a new object-oriented PGAS (Partitioned Global Address Space) programming language with support for distributed asynchronous dynamic parallelism that goes beyond past SPMD message-passing models such as MPI and SPMD PGAS models such as UPC and Co-Array Fortran. The concurrency constructs in X10 make it possible to express complex computation and communication structures with higher productivity than other distributed-memory programming models. However, this productivity often comes at the cost of high performance overhead when the language is used in its full generality. This paper introduces high-level compiler optimizations and transformations to reduce communication and synchronization overheads in distributed-memory implementations of X10 programs. Specifically, we focus on locality optimizations such as scalar replacement and task localization, combined with supporting transformations such as loop distribution, scalar expansion, loop tiling, and loop splitting. We have completed a prototype implementation of these high-level optimizations, and performed a performance evaluation that shows significant improvements in performance, scalability, communication volume and number of tasks. We evaluated the communication optimizations on three platforms: a 128-node BlueGene/P cluster, a 32-node Nehalem cluster, and a 16-node Power7 cluster. On the BlueGene/P cluster, we observed a maximum performance improvement of 31.46× relative to the unoptimized case (for the MolDyn benchmark). On the Nehalem cluster, we observed a maximum performance improvement of 3.01× (for the NQueens benchmark) and on the Power7 cluster, we observed a maximum performance improvement of 2.73× (for the MolDyn benchmark). In addition, there was no case in which the optimized code was slower than the unoptimized case. We also believe that the optimizations presented in this paper will be necessary for any high-productivity PGAS language based on modern object-oriented principles, that is designed for execution on future Extreme Scale systems that place a high premium on locality improvement for performance and energy efficiency.
Efficient Selection of Vector Instructions using Dynamic Programming
(2010-06-17) Barik, Rajkishore; Sarkar, Vivek; Zhao, Jisheng
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scientific, and embedded applications. To take full advantage of the vector capabilities, a compiler needs to generate efficient vector code automatically. However, most commercial and open-source compilers fall short of using the full potential of vector units, and only generate vector code for simple innermost loops. In this paper, we present the design and implementation of an auto-vectorization framework in the backend of a dynamic compiler that not only generates optimized vector code but is also well integrated with the instruction scheduler and register allocator. The framework includes a novel compile-time efficient dynamic programming-based vector instruction selection algorithm for straight-line code that expands opportunities for vectorization in the following ways: (1) scalar packing explores opportunities of packing multiple scalar variables into short vectors; (2) judicious use of shuffle and horizontal vector operations, when possible; and (3) algebraic reassociation expands opportunities for vectorization by algebraic simplification. We report performance results on the impact of auto-vectorization on a set of standard numerical benchmarks using the Jikes RVM dynamic compilation environment. Our results show performance improvement of up to 57.71% on an Intel Xeon processor, compared to non-vectorized execution, with a modest increase in compile time in the range from 0.87% to 9.992%. An investigation of the SIMD parallelization performed by v11.1 of the Intel Fortran Compiler (IFC) on three benchmarks shows that our system achieves speedup with vectorization in all three cases and IFC does not. Finally, a comparison of our approach with an implementation of the Superword Level Parallelization (SLP) algorithm from [21], shows that our approach yields a performance improvement of up to 13.78% relative to SLP.
A Refined Parallel Simulation of Crossflow Membrane Filtration
(2011) Boyle, Paul Martin; Houchens, Brent C.
This work builds upon the previous research carried out in the development of a simulation that incorporated a dynamically-updating velocity profile and electric interactions between particles with a Force Bias Monte Carlo method. Surface roughness of the membranes is added to this work, by fixing particles to the membrane surface. Additionally, the previous electric interactions are verified through the addition of an allrange solution to the calculation of the electrostatic double layer potential between two particles. Numerous numerical refinements are made to the simulation in order to ensure accuracy and confirm that previous results using single-precision variables are accurate when compared to double-precision work. Finally, the method by which the particles move within a Monte Carlo step was altered in order to implement a different data handling structure for the parallel environment. This new data handling structure greatly reduces the runtime while providing a more realistic movement scheme for the particles. Additionally, this data handling scheme offers the possibility of using a variety ofn-body algorithms that could, in the future, improve the speed of the simulation in cases with very high particle counts.
Mie and Finite-Element Simulations of the Optical and Plasmonic Properties of Micro- and Nanostructures
(2012) Hu, Ying Samuel; Drezek, Rebekah A.
A Mie-based code is developed for multilayer concentric spheres. The code is used in conjunction with a finite-element package to investigate the plasmonic and optical properties of micro- and nanostructures. For plasmonic nanostructures, gold-silica-gold multilayer nanoshells are computationally investigated. A plasmon hybridization theory is used to interpret the optical tunability. The interaction between the plasmon modes on the inner core and the outer shell results in dual resonances. The low-energy dipole mode is red-shifted by reducing the spacing ( i.e. , the intermediate silica layer) between the core and the shell. This extra tunability allows the plasmon resonance of a multilayer nanoshell to be tuned to the near-infrared region from a visible silica-gold nanoshell whose gold shell cannot be further reduced in thickness. For multilayer nanoshells with reduced geometrical symmetry ( i.e. , the inner core is offset from the center), modes of different orders interact. The mixed interaction introduces the dipolar (bright) characteristic into the higher-order (dark) modes and improves their coupling efficiency to the excitation light. The excitation of the dark modes attenuates and red-shifts the dipole mode and gives it higher-order characteristics. For non-plasmonic structures, simulations have demonstrated that multilayered structures can either reduce or enhance the scattering of light. By adding an anti-reflection layer to as microsphere made of a high-index material, the scattering force can be dramatically reduced. The reduced scattering allows optical trapping of high-index particles. Additionally, the improved trapping is not largely sensitive to the refractive index or the thickness of the coating. The technique has the practical potential to lower the requirement on the numerical aperture of the microscope objectives, making possible the integration of the imaging and optical trapping systems. While the anti-reflection coating reduces scattering, the photothermal bubble (PTB) generated by gold nanoparticles by and large enhances the scattering of light. Transient PTBs are generated by super-heating gold nanoparticles with short laser pulses. Mie-based simulations predict that the scattering of PTBs strongly depends on the transient environment immediately surrounding the nanoparticles. A scattering enhancement of two-to-four orders of magnitude from PBT is demonstrated from both calculations and experiments. Lastly, the near-field coupling between different plasmonie structures for surface-enhanced Raman scattering is investigated. A gold-coated silicon-germanium nanocone substrate has been fabricated and characterized. Finite-element simulations reveal that individual nanocones generate strong tip enhancement with axially polarized light ( i.e. , light polarized along the vertical axis of the nanocone) while the enhancement from transversely polarized light ( i.e. , light polarized in the plane of the substrate) is relatively weak. By simply filling the valleys between nanocones with plasmonic gold nanoparticles, the performance of the substrate is improved with in-plane excitation. Simulations reveal strong coupling between nanoparticles and adjacent nanocones with transverse exactions. An over one order-of-magnitude improvement has been experimentally observed.
Structure of androcam supports specialized interactions with myosin VI
(National Academy of Sciences, 2012) Joshi, Mehul K.; Moran, Sean; Beckingham, Kathleen M.; MacKenzie, Kevin R.; Biosciences
Androcam replaces calmodulin as a tissue-specific myosin VI light chain on the actin cones that mediate D. melanogaster spermatid individualization. We show that the androcam structure and its binding to the myosin VI structural (Insert 2) and regulatory (IQ) light chain sites are distinct from those of calmodulin and provide a basis for specialized myosin VI function. The androcam N lobe noncanonically binds a single Ca2þ and is locked in a “closed” conformation, causing androcam to contact the Insert 2 site with its C lobe only. Androcam replacing calmodulin at Insert 2 will increase myosin VI lever arm flexibility, which may favor the compact monomeric form of myosin VI that functions on the actin cones by facilitating the collapse of the C-terminal region onto the motor domain. The tethered androcam N lobe could stabilize the monomer through contacts with C-terminal portions of the motor or recruit other components to the actin cones. Androcam binds the IQ site at all calcium levels, constitutively mimicking a conformation adopted by calmodulin only at intermediate calcium levels. Thus, androcam replacing calmodulin at IQ will abolish a Ca2þ-regulated, calmodulin-mediated myosin VI structural change. We propose that the N lobe prevents androcam from interfering with other calmodulin- mediated Ca2þ signaling events. We discuss how gene duplication and mutations that selectively stabilize one of the many conformations available to calmodulin support the molecular evolution of structurally and functionally distinct calmodulin-like proteins.
RCMP: A System Enabling Efficient Re-computation Based Failure Resilience for Big Data Analytics
(2013-04-30) Dinu, Florin; Ng, T. S. Eugene
Multi-job I/O-intensive big-data computations can suffer a significant performance hit due to relying on data replication as the main failure resilience strategy. Data replication is inherently an expensive operation for I/O-intensive jobs because the datasets to be replicated are very large. Moreover, since the failure resilience guarantees provided by replication are fundamentally limited by the number of available replicas, jobs may fail when all replicas are lost. In this paper we argue that job re-computation should also be a first-order failure resilience strategy for big data analytics. Recomputation support is especially important for multi-job computations because they can require cascading re-computations to deal with the data loss caused by failures. We propose RCMP, a system that performs efficient job re-computation. RCMP improves on state-of-the-art big data processing systems which rely on data replication and consequently lack any dedicated support for recomputation. RCMP can speed-up a job’s re-computation by leveraging outputs that it stored during that job’s successful run. During re-computation, RCMP can efficiently utilize the available compute node parallelism by switching to a finer-grained task scheduling granularity. Furthermore, RCMP can mitigate hot-spots specific to re-computation runs. Our experiments on a moderate-sized cluster show that compared to using replication, RCMP can provide significant benefits during failure-free periods while still finishing multijob computations in comparable or better time when impacted by single and double data loss events.
Parallel Sparse Optimization
(2013-08-27) Peng, Zhimin; Yin, Wotao; Zhang, Yin; Baraniuk, Richard G.
This thesis proposes parallel and distributed algorithms for solving very largescale sparse optimization problems on computer clusters and clouds. Many modern applications problems from compressive sensing, machine learning and signal and image processing involve large-scale data and can be modeled as sparse optimization problems. Those problems are in such a large-scale that they can no longer be processed on single workstations running single-threaded computing approaches. Moving to parallel/distributed/cloud computing becomes a viable option. I propose two approaches for solving these problems. The first approach is the distributed implementations of a class of efficient proximal linear methods for solving convex optimization problems by taking advantages of the separability of the terms in the objective. The second approach is a parallel greedy coordinate descent method (GRock), which greedily choose several entries to update in parallel in each iteration. I establish the convergence of GRock and explain why it often performs exceptionally well for sparse optimization. Extensive numerical results on a computer cluster and Amazon EC2 demonstrate the efficiency and elasticity of my algorithms.
Programming Models and Runtimes for Heterogeneous Systems
(2013-09-16) Grossman, Max; Sarkar, Vivek; Mellor-Crummey, John; Cox, Alan L.
With the plateauing of processor frequencies and increase in energy consumption in computing, application developers are seeking new sources of performance acceleration. Heterogeneous platforms with multiple processor architectures offer one possible avenue to address these challenges. However, modern heterogeneous programming models tend to be either so low-level as to severely hinder programmer productivity, or so high-level as to limit optimization opportunities. The novel systems presented in this thesis strike a better balance between abstraction and transparency, enabling programmers to be productive and produce high-performance applications on heterogeneous platforms. This thesis starts by summarizing the strengths, weaknesses, and features of existing heterogeneous programming models. It then introduces and evaluates four novel heterogeneous programming models and runtime systems: JCUDA, CnC-CUDA, DyGR, and HadoopCL. We'll conclude by positioning the key contributions of each piece in this thesis relative to the state-of-the-art, and outline possible directions for future work.
Discontinuous Galerkin Methods for Parabolic Partial Differential Equations with Random Input Data
(2013-09-16) Liu, Kun; Riviere, Beatrice M.; Heinkenschloss, Matthias; Symes, William W.; Vannucci, Marina
This thesis discusses and develops one approach to solve parabolic partial differential equations with random input data. The stochastic problem is firstly transformed into a parametrized one by using finite dimensional noise assumption and the truncated Karhunen-Loeve expansion. The approach, Monte Carlo discontinuous Galerkin (MCDG) method, randomly generates $M$ realizations of uncertain coefficients and approximates the expected value of the solution by averaging M numerical solutions. This approach is applied to two numerical examples. The first example is a two-dimensional parabolic partial differential equation with random convection term and the second example is a benchmark problem coupling flow and transport equations. I first apply polynomial kernel principal component analysis of second order to generate M realizations of random permeability fields. They are used to obtain M realizations of random convection term computed from solving the flow equation. Using this approach, I solve the transport equation M times corresponding to M velocity realizations. The MCDG solution spreads toward the whole domain from the initial location and the contaminant does not leave the initial location completely as time elapses. The results show that MCDG solution is realistic, because it takes the uncertainty in velocity fields into consideration. Besides, in order to correct overshoot and undershoot solutions caused by the high level of oscillation in random velocity realizations, I solve the transport equation on meshes of finer resolution than of the permeability, and use a slope limiter as well as lower and upper bound constraints to address this difficulty. Finally, future work is proposed.
Screw and Edge Dislocations in Cement Phases: Atomic Modeling
(2013-10-09) Chen, Lu; Shahsavari, Rouzbeh; Nagarajaiah, Satish; Duenas-Osorio, Leonardo
Cement is the key strengthening and the most energy-intensive ingredient in concrete. With increasing pressure for reducing energy consumption in cement manufacturing, there is an urgent need to understand the basic deformation mechanisms of cement. In this thesis, we develop a computational framework based on molecular dynamics to study two common types of defects, namely screw and edge dislocations, in complex, anisotropic crystalline polymorphs of cement clinkers and cement hydration products. We found the likelihood of these defects in regions with higher Young moduli. We also found the preferred cement polymorphs that require less energy for grinding via analysis of Peierls stresses. Together, the results provide a detailed understanding of the role and type of defects in cement phases, which impact the rate of hydration, crystal growth and grinding energy. To our knowledge, this is the first study with atomic-resolution on deformation-based mechanisms in cement crystalline phases.
Understanding and Improving the Efficiency of Failure Resilience for Big Data Frameworks
(2013-10-30) Dinu, Florin; Ng, T. S. Eugene; Cox, Alan L.; Knightly, Edward W.; Gkantsidis, Christos
Big data processing frameworks (MapReduce, Hadoop, Dryad) are hugely popular today because they greatly simplify the management and deployment of big data analysis jobs requiring the use of many machines in parallel. A strong selling point is their built-in failure resilience support. Big data frameworks can run computations to completion despite occasional failures in the system. However, an important but overlooked point has been the efficiency of their failure resilience. The vision of this thesis is that big data frameworks should not only be failure resilient but that they should provide the resilience in an efficient manner with minimum impact on computations both under failures as well as during failure-free periods. To this end, the first part of the thesis presents the first in-depth analysis of the efficiency of the failure resilience provided by the popular Hadoop framework under failures. The results show that even single machine failures can lead to large, variable and unpredictable job running times. This thesis determines the causes behind this inefficient behavior and points out the responsible Hadoop mechanisms and their limitations. The second part of the thesis focuses on providing efficient failure resilience for the case of computations comprised of multiple jobs. We present the design, implementation and evaluation of RCMP, a MapReduce system based on the fundamental insight that using data replication to enable failure resilience oftentimes leads to significant and unnecessary increases in computation running time. In contrast, RCMP is designed to use job re-computation as a first-order failure resilience strategy. Re-computations under RCMP are efficient. Specifically, RCMP re-computes the minimum amount of work and uniquely it ensure this minimum re-computation work is performed efficiently. In particular, RCMP mitigates hot-spots that affect data transfers during re-computations and ensures that the available compute node parallelism is well leveraged.
Microbial processes influencing the attenuation and impacts of ethanol-blended fuel releases
(2013-12-05) Ma, Jie; Alvarez, Pedro J.; Li, Qilin; Bennett, George N.; Rixey, Bill
Fuel releases that impact groundwater are a common occurrence, and the growing use of ethanol as a transportation biofuel is increasing the likelihood of encountering ethanol in such releases. Therefore, it is important to understand how such releases behave and affect public safety and environmental health, and how indigenous microorganisms respond and affect their migration, fate, and overall impacts. Vapor intrusion risk (i.e., methane explosion and enhanced fuel hydrocarbon vapor intrusion) associated with ethanol blend releases is a potential concern. Using both experimental measurements and mathematical model simulations, this thesis shows that methane is unlikely to build up to pose an explosion hazard (5% v:v) if diffusion is the only mass transport pathway through the unsaturated zone. However, if methanogenic activity near the source zone is sufficiently high to cause advective gas transport, the methane indoor concentration may exceed the flammable threshold. As a group of widely distributed microorganisms, methanotrophs can significantly attenuate methane migration through the vadose zone, and thus alleviate the associated explosion risk. However, methane biodegradation could consume soil oxygen that would otherwise be available to support biodegradation of volatile hydrocarbons, and increase their vapor intrusion potential. The release of an ethanol blend solution (10 % v:v ethanol solution mixed with 50 mg/L benzene and 50 mg/L toluene) experiment into a pilot-scale (8 m3) aquifer tank produced a large amount of volatile fatty acids (VFAs). The accumulation of VFAs (particularly butyric acid) exceeded the secondary maximum contaminant level value for odor, which represents a previously unreported aesthetic impact. After the release was shut off, ethanol anaerobic degradation was temporarily stimulated when the dissolved ethanol concentration decreased below its toxicity threshold (~2,000 mg/L for this system). Methane generation persisted for more than 100 days after the disappearance of dissolved ethanol. The persistent methane was likely generated from ethanol degradation byproducts (e.g., acetate) and solid organic carbon in aquifer materials. Ethanol blend releases stimulate the microbial growth and increased the organic carbon content in the aquifer. Microorganisms play a critical role in the fate of ethanol-blended fuel releases, often determining their region of influence and potential impacts. This thesis used advanced molecular tools including 454 pyrosequencing and real-time PCR (qPCR) to characterize changes in structure of indigenous microbial communities in response to 1) a pilot-scale ethanol blend release and to 2) the shut-off of such release. This thesis shows that the ethanol blend release stimulated microbial growth and significantly changed the microbial community structure by enriching microbial groups involved in the fermentative degradation process. The growth of putative hydrocarbon degraders and commensal anaerobes, and increases in degradation rates suggest an adaptive response that increases the potential for natural attenuation of ethanol blend releases. After the release was shut off, the microbial community returned towards the pre-contaminated state; however, restoration was relatively slow and far from complete even one year later. Overall, this thesis advanced current understanding of vapor intrusion risks and groundwater quality impacts associated ethanol blend releases and microbial ecology in the impacted aquifer. The integration of this knowledge with site-specific information on pertinent hydrogeological processes will undoubtedly enhance engineering practices such as site investigation, risk assessment, and bioremediation implementation and maintenance to deal with releases of current and future biofuel blends.

Browse

Browsing Center for Research Computing by Issue Date

Results Per Page

Sort Options