A Data-Driven Perspective on Molecular Coarse-Graining

dc.contributor.advisorClementi, Ceciliaen_US
dc.creatorBoninsegna, Lorenzoen_US
dc.date.accessioned2019-05-16T19:00:17Zen_US
dc.date.available2019-05-16T19:00:17Zen_US
dc.date.created2019-05en_US
dc.date.issued2019-04-09en_US
dc.date.submittedMay 2019en_US
dc.date.updated2019-05-16T19:00:17Zen_US
dc.description.abstractCoarse-graining is an ubiquitous concept in the sciences, and denotes a variety of diverse methods to consistently formulate a low resolution model of a physical system. If detailed data from a higher-resolution model is available, a popular bottom-up approach consists in renormalizing that information into a surrogate model, by properly filtering out non-essential details, while preserving what is considered essential. For biological molecules, a coarse-grained model requires groups of atoms to be replaced by effective degrees of freedom and their new interactions to be specified. In addition, the long timescale features of the original dynamics shall be preserved, since they correlate with physico-chemically relevant conformational rearrangements, such as (mis)folding. It can be shown that such features are completely encoded in the first few eigenvalues and eigenvectors of the operator implementing the dynamics. Thus, it all amounts to being able to approximate such quantities from the high resolution data and then ensure that the coarse-graining procedure does not perturb them. In this Dissertation, different data-driven techniques addressing various aspects of molecular coarse-graining will be presented. First, the problem of distilling a set of physically meaningful collective descriptors from high-resolution data is discussed. In particular, a novel strategy (Variationally optimized Diffusion Maps) combining existing algorithms to accomplish that is presented, both as validation strategy against different choices of the model parameters, and as an optimized algorithm. Such an approach often requires the computation and storage of large correlation matrices, so a compressed sensing procedure (oASIS) is discussed, which allows to fully reconstruct sparse matrices using only a subset of their entries. Second, the Structure and State Space Decomposition (S3D) protocol will be discussed, which maps a molecular primary sequence onto a set of disjoint dynamically coherent domains. Such units are compelling candidates for effective coarse-grained degrees of freedom and provide a novel interpretation of the conformational rearragements the molecule undergoes in terms of splitting and merging of those units. In particular, results seem to indicate that different model resolutions may be appropriate for different regions of the conformational space. Next, the Stepwise Sparse Regressor and Spectral Coarse-Graining will be introduced that allow to infer the constitutive renormalized interactions which regulate the effective diffusive dynamics of the coarser variables. Both approaches rely on constructing a data-based loss function and optimize its parameters. Preliminary results on toy-models indicate that both methods consistently capture the long timescale features expressed by the input data. Finally, future developments and ideas on how to extend the approaches to real molecular systems will be also addressed.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationBoninsegna, Lorenzo. "A Data-Driven Perspective on Molecular Coarse-Graining." (2019) Diss., Rice University. <a href="https://hdl.handle.net/1911/105392">https://hdl.handle.net/1911/105392</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105392en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectData-drivenen_US
dc.subjectcoarse-grainingen_US
dc.subjectmoleculesen_US
dc.titleA Data-Driven Perspective on Molecular Coarse-Grainingen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentChemistryen_US
thesis.degree.disciplineNatural Sciencesen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.majorPhysical Chemistryen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
BONINSEGNA-DOCUMENT-2019.pdf
Size:
36.33 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.85 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: