A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks

Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus

A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks

dc.citation.articleNumber	25103	en_US
dc.citation.issueNumber	2	en_US
dc.citation.journalTitle	The Journal of Chemical Physics	en_US
dc.citation.volumeNumber	142	en_US
dc.contributor.author	Cazade, Pierre-André	en_US
dc.contributor.author	Zheng, Wenwei	en_US
dc.contributor.author	Prada-Gracia, Diego	en_US
dc.contributor.author	Berezovska, Ganna	en_US
dc.contributor.author	Rao, Francesco	en_US
dc.contributor.author	Clementi, Cecilia	en_US
dc.contributor.author	Meuwly, Markus	en_US
dc.date.accessioned	2017-06-05T17:33:45Z	en_US
dc.date.available	2017-06-05T17:33:45Z	en_US
dc.date.issued	2015	en_US
dc.description.abstract	The ligand migration network for O2–diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k–means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k–means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.	en_US
dc.identifier.citation	Cazade, Pierre-André, Zheng, Wenwei, Prada-Gracia, Diego, et al.. "A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks." <i>The Journal of Chemical Physics,</i> 142, no. 2 (2015) AIP Publishing LLC.: http://dx.doi.org/10.1063/1.4904431.	en_US
dc.identifier.doi	http://dx.doi.org/10.1063/1.4904431	en_US
dc.identifier.uri	https://hdl.handle.net/1911/94767	en_US
dc.language.iso	eng	en_US
dc.publisher	AIP Publishing LLC.	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.title	A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks	en_US
dc.type	Journal article	en_US
dc.type.dcmi	Text	en_US
dc.type.publication	publisher version	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: clustering-algorithms.pdf
Size:: 5.93 MB
Format:: Adobe Portable Document Format

Download

Collections

Faculty Publications
Chemistry Publications