Show simple item record

dc.contributor.authorEisner, Michelle
dc.date.accessioned2015-11-11T20:36:46Z
dc.date.available2015-11-11T20:36:46Z
dc.identifier.urihttp://hdl.handle.net/10464/7471
dc.description.abstractThe goal of most clustering algorithms is to find the optimal number of clusters (i.e. fewest number of clusters). However, analysis of molecular conformations of biological macromolecules obtained from computer simulations may benefit from a larger array of clusters. The Self-Organizing Map (SOM) clustering method has the advantage of generating large numbers of clusters, but often gives ambiguous results. In this work, SOMs have been shown to be reproducible when the same conformational dataset is independently clustered multiple times (~100), with the help of the Cramérs V-index (C_v). The ability of C_v to determine which SOMs are reproduced is generalizable across different SOM source codes. The conformational ensembles produced from MD (molecular dynamics) and REMD (replica exchange molecular dynamics) simulations of the penta peptide Met-enkephalin (MET) and the 34 amino acid protein human Parathyroid Hormone (hPTH) were used to evaluate SOM reproducibility. The training length for the SOM has a huge impact on the reproducibility. Analysis of MET conformational data definitively determined that toroidal SOMs cluster data better than bordered maps due to the fact that toroidal maps do not have an edge effect. For the source code from MATLAB, it was determined that the learning rate function should be LINEAR with an initial learning rate factor of 0.05 and the SOM should be trained by a sequential algorithm. The trained SOMs can be used as a supervised classification for another dataset. The toroidal 10×10 hexagonal SOMs produced from the MATLAB program for hPTH conformational data produced three sets of reproducible clusters (27%, 15%, and 13% of 100 independent runs) which find similar partitionings to those of smaller 6×6 SOMs. The χ^2 values produced as part of the C_v calculation were used to locate clusters with identical conformational memberships on independently trained SOMs, even those with different dimensions. The χ^2 values could relate the different SOM partitionings to each other.en_US
dc.language.isoengen_US
dc.publisherBrock Universityen_US
dc.subjectComputational Chemistry, self-organizing maps, molecular dynamicsen_US
dc.titleAssessing the Reproducibility of Clustering of Molecular Dynamics Conformations on Self-Organizing Mapsen_US
dc.typeElectronic Thesis or Dissertationen_US
dc.degree.nameM.Sc. Chemistryen_US
dc.degree.levelMastersen_US
dc.contributor.departmentDepartment of Chemistryen_US
dc.degree.disciplineFaculty of Mathematics and Scienceen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record