職員瀏覽: Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers

Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers

Three major aspects of chemometrics have been investigated in this study namely Quantitative Structure-Activity Relationship (QSAR) and database mining, classification and multiblock methods. In the first analysis, 197 artemisinin compounds were divided into training set and test set together with s...

全面介紹

Saved in:

書目詳細資料
主要作者:	Jamaludin, Rosmahaida
格式:	Thesis
語言:	English
出版:	2015
主題:	QD Chemistry
在線閱讀:	http://eprints.utm.my/id/eprint/61066/1/RosmahaidaJamaludinPFS2015.pdf http://eprints.utm.my/id/eprint/61066/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:96405
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	my.utm.61066
record_format	eprints
spelling	my.utm.610662017-10-08T08:57:57Z http://eprints.utm.my/id/eprint/61066/ Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers Jamaludin, Rosmahaida QD Chemistry Three major aspects of chemometrics have been investigated in this study namely Quantitative Structure-Activity Relationship (QSAR) and database mining, classification and multiblock methods. In the first analysis, 197 artemisinin compounds were divided into training set and test set together with structural descriptors generated by DRAGON 6.0 software had been used to develop three QSAR models. Statistics of the models were (r2/ rtest2) 0.790/0.853 for Forward Stepwise-Multiple Linear Regression (MLR), 0.807/0.789 for Genetic Algorithm (GA)-MLR and 0.795/0.811 for GA-Partial Least Square (PLS). The rigorously validated QSAR models were then applied to mine a chemical database which resulted in four potential new anti-malarial agents. The same artemisinin data set was then classified into active and less active compounds to develop reliable predictive classification models and to investigate the consequences of using various data splitting and data pre-processing methods on classification. Principal Component Analysis (PCA) and boundary plot had been utilized to visualize the four classifiers namely Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Linear Vector Quantization (LVQ) and Quadratic Discriminant Analysis (QDA). Kennard-Stone data splitting and standardization had produced better results in terms of percent correctly classified (% CC) compared to Duplex data-splitting and mean-centering. Moreover, LDA was found to be superior as compared to the other three classifiers with lower risk of over-fitting. Lastly, multiblock analysis methods such as Multiblock PLS and Consensus PCA have been implemented on polychlorinated diphenyl ethers (PCDEs) data set together with their respective descriptors blocked into three groups labelled as X 1D, X 2D, X 3D and a property block, Y which consists of log PL (Pa, 25°C), log K OW (25°C) and log SWL (mol/L, 25°C). Their performance were then compared to single block methods that is PLS and PCA. The PLS models of each descriptor block with respect to each property were statistically best-fitted and well predicted with rtrain2 values greater than 0.96 while the rtest values range from 0.86 to 0.98. It is interesting to note that the combination of the three descriptor blocks into a single block to produce Multiblock PLS superscores (MBSS) model which was superior than Multiblock PLS block-scores (MBBS) yielded slightly better rtrain2 value and significantly better prediction with higher rtest as compared to PLS model of individual descriptor block. In addition, three measures of block similarity such as Mantel Test, Rv coefficient and Procrustes analysis were used to investigate similarity and correlation between the blocks along with Monte Carlo simulations to determine their significance. Based on the similarity index between two blocks, X jD descriptors resembled Y block better while X 2D was more correlated to X 1D block. In short, the chemometric methods had been applied successfully on both data sets using various descriptors generated by DRAGON software and yielded promising results beneficial not only in chemometrics area but also in drug design. 2015-09 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/61066/1/RosmahaidaJamaludinPFS2015.pdf Jamaludin, Rosmahaida (2015) Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers. PhD thesis, Universiti Teknologi Malaysia, Faculty of Science. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:96405
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	QD Chemistry
spellingShingle	QD Chemistry Jamaludin, Rosmahaida Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
description	Three major aspects of chemometrics have been investigated in this study namely Quantitative Structure-Activity Relationship (QSAR) and database mining, classification and multiblock methods. In the first analysis, 197 artemisinin compounds were divided into training set and test set together with structural descriptors generated by DRAGON 6.0 software had been used to develop three QSAR models. Statistics of the models were (r2/ rtest2) 0.790/0.853 for Forward Stepwise-Multiple Linear Regression (MLR), 0.807/0.789 for Genetic Algorithm (GA)-MLR and 0.795/0.811 for GA-Partial Least Square (PLS). The rigorously validated QSAR models were then applied to mine a chemical database which resulted in four potential new anti-malarial agents. The same artemisinin data set was then classified into active and less active compounds to develop reliable predictive classification models and to investigate the consequences of using various data splitting and data pre-processing methods on classification. Principal Component Analysis (PCA) and boundary plot had been utilized to visualize the four classifiers namely Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Linear Vector Quantization (LVQ) and Quadratic Discriminant Analysis (QDA). Kennard-Stone data splitting and standardization had produced better results in terms of percent correctly classified (% CC) compared to Duplex data-splitting and mean-centering. Moreover, LDA was found to be superior as compared to the other three classifiers with lower risk of over-fitting. Lastly, multiblock analysis methods such as Multiblock PLS and Consensus PCA have been implemented on polychlorinated diphenyl ethers (PCDEs) data set together with their respective descriptors blocked into three groups labelled as X 1D, X 2D, X 3D and a property block, Y which consists of log PL (Pa, 25°C), log K OW (25°C) and log SWL (mol/L, 25°C). Their performance were then compared to single block methods that is PLS and PCA. The PLS models of each descriptor block with respect to each property were statistically best-fitted and well predicted with rtrain2 values greater than 0.96 while the rtest values range from 0.86 to 0.98. It is interesting to note that the combination of the three descriptor blocks into a single block to produce Multiblock PLS superscores (MBSS) model which was superior than Multiblock PLS block-scores (MBBS) yielded slightly better rtrain2 value and significantly better prediction with higher rtest as compared to PLS model of individual descriptor block. In addition, three measures of block similarity such as Mantel Test, Rv coefficient and Procrustes analysis were used to investigate similarity and correlation between the blocks along with Monte Carlo simulations to determine their significance. Based on the similarity index between two blocks, X jD descriptors resembled Y block better while X 2D was more correlated to X 1D block. In short, the chemometric methods had been applied successfully on both data sets using various descriptors generated by DRAGON software and yielded promising results beneficial not only in chemometrics area but also in drug design.
format	Thesis
author	Jamaludin, Rosmahaida
author_facet	Jamaludin, Rosmahaida
author_sort	Jamaludin, Rosmahaida
title	Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
title_short	Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
title_full	Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
title_fullStr	Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
title_full_unstemmed	Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
title_sort	chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers
publishDate	2015
url	http://eprints.utm.my/id/eprint/61066/1/RosmahaidaJamaludinPFS2015.pdf http://eprints.utm.my/id/eprint/61066/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:96405
_version_	1643655060097335296
score	13.251813

Chemometrics and multiblock methods for quantitative structure-activity studies of artemisinin analogues and polychlorinated diphenylethers

相似書籍