Determination of the optimal number of PLS components based on the combination of cross-validation and RMD-MRCD-PCA weighting function
Partial least squares (PLS) regression is a very useful tool for the analysis of high dimensional data (HDD). Choosing the ideal number of PLS components is a vital step in developing the best model. The accuracy of the model will be affected if there are too many or too few PLS components being sel...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | en |
| Published: |
Penerbit Universiti Kebangsaan Malaysia
2025
|
| Online Access: | http://journalarticle.ukm.my/26523/1/SSS%2016.pdf http://journalarticle.ukm.my/26523/ https://www.ukm.my/jsm/english_journals/vol54num11_2025/contentsVol54num11_2025.html |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Partial least squares (PLS) regression is a very useful tool for the analysis of high dimensional data (HDD). Choosing the ideal number of PLS components is a vital step in developing the best model. The accuracy of the model will be affected if there are too many or too few PLS components being selected. Numerous classical methods, such as the leave-one-out cross-validation (LOOCV) and K-fold cross-validation (K-FoldCV) are developed to determine the optimal number of PLS components. Nonetheless, they are easily affected by high leverage points (HLPs). Thus, robust cross validation techniques, denoted as RMD- MRCD-PCA-LOOCV and RMD-MRCD-PCA-K-FoldCV are proposed to remedy this problem. The results of the simulation study and real data set indicate that the proposed methods successfully select the appropriate number of PLS components. Keywords: High leverage points; leave-one-out cross validation; minimum regularized covariance determinant; partial least squares; principal component analysis. |
|---|
