metbit.cross_validation
Perform cross validation.
import metbit.cross_validationClasses
CrossValidation
Stratified cross validation
Parameters:
estimatorstrEstimator indicates algorithm for model construction. Values can be "pls" for PLS and "opls" for OPLS. Default is "opls".
kfoldintk fold cross validation. if k equals to len(X), leave one out cross validation will be performed. Default is 10.
scalerstrScaler for scaling data matrix. Valid values are "uv" for zero-mean-unit-variance scaling, "pareto" for Pareto scaling, "minmax" for Min-Max scaling and "mean" for mean centering. Default is "pareto".
Returns
CrossValidation object
Methods
__init__(self, estimator='opls', kfold=10, scaler='pareto')
fit(self, x, y)
Fitting variable matrix X
Parameters
xnp.ndarrayVariable matrix with size n samples by p variables.
ynp.ndarray | listDependent matrix with size n samples by 1. The values in this vector must be 0 and 1, otherwise the classification performance will be wrongly concluded.
Returns
CrossValidation object
predict(self, x)
Do prediction using optimal model.
Parameters
xnp.ndarrayVariable matrix with size n samples by p variables.
Returns
np.ndarray Predictions for the x
reset_optimal_num_component(self, k)
Reset the optimal number of components for manual setup.
Parameters
kintNumber of components according to the error plot.
Returns
None
orthogonal_score(self)
Cross validated orthogonal score.
Returns
np.ndarray The first orthogonal scores.
Raises
ValueError If OPLS / OPLS-DA is not used.
predictive_score(self)
Cross validated predictive score.
Returns
np.ndarray The first predictive scores.
Raises
ValueError If OPLS / OPLS-DA is not used.
scores(self)
Returns
np.ndarray The first predictive score, if the method is OPLS/OPLS-DA, otherwise is the scores of X
q2(self)
Q2
Returns
q2floatoptimal_component_num(self)
Number of components determined by CV.
Returns
int
R2Xcorr(self)
Returns
float Modeled joint X-y covariation of X.
Raises
ValueError If OPLS / OPLS-DA is not used.
R2XYO(self)
Returns
float Modeled structured noise variation of X.
Raises
ValueError If OPLS / OPLS-DA is not used.
R2X(self)
Returns
float Modeled variation of X
R2y(self)
Returns
float Modeled variation of y
correlation(self)
Correlation
Returns
np.ndarray Correlation loading profile
Raises
ValueError If OPLS / OPLS-DA is not used.
References
[1] Wiklund S, et al. Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models. Anal Chem. 2008, 80, 115-122.
covariance(self)
Covariance
Returns
np.ndarray Correlation loading profile
Raises
ValueError If OPLS / OPLS-DA is not used.
References
[1] Wiklund S, et al. Visualization of GC/TOF-MS-Based Metabolomics Data for Identification of Biochemically Interesting Compounds Using OPLS Class Models. Anal Chem. 2008, 80, 115-122.
loadings_cv(self)
Loadings from cross validation.
Returns
np.ndarray Correlation loading profile
Raises
ValueError If OPLS / OPLS-DA is not used.
min_nmc(self)
Returns
float Minimal number of mis-classifications
mis_classifications(self)
Returns
list Mis-classifications at different principal components.