Preprocessing as part of the model

From v. 0.15.0 preprocessing methods can be combined into a list and provided as additional argument to the models, including PCA. Once provided, PCA will take care of estimation of preprocessing parameters (e.g. reference spectrum for EMSC) and then will automatically apply all preprocessing methods to new data when the user calls function predict().

Here is an example. We will create a PCA model for Simdata UV/Vis spectra but we would like to smooth the spectra first and apply SNV normalization. Here is how to do this:

# load calibration and test set
Xc = simdata$spectra.c
Xt = simdata$spectra.t

# define chain of preprocessing methods
p = list(
   prep("savgol", width = 7, porder = 1, dorder = 0),
   prep("norm", type = "snv")
)

# create two PCA models - with and without preprocessing
m1 = pca(Xc, 6)
m2 = pca(Xc, 6, prep = p)

# apply the models to the test set
r1 = predict(m1, Xt)
r2 = predict(m2, Xt)

par(mfrow = c(2, 2))
plotScores(m1, res = list("cal" = m1$res$cal, "test" = r1), main = "Scores (no preprocessing)")
plotScores(m2, res = list("cal" = m2$res$cal, "test" = r2), main = "Scores (with preprocessing)")
plotLoadings(m1, comp = c(1, 2), type = "l", main = "Loadings (no preprocessing)")
plotLoadings(m2, comp = c(1, 2), type = "l", main = "Loadings (with preprocessing)")

As you can see there is no need to apply the preprocessing methods to the test set manually, method predict() takes care of everything.

Information about the preprocessing methods is also shown in the model summary:

summary(m2)
## 
## Summary for PCA model (class pca)
## Type of limits: ddmoments
## Alpha: 0.05
## Gamma: 0.01
## 
## Preprocessing methods:
##  - savgol: width = 7, porder = 1, dorder = 0 
##  - norm: type = snv,  
## 
##        Eigenvals Expvar Cumexpvar Nq Nh
## Comp 1     0.380  82.75     82.75  1  1
## Comp 2     0.076  16.60     99.34  1  1
## Comp 3     0.001   0.20     99.55  2  1
## Comp 4     0.000   0.06     99.60  3  1
## Comp 5     0.000   0.04     99.65  3  1
## Comp 6     0.000   0.03     99.68  4  1