Plotting methods
Plotting methods, again, work similar to PCA, so in this section we will look more detailed at the available methods instead of on how to customize them. PLS has a lot of different results and much more possible plots. Here is a list of methods, which will work both for a model and for a particular results.
Plotting methods for summary statistics.
Method | Description |
---|---|
plotRMSE(obj, ny, ...) |
RMSE values vs. number of components in a model |
plotRMSERatio(obj, ny, ...) |
RMSECV/RMSEC ratio values vs. RMSECV in a model |
plotXVariance(obj, ...) |
explained variance for X decomposition for each component |
plotXCumVariance(obj, ...) |
same as above but for cumulative variance |
plotYVariance(obj, ...) |
explained variance for Y decomposition for each component |
plotYCumVariance(obj, ...) |
same as above but for cumulative variance |
Plotting methods for objects.
Method | Description |
---|---|
plotPredictions(obj, ny, ncomp, ...) |
plot with predicted vs. measured (reference) y-values |
plotXScores(obj, comp, ...) |
scores for decomposition of X (similar to PCA plot) |
plotYScores(obj, comp, ...) |
scores for decomposition of Y (similar to PCA plot) |
plotXResiduals(obj, ncomp, ...) |
distance plot for decomposition of X (similar to PCA) |
plotYResiduals(obj, ncomp, ...) |
distance plot for decomposition of Y |
plotXYResiduals(obj, ncomp, ...) |
distance plot for both X and Y decomposition |
plotXYScores(obj, ncomp, ...) |
Y-scores vs. X-scores for a particular PLS component. |
Parameter obj
is either a model or a result object and it is the only mandatory argument for the plots. All other parameters have reasonable default values. Parameter ny
is used to specify which y-variable you want to see a plot for (if \(\mathbf{Y}\) is multivariate). You can also provide any parameter from mdaplot()
or mdaplotg()
thus change limits or labels for axis, main title, colors, line and marker style etc.
Parameter comp
allows to provide a number of selected components (one or several) to show the plot
for, while parameter ncomp
assumes that only one number is expected (number of components in a model or particular individual component). So if e.g. you create model for five components and selected three as optimal, you can also see, for example, prediction plot for having only one or for components in your model.
Here is an example for creating prediction plot for m
model (left for automatically selected number of components, right for user specified value):
By the way, when plotPredictions()
is made for results object, you can show performance statistics on the plot:
par(mfrow = c(2, 2))
plotPredictions(m$res$cal)
plotPredictions(m$res$cal, ncomp = 2)
plotPredictions(m$res$cal, show.stat = TRUE)
plotPredictions(m$res$cal, ncomp = 2, show.stat = TRUE)
The plots for variables are available only for a model object and include:
Methods | Description |
---|---|
plotXLoadings(obj, comp) |
loadings plot for decomposition of X |
plotXYLoadings(obj, comp) |
loadings plot for both X and Y decomposition |
plotWeights(obj, comp) |
plot with weights (W) for PLS decomposition |
plotRegcoeffs(obj, ny, ncomp) |
plot with regression coefficients |
plotVIPScores(obj, ny) |
VIP scores plot |
plotSelectivityRatio(obj, ny, ncomp) |
Selectivity ratio plot |
And, of course, both model and result objects have method plot()
for giving an overview.
Excluding rows and columns
PLS, like PCA, also can exclude rows and columns from calculations. The implementation works similar to what was described for PCA. For example, it can be useful if you have some candidates for outliers or do variable selection and do not want to remove rows and columns physically from the data matrix. In this case you can just specify two additional parameters, exclcols
and exclrows
, using either numbers or names of rows/columns to be excluded. You can also specify a vector with logical values (all TRUE
s will be excluded).
The excluded rows are not used for creating a model and calculation of model’s and results’ performance (e.g. explained variance). However main results (for PLS — scores, predictions, distances) are calculated for these rows as well and set hidden, so you will not see them on plots. You can always e.g. show scores for excluded objects by using show.excluded = TRUE
. It is implemented via attributes “known” for plotting methods from mdatools so if you use e.g. ggplot2 you will see all points.
The excluded columns are not used for any calculations either, the corresponding results (e.g. loadings, weights or regression coefficients) will have zero values for such columns and be also hidden on plots.