Plotting methods

Plotting methods, again, work similar to PCA, so in this section we will look more detailed at the available methods instead of on how to customize them. PLS has a lot of different results and much more possible plots. Here is a list of methods, which will work both for a model and for a particular results.

Plotting methods for summary statistics.

Method Description
plotRMSE(obj, ny, ...) RMSE values vs. number of components in a model
plotRMSERatio(obj, ny, ...) RMSECV/RMSEC ratio values vs. RMSECV in a model
plotXVariance(obj, ...) explained variance for X decomposition for each component
plotXCumVariance(obj, ...) same as above but for cumulative variance
plotYVariance(obj, ...) explained variance for Y decomposition for each component
plotYCumVariance(obj, ...) same as above but for cumulative variance

Plotting methods for objects.

Method Description
plotPredictions(obj, ny, ncomp, ...) plot with predicted vs. measured (reference) y-values
plotXScores(obj, comp, ...) scores for decomposition of X (similar to PCA plot)
plotYScores(obj, comp, ...) scores for decomposition of Y (similar to PCA plot)
plotXResiduals(obj, ncomp, ...) distance plot for decomposition of X (similar to PCA)
plotYResiduals(obj, ncomp, ...) distance plot for decomposition of Y
plotXYResiduals(obj, ncomp, ...) distance plot for both X and Y decomposition
plotXYScores(obj, ncomp, ...) Y-scores vs. X-scores for a particular PLS component.

Parameter obj is either a model or a result object and it is the only mandatory argument for the plots. All other parameters have reasonable default values. Parameter ny is used to specify which y-variable you want to see a plot for (if Y is multivariate). You can also provide any parameter from mdaplot() or mdaplotg() thus change limits or labels for axis, main title, colors, line and marker style etc.

Parameter comp allows to provide a number of selected components (one or several) to show the plot for, while parameter ncomp assumes that only one number is expected (number of components in a model or particular individual component). So if e.g. you create model for five components and selected three as optimal, you can also see, for example, prediction plot for having only one or for components in your model.

Here is an example for creating prediction plot for m model (left for automatically selected number of components, right for user specified value):

par(mfrow = c(1, 2))
plotPredictions(m)
plotPredictions(m, ncomp = 1)

By the way, when plotPredictions() is made for results object, you can show performance statistics on the plot:

par(mfrow = c(2, 2))
plotPredictions(m$res$cal)
plotPredictions(m$res$cal, ncomp = 2)
plotPredictions(m$res$cal, show.stat = TRUE)
plotPredictions(m$res$cal, ncomp = 2, show.stat = TRUE)

The plots for variables are available only for a model object and include:

Methods Description
plotXLoadings(obj, comp) loadings plot for decomposition of X
plotXYLoadings(obj, comp) loadings plot for both X and Y decomposition
plotWeights(obj, comp) plot with weights (W) for PLS decomposition
plotRegcoeffs(obj, ny, ncomp) plot with regression coefficients
plotVIPScores(obj, ny) VIP scores plot
plotSelectivityRatio(obj, ny, ncomp) Selectivity ratio plot

And, of course, both model and result objects have method plot() for giving an overview.

plot(m)

Excluding rows and columns

PLS, like PCA, also can exclude rows and columns from calculations. The implementation works similar to what was described for PCA. For example, it can be useful if you have some candidates for outliers or do variable selection and do not want to remove rows and columns physically from the data matrix. In this case you can just specify two additional parameters, exclcols and exclrows, using either numbers or names of rows/columns to be excluded. You can also specify a vector with logical values (all TRUEs will be excluded).

The excluded rows are not used for creating a model and calculation of model’s and results’ performance (e.g. explained variance). However main results (for PLS — scores, predictions, distances) are calculated for these rows as well and set hidden, so you will not see them on plots. You can always e.g. show scores for excluded objects by using show.excluded = TRUE. It is implemented via attributes “known” for plotting methods from mdatools so if you use e.g. ggplot2 you will see all points.

The excluded columns are not used for any calculations either, the corresponding results (e.g. loadings, weights or regression coefficients) will have zero values for such columns and be also hidden on plots.