When good predictive ability is the one and only goal, it is perhaps less important to think about orthogonality. The reason is simple: we decide to only think about the predicted values and we don’t care about looking at anything else in the models we have created. When talking about differences between different linear algorithms like PLS (partial least squares), it is very tempting to draw the erroneous conclusion that they are all the same just because the predicted values are similar. In predictive ability, yes, but how about other model properties? How about if one algorithm predicts as well as others, but at the same time has better mathematical properties?
If you only want good predictions, don’t use linear methods like PLS. If you want something non-orthogonal, choose something that is good at being non-orthogonal. Use ANN (artificial neural networks) or other non-linear black-box methods or support vector machines or LOCAL, like the professionals do. Selecting a PLS-algorithm just because it has as a similar predictive ability as other PLS algorithms does no make any sense. Like I said in the conclusion of my my paper on a comparison of nine different PLS algorithms:
There is no reason to let PLS stand for “Partial Little Squares” or “Partial Less Squares” when there is nothing to gain from it. Use only the numerically stable algorithms and let PLS stand for “Partial Least Squares”.
This refers to the least squares of BOTH the X and the Y sides and the conclusion still holds. The paper was very focused on reaching to a conclusion to the rather difficult question on which alorithms are the the best, and perhaps I focused too much on the numerical differences. I did emphasize that the differences relate to orthogonality properties of the underlying latent vectors, but now I think that this point needs some more attention. Why would orthogonality be important? Are similar predictions not enough? Actually, I used to think so myself and I remember that I thought: “Who cares?”.
I will now list some simple reasons why orthogonality of mathematical models are important:
- If you look at the scores (the projected data onto the model) and if you plot them in xy-scatter plots, you may prefer to have 90 degrees between the axes. If you use inaccurate PLS algorithms, you can’t be sure it’s 90 degrees.
- If you like to calculate Mahalanobis distances between points, you may like to use euclidian distances. Such distances only hold if you have an orthogonal coordinate system.
- You may want to calculate the distance to the model for new data. If the axes of the underlying model are twisted and not 90 degrees, you don’t know what you will get.
- You may like to use linear algebra and rules of mathematics to create new algorithms. Orthogonality is then important if you want to link yyour models to old as well as new theory.
I am sure there are more reasons. Conclusion. Use only the best PLS algorithms if you want to use PLS. Don’t look at only the predictions to judge what is a better algorithm. Orthogonality is indeed important, but if you care only about predictions, it is okay to forget about orthogonality for a while. While doing so, you’d better switch to ANN or another non-linear method that is good at being non-linear.