A very interpretable method for dimension reduction is to partition the collection of variables into subsets, then from each element of the partition select a single variable to represent that subset. We introduce two general approaches to finding the partition and representatives: correlation cliques and variable clustering. The former is based on finding maximal subsets of variables with a specified lower bound on correlation, and the latter is based on optimizing a general criterion for dimension reduction. This general criterion is based on two maps: a dimension reduction map from the full set of variables to the reduced set, and an approximation map from the reduced set of variables back to the original full set of variables. The objective function is the sum of squared errors of the approximation of the full set by the reduced set. Examples are given including one with a spatial structure which illustrate the methods and their utility for data analysis.
展开▼