Let T be a tree on n vertices. The subtree frequency vector (STF-vector) of T, denoted by stf(T) is a vector of length n whose k th coordinate is the number of subtrees of T that have exactly k vertices. We present algorithms for calculating the subtree frequencies. We give a combinatorial interpretation for the first few and last few entries of the STF-vector. The main question we investigate {originally motivated by the problem of determining molecule structure from mass spectrometry data {is whether T can be reconstructed from stf(T). We show that there exist examples of non-isomorphic pairs of unlabeled free (i.e. unrooted) trees that are STF-equivalent, i.e. have identical subtree frequency vectors. Using exhaustive computer search, we determine all such pairs for small sizes. We show that there are in finitely many non-isomorphic STF-equivalent pairs of trees by constructing in finite families of examples. We also show that for special kinds of trees (e.g. paths, stars and trees containing a single vertex of degree larger than 2), the tree is reconstructible from the subtree frequencies. We consider a version of the problem for rooted trees, where only subtrees containing the root are counted. Finally, we formulate some conjectures and open problems and outline further research directions.
展开▼