Given a set of strings S = {s_1,..., s_n}, the Shortest Super-string problem asks for the shortest string s which contains each S_i as a substring. We consider two measures of success in this problem: the length measure, which is the length of s, and the compression measure, which is the difference between the sum of lengths of the s_i and the length of s. Both the length and the compression versions of the problem are known to be MAX-SNP-hard. The only explicit approximation ratio lower bounds are by Ott: 1.000057 for the length measure and 1.000089 for the compression measure. Using a natural construction we improve these lower bounds to 1.00082 for the length measure and 1.00093 for the compression measure. Our lower bounds hold even for instances in which the strings are over a binary alphabet and have equal lengths. In fact, we show a somewhat surprising result, that the Shortest Superstring problem (with respect to both measures) is as hard to approximate on instances over a binary alphabet, as it is over any alphabet.
展开▼