A method and system are provided for disambiguating multiples of syntactically related words automatically using the notion of semantic similarity between words. Based on syntactically related words derived from a sample of text, a set is formed containing each associating word and the words associated in the syntactic relationship with it. The associating words are expanded to all word senses. Pair wise intersections of the resulting sets are formed so as to form pairs of semantically compatible word clusters which may be stored as pairs of cooccurrence restriction codes.
展开▼