Table of Contents
wngroups - discussion of WordNet search code to group similar senses
Standard dictionaries commonly group related senses of a
word with expressions such as "the act, process, or result of X", or else
they enter the act, process, and result senses as subheads under a single
sense number. Unlike standard dictionaries, the default display for the
WordNet browser is to show senses in order of frequency of use in the
semantically tagged texts described in semcor(7WN)
. The RELATIVES
search
displays similar senses of a word together. At the present time, nouns
and some verbs are grouped.
Three relations are used to group
noun senses: cousins, sisters, and twins.
The cousin groupings exploit
the hyponym relation in WordNet. Many WordNet nodes whose hyponyms bear
a specific relation to each other have been identified. For example, the
noun crab refers to an animal, as well as the edible meat of the animal.
The same relation, that of "an animal and its edible meat", holds for
lobster , chicken and most other matching strings under the food and
animal nodes.
Another class of related senses in the noun hierarchy is
called sisters. Sisters are matching strings that are both the immediate
hyponyms of the same superordinate. For example, the noun flounder can
refer to several kinds of flatfish .
The third grouping relation is called
twins. These are synsets that have at least three words in common. For
example, one sense of duo is a musical group and another is a musical
composition. Both synsets contain duet , duette , and duo .
Transitivity
is used to combine groups of overlapping senses into the largest sense
groups possible.
Some similar senses of verbs have been grouped
by the lexicographers. This grouping is done statically in the lexicographer
source files using the semantic pointer_symbol $ . As with the noun senses,
transitivity is used to combine groups of overlapping senses into the
largest sense groups possible.
There are, of course, exceptions
to all of the relations. For example, the noun coral is in both the animal
and food hierarchies, but does not bear the relation of "an animal and
its edible meat". Candidates for cousin and twin groupings are checked
by hand and those that should not be grouped together are listed in the
exception list file.
Coverage of noun cousins is incomplete.
Coverage
of verb groups is incomplete.
Groups of noun senses are determined at
run-time - not statically - when the RELATIVES
search is requested. Depending
on the computer platform and the number of senses involved, this search
may run slowly.
- WNHOME
- Base directory for WordNet.
Unix default is /usr/local/wordnet1.6 , PC default is C:\wn16 , Macintosh
default is : .
- WNSEARCHDIR
- Directory in which the WordNet database has
been installed. Unix default is WNHOME/dict , PC default is WNHOME\dict
, Macintosh default is :Database .
All files are in directory WNSEARCHDIR
:
- cousin.tops
- pairs of noun top nodes (Unix and Macintosh)
- cousin.tps
- pairs of noun top nodes (PC)
- cousin.exc
- senses that should not be grouped
wn(1WN)
, wnb(1WN)
, wnsearch(3WN)
, wndb(5WN)
semcor(7WN)
.
Table of Contents