A thesaurus is a controlled index language that makes indexing and searching databases easier and more cost effective. It has a vocabulary of standard terms, with explanatory scope notes when necessary, and a syntactical structure that governs and displays relationships between terms. The thesaurus is developed by collecting raw vocabulary from documents, user questions and existing vocabularies, and then organising the vocabulary into "facets": mutually exclusive groups of terms representing different aspects of the subject. Further analysis may generate subfacets, and will generate hierarchies of terms within the facets and subfacets. Notation attached to each term places it in its facet and hierarchy. References from non-preferred terms, and from broader, narrower and related terms are made to each preferred term. The faceted structure can be used to develop hierarchical, alphabetical, and rotated displays.
View full abstract