THESAURUS (Greek thesauros a treasure, treasure) — the dictionary reference, in Krom collected the terms used in a certain subject domain and also their semantic bonds are given. Example of T. different information retrieval systems (see), encyclopedic editions, in particular medical encyclopedias can serve (see. Big medical encyclopedia , Encyclopedias medical ), and other works of reference character.
The separate word or the set phrase from two and more words can be the term. Semantic bonds share on hierarchical (type a sort — a look, a part — whole), synonymic and associative. The synonimy can be absolute and conditional. Terms, sense to-rykh partially are considered conditionally synonymic matches or matches completely from the point of view of a part of users of an information system. Associative bonds are bonds like process — a product, the reason — the investigation, a subject — range of application and other bonds, excellent from hierarchical and synonymic. T. may contain also short definitions of terms, the morphology (biology) and morphology (linguistics) is more often in the form of the indication of range of application in case of multiple-valued terms (homonyms), napr.
Structure and structure of T. depend on technology of information processing. V T., used at manual indexing, the main terms — descriptors are selected (see) and terms — synonyms of these descriptors, at to-rykh references to the descriptor corresponding to them are had. By means of an indexer from the text of the document at first select the main terms (keywords) defining its sense. Then by means of T. replace keywords with descriptors and thus create a search image of the document. During the indexing of a request specific or associative descriptors can be added to its search image. Patrimonial descriptors are entered in addition into a search image of a request only during the obtaining the negative response to an initial request. During the indexing it is not recommended to use descriptors with very wide values.
Development of T. it is characterized by the relation of quantity of synonyms to total number of terms. V T. with manual indexing several systems of terms are, as a rule, used. It is lexico-semantic system and various indexes (the classified index of the hierarchical relations, etc.) - In lexico-semantic system all terms are located in alphabetical order. In the classified index the same terms are arranged by subject thematically principle. In the index of the hierarchical relations descriptors are grouped in the form of families, of to-rykh the generic term is at the head of each. In the permutation index of terms for verbose terms alphabetic streamlining on each of the meaning words is implemented.
In systems with automatic indexing each term can be written down in memory of the car once, and bonds between terms are displayed by means of address references. In a row sovr. informatsion-but-search engines in computer memory the full text of the document or its paper is brought. From the text all meaning words (in computer memory there is a list of non-significant words, including pretexts, the unions and words of extended sense) are automatically selected, on the Crimea inverse arrays form (see. Information retrieval system ), used for search at requests. At automatic creation of a search image of a request (or the search instruction) the T is used., stored in computer memory. At the same time keywords of a request are automatically supplemented with synonyms and specific descriptors. Also dialogue formation of a search image of a request is applied. Storage in computer memory of the full text of the paper or document allows in process of replenishment of T. to realize search with bigger completeness, than at storage in computer memory only of search images of documents.
The problem of replenishment of T is important. new terms and new semantic bonds. It can be carried out in manual or in automated the modes on the basis of the analysis of new documents and requests. Final establishment of semantic bonds in T. it is carried out by the specialist. At automatic indexing there are rather complex linguistic challenges. At the same time in T. bases of words or the word with all possible terminations shall be stored.
T. in the factual automated systems (see the Automated control system), napr, such as system of the accounting of reserves of medicines, registration of personnel, etc., differs from T. the information retrieval system in the field of scientific information existence of ordered lists of terms, each of to-rykh contains all text values of a nek-ry set of objects or all values of one sign of an object. The list may contain names of the countries (suppliers of pharmaceuticals), the list of diseases, the list of types of packaging, etc. Existence in computer memory of such lists simplifies automatic indexing and provides unambiguous factual search and a possibility of data processing.
Bibliography: Wheaten JI. AA. The thesaurus in a documentary INFORMATION RETRIEVAL SYSTEM, Kiev, 1977; Shemakin Yu. I. The thesaurus in automated control systems and information processing, M., 19 74; Schultz C. To. Thesaurus of information science terminology, N. Y., 1978.
G. A. Shastova.