A thesaurus or thesaurus is a kind of Dictionnaire arranged hierarchically; a vocabulary standardized on the basis of of generic terms and narrower terms to a field. It provides only incidentally definitions, the relations of the terms and their choice overriding the significances.
Notice on the orthography: thesaurus is a " word savant" directly borrowed from Latin, and of this fact should not be accentuated in theory. The two orthographies thesaurus and thesaurus are allowed by the dictionaries, but the francized form seems most frequent in the literature. Latin plural thesauri is sometimes employed, but passes for an obsolete form or oddly an Anglicism (English employs Latin plural). Coherence wants of course that one writes either a thesaurus, thesauri , or a thesaurus, thesauri . In this article one adopted the accentuated form.
A thesaurus is a structured whole of terms chosen for their capacity to facilitate the description of a field and to harmonize the communication and the data processing about it. Each term called descriptor is as not very ambiguous as possible and is preferred in the nearby terms (quasi-synonymy) or synonymous, the not-descriptors , for all the significant exchanges.
In practice, the thesaurus is a documentary tool of Indexation. Guided by a relevant thesaurus, it is possible to represent any document by a rigorous selection of precise words, called key words. It will be then easy to ensure of it an arbitrary form of document management.
In mode consultation and exploitation of the data, the thesaurus becomes an instrument of research: having the vocabularies and rules of the indexing, the user can optimize his requests.
A thesaurus is worked out like a subset of the usual vocabulary and at least a specialized vocabulary. It is about a Vocabulaire controlled since he results from a long process of sorting of the words, names and expressions used in an abstract way in a particular field. It is about a pragmatic and continuous step of rationalization of the descriptive terms. A new thesaurus or a new version must generally undergo a phase of validation by the community concerned.
Systems of automatic treatment of texts (automatic indexing) allow the extraction of the most frequent terms of a corpus and to a certain extent facilitate the emergence of their semantic relations.
For the best adequacy with the field considered, the terms are inventoried, compared, connected and are finally treated on a hierarchical basis to give an account of the essential features of the field. This hierarchy is based on a typology: each term belongs to a category which locates it compared to all the other terms selected and which fixes this manner its priority of employment. The hierarchy of the terms can completely be different from a thesaurus to another and even subject to inconsistency in a use or another of the same thesaurus.
Finally on the basis of the level highest and corresponding to the field of the thesaurus, one finds initially the subdivisions major representing the components of the field - subdivisions often named microthesaurus , then for each subdivision, the hierarchy specific to the descriptors. A thesaurus can also relate to several fields.
It remains always an arbitrary dimension in the hierarchy of a thesaurus, either in the choice of the terms, or in their hierarchical position.
There exist standards for the development of the thesauri:
The terms of a thesaurus are organized hierarchically (inside microthesauri often classified alphabetically). This hierarchy makes it possible to regulate the precision of the indexing or the interrogation. The indexing will be based as much as possible on the identification of the specific terms (thus of the level low possible), whereas research according to the cases can call upon the generic terms to increase the number of answers.
The relations of the terms are of three types:
Any thesaurus comprises at least three categories of terms: generic terms and the narrower terms which must be used as descriptors; the equivalent terms which are regarded as not-descriptors according to conventions of the thesaurus.
*Les generic terms is generally located by the initials TG ; they indicate the entities or principal concepts in reference to the other terms and the field considered;
One also finds very generally the associated terms identified by MT (association relationship: causality, localization, relations of temporal nature, composition, etc). Being themselves of the descriptors, these related terms make it possible to the researcher to gradually modify his interrogation or to widen it without calling upon the generic terms.
Various types of relations and complementary headings can be assistant with this basic structure to enrich the thesaurus or to improve its use. One can in particular envisage equivalent linguistics for multilingual thesauri as well as footbridges with other thesauri of the same field or fields different.
Are the principal headings of a micro thesaurus on a collaboratif computing system:
* Individuals >
The heading Individus would be composed for example of:
* Reader (TG);
The person in charge of any contribution could thus be specified by at least a selected descriptive term among the five narrower terms or the three generic terms, according to the needs. Terms (EP) by principle will be avoided in the indexing, but could be used later on to exploit exclusively such or such type of contribution without rigorously employing the clean terms of initial description.
Whatever its support, a thesaurus uses usually presentations by alphabetical classification of its terms; first stage before the presentation of the hierarchical relations. Thus, the user can it be diverted initially by the absence of a term in a list, whereas another method of use of the thesaurus reveals to him that this term is well taken into account but by relation with one of the privileged terms. Presentations in the form of graphs and charts allow more complex explorations.
The use or exploration of a thesaurus can usually be done using several modes of presentation:
List (S) alphabetical (S) of the terms; for an comprehensive approach or the search for a particular term;
One can find in these lists the symbol MT indicating the microthesaurus which the term concerns.
One finds associated with the descriptors, of the definitions (case of Homonymie), of the notes assisting the user (notes), of the bonds of any nature, etc
Simple: Thesaurus
| Random links: | Foug | Park of Yerres | Álvaro Arbeloa | Albator 84 | ATU Steinhardt |