ISO 15924
The standard ISO 15924 list of the “ Codes for the representation of the names of writing s ”. The Consortium Unicode manages the office of the authority of recording and maintenance of the standard on behalf of the ISO which defines and approves the standard. However, the standard ISO 15924 does not make not left the standard Unicode (which uses unified writings relating only to the distinctions of abstract natures).
Designation and organization of the written forms according to ISO 15924
The standard defines for each system of écriture :- a descriptive name in English ;
- a descriptive name in French ;
- an alphabetical code element (normative) with four letters, for example :
- : Arab : Arab ;
- : Cyrl : Cyrillic ;
- : Egyp : hiéroglyphes Egyptian ;
- : Latn : Latin ;
- : Laoo : Laotian ;
- : Yiii : Yi ;
- a numerical code element (normative) between 000 and 999 ; and finally
- a date of reference allowing to follow the evolutions (and corrections possible) of each written form in the standard itself.
For a complete listing (and up to date) of the codes and defined names, one will refer simply to Internet site indicated at the end of the article.
Nomenclature and numerical classification
The numerical codets are grouped in series of a hundred according to the typology and the relative proximity of the written forms (see examples below).The codets and names are defined to also take into account the bibliographical needs concerning for the whole texts and documents, and are not reserved for the only isolated characters. Also, of the styles different of writings using the same abstract alphabet have codets specific, classified with codets close to the same series, if possible consecutive. For that, the numerical codets are not allocated simply by increment of 1 (there are “ trous ” in classification).
The following series are used actuellement :
- 000 with 099 : hieroglyphic writings (Egyptians or Maya) and wedge-shaped (of which ougaritic) the ;
- 100 with 199 : alphabetical writings from right to left (of which the Semitic alphabets phenician, Tifinagh S, abjads, Mongolian, KB and old man Hungarian) ;
- 200 with 299 : alphabetical writings from left to right (of which European alphabets derived from the old Greek, the invented bobomofo and alphabet hangûl, or literary alphabets) ;
- 300 with 399 : alphasyllabic writings (of which many the abugidas brahmic of the south and the south-east of Asia) ;
- 400 with 499 : syllabic writings (whose spelling-books linear has or B, Cypriot, hiragana or katakana, Ethiopic, autochtones Canadian, cherokee, etc) ;
- 500 with 599 : ideographic writings or symbolic systems (of which the writing Braille) ;
- 600 with 699 : not deciphered writings (of still unknown classification, such undue money and the Rongorongo) ;
- 700 to 799 or 800 with 899 : series yet used ;
- 900 with 999 : codets with private use, alias (no currently), codets special.
Composition and attribution of the alphabetical codets
The alphabetical codets with four letters use the basic Latin alphabet with 26 letters. The breakage of these codets is not significant, but breakage recommended uses a capital letter followed by three small letters. These codets alphabetical is inspired by the names of the writings for mnemonic reasons. However, alternatives of styles of the same writing different, as much as possible, only by their fourth letter. These alternatives are recognizable also by their codets numerical close relations in the same series. For example :- Latn = 21 5 = “ Latin ” = “ Lat i' ” ;
- Lat f = 21 6 = “ Latin (broken alternative) ” = “Latin ( F raktur variable)” ;
- Lat g = 21 7 = “ Latin (alternative Gaelic) ” = “Latin ( G aelic variable)”.
Or encore :
- Geor = 24 0 = “ géorgien (Mkhédrouli) ” = “ Geor gian (Mkhedruli)” ;
- Geo k = 24 1 = “ Khoutsouri (Assomtavrouli and Nouskhouri) ” = “ K hutsuri (Asomtavruli and Nuskhuri)”.
And aussi :
- Hani = 50 0 = “ ''' I ''' déogrammes han ” = “ Han (Hanzi, Kanji, Hanja)” ;
- Han s = 50 1 = “ ideograms han (simplified alternative) ” = “Han ( S implified variable)” ;
- Han t = 50 2 = “ ideograms han (traditional alternative) ” = “Han ( T raditional variable)”.
However, two codets alphabetical starting with same the first three letters inevitably do not indicate two alternatives of the same written form (what can be also seen thanks to numerical classification in distinct series) :
- Hani = 5 00 = “ ''' I ''' déogrammes han ” = “ Han (Hanzi, Kanji, Hanja)” ;
- Hano = 3 71 = “ Hanounóo ” = “ Han uno' o' (Hanunóo) ”.
Codets special
If the standardized writings are not enough, there exist 50 codets usable with the liking of the users (the names used are not normative and are modifiable) :- Qaaa = 900 = “ reserved with the private use (beginning) ” = “Reserved for private uses (start)” ;
- Qaab = 901 = “ reserved with the private use (2nd) ” = “Reserved for private uses (2 Nd )” ;
- …
- Qaaz = 925 = “ reserved with the private use (26e) ” = “Reserved for private uses (26 HT )”.
- Qaba = 926 = “ reserved with the private use (27e) ” = “Reserved for private uses (27 HT )” ;
- …
- Qabx = 949 = “ reserved with the private use (fine) ” = “Reserved for private uses (end)”.
There exist special codets intended for the cases of the not written languages (for example with the use of classification of photographs and video or audiophonic recordings in the collections of the media libraries and museums), or when a writing cannot be given in a reliable way because multiple (in distinct families and for which the unit does not have a more precise preset code), or even when the writing was not specified but could possibly be indicated in a more precise way with another code :
- Zxxx = 997 = “ code element for the languages not écrites ” = “Code for unwritten languages” ;
- Zyyy = 998 = “ code element for writing indéterminée ” = “Code for undetermined script” ;
- Zzzz = 999 = “ code element for writing not codée ” = “Code for uncoded script”.
History
This list of codets and names of writings was created and is maintained by Michael Everson, also member of the technical Committee of Unicode (UTC). The text of the standard ISO 15924 was approved for the first time on January 9th, 2004, which fixed the general principles for the definition of the codets.The first list of codets, very complete then, was published on May 1st, 2004 on line on Internet site of the Consortium Unicode . It included/understood, inter alia, all the writings used or defined then in the standard Unicode 4.0 and normalizes it ISO/CEI 10646. A big number of corrections followed in the following weeks, and the list was finalized on May 29th, 2004.
Since then, some new writings were regularly added for the needs for writings in the course of standardization in ISO/CEI 19646 and Unicode, or for bibliographical uses, as for not yet standardized writings which must still be the subject of studies.
Relationships to other standards and recommendations
Relation with the codets of languages of the standard ISO 639
Moreover alphabetical codets ISO 15924 of writings start, as much as possible, by the same letters as the codets with three letters of languages according to ISO 639 -2 or its extension ISO 639 -3 (which covers a wide list of languages), when the names of the writing and language are homonymous. For example :- name of language = “ Latin ” = “ latin ” ; alphabetical code element of ISO language 639-2 = lat ;
- name of writing = “ Latin ” = “ latin ” ; homonyms, donc : alphabetical code element of ISO writing 15924 = Latn.
The future standard ISO 639 -6 in preparation, and which should extend to four letters the codets of languages (in order to count a greater number of alternatives of languages) takes again this principle, and uses if possible same the codets already retained in ISO 15924 for the homonymous writings of languages, in order to preserve compatibility with current standard RFC 4646 (a) (BCP 47) :
- name of writing = “ Latin ” = “ latin ”. : alphabetical code element of ISO writing 15924 = Latn.
- name of language = “ Latin ” = “ latin ” ; homonyms, donc : alphabetical code element of language ISO/CD 639-6 = latn.
Designation of local according to RFC 4646 , with ISO 639 and ISO 3166
In practice, the alphabetical codets are preferable in the internationalized applications which must locate data. They are these codets alphabetical which will be used in the codes of local, jointly with the alphabetical codets of languages of the standard ISO 639 and the alphabetical or numerical codets of country and areas of the standard ISO 3166.
The local ones are indicated in the applications in accordance with the RFC 4646 to take into account also the codets of writing ISO 15924, in addition to the codets of languages ISO 639 and codets of country and areas ISO 3166.
Differences of the names with those of the standard ISO/CEI 10646
There is not not exact bijection between the English and French names of writings defined in ISO 15924 and the designations in English and French used in the normative names of characters and blocks of natures allocated in the standards ISO/CEI 10646 (and thus also Unicode).However, the future blocks of characters and characters standardized in ISO/CEI 10646 (and thus also Unicode) will be named, if possible, in accordance with ISO 15924.
Differences of the alphabetical codets with those of the standard Unicode
In the same way, there is not not exact bijection between the alphabetical codets of writings standardized in ISO 15924 and the codes of writings used in the tables of properties of the characters Unicode. Indeed, the standard ISO 15924 contains additional elements bringing of the distinctions of bibliographical use, between writings which were unified in the ISO standards and Unicode of coding of characters. The standard ISO 15924 contains distinctive codets and names for the writings which were thus unified in only one in Unicode (which treats them like typographical alternatives without difference in coding with the level of the characters and their normative or informative properties).In addition, the standard ISO 15924 having been created after the standard Unicode, the format of the codets alphabetical ISO 15924 can differ from the normative codes used in the tables of properties Unicode (which can be longer and contain low indents).
On a purely informative basis only, the standard ISO 15924 defines a alias (or “synonymous with value of property”) for the standardized writings, in order to know the correspondence with the properties of characters defined in the Unicode standard, when such a difference exists. Since the standard ISO 15924 was published, the Consortium Unicode was committed not more defining new codes different from those defined in ISO 15924, and thus uses, each time possible, the alphabetical codets of the standard ISO 15924. This is why all the synonyms of Unicode properties are not mentioned in the tables ISO 15924 (one will find the codes used in the files of properties of the standard Unicode itself, and Unicode added synonyms of values of properties of characters, which makes it possible from now on to use only the codets ISO 15924 in the applications in conformity with Unicode).
See too
Internal bonds
- ISO 639 (codes of languages)
- ISO 3166 (codes of country and areas)
- ISO/CEI 10646, Unicode (coding of the characters)
- International Phonetic Alphabet (API)
External bonds
- http://www.unicode.org/iso15924/ - Authority of recording and maintenance of the codes for the representation of the names of écriture :
- Standard ISO 15924, official version on line (lodged on the site of the Unicode Consortium).
- Table 1. Alphabetical list of the ''' codets of writing to four letters ''' ;
- Table 2. List ''' numerical ''' codets of écriture ;
- Table 3. Alphabetical list of the names of writings in ''' English ''' ;
- Table 4. Alphabetical list of the names of writings in ''' French ''' ;
- RFC 4646 (BCP 47), recommendations for the selection of the identifiers of languages and local in the applications, IETF, January 2001 ;
- Erratum for
RFC 4646 , posterior corrections with the publication (last known correction, July 2002).
| Random links: | 37e legislature of Canada | John Hawkins (navigator) | Canton of Valence-on-Drops | Frank Schlesinger | List cities of Kazakhstan classified per many inhabitants | Fehmarn |