Mate Barany 9413c19cc1 Update CLDR to v46
New languages added with v46
- Kara-Kalpak
- Swampy Cree

Several new Chinese-language locales have been added, including one
using Latin script, which invalidated some prior QLocale tests, which
have been adjusted to fit.

Some obsolete time-zone identifiers are now treated as deprecated
aliases. These have lost their AnyTerritory association, implying
changes to QTimeZone tests.

Many redundant likely sub-tag rules for unspecified language have been
dropped, in favor of simpler rules.

[ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to
v46.

Task-number: QTBUG-130877
Pick-to: 6.8
Change-Id: I92cf210422c7759dd829a7ca2f845d20e263d25b
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit e316276b76b9c3768ca4e19a04d03308ef21fe12)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
2025-01-06 17:56:36 +00:00

901 lines
46 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Copyright (C) 2021 The Qt Company Ltd.
# SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GPL-3.0-only WITH Qt-GPL-exception-1.0
"""Assorted enumerations implicated in public API.
The numberings of these enumerations can only change at major
versions. When new CLDR data implies adding entries, the new ones must
go after all existing ones. See also zonedata.py for enumerations
related to timezones and CLDR, which can more freely be changed
between versions.
A run of cldr2qlocalexml.py will produce output reporting any
language, script and territory codes it sees, in data, for which it
can find a name (taken always from en.xml) that could potentially be
used. There is no point adding a mapping for such a code unless the
CLDR's common/main/ contains an XML file for at least one locale that
exercises it (and little point, even then, absent substantial data,
ignoring draft='unconfirmed' entries).
Each *_map reflects the current values of its enums in qlocale.h; if
new xml language files are available in CLDR, these languages and
territories need to be *appended* to this list (for compatibility
between versions). Include any spaces and dashes present in names
(they'll be squished out for the enum entries) in *_map, but use the
squished forms of names in the *_aliases mappings. The squishing also
turns the first letter of each word into a capital so you can safely
preserve the case of en.xml's name; but omit (or replace with space)
any punctuation aside from dashes and map any accented letters to
their un-accented plain ASCII. The two tables, for each enum, have
the forms:
* map { Numeric value: ("Proper name", "ISO code") }
* alias { "OldName": "CurrentName" }
TODO: add support for marking entries as deprecated from a specified
version. For aliases that merely deprecates the name. Where we have a
name for which CLDR offers no data, we may also want to deprecate
entries in the map - although they may be worth keeping for the
benefit of QLocaleSelector (see QTBUG-112765), if other
locale-specific resources might have use of them.
For a new major version (and only then), we can change the numbering,
so re-sort each list into alphabetic order (e.g. using sort -k2); but
keep the Any and C entries first. That's why those are offset with a
blank line, below. After doing that, regenerate locale data as usual;
this will cause a binary-incompatible change.
Note on 'macrolanguage' comments: see QTBUG-107781 and 'ISO 639
macrolanguage' on Wikipedia. A 'macrolanguage' is (loosely-speaking) a
group of languages so closely related to one another that they could
also be regarded as divergent dialects of the macrolanguage. In some
cases this may mean a resource (such as translation or text-to-speech
data) may describe itself as pertaining to the macrolanguage, implying
its suitability for use in any of the languages within the
macrolanguage. For example, no_NO might be used for a generic
Norwegian resource, embracing both nb_NO and nn_NO.
"""
language_map = {
0: ("AnyLanguage", " "),
1: ("C", " "),
2: ("Abkhazian", "ab"),
3: ("Afar", "aa"),
4: ("Afrikaans", "af"),
5: ("Aghem", "agq"),
6: ("Akan", "ak"), # macrolanguage
7: ("Akkadian", "akk"),
8: ("Akoose", "bss"),
9: ("Albanian", "sq"), # macrolanguage
10: ("American Sign Language", "ase"),
11: ("Amharic", "am"),
12: ("Ancient Egyptian", "egy"),
13: ("Ancient Greek", "grc"),
14: ("Arabic", "ar"), # macrolanguage
15: ("Aragonese", "an"),
16: ("Aramaic", "arc"),
17: ("Armenian", "hy"),
18: ("Assamese", "as"),
19: ("Asturian", "ast"),
20: ("Asu", "asa"),
21: ("Atsam", "cch"),
22: ("Avaric", "av"),
23: ("Avestan", "ae"),
24: ("Aymara", "ay"), # macrolanguage
25: ("Azerbaijani", "az"), # macrolanguage
26: ("Bafia", "ksf"),
27: ("Balinese", "ban"),
28: ("Bambara", "bm"),
29: ("Bamun", "bax"),
30: ("Bangla", "bn"),
31: ("Basaa", "bas"),
32: ("Bashkir", "ba"),
33: ("Basque", "eu"),
34: ("Batak Toba", "bbc"),
35: ("Belarusian", "be"),
36: ("Bemba", "bem"),
37: ("Bena", "bez"),
38: ("Bhojpuri", "bho"),
39: ("Bislama", "bi"),
40: ("Blin", "byn"),
41: ("Bodo", "brx"),
42: ("Bosnian", "bs"),
43: ("Breton", "br"),
44: ("Buginese", "bug"),
45: ("Bulgarian", "bg"),
46: ("Burmese", "my"),
47: ("Cantonese", "yue"),
48: ("Catalan", "ca"),
49: ("Cebuano", "ceb"),
50: ("Central Atlas Tamazight", "tzm"),
51: ("Central Kurdish", "ckb"),
52: ("Chakma", "ccp"),
53: ("Chamorro", "ch"),
54: ("Chechen", "ce"),
55: ("Cherokee", "chr"),
56: ("Chickasaw", "cic"),
57: ("Chiga", "cgg"),
58: ("Chinese", "zh"), # macrolanguage
59: ("Church", "cu"), # macrolanguage
60: ("Chuvash", "cv"),
61: ("Colognian", "ksh"),
62: ("Coptic", "cop"),
63: ("Cornish", "kw"),
64: ("Corsican", "co"),
65: ("Cree", "cr"), # macrolanguage
66: ("Croatian", "hr"),
67: ("Czech", "cs"),
68: ("Danish", "da"),
69: ("Divehi", "dv"),
70: ("Dogri", "doi"), # macrolanguage
71: ("Duala", "dua"),
72: ("Dutch", "nl"),
73: ("Dzongkha", "dz"),
74: ("Embu", "ebu"),
75: ("English", "en"),
76: ("Erzya", "myv"),
77: ("Esperanto", "eo"),
78: ("Estonian", "et"), # macrolanguage
79: ("Ewe", "ee" ),
80: ("Ewondo", "ewo"),
81: ("Faroese", "fo"),
82: ("Fijian", "fj"),
83: ("Filipino", "fil"),
84: ("Finnish", "fi"),
85: ("French", "fr"),
86: ("Friulian", "fur"),
87: ("Fulah", "ff"), # macrolanguage
88: ("Gaelic", "gd"),
89: ("Ga", "gaa"),
90: ("Galician", "gl"),
91: ("Ganda", "lg"),
92: ("Geez", "gez"),
93: ("Georgian", "ka"),
94: ("German", "de"),
95: ("Gothic", "got"),
96: ("Greek", "el"),
97: ("Guarani", "gn"), # macrolanguage
98: ("Gujarati", "gu"),
99: ("Gusii", "guz"),
100: ("Haitian", "ht"),
101: ("Hausa", "ha"),
102: ("Hawaiian", "haw"),
103: ("Hebrew", "he"),
104: ("Herero", "hz"),
105: ("Hindi", "hi"),
106: ("Hiri Motu", "ho"),
107: ("Hungarian", "hu"),
108: ("Icelandic", "is"),
109: ("Ido", "io"),
110: ("Igbo", "ig" ),
111: ("Inari Sami", "smn"),
112: ("Indonesian", "id"),
113: ("Ingush", "inh"),
114: ("Interlingua", "ia"),
115: ("Interlingue", "ie"),
116: ("Inuktitut", "iu"), # macrolanguage
117: ("Inupiaq", "ik"), # macrolanguage
118: ("Irish", "ga"),
119: ("Italian", "it"),
120: ("Japanese", "ja"),
121: ("Javanese", "jv"),
122: ("Jju", "kaj"),
123: ("Jola-Fonyi", "dyo"),
124: ("Kabuverdianu", "kea"),
125: ("Kabyle", "kab"),
126: ("Kako", "kkj"),
127: ("Kalaallisut", "kl"),
128: ("Kalenjin", "kln"),
129: ("Kamba", "kam"),
130: ("Kannada", "kn"),
131: ("Kanuri", "kr"), # macrolanguage
132: ("Kashmiri", "ks"),
133: ("Kazakh", "kk"),
134: ("Kenyang", "ken"),
135: ("Khmer", "km"),
136: ("Kiche", "quc"),
137: ("Kikuyu", "ki"),
138: ("Kinyarwanda", "rw"),
139: ("Komi", "kv"), # macrolanguage
140: ("Kongo", "kg"), # macrolanguage
141: ("Konkani", "kok"),
142: ("Korean", "ko"),
143: ("Koro", "kfo"),
144: ("Koyraboro Senni", "ses"),
145: ("Koyra Chiini", "khq"),
146: ("Kpelle", "kpe"),
147: ("Kuanyama", "kj"),
148: ("Kurdish", "ku"), # macrolanguage
149: ("Kwasio", "nmg"),
150: ("Kyrgyz", "ky"),
151: ("Lakota", "lkt"),
152: ("Langi", "lag"),
153: ("Lao", "lo"),
154: ("Latin", "la"),
155: ("Latvian", "lv"), # macrolanguage
156: ("Lezghian", "lez"),
157: ("Limburgish", "li"),
158: ("Lingala", "ln"),
159: ("Literary Chinese", "lzh"),
160: ("Lithuanian", "lt"),
161: ("Lojban", "jbo"),
162: ("Lower Sorbian", "dsb"),
163: ("Low German", "nds"),
164: ("Luba-Katanga", "lu"),
165: ("Lule Sami", "smj"),
166: ("Luo", "luo"),
167: ("Luxembourgish", "lb"),
168: ("Luyia", "luy"),
169: ("Macedonian", "mk"),
170: ("Machame", "jmc"),
171: ("Maithili", "mai"),
172: ("Makhuwa-Meetto", "mgh"),
173: ("Makonde", "kde"),
174: ("Malagasy", "mg"), # macrolanguage
175: ("Malayalam", "ml"),
176: ("Malay", "ms"), # macrolanguage
177: ("Maltese", "mt"),
178: ("Mandingo", "man"), # macrolanguage
179: ("Manipuri", "mni"),
180: ("Manx", "gv"),
181: ("Maori", "mi"),
182: ("Mapuche", "arn"),
183: ("Marathi", "mr"),
184: ("Marshallese", "mh"),
185: ("Masai", "mas"),
186: ("Mazanderani", "mzn"),
187: ("Mende", "men"),
188: ("Meru", "mer"),
189: ("Meta", "mgo"),
190: ("Mohawk", "moh"),
191: ("Mongolian", "mn"), # macrolanguage
192: ("Morisyen", "mfe"),
193: ("Mundang", "mua"),
194: ("Muscogee", "mus"),
195: ("Nama", "naq"),
196: ("Nauru", "na"),
197: ("Navajo", "nv"),
198: ("Ndonga", "ng"),
199: ("Nepali", "ne"), # macrolanguage
200: ("Newari", "new"),
201: ("Ngiemboon", "nnh"),
202: ("Ngomba", "jgo"),
203: ("Nigerian Pidgin", "pcm"),
204: ("Nko", "nqo"),
205: ("Northern Luri", "lrc"),
206: ("Northern Sami", "se" ),
207: ("Northern Sotho", "nso"),
208: ("North Ndebele", "nd"),
209: ("Norwegian Bokmal", "nb"),
210: ("Norwegian Nynorsk", "nn"),
211: ("Nuer", "nus"),
212: ("Nyanja", "ny" ),
213: ("Nyankole", "nyn"),
214: ("Occitan", "oc"),
215: ("Odia", "or"), # macrolanguage
216: ("Ojibwa", "oj"), # macrolanguage
217: ("Old Irish", "sga"),
218: ("Old Norse", "non"),
219: ("Old Persian", "peo"),
220: ("Oromo", "om"), # macrolanguage
221: ("Osage", "osa"),
222: ("Ossetic", "os"),
223: ("Pahlavi", "pal"),
224: ("Palauan", "pau"),
225: ("Pali", "pi"), # macrolanguage
226: ("Papiamento", "pap"),
227: ("Pashto", "ps"), # macrolanguage
228: ("Persian", "fa"), # macrolanguage
229: ("Phoenician", "phn"),
230: ("Polish", "pl"),
231: ("Portuguese", "pt"),
232: ("Prussian", "prg"),
233: ("Punjabi", "pa"),
234: ("Quechua", "qu"), # macrolanguage
235: ("Romanian", "ro"),
236: ("Romansh", "rm"),
237: ("Rombo", "rof"),
238: ("Rundi", "rn"),
239: ("Russian", "ru"),
240: ("Rwa", "rwk"),
241: ("Saho", "ssy"),
242: ("Sakha", "sah"),
243: ("Samburu", "saq"),
244: ("Samoan", "sm"),
245: ("Sango", "sg"),
246: ("Sangu", "sbp"),
247: ("Sanskrit", "sa"),
248: ("Santali", "sat"),
249: ("Sardinian", "sc"), # macrolanguage
250: ("Saurashtra", "saz"),
251: ("Sena", "seh"),
252: ("Serbian", "sr"),
253: ("Shambala", "ksb"),
254: ("Shona", "sn"),
255: ("Sichuan Yi", "ii" ),
256: ("Sicilian", "scn"),
257: ("Sidamo", "sid"),
258: ("Silesian", "szl"),
259: ("Sindhi", "sd"),
260: ("Sinhala", "si"),
261: ("Skolt Sami", "sms"),
262: ("Slovak", "sk"),
263: ("Slovenian", "sl"),
264: ("Soga", "xog"),
265: ("Somali", "so"),
266: ("Southern Kurdish", "sdh"),
267: ("Southern Sami", "sma"),
268: ("Southern Sotho", "st"),
269: ("South Ndebele", "nr" ),
270: ("Spanish", "es"),
271: ("Standard Moroccan Tamazight", "zgh"),
272: ("Sundanese", "su"),
273: ("Swahili", "sw"), # macrolanguage
274: ("Swati", "ss"),
275: ("Swedish", "sv"),
276: ("Swiss German", "gsw"),
277: ("Syriac", "syr"),
278: ("Tachelhit", "shi"),
279: ("Tahitian", "ty"),
280: ("Tai Dam", "blt"),
281: ("Taita", "dav"),
282: ("Tajik", "tg"),
283: ("Tamil", "ta"),
284: ("Taroko", "trv"),
285: ("Tasawaq", "twq"),
286: ("Tatar", "tt"),
287: ("Telugu", "te"),
288: ("Teso", "teo"),
289: ("Thai", "th"),
290: ("Tibetan", "bo"),
291: ("Tigre", "tig"),
292: ("Tigrinya", "ti"),
293: ("Tokelau", "tkl"),
294: ("Tok Pisin", "tpi"),
295: ("Tongan", "to"),
296: ("Tsonga", "ts"),
297: ("Tswana", "tn"),
298: ("Turkish", "tr"),
299: ("Turkmen", "tk"),
300: ("Tuvalu", "tvl"),
301: ("Tyap", "kcg"),
302: ("Ugaritic", "uga"),
303: ("Ukrainian", "uk"),
304: ("Upper Sorbian", "hsb"),
305: ("Urdu", "ur"),
306: ("Uyghur", "ug"),
307: ("Uzbek", "uz"), # macrolanguage
308: ("Vai", "vai"),
309: ("Venda", "ve" ),
310: ("Vietnamese", "vi"),
311: ("Volapuk", "vo"),
312: ("Vunjo", "vun"),
313: ("Walloon", "wa"),
314: ("Walser", "wae"),
315: ("Warlpiri", "wbp"),
316: ("Welsh", "cy"),
317: ("Western Balochi", "bgn"),
318: ("Western Frisian", "fy"),
319: ("Wolaytta", "wal"),
320: ("Wolof", "wo"),
321: ("Xhosa", "xh"),
322: ("Yangben", "yav"),
323: ("Yiddish", "yi"), # macrolanguage
324: ("Yoruba", "yo"),
325: ("Zarma", "dje"),
326: ("Zhuang", "za"), # macrolanguage
327: ("Zulu", "zu"),
# added in CLDR v40
328: ("Kaingang", "kgp"),
329: ("Nheengatu", "yrl"),
# added in CLDR v42
330: ("Haryanvi", "bgc"),
331: ("Northern Frisian", "frr"),
332: ("Rajasthani", "raj"),
333: ("Moksha", "mdf"),
334: ("Toki Pona", "tok"),
335: ("Pijin", "pis"),
336: ("Obolo", "ann"),
# added in CLDR v43
337: ("Baluchi", "bal"),
338: ("Ligurian", "lij"),
339: ("Rohingya", "rhg"),
340: ("Torwali", "trw"),
# added in CLDR v44
341: ("Anii", "blo"),
342: ("Kangri", "xnr"),
343: ("Venetian", "vec"),
# added in CLDR v45
344: ("Kuvi", "kxv"),
# added in CLDR v46
345: ("Kara-Kalpak", "kaa"),
346: ("Swampy Cree", "csw"),
}
# Don't add languages just because they exist; check CLDR does provide
# substantial data for locales using it; and check, once added, they
# don't show up in cldr2qlocalexml.py's unused listing. Do also check
# the data's draft status; if it's (nearly) all unconfirmed, leave it.
language_aliases = {
# Renamings prior to Qt 6.0 (CLDR v37):
'Afan': 'Oromo',
'Byelorussian': 'Belarusian',
'Bhutani': 'Dzongkha',
'Cambodian': 'Khmer',
'Kurundi': 'Rundi',
'RhaetoRomance': 'Romansh',
'Chewa': 'Nyanja',
'Frisian': 'WesternFrisian',
'Uigur': 'Uyghur',
# Renamings:
'Uighur': 'Uyghur',
'Kwanyama': 'Kuanyama',
'Inupiak': 'Inupiaq',
'Bengali': 'Bangla',
'CentralMoroccoTamazight': 'CentralAtlasTamazight',
'Greenlandic': 'Kalaallisut',
'Walamo': 'Wolaytta',
'Navaho': 'Navajo',
'Oriya': 'Odia',
'Kirghiz': 'Kyrgyz'
}
territory_map = {
0: ("AnyTerritory", "ZZ"),
1: ("Afghanistan", "AF"),
2: ("Aland Islands", "AX"),
3: ("Albania", "AL"),
4: ("Algeria", "DZ"),
5: ("American Samoa", "AS"),
6: ("Andorra", "AD"),
7: ("Angola", "AO"),
8: ("Anguilla", "AI"),
9: ("Antarctica", "AQ"),
10: ("Antigua and Barbuda", "AG"),
11: ("Argentina", "AR"),
12: ("Armenia", "AM"),
13: ("Aruba", "AW"),
14: ("Ascension Island", "AC"),
15: ("Australia", "AU"),
16: ("Austria", "AT"),
17: ("Azerbaijan", "AZ"),
18: ("Bahamas", "BS"),
19: ("Bahrain", "BH"),
20: ("Bangladesh", "BD"),
21: ("Barbados", "BB"),
22: ("Belarus", "BY"),
23: ("Belgium", "BE"),
24: ("Belize", "BZ"),
25: ("Benin", "BJ"),
26: ("Bermuda", "BM"),
27: ("Bhutan", "BT"),
28: ("Bolivia", "BO"),
29: ("Bosnia and Herzegovina", "BA"),
30: ("Botswana", "BW"),
31: ("Bouvet Island", "BV"),
32: ("Brazil", "BR"),
33: ("British Indian Ocean Territory", "IO"),
34: ("British Virgin Islands", "VG"),
35: ("Brunei", "BN"),
36: ("Bulgaria", "BG"),
37: ("Burkina Faso", "BF"),
38: ("Burundi", "BI"),
39: ("Cambodia", "KH"),
40: ("Cameroon", "CM"),
41: ("Canada", "CA"),
42: ("Canary Islands", "IC"),
43: ("Cape Verde", "CV"),
44: ("Caribbean Netherlands", "BQ"),
45: ("Cayman Islands", "KY"),
46: ("Central African Republic", "CF"),
47: ("Ceuta and Melilla", "EA"),
48: ("Chad", "TD"),
49: ("Chile", "CL"),
50: ("China", "CN"),
51: ("Christmas Island", "CX"),
52: ("Clipperton Island", "CP"),
53: ("Cocos Islands", "CC"),
54: ("Colombia", "CO"),
55: ("Comoros", "KM"),
56: ("Congo - Brazzaville", "CG"),
57: ("Congo - Kinshasa", "CD"),
58: ("Cook Islands", "CK"),
59: ("Costa Rica", "CR"),
60: ("Croatia", "HR"),
61: ("Cuba", "CU"),
62: ("Curacao", "CW"),
63: ("Cyprus", "CY"),
64: ("Czechia", "CZ"),
65: ("Denmark", "DK"),
66: ("Diego Garcia", "DG"),
67: ("Djibouti", "DJ"),
68: ("Dominica", "DM"),
69: ("Dominican Republic", "DO"),
70: ("Ecuador", "EC"),
71: ("Egypt", "EG"),
72: ("El Salvador", "SV"),
73: ("Equatorial Guinea", "GQ"),
74: ("Eritrea", "ER"),
75: ("Estonia", "EE"),
76: ("Eswatini", "SZ"),
77: ("Ethiopia", "ET"),
78: ("Europe", "150"),
79: ("European Union", "EU"),
80: ("Falkland Islands", "FK"),
81: ("Faroe Islands", "FO"),
82: ("Fiji", "FJ"),
83: ("Finland", "FI"),
84: ("France", "FR"),
85: ("French Guiana", "GF"),
86: ("French Polynesia", "PF"),
87: ("French Southern Territories", "TF"),
88: ("Gabon", "GA"),
89: ("Gambia", "GM"),
90: ("Georgia", "GE"),
91: ("Germany", "DE"),
92: ("Ghana", "GH"),
93: ("Gibraltar", "GI"),
94: ("Greece", "GR"),
95: ("Greenland", "GL"),
96: ("Grenada", "GD"),
97: ("Guadeloupe", "GP"),
98: ("Guam", "GU"),
99: ("Guatemala", "GT"),
100: ("Guernsey", "GG"),
101: ("Guinea-Bissau", "GW"),
102: ("Guinea", "GN"),
103: ("Guyana", "GY"),
104: ("Haiti", "HT"),
105: ("Heard and McDonald Islands", "HM"),
106: ("Honduras", "HN"),
107: ("Hong Kong", "HK"),
108: ("Hungary", "HU"),
109: ("Iceland", "IS"),
110: ("India", "IN"),
111: ("Indonesia", "ID"),
112: ("Iran", "IR"),
113: ("Iraq", "IQ"),
114: ("Ireland", "IE"),
115: ("Isle of Man", "IM"),
116: ("Israel", "IL"),
117: ("Italy", "IT"),
# Officially Côte dIvoire, which we'd need to map to CotedIvoire
# or CoteDIvoire, either failing to make the d' separate from Cote
# or messing with its case. So stick with Ivory Coast:
118: ("Ivory Coast", "CI"),
119: ("Jamaica", "JM"),
120: ("Japan", "JP"),
121: ("Jersey", "JE"),
122: ("Jordan", "JO"),
123: ("Kazakhstan", "KZ"),
124: ("Kenya", "KE"),
125: ("Kiribati", "KI"),
126: ("Kosovo", "XK"),
127: ("Kuwait", "KW"),
128: ("Kyrgyzstan", "KG"),
129: ("Laos", "LA"),
130: ("Latin America", "419"),
131: ("Latvia", "LV"),
132: ("Lebanon", "LB"),
133: ("Lesotho", "LS"),
134: ("Liberia", "LR"),
135: ("Libya", "LY"),
136: ("Liechtenstein", "LI"),
137: ("Lithuania", "LT"),
138: ("Luxembourg", "LU"),
139: ("Macao", "MO"),
140: ("Macedonia", "MK"),
141: ("Madagascar", "MG"),
142: ("Malawi", "MW"),
143: ("Malaysia", "MY"),
144: ("Maldives", "MV"),
145: ("Mali", "ML"),
146: ("Malta", "MT"),
147: ("Marshall Islands", "MH"),
148: ("Martinique", "MQ"),
149: ("Mauritania", "MR"),
150: ("Mauritius", "MU"),
151: ("Mayotte", "YT"),
152: ("Mexico", "MX"),
153: ("Micronesia", "FM"),
154: ("Moldova", "MD"),
155: ("Monaco", "MC"),
156: ("Mongolia", "MN"),
157: ("Montenegro", "ME"),
158: ("Montserrat", "MS"),
159: ("Morocco", "MA"),
160: ("Mozambique", "MZ"),
161: ("Myanmar", "MM"),
162: ("Namibia", "NA"),
163: ("Nauru", "NR"),
164: ("Nepal", "NP"),
165: ("Netherlands", "NL"),
166: ("New Caledonia", "NC"),
167: ("New Zealand", "NZ"),
168: ("Nicaragua", "NI"),
169: ("Nigeria", "NG"),
170: ("Niger", "NE"),
171: ("Niue", "NU"),
172: ("Norfolk Island", "NF"),
173: ("Northern Mariana Islands", "MP"),
174: ("North Korea", "KP"),
175: ("Norway", "NO"),
176: ("Oman", "OM"),
177: ("Outlying Oceania", "QO"),
178: ("Pakistan", "PK"),
179: ("Palau", "PW"),
180: ("Palestinian Territories", "PS"),
181: ("Panama", "PA"),
182: ("Papua New Guinea", "PG"),
183: ("Paraguay", "PY"),
184: ("Peru", "PE"),
185: ("Philippines", "PH"),
186: ("Pitcairn", "PN"),
187: ("Poland", "PL"),
188: ("Portugal", "PT"),
189: ("Puerto Rico", "PR"),
190: ("Qatar", "QA"),
191: ("Reunion", "RE"),
192: ("Romania", "RO"),
193: ("Russia", "RU"),
194: ("Rwanda", "RW"),
195: ("Saint Barthelemy", "BL"),
196: ("Saint Helena", "SH"),
197: ("Saint Kitts and Nevis", "KN"),
198: ("Saint Lucia", "LC"),
199: ("Saint Martin", "MF"),
200: ("Saint Pierre and Miquelon", "PM"),
201: ("Saint Vincent and Grenadines", "VC"),
202: ("Samoa", "WS"),
203: ("San Marino", "SM"),
204: ("Sao Tome and Principe", "ST"),
205: ("Saudi Arabia", "SA"),
206: ("Senegal", "SN"),
207: ("Serbia", "RS"),
208: ("Seychelles", "SC"),
209: ("Sierra Leone", "SL"),
210: ("Singapore", "SG"),
211: ("Sint Maarten", "SX"),
212: ("Slovakia", "SK"),
213: ("Slovenia", "SI"),
214: ("Solomon Islands", "SB"),
215: ("Somalia", "SO"),
216: ("South Africa", "ZA"),
217: ("South Georgia and South Sandwich Islands", "GS"),
218: ("South Korea", "KR"),
219: ("South Sudan", "SS"),
220: ("Spain", "ES"),
221: ("Sri Lanka", "LK"),
222: ("Sudan", "SD"),
223: ("Suriname", "SR"),
224: ("Svalbard and Jan Mayen", "SJ"),
225: ("Sweden", "SE"),
226: ("Switzerland", "CH"),
227: ("Syria", "SY"),
228: ("Taiwan", "TW"),
229: ("Tajikistan", "TJ"),
230: ("Tanzania", "TZ"),
231: ("Thailand", "TH"),
232: ("Timor-Leste", "TL"),
233: ("Togo", "TG"),
234: ("Tokelau", "TK"),
235: ("Tonga", "TO"),
236: ("Trinidad and Tobago", "TT"),
237: ("Tristan da Cunha", "TA"),
238: ("Tunisia", "TN"),
239: ("Turkey", "TR"),
240: ("Turkmenistan", "TM"),
241: ("Turks and Caicos Islands", "TC"),
242: ("Tuvalu", "TV"),
243: ("Uganda", "UG"),
244: ("Ukraine", "UA"),
245: ("United Arab Emirates", "AE"),
246: ("United Kingdom", "GB"),
247: ("United States Outlying Islands", "UM"),
248: ("United States", "US"),
249: ("United States Virgin Islands", "VI"),
250: ("Uruguay", "UY"),
251: ("Uzbekistan", "UZ"),
252: ("Vanuatu", "VU"),
253: ("Vatican City", "VA"),
254: ("Venezuela", "VE"),
255: ("Vietnam", "VN"),
256: ("Wallis and Futuna", "WF"),
257: ("Western Sahara", "EH"),
258: ("world", "001"),
259: ("Yemen", "YE"),
260: ("Zambia", "ZM"),
261: ("Zimbabwe", "ZW"),
}
territory_aliases = {
# Renamings prior to Qt 6.0 (CLDR v37):
'DemocraticRepublicOfCongo': 'CongoKinshasa',
'PeoplesRepublicOfCongo': 'CongoBrazzaville',
'DemocraticRepublicOfKorea': 'NorthKorea',
'RepublicOfKorea': 'SouthKorea',
'RussianFederation': 'Russia',
'SyrianArabRepublic': 'Syria',
'LatinAmericaAndTheCaribbean': 'LatinAmerica',
# Renamings:
'EastTimor': 'TimorLeste',
'Bonaire': 'CaribbeanNetherlands',
'Macau': 'Macao',
'SouthGeorgiaAndTheSouthSandwichIslands': 'SouthGeorgiaAndSouthSandwichIslands',
'WallisAndFutunaIslands': 'WallisAndFutuna',
'SaintVincentAndTheGrenadines': 'SaintVincentAndGrenadines',
'BosniaAndHerzegowina': 'BosniaAndHerzegovina',
'SvalbardAndJanMayenIslands': 'SvalbardAndJanMayen',
'VaticanCityState': 'VaticanCity',
'Swaziland': 'Eswatini',
'UnitedStatesMinorOutlyingIslands': 'UnitedStatesOutlyingIslands',
'CuraSao': 'Curacao',
'CzechRepublic': 'Czechia',
# Backwards compatibility with old Country enum, prior to Qt 6.2:
'AnyCountry': 'AnyTerritory',
'NauruCountry': 'NauruTerritory',
'TokelauCountry': 'TokelauTerritory',
'TuvaluCountry': 'TuvaluTerritory',
}
script_map = {
0: ("AnyScript", "Zzzz"),
1: ("Adlam", "Adlm"),
2: ("Ahom", "Ahom"),
3: ("Anatolian Hieroglyphs", "Hluw"),
4: ("Arabic", "Arab"),
5: ("Armenian", "Armn"),
6: ("Avestan", "Avst"),
7: ("Balinese", "Bali"),
8: ("Bamum", "Bamu"),
9: ("Bangla", "Beng"),
10: ("Bassa Vah", "Bass"),
11: ("Batak", "Batk"),
12: ("Bhaiksuki", "Bhks"),
13: ("Bopomofo", "Bopo"),
14: ("Brahmi", "Brah"),
15: ("Braille", "Brai"),
16: ("Buginese", "Bugi"),
17: ("Buhid", "Buhd"),
18: ("Canadian Aboriginal", "Cans"),
19: ("Carian", "Cari"),
20: ("Caucasian Albanian", "Aghb"),
21: ("Chakma", "Cakm"),
22: ("Cham", "Cham"),
23: ("Cherokee", "Cher"),
24: ("Coptic", "Copt"),
25: ("Cuneiform", "Xsux"),
26: ("Cypriot", "Cprt"),
27: ("Cyrillic", "Cyrl"),
28: ("Deseret", "Dsrt"),
29: ("Devanagari", "Deva"),
30: ("Duployan", "Dupl"),
31: ("Egyptian hieroglyphs", "Egyp"),
32: ("Elbasan", "Elba"),
33: ("Ethiopic", "Ethi"),
34: ("Fraser", "Lisu"),
35: ("Georgian", "Geor"),
36: ("Glagolitic", "Glag"),
37: ("Gothic", "Goth"),
38: ("Grantha", "Gran"),
39: ("Greek", "Grek"),
40: ("Gujarati", "Gujr"),
41: ("Gurmukhi", "Guru"),
42: ("Hangul", "Hang"),
43: ("Han", "Hani"),
44: ("Hanunoo", "Hano"),
45: ("Han with Bopomofo", "Hanb"),
46: ("Hatran", "Hatr"),
47: ("Hebrew", "Hebr"),
48: ("Hiragana", "Hira"),
49: ("Imperial Aramaic", "Armi"),
50: ("Inscriptional Pahlavi", "Phli"),
51: ("Inscriptional Parthian", "Prti"),
52: ("Jamo", "Jamo"),
53: ("Japanese", "Jpan"),
54: ("Javanese", "Java"),
55: ("Kaithi", "Kthi"),
56: ("Kannada", "Knda"),
57: ("Katakana", "Kana"),
58: ("Kayah Li", "Kali"),
59: ("Kharoshthi", "Khar"),
60: ("Khmer", "Khmr"),
61: ("Khojki", "Khoj"),
62: ("Khudawadi", "Sind"),
63: ("Korean", "Kore"),
64: ("Lanna", "Lana"),
65: ("Lao", "Laoo"),
66: ("Latin", "Latn"),
67: ("Lepcha", "Lepc"),
68: ("Limbu", "Limb"),
69: ("Linear A", "Lina"),
70: ("Linear B", "Linb"),
71: ("Lycian", "Lyci"),
72: ("Lydian", "Lydi"),
73: ("Mahajani", "Mahj"),
74: ("Malayalam", "Mlym"),
75: ("Mandaean", "Mand"),
76: ("Manichaean", "Mani"),
77: ("Marchen", "Marc"),
78: ("Meitei Mayek", "Mtei"),
79: ("Mende", "Mend"),
80: ("Meroitic Cursive", "Merc"),
81: ("Meroitic", "Mero"),
82: ("Modi", "Modi"),
83: ("Mongolian", "Mong"),
84: ("Mro", "Mroo"),
85: ("Multani", "Mult"),
86: ("Myanmar", "Mymr"),
87: ("Nabataean", "Nbat"),
88: ("Newa", "Newa"),
89: ("New Tai Lue", "Talu"),
90: ("Nko", "Nkoo"),
91: ("Odia", "Orya"),
92: ("Ogham", "Ogam"),
93: ("Ol Chiki", "Olck"),
94: ("Old Hungarian", "Hung"),
95: ("Old Italic", "Ital"),
96: ("Old North Arabian", "Narb"),
97: ("Old Permic", "Perm"),
98: ("Old Persian", "Xpeo"),
99: ("Old South Arabian", "Sarb"),
100: ("Orkhon", "Orkh"),
101: ("Osage", "Osge"),
102: ("Osmanya", "Osma"),
103: ("Pahawh Hmong", "Hmng"),
104: ("Palmyrene", "Palm"),
105: ("Pau Cin Hau", "Pauc"),
106: ("Phags-pa", "Phag"),
107: ("Phoenician", "Phnx"),
108: ("Pollard Phonetic", "Plrd"),
109: ("Psalter Pahlavi", "Phlp"),
110: ("Rejang", "Rjng"),
111: ("Runic", "Runr"),
112: ("Samaritan", "Samr"),
113: ("Saurashtra", "Saur"),
114: ("Sharada", "Shrd"),
115: ("Shavian", "Shaw"),
116: ("Siddham", "Sidd"),
117: ("SignWriting", "Sgnw"), # Oddly, en.xml leaves no space in it.
118: ("Simplified Han", "Hans"),
119: ("Sinhala", "Sinh"),
120: ("Sora Sompeng", "Sora"),
121: ("Sundanese", "Sund"),
122: ("Syloti Nagri", "Sylo"),
123: ("Syriac", "Syrc"),
124: ("Tagalog", "Tglg"),
125: ("Tagbanwa", "Tagb"),
126: ("Tai Le", "Tale"),
127: ("Tai Viet", "Tavt"),
128: ("Takri", "Takr"),
129: ("Tamil", "Taml"),
130: ("Tangut", "Tang"),
131: ("Telugu", "Telu"),
132: ("Thaana", "Thaa"),
133: ("Thai", "Thai"),
134: ("Tibetan", "Tibt"),
135: ("Tifinagh", "Tfng"),
136: ("Tirhuta", "Tirh"),
137: ("Traditional Han", "Hant"),
138: ("Ugaritic", "Ugar"),
139: ("Vai", "Vaii"),
140: ("Varang Kshiti", "Wara"),
141: ("Yi", "Yiii"),
# Added at CLDR v43
142: ("Hanifi", "Rohg"), # Used for Rohingya
}
script_aliases = {
# Renamings prior to Qt 6.0 (CLDR v37):
'SimplifiedChineseScript': 'SimplifiedHanScript',
'TraditionalChineseScript': 'TraditionalHanScript',
# Renamings:
'OriyaScript': 'OdiaScript',
'MendeKikakuiScript': 'MendeScript',
'BengaliScript': 'BanglaScript',
}