Update QLocale and calendar data to CLDR v44.1

(This turns out to be identical to v44, for our purposes.)

The CLDR license has been revised at v44 to "UNICODE LICENSE V3",
which is now included (as LICENSES/UNICODE-3.0.txt) in addition to the
old license (still in use, presumably, by UCD - at least until its
next update). Some new QLocale::Language entries are needed. There is
no change to the time-zone data.

Some tests needed changes:
* Various Arabic locales now use U+0623 (Arabic letter aleph with
  hamza above) in exponent separator, replacing plain U+0627 (Arabic
  letter aleph); it is still followed by U+0633 (Arabic letter seen).
* Where likely sub-tags used to fill in world, 001, as territory for a
  language, they now (e.g. for Prussian and Yiddish) give specific
  countries.
* Tamil locales now have something of a mix of inherited and localized
  forms for AM/PM, which looks a lot like a mistake in CLDR.

Conflict resolution at 6.7: a test fixed in dev is not present in 6.7,
as it wasn't reworked or given the new test-case, so that's omitted.
Conflict resolution in 6.6: regenerated data using 6.6's scripts.

[ChangeLog][Third-Party Code] Updated QLocale's data extracted from
the Unicode Common Locale Data Repository (CLDR) to v44.1. The license
changed to Unicode License V3.

Pick-to: 6.5
Fixes: QTBUG-121485
Task-number: QTBUG-121325
Change-Id: Ide1a68016129526d7a5aa3fc67f1a674858696bc
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
(cherry picked from commit 063026cc503e0c02af781caf920f5abfa0416268)
(cherry picked from commit 41c786781ae3f75d02cf34c1b41e326181a03e38)
This commit is contained in:
Edward Welbourne 2024-01-09 16:03:55 +01:00
parent 9a3da1b6d6
commit 4f88c3e3ac
10 changed files with 6843 additions and 6444 deletions

39
LICENSES/UNICODE-3.0.txt Normal file
View File

@ -0,0 +1,39 @@
UNICODE LICENSE V3
COPYRIGHT AND PERMISSION NOTICE
Copyright © 2004-2023 Unicode, Inc.
NOTICE TO USER: Carefully read the following legal agreement. BY
DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING DATA FILES, AND/OR
SOFTWARE, YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE
TERMS AND CONDITIONS OF THIS AGREEMENT. IF YOU DO NOT AGREE, DO NOT
DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE THE DATA FILES OR SOFTWARE.
Permission is hereby granted, free of charge, to any person obtaining a
copy of data files and any associated documentation (the "Data Files") or
software and any associated documentation (the "Software") to deal in the
Data Files or Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, and/or sell
copies of the Data Files or Software, and to permit persons to whom the
Data Files or Software are furnished to do so, provided that either (a)
this copyright and permission notice appear with all copies of the Data
Files or Software, or (b) this copyright and permission notice appear in
associated Documentation.
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF
THIRD PARTY RIGHTS.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE
BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES,
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA
FILES OR SOFTWARE.
Except as contained in this notice, the name of a copyright holder shall
not be used in advertising or otherwise to promote the sale, use or other
dealings in these Data Files or Software without prior written
authorization of the copyright holder.

View File

@ -380,6 +380,9 @@ public:
Ligurian = 338,
Rohingya = 339,
Torwali = 340,
Anii = 341,
Kangri = 342,
Venetian = 343,
Afan = Oromo,
Bengali = Bangla,
@ -401,7 +404,7 @@ public:
Uigur = Uyghur,
Walamo = Wolaytta,
LastLanguage = Torwali
LastLanguage = Venetian
};
enum Script : ushort {

View File

@ -50,7 +50,7 @@
\note For the current keyboard input locale take a look at
QInputMethod::locale().
QLocale's data is based on Common Locale Data Repository v43.
QLocale's data is based on Common Locale Data Repository v44.1.
\section1 Matching combinations of language, script and territory
@ -102,6 +102,7 @@
\value Amharic
\value [since 5.1] AncientEgyptian
\value [since 5.1] AncientGreek
\value [since 6.7] Anii
\value Arabic
\value [since 5.1] Aragonese
\value [since 5.1] Aramaic
@ -230,6 +231,7 @@
\value [since 6.0] Kalaallisut
\value Kalenjin
\value Kamba
\value [since 6.7] Kangri
\value Kannada
\value Kanuri
\value Kashmiri
@ -428,6 +430,7 @@
\value Uzbek
\value Vai
\value Venda
\value [since 6.7] Venetian
\value Vietnamese
\value Volapuk
\value Vunjo

File diff suppressed because it is too large Load Diff

View File

@ -34,11 +34,9 @@
world's languages, with the largest and most extensive standard repository of locale data
available.",
"Homepage": "https://cldr.unicode.org/",
"Version": "v43",
"Comment": { "License": "as specified in https://spdx.org/licenses/Unicode-DFS-2016.html" },
"License": "Unicode License Agreement - Data Files and Software (2016)",
"LicenseId": "Unicode-DFS-2016",
"LicenseFile": "UNICODE_LICENSE.txt",
"Copyright": "Copyright (C) 1991-2022 Unicode, Inc."
"Version": "v44.1",
"License": "Unicode License V3",
"LicenseId": "UNICODE-3.0",
"Copyright": "Copyright (C) 2004-2023 Unicode, Inc."
}
]

File diff suppressed because it is too large Load Diff

View File

@ -25,8 +25,8 @@ namespace QtPrivate::Jalali {
// GENERATED PART STARTS HERE
/*
This part of the file was generated on 2023-07-27 from the
Common Locale Data Repository v43
This part of the file was generated on 2024-02-06 from the
Common Locale Data Repository v44.1
http://www.unicode.org/cldr/
@ -118,7 +118,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 49, 66, 185, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Cebuano/Latin/Philippines
{ 50, 66, 159, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Central Atlas Tamazight/Latin/Morocco
{ 51, 4, 113, 645, 645, 645, 645, 153, 153,101,101,101,101, 26, 26 },// Central Kurdish/Arabic/Iraq
{ 51, 4, 112, 645, 746, 645, 645, 153, 153,101,100,101,101, 26, 26 },// Central Kurdish/Arabic/Iran
{ 51, 4, 112, 746, 746, 746, 746, 153, 153,100,100,100,100, 26, 26 },// Central Kurdish/Arabic/Iran
{ 52, 21, 20, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Chakma/Chakma/Bangladesh
{ 52, 21, 110, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Chakma/Chakma/India
{ 54, 27, 193, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Chechen/Cyrillic/Russia
@ -196,6 +196,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 75, 66, 103, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Guyana
{ 75, 66, 107, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Hong Kong
{ 75, 66, 110, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/India
{ 75, 66, 111, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Indonesia
{ 75, 66, 114, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Ireland
{ 75, 66, 115, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Isle Of Man
{ 75, 66, 116, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// English/Latin/Israel
@ -378,6 +379,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 111, 66, 83, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Inari Sami/Latin/Finland
{ 112, 66, 111, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Indonesian/Latin/Indonesia
{ 114, 66, 258, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Interlingua/Latin/World
{ 115, 66, 75, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Interlingue/Latin/Estonia
{ 116, 18, 41, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Inuktitut/Canadian Aboriginal/Canada
{ 116, 66, 41, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Inuktitut/Latin/Canada
{ 118, 66, 114, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Irish/Latin/Ireland
@ -407,6 +409,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 138, 66, 194, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Kinyarwanda/Latin/Rwanda
{ 141, 29, 110, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Konkani/Devanagari/India
{ 142, 63, 218, 2685, 2685, 2685, 2685, 153, 153, 54, 54, 54, 54, 26, 26 },// Korean/Korean/South Korea
{ 142, 63, 50, 2685, 2685, 2685, 2685, 153, 153, 54, 54, 54, 54, 26, 26 },// Korean/Korean/China
{ 142, 63, 174, 2685, 2685, 2685, 2685, 153, 153, 54, 54, 54, 54, 26, 26 },// Korean/Korean/North Korea
{ 144, 66, 145, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Koyraboro Senni/Latin/Mali
{ 145, 66, 145, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Koyra Chiini/Latin/Mali
@ -501,7 +504,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 227, 4, 1, 3384, 3384, 3384, 3384, 3446, 3446, 62, 62, 62, 62, 26, 26 },// Pashto/Arabic/Afghanistan
{ 227, 4, 178, 3384, 3384, 3384, 3384, 3446, 3446, 62, 62, 62, 62, 26, 26 },// Pashto/Arabic/Pakistan
{ 228, 4, 112, 3472, 3472, 3472, 3472, 3538, 3538, 66, 66, 66, 66, 23, 23 },// Persian/Arabic/Iran
{ 228, 4, 1, 3472, 3561, 3472, 3472, 3617, 3538, 66, 56, 66, 66, 23, 23 },// Persian/Arabic/Afghanistan
{ 228, 4, 1, 3561, 3561, 3561, 3561, 3617, 3617, 56, 56, 56, 56, 23, 23 },// Persian/Arabic/Afghanistan
{ 230, 66, 187, 3640, 3640, 3640, 3640, 153, 153, 83, 83, 83, 83, 26, 26 },// Polish/Latin/Poland
{ 231, 66, 32, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Portuguese/Latin/Brazil
{ 231, 66, 7, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Portuguese/Latin/Angola
@ -515,7 +518,7 @@ static constexpr QCalendarLocale locale_data[] = {
{ 231, 66, 204, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Portuguese/Latin/Sao Tome And Principe
{ 231, 66, 226, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Portuguese/Latin/Switzerland
{ 231, 66, 232, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Portuguese/Latin/Timor-Leste
{ 232, 66, 258, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Prussian/Latin/World
{ 232, 66, 187, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Prussian/Latin/Poland
{ 233, 41, 110, 3723, 3723, 3723, 3723, 153, 153, 77, 77, 77, 77, 26, 26 },// Punjabi/Gurmukhi/India
{ 233, 4, 178, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Punjabi/Arabic/Pakistan
{ 234, 66, 184, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Quechua/Latin/Peru
@ -677,10 +680,11 @@ static constexpr QCalendarLocale locale_data[] = {
{ 320, 66, 206, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Wolof/Latin/Senegal
{ 321, 66, 216, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Xhosa/Latin/South Africa
{ 322, 66, 40, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Yangben/Latin/Cameroon
{ 323, 47, 258, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Yiddish/Hebrew/World
{ 323, 47, 244, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Yiddish/Hebrew/Ukraine
{ 324, 66, 169, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Yoruba/Latin/Nigeria
{ 324, 66, 25, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Yoruba/Latin/Benin
{ 325, 66, 170, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Zarma/Latin/Niger
{ 326, 66, 50, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Zhuang/Latin/China
{ 327, 66, 216, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Zulu/Latin/South Africa
{ 328, 66, 32, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Kaingang/Latin/Brazil
{ 329, 66, 32, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Nheengatu/Latin/Brazil
@ -699,6 +703,9 @@ static constexpr QCalendarLocale locale_data[] = {
{ 339, 142, 161, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Rohingya/Hanifi/Myanmar
{ 339, 142, 20, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Rohingya/Hanifi/Bangladesh
{ 340, 4, 178, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Torwali/Arabic/Pakistan
{ 341, 66, 25, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Anii/Latin/Benin
{ 342, 29, 110, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Kangri/Devanagari/India
{ 343, 66, 117, 0, 0, 0, 0, 153, 153, 83, 83, 83, 83, 26, 26 },// Venetian/Latin/Italy
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },// trailing zeros
};

File diff suppressed because it is too large Load Diff

View File

@ -472,7 +472,7 @@ void tst_QLocale::defaulted_ctor()
TEST_CTOR("en-GB", English, UnitedKingdom)
TEST_CTOR("en-GB@bla", English, UnitedKingdom)
TEST_CTOR("eo", Esperanto, World)
TEST_CTOR("yi", Yiddish, World)
TEST_CTOR("yi", Yiddish, Ukraine)
TEST_CTOR("no", NorwegianBokmal, Norway)
TEST_CTOR("nb", NorwegianBokmal, Norway)
@ -901,13 +901,13 @@ void tst_QLocale::toReal_data()
QTest::newRow("se_NO 4x-3") // Only first character of exponent
<< u"se_NO"_s << u"4\u00b7\u2212" "03"_s << false << 0.0;
QTest::newRow("ar_EG 4e-3") // Arabic, Egypt
<< u"ar_EG"_s << u"\u0664\u0627\u0633\u061c-\u0660\u0663"_s << true << 4e-3;
<< u"ar_EG"_s << u"\u0664\u0623\u0633\u061c-\u0660\u0663"_s << true << 4e-3;
QTest::newRow("ar_EG 4e!3") // Only first character of sign:
<< u"ar_EG"_s << u"\u0664\u0627\u0633\u061c\u0660\u0663"_s << false << 0.0;
<< u"ar_EG"_s << u"\u0664\u0623\u0633\u061c\u0660\u0663"_s << false << 0.0;
QTest::newRow("ar_EG 4x-3") // Only first character of exponent
<< u"ar_EG"_s << u"\u0664\u0627\u061c-\u0660\u0663"_s << false << 0.0;
<< u"ar_EG"_s << u"\u0664\u0623\u061c-\u0660\u0663"_s << false << 0.0;
QTest::newRow("ar_EG 4x!3") // Only first character of exponent and sign
<< u"ar_EG"_s << u"\u0664\u0627\u061c\u0660\u0663"_s << false << 0.0;
<< u"ar_EG"_s << u"\u0664\u0623\u061c\u0660\u0663"_s << false << 0.0;
QTest::newRow("fa_IR 4e-3") // Farsi, Iran
<< u"fa_IR"_s << u"\u06f4\u00d7\u06f1\u06f0^\u200e\u2212\u06f0\u06f3"_s << true << 4e-3;
QTest::newRow("fa_IR 4e!3") // Only first character of sign:
@ -1201,9 +1201,9 @@ void tst_QLocale::doubleToString_data()
QTest::newRow("se 0.000003945 g 1") // Northern Sami
<< u"se"_s << u"4\u00b7" "10^\u2212" "06"_s << 0.000003945 << 'g' << 1;
QTest::newRow("ar_EG 0.000003945 g 1") // Arabic, Egypt (among others)
<< u"ar_EG"_s << u"\u0664\u0627\u0633\u061c-\u0660\u0666"_s << 0.000003945 << 'g' << 1;
<< u"ar_EG"_s << u"\u0664\u0623\u0633\u061c-\u0660\u0666"_s << 0.000003945 << 'g' << 1;
QTest::newRow("ar_EG 3945e3 g 1")
<< u"ar_EG"_s << u"\u0664\u0627\u0633\u061c+\u0660\u0666"_s << 3945e3 << 'g' << 1;
<< u"ar_EG"_s << u"\u0664\u0623\u0633\u061c+\u0660\u0666"_s << 3945e3 << 'g' << 1;
QTest::newRow("fa_IR 0.000003945 g 1") // Farsi, Iran (same for Afghanistan)
<< u"fa_IR"_s << u"\u06f4\u00d7\u06f1\u06f0^\u200e\u2212\u06f0\u06f6"_s
<< 0.000003945 << 'g' << 1;
@ -2514,7 +2514,7 @@ void tst_QLocale::doubleRoundTrip_data()
QTest::newRow("se_NO 4e-06 g") // Northern Sami, Norway
<< u"se_NO"_s << u"4\u00b7" "10^\u2212" "06"_s << 'g';
QTest::newRow("ar_EG 4e-06 g") // Arabic, Egypt
<< u"ar_EG"_s << u"\u0664\u0627\u0633\u061c-\u0660\u0666"_s << 'g';
<< u"ar_EG"_s << u"\u0664\u0623\u0633\u061c-\u0660\u0666"_s << 'g';
QTest::newRow("fa_IR 4e-06 g") // Farsi, Iran
<< u"fa_IR"_s << u"\u06f4\u00d7\u06f1\u06f0^\u200e\u2212\u06f0\u06f6"_s << 'g';
}
@ -3146,7 +3146,8 @@ void tst_QLocale::ampm_data()
QTest::newRow("tr_TR") << QString::fromUtf8("\303\226\303\226")
<< QString::fromUtf8("\303\226\123");
QTest::newRow("id_ID") << QStringLiteral("AM") << QStringLiteral("PM");
QTest::newRow("ta_LK") << QString::fromUtf8("முற்பகல்") << QString::fromUtf8("பிற்பகல்");
// CLDR v44 made Tamil's AM/PM inconsistent; AM was "முற்பகல்" before.
QTest::newRow("ta_LK") << QString::fromUtf8("AM") << QString::fromUtf8("பிற்பகல்");
}
void tst_QLocale::ampm()

View File

@ -372,10 +372,15 @@ language_map = {
338: ("Ligurian", "lij"),
339: ("Rohingya", "rhg"),
340: ("Torwali", "trw"),
# added in CLDR v44
341: ("Anii", "blo"),
342: ("Kangri", "xnr"),
343: ("Venetian", "vec"),
}
# Don't add languages just because they exist; check CLDR does provide
# substantial data for locales using it; and check, once added, they
# don't show up in cldr2qlocalexmo.py's unused listing.
# don't show up in cldr2qlocalexmo.py's unused listing. Do also check
# the data's draft status; if it's (nearly) all unconfirmed, leave it.
language_aliases = {
# Renamings prior to Qt 6.0 (CLDR v37):