43 Commits

Author SHA1 Message Date
Edward Welbourne
303863170c QLocale: fix likely subtags to include und -> en_Latn_US
The lack of this was hidden by other rules (redundant with it) until
CLDR v45, but v46 prunes the redundant rules, breaking this. So
include the missing rule and tweak the code that assumed likely
sub-tag rules preserved language, since this one doesn't. Rework the
tail of withLikelySubtagsAdded() to correctly use this rule, now that
we have it. (The prior comment about there being no match-all was
wrong: CLDR did have it, but our data skipped it.) Amended one test
affected by it (when system locale wasn't en_US).

Pick-to: 6.8
Task-number: QTBUG-130877
Change-Id: I2a415b67af4bc8aa6a766bcc1e349ee5bda9f174
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-12-05 14:42:56 +01:00
Edward Welbourne
e23dc7c420 Correct handling of World in mapping MS's zone IDs to IANA ones
The AnyTerritory entries in the zoneDataTable are derived from
territory="ZZ" entries in the upstream CLDR data; the World ones from
territory="001". The latter give the default IANA ID for each MS ID,
the former give an (often legacy) IANA ID for the MS ID, that is not
based on geography. Some of these are being removed at CLDR v46.

The documentation said the ZZ entries have "no known territorial
association", hinting that there may be some (unknown) territorial
association; however, CLDR's inclusion of them is as entries with a
known non-territorial association, so revise the phrasing to reflect
this.

Also document that windowsIdToDefaultIanaId() returns empty when
there is no territory-specific value, and callers can use the
territory-neutral call to get a suitable value in that case. (They
may, however, wish to distinguish this case, to treat it differently,
so I decided not to just return that in place of empty in any case.)

The upstream CLDR tables do have entries for territory 001, so we
should report these if asked for World as territory. Amend the
available zone ID lookup and mapping from MS to IANA functions that
take a territory to duly handle World via the default-data that was
derived from 001 data in CLDR, instead of from the territory-varying
table, from which those were effectively filtered out when generating
the two tables. Update docs to mention this handling of World, for
contrast with that of AnyTerritory.

In the process remove a spurious split-on-space from the MS to default
IANA lookup, asserting there is no space (in a field now stored in the
table for single IANA ID entries, instead of the one for space-joined
lists of them in which it used to be stored, before I noticed it's
always only one ID). There is a matching assertion in the cldr.py code
that extracts the data. Added an assertion to this last, that each
default IANA ID given by CLDR's MS data does in fact also appear as
one of the IANA IDs for at least one territory (potentially ZZ), and
comment in C++ code on why this means we don't need to scan the
windowsDataTable in a few places, where it would just produce
duplicate entries.

[ChangeLog][QtCore][QTimeZone] Corrected handling of QLocale::World
and clarified in docs how QLocale::AnyTerritory is handled when
QTimeZone selects zones by territory.

Pick-to: 6.8
Task-number: QTBUG-130877
Change-Id: I861c777c68b0cb73a194138fe23fbff839df49e6
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2024-12-02 17:43:50 +01:00
Mate Barany
3045a08e5e Add type annotations to QLocaleXmlWriter
Also fix the annotation of englishNaming in cldr.py. Spotted it while
annotating __enumTable.

Task-number: QTBUG-129564
Pick-to: 6.8
Change-Id: I93f698b4cf1b5ae90c21fe77330e4f167143a9f3
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-11-11 19:33:44 +01:00
Mate Barany
812f79e75f Add type annotations to CldrReader
Add some type annotatons to cldr2qlocalexml.py as well. Based on the
default arguments the constructor of CldrReader was expecting callables
that return None, but in reality we are passing in functions that
return integers.

Task-number: QTBUG-129613
Pick-to: 6.8
Change-Id: I06832240956ea635ca0cc0ec45c466a3b2539ff7
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-10-24 23:22:40 +02:00
Mate Barany
defd1549de Add type annotations to CldrAccess
Task-number: QTBUG-129613
Pick-to: 6.8
Change-Id: I8a00cca718554909b7ab9dcad15cc9b9ac702e94
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-10-24 11:53:52 +02:00
Edward Welbourne
98db7a35d2 Fix check for duplicated Windows time-zone IDs
A missing update of a "last" variable meant the loop inevitably did
nothing useful. Include type-annotation for last, while doing this.
Thankfully the check still doesn't find any duplications, now that
I've fixed it so that actually would, were any present.

Pick-to: 6.8 6.5
Change-Id: I672e6570359a3ff102a364d8af98c5c8c0bdc4d9
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-10-23 20:34:58 +02:00
Mate Barany
b9e4f53b7e Add type annotations to LocaleScanner
Task-number: QTBUG-129566
Pick-to: 6.8
Change-Id: I768fda6b5202ebabc8283ecedead9157653862be
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-10-23 20:34:10 +02:00
Mate Barany
ba9d6b261b Remove unused parameters, variables from cldr.py and ldml.py
Found these while adding type annotations.

Task-number: QTBUG-129566
Pick-to: 6.8
Change-Id: I51c8e5676f958094946c0e6f396b98c083fd9de0
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-10-23 20:02:05 +02:00
Edward Welbourne
f6afd98e7e Purge some archaic complications from CLDR parsing
Apparently there used to be a mechanism where an alias element in a
top-level LDML element could serve to provide a parent locale as its
source attribute. That is long gone and, since at least a decade ago,
alias elements only ever appear in root.xml, with source="locale" and
a path that starts ../ (so is a relative XPath).

Ditch some complications (that I transcribed faithfully five-ish years
ago when transforming the scripts), replacing them with assertions
that check what's now documented in the LDML spec and confirmed by my
own grep-checks in the CLDR data. This incidentally made one prior
(weaker) check redundant, so I've now removed that from the look-up
for the tags that identify a locale. That look-up is only ever
performed after the DOM root nodes it uses have come through the scan
of locale roots that now does the stronger check.

Makes no difference to generated data.

Change-Id: I811ffbef5f5ecb69183d68fa8bda57281f2a579d
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-09-20 11:29:21 +02:00
Mate Barany
67ec126168 Fix typo in cldr.py
The variable ianalist is not really used for anything, it was probably
meant to be ianaList.

Pick-to: 6.8
Change-Id: Ie9f42bf9716da28ee0017319dda96389c415ef4f
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-09-17 19:12:04 +02:00
Edward Welbourne
8b456bbd9a Include timezone L10n data in QLocaleXML files
This makes the XML file bigger by a factor of roughly 8, at about 30
MB. Code to read the new data out of it shall follow in a later
commit.

Task-number: QTBUG-115158
Change-Id: I7b9b6abe88be2457fa6cf0e8d7b6a68845136770
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-08-20 18:43:05 +02:00
Edward Welbourne
96e0d5e82c Extract locale-appropriate names for zones and metazones
Task-number: QTBUG-115158
Change-Id: Ie687c97b038604ab5ec4d65b3052a632d0f2292b
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-07-22 10:41:50 +02:00
Edward Welbourne
99285e6060 Include metazone data in QTZP_data_p.h when timezone_locale is active
This also expands the IANA ID table (by about 5 KiB) even when the
feature is inactive, since it includes all IANA zones referenced by
the new data, as well as those for which CLDR has aliases.

Add code to QTZlocale.cpp to use this locale-independent data. This
shall need expanded once locale-dependent data is also available.

Task-number: QTBUG-115158
Change-Id: I720f10cb9ae4cf87dfd8bb66af965a45d49c389a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2024-07-22 10:41:34 +02:00
Edward Welbourne
2cdfe35712 Implement scanning of CLDR's supplemental data on meta-zones
This is the locale-independent part of the data, for inclusion in
qtimezoneprivate_data_p.h in some form.

Task-number: QTBUG-115158
Change-Id: Ic46f53dd22d45ddc999633bc1bb4a0a3cf6d5112
Reviewed-by: Mate Barany <mate.barany@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2024-07-22 10:41:23 +02:00
Edward Welbourne
bd5bb70b7c QLocaleXML: Use enum values instead of names in likely subtag map
The existing naming lists provide the needed mapping and this prepares
the way to move the language, script and territory into the from and
to elements as attributes, saving some file-size. It incidentally
pushes the mapping to enum values upstream and simplifies the
downstream processing.

Change-Id: I8f6d2615d52b14d46d1b795539c71f8afdc310ca
Reviewed-by: Dennis Oberst <dennis.oberst@qt.io>
2024-07-11 15:57:51 +02:00
Edward Welbourne
0c809fc3b5 Derive C locale data from en_US, overriding minor details
The qlocalexml.py Locale.C() had to replicate a whole lot of data that
isn't really relevant to how C differs from en_US and every addition
to what we support required further additions to it. So pass the en_US
Locale object to the pseudoconstructor so that C can inherit from it
and only override the parts where we care about the difference.

Hand-code shortening for short Jalali month names, to match Soroush's
original contribution, and include the narrow forms in the hard-coded
data to keep the generated data unchanged (for now). Note some of the
departures from CLDR; we may want to drop these overrides later.

In the process, convert the mapping from keys to locales to
consistently use IDs for all members of the key, instead of using the
(empty) code value for (as yet unused) variant; it now gets ID 0 and
is consistent with returns from codesToIdNames(). This makes life
easier for the code that now has to construct an en_US key.

Task-number: QTBUG-115158
Change-Id: I3d7acb6a4059daec1bba341fcf015c39c7a6803b
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
2024-06-02 15:25:52 +02:00
Edward Welbourne
9534341654 Integrate timezone data into the CLDR-via-QLocaleXml pipeline
Future work shall need the timezone alias data to be synchronized
between the (expanded) locale-independent timezone data and the
(coming) locale-dependent timezone data. The latter shall need to come
via QLocaleXml, hence the former now needs to, too.

This makes no change to the generated data, aside from changing the
regeneration instructions for qtimezoneprivate_data_p.h, to use the
same scripts as locale data, instead of cldr2qtimezone.py, which is
now removed.

Task-number: QTBUG-115158
Change-Id: I47ddd95f6af1855cbb1f601e9074c13f213cd61c
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-06-02 15:25:27 +02:00
Edward Welbourne
4e23dbb742 Add assorted notes and suggestions in util/locale_database/
Change-Id: I22534943f2c9710d501235672811a861a5fd3aea
Reviewed-by: Øystein Heskestad <oystein.heskestad@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
2024-06-02 15:25:21 +02:00
Edward Welbourne
99475db542 Revise Windows time-zone mapping to use proper IANA IDs
The CLDR's "IANA" IDs may (for the sake of stability) date back to
before IANA's own naming has been updated. As a result, the "IANA" IDs
we were using were in some cases out of date. CLDR does provide a
mapping from its stable IDs to all aliases and the current IANA name
for each (which I shall soon be needing in other work), so use that to
map the CLDR IDs to contemporary IANA ones.

Revise the documentation of CldrAccess.readWindowsTimeZones() to take
this into account, pass it the alias mapping from the table, use that
to map IDs internally and, in passing, rename a variable.  Update
cldr2qtimezone.py to match the new CldrAccess methods and regenerate
the data.

Change-Id: I23d8a7d048d76392099d125376b544a41faf7eb3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-05-30 20:28:55 +02:00
Edward Welbourne
bcadcb029e Use CLDR alias data to find canonical IANA IDs
There are various legacy IANA IDs that we should recognize as aliases
for their contemporary equivalents. Later work shall also take these
into account in the Windows IDs. Scan CLDR's data about these aliases
and use it when constructing QTimeZone. This adds aliasMappingTable
and aliasIdData arrays to QTZP_data_p.h and an AliasData type to its
QtTimeZoneCldr namespace.

Change-Id: I1bbfce62959a7e1b7a0bc4a320c32f5a174a2ff2
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2024-05-21 17:23:21 +02:00
Edward Welbourne
f2a2379de8 Use dict comprehensions more in cldr.py and qlocalexml.py
They're a bit more readable than calling dict on a generator.

Change-Id: I3177e31b1f617b80d1cf5d5f83df7036fc0c4c01
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2024-04-22 18:56:20 +02:00
Edward Welbourne
d935a89d25 Tweak the message for variants
Although the code does not, in fact, know about them, it's more
pertinent to say that they're unsupported than to say that the variant
in question is unknown.

Change-Id: I411d792dc91f2d7af58a4b7919c952a005b3417e
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2024-04-22 17:22:12 +02:00
Edward Welbourne
693bf76306 Minor tidy-up of CldrAccess.__enumMap: revise comment, modernize code
A comment dated from when variables misleadingly named language_list,
script_list and country_list actually held mappings not lists; they've
been renamed to s/list/map/ a while back, so rephrase.

Use a dict-comprehension rather than the somewhat lisp-ier invocation
of the dict constructor on an iterator over pairs.

Change-Id: Ibcb97122434122dbb1dcb0f621aae02b25a4e1fa
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2024-02-13 15:58:43 +01:00
Edward Welbourne
1ae24f8b50 Use CLDR's names in QLocale::*ToName() for language, script, territory
Various comments need to continue using the enumdata.py names, as they
associate data with particular enum members, but we can now correctly
use the en.xml versions of their names when we report them, rather
than the enum-friendly names we use in the code. Since this now means
the data may stray outside plain ASCII - it'll be UTF-8-encoded - this
implies replacing the QLatin1StringView()s of the code that formerly
read this data with QString::fromUtf8().

Fixes: QTBUG-94460
Change-Id: Id3b08875a46af58c0555c3e303b0e15a19441509
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2023-08-09 17:53:42 +02:00
Edward Welbourne
e212b3633c Break clashing-names test function out of CldrAccess.__checkEnum()
Moving it makes it easier to document what it's up to and why, while
leaving __checkEnum() easier to read; and I'm going to need it
elsewhere anyway. This makes no difference to generated data.

Task-number: QTBUG-94460
Change-Id: I684375bc926d5d54928fbf5b5e08978528aef487
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-09 17:53:20 +02:00
Edward Welbourne
40b063cd74 Tweak lookup of en.xml names for languages, scripts and territories
Prefer stand-alone versions of the names when available. This saves
the need for a Han-specific kludge in the check for discrepancies
between our enum names and the en.xml names. Causes no change to
generated locale data.

Pick-to: 6.6 6.5
Change-Id: I162f3107d6ffc1f8b893b206e0b78b61cf7254f6
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-03 19:16:27 +02:00
Edward Welbourne
69a0cec4d0 Canonicalize space in lists of IANA time-zones
In the Windows zone-ID code, we tokenize() a text extracted from CLDR
data. However, a leading or trailing space (or a repeated internal
space) would then give an empty "IANA ID" for us to match, causing the
empty ID to be mapped to the Windows ID for the entry with the
superfluous space. This was uncovered by an entry with a trailing
space in CLDR v43's data.

Canonicalize spacing in the IANA ID lists extracted from CLDR so as to
ensure this doesn't happen. (We could pass Qt::SkipEmptyParts to the
tokenize() call, but fixing the issue when generating the data is
cheaper and more robust than fixing it at run-time every time it's
consulted.)

Task-number: QTBUG-111550
Change-Id: Ib3883419558d6574141e9ab0bc929ade2d73e020
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-02 09:38:14 +02:00
Edward Welbourne
615047e98f Ignore parentLocales nodes with component="..." attributes
From CLDR v43, "The parentLocale elements now have an optional
component attribute, with a value of segmentations or
collations. These should be used for inheritance for those respective
elements." Since we aren't extracting collation or segmentation data
for the present, omit these elements from the scan for parentLocale
information.

Task-number: QTBUG-111550
Change-Id: I42871929f539c1852471812801953f2fc8be0e8a
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-01 15:36:06 +02:00
Edward Welbourne
914857be08 Fix trivial typo in cldr.py doc-string
Change-Id: I24b039f9256adb3dc7808cd04bd621226b1a5ed5
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2022-09-14 17:46:39 +02:00
Lucie Gérard
05fc3aef53 Use SPDX license identifiers
Replace the current license disclaimer in files by
a SPDX-License-Identifier.
Files that have to be modified by hand are modified.
License files are organized under LICENSES directory.

Task-number: QTBUG-67283
Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
2022-05-16 16:37:38 +02:00
Ievgenii Meshcheriakov
2d0c2c9f1c locale_database: Use pathlib to manipulate paths in Python code
pathlib's API is more modern and easier to use than os.path. It
also allows to distinguish between paths and other strings in type
annotations.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ie6d9b4e35596f7f6befa4c9635f4a65ea3b20025
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-19 22:05:54 +02:00
Ievgenii Meshcheriakov
41458fafa0 locale_database: Use f-strings in Python code
Replace most uses of str.format() and string arithmetic by f-strings.
This results in more compact code and the code is easier to read
when using an appropriate editor.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I3409f745b5d0324985cbd5690f5eda8d09b869ca
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-16 19:04:20 +02:00
Ievgenii Meshcheriakov
53382b7b07 locale_database: Don't use u prefix for strings in python files
This prefix is useless with Python 3.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ic008d53fe506865759e9a5003f439f7ac107b9e6
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-15 17:06:53 +02:00
Ievgenii Meshcheriakov
b02d17c5c0 Convert CLDR scripts to Python 3
The convertion is moslty done using 2to3 script with manual cleanup
afterwards.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I4d33b04e7269c55a83ff2deb876a23a78a89f39d
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-15 17:06:53 +02:00
Ievgenii Meshcheriakov
d804d21e8f cldr.py: Avoid raising StopIteration from generators
The behavior of StopIteration in generators was changed in Python 3
(see https://www.python.org/dev/peps/pep-0479/). Not raising that
exception makes it easier to port the code to Python 3.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Iac6e3f6f1e1e8ef3a1a0d89b19d2ac2d186434f5
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-09 13:42:21 +02:00
Edward Welbourne
e51831260a Nomenclature change: s/countr/territor/g in locale scripts
Change the nomenclature used in the scripts and the QLocaleXML data
format to use "territory" and "territories" in place of "country" and
"countries". Does not change the generated source files.

Change-Id: I4b208d8d01ad2bfc70d289fa6551f7e0355df5ef
Reviewed-by: JiDe Zhang <zhangjide@uniontech.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2021-05-26 18:00:01 +02:00
Edward Welbourne
21e0ef3ccf Rename util/locale_database/enumdata.py's various *_list to *_map
These variables provide mappings, not lists, so name them non-deceptively.

Change-Id: Idf15e78ad73790bc86dd8b9d4f248d1c4f73993c
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-05-26 18:00:01 +02:00
JiDe Zhang
50a7eb8cf7 Add the "Territory" enumerated type for QLocale
The use of "Country" is misleading as some entries in the enumeration
are not countries (eg, HongKong), for all that most are. The Unicode
Consortium's Common Locale Data Repository (CLDR, from which QLocale's
data is taken) calls these territories, so introduce territory-based
names and prepare to deprecate the country-based ones in due course.

[ChangeLog][QtCore][QLocale] QLocale now has Territory as an alias for
its Country enumeration, and associated territory-based names to match
its country-named methods, to better match the usage in relevant
standards. The country-based names shall in due course be deprecated
in favor of the territory-based names.

Fixes: QTBUG-91686
Change-Id: Ia1ae1ad7323867016186fb775c9600cd5113aa42
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2021-04-15 20:17:49 +08:00
Edward Welbourne
d11bf5fc24 Check our enumdata.py tables are consistent with CLDR
Compare the code->name mappings we're using to the ones CLDR's
common/main/en.xml provides; report discrepancies. Tolerate tags
missing from en.xml if they're known to the locale-inheritance
machinery.

Change-Id: Ibe96c18bf55984a35de3b3644f3586a9f30720b2
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-11-08 03:14:00 +01:00
Qt Forward Merge Bot
8823bb8d30 Merge remote-tracking branch 'origin/5.15' into dev
Conflicts:
	examples/opengl/doc/src/cube.qdoc
	src/corelib/global/qlibraryinfo.cpp
	src/corelib/text/qbytearray_p.h
	src/corelib/text/qlocale_data_p.h
	src/corelib/time/qhijricalendar_data_p.h
	src/corelib/time/qjalalicalendar_data_p.h
	src/corelib/time/qromancalendar_data_p.h
	src/network/ssl/qsslcertificate.h
	src/widgets/doc/src/graphicsview.qdoc
	src/widgets/widgets/qcombobox.cpp
	src/widgets/widgets/qcombobox.h
	tests/auto/corelib/tools/qscopeguard/tst_qscopeguard.cpp
	tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp
	tests/benchmarks/corelib/io/qdiriterator/qdiriterator.pro
	tests/manual/diaglib/debugproxystyle.cpp
	tests/manual/diaglib/qwidgetdump.cpp
	tests/manual/diaglib/qwindowdump.cpp
	tests/manual/diaglib/textdump.cpp
	util/locale_database/cldr2qlocalexml.py
	util/locale_database/qlocalexml.py
	util/locale_database/qlocalexml2cpp.py

Resolution of util/locale_database/ are based on:
https://codereview.qt-project.org/c/qt/qtbase/+/294250
and src/corelib/{text,time}/*_data_p.h were then regenerated by
running those scripts.

Updated CMakeLists.txt in each of
	tests/auto/corelib/serialization/qcborstreamreader/
	tests/auto/corelib/serialization/qcborvalue/
	tests/auto/gui/kernel/
and generated new ones in each of
	tests/auto/gui/kernel/qaddpostroutine/
	tests/auto/gui/kernel/qhighdpiscaling/
	tests/libfuzzer/corelib/text/qregularexpression/optimize/
	tests/libfuzzer/gui/painting/qcolorspace/fromiccprofile/
	tests/libfuzzer/gui/text/qtextdocument/sethtml/
	tests/libfuzzer/gui/text/qtextdocument/setmarkdown/
	tests/libfuzzer/gui/text/qtextlayout/beginlayout/
by running util/cmake/pro2cmake.py on their changed .pro files.

Changed target name in
	tests/auto/gui/kernel/qaction/qaction.pro
	tests/auto/gui/kernel/qaction/qactiongroup.pro
	tests/auto/gui/kernel/qshortcut/qshortcut.pro
to ensure unique target names for CMake

Changed tst_QComboBox::currentIndex to not test the
currentIndexChanged(QString), as that one does not exist in Qt 6
anymore.

Change-Id: I9a85705484855ae1dc874a81f49d27a50b0dcff7
2020-04-08 20:11:39 +02:00
Edward Welbourne
81cf23c7a7 Take CLDR's distinguished attributes into account
When doing XPATH searches, child nodes that have distinguished
attributes that were not asked for should be skipped. This is part of
the LDML spec and matters when resolving locale inheritance. Scan the
LDML DTD (previously only scanned for the CLDR version) to find which
attributes of which tags are ignorable - all others are distinguished
- and take the result into account when performing XPATH searches.

The XPath we were using for currency formats wasn't excluding
currencyFormatLength elements with type="short" and patterns specific
to thousands (and larger multiples); this is fixed by taking
distinguished attributes into account. However, the XPATH also wasn't
specifying the always distinguished attribute type="standard" that
was, in practice, used for nearly all locales that weren't (wrongly)
using short-forms for thousands; so type="standard" is now made
explicit, so as to minimize the diff.

This leaves only twenty-one locales with a negative currency formats.
A later commit shall switch to using accounting by default (it falls
back via an alias to standard, in any case), thereby restoring the two
mentioned below that were using it by accident, but the present change
gives the minimal diff here.

Thousands-specific formats replaced with sensible ones:
* zh_Hant_{HK,MO} (Traditional Mandarin, Hong Kong and Macau)
* eo_001 (Esperanto)
* fr_CA (Canadian French)
* ha_* (Hausa, when not written in Arabic)
* es_{GT,MX,US} (Spanish - Guatemala, Mexico, USA)
* sw_KE (Swahili, Kenya)
* yi_001 (Yiddish)
* mfe_MU (Morisyen, Mauritius)
* lag_TZ (Langi, Tanzania)
* mgh_MZ (Makhuwa Meetto, Mozambique)
* wae_CH (Walser, Switzerland)
* kkj_CM (Kako, Cameroon)
* lkt_US (Lakota, USA)
* pa_Arab_PK (Punjabi, in Arabic script, as used in Pakistan; uses
  arabext number system, whose currency falls back to latn's, for
  which pa_Arab over-rides the thousands-format).

Format changed from an over-ridden type="accounting" to standard (so
these lost a negative-specific form) in:
* en_SI (English, Slovenia)
* es_DO (Spanish, Dominican Republic; same)

For some locales we were picking up over-rides of narrow or short list
formats, or formats for or-lists or unit-lists rather than and-lists,
in place of the standard list format, that these locales don't
over-ride, provided by a parent locale. This changed list formats for:
* en_CA, en_IN (dropped "Oxford" comma before "and")
* qu_* (Quechua; dropped "utaq", presumably meaning "and")
* ur_IN (Urdu, India; was using unit-list formats)

[ChangeLog][QtCore][QLocale] Data used for currency formats in several
locales and list patterns in some locales have changed due to now
parsing the CLDR data more faithfully.

Fixes: QTBUG-81344
Change-Id: I6b95c6c37db92df167153767c1b103becfb0ac98
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:43:28 +01:00
Edward Welbourne
be3dfd7a71 Rework cldr2qlocalexml.py's reading of CLDR data
Move the code out to a CldrReader class in cldr.py, expand CldrAccess
with facilities that needs, expand ldml.py to include support for more
features, finally making xpathlite.py redundant. This initial commit
aims, though, to be bug-for-bug compatible with xpathlite in its
reading of the CLDR data.

It turns out we've been using draftier data than we were aware of
(which might not be a bad thing). The xpathlite code appeared to check
for draft attributes, but these only appear on leaf nodes and most
data were fetched by finding a parent and then scanning its children
without the draft check; only am/pm data was actually being excluded
based on draft values.  (We allowed contributed, for am/pm, in
addition to approved, which is all the xpathlite code allows
otherwise.) There are also some less equivocal bugs; I'll deal with
these in later commits.

Simplified number-system data look-ups; the old get_number_in_system()
was taking care of old LDML versions' placement of the number system
attribute; this is no longer needed. (It was also being used for a
currency value to which it was not appropriate, which is now handled
separately; this is one of the bugs mentioned above.) Ditched a
fall-back to nativeZeroDigit, which no longer exists in CLDR.

Change the command-line to take the root of the CLDR data tree, rather
than its common/main/ sub-directory. Support naming the file to which
to write output, as a second command-line argument, instead of always
writing to stdout (which remains the default) and leaving whoever runs
the script to redirect stdout.

Support (internally for now, while adding TODOs to give main() more
command-line options) separating the stderr output into its more and
less interesting parts; for now, continue producing both, but suppress
the least interesting entirely.

Task-number: QTBUG-81344
Change-Id: Ie611b47403a9452b51feaeeaaa0fbc8f7e84dc71
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:43:18 +01:00
Edward Welbourne
c834dbc6fb Move cldr2qtimezone.py's CLDR-reading to a CldrAccess class
This begins the process of replacing xpathlite.py, adding low-level
DOM-access classes to ldml.py and the CldrAccess class to cldr.py

Moved a format comment from cldr2qtimezone.py's doc-string to the
method of CldrAccess that does the actual reading.

Task-number: QTBUG-81344
Change-Id: I46ae3f402f8207ced6d30a1de5cedaeef47b2bcf
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:43:13 +01:00