Add some type hints to unicode2hex as well, it is used by
ByteArrayData.
Task-number: QTBUG-128634
Pick-to: 6.8
Change-Id: I86b7ce8567483bf8a4d4db78c9585652526cb90d
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Unify generation of the data on enum-related names (and their indices)
with that of the data for the corresponding codes. This produces the
same tables, just in a different order, putting each code table right
after the name table and its indexing.
It'll mean more conflicts on picking future updates back to 6.8 and
before, but those should usually involve regenerating data anyway,
even when they don't get (visible) conflicts, so this'll just
encourage doing that.
As the TODO comment noted, the reason for keeping the table separate
was just that, during a major rewrite of the scripts (most of five
years ago), I wanted to be sure data didn't change. We've stabilised
plenty since then, so it's time to do that clean-up.
Change-Id: I0c3ee9d41d85debdba8b8b2624f137fadb6d8a3f
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The lookup into it is done case-insensitively (because user-supplied
names of zones might not have the right case) but I forgot to make the
sorting of the data table case-insensitive in the aliases. Regenerate
data: only the qtimezone*_data_p.h are changed by the reindexing of
zone aliases.
Pick-to: 6.8
Change-Id: Id5e95c245c7ca421a77298f23baefe6b7021a396
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Intel CPUs have had this since 2013 (Ivy Bridge), but some older
Bulldozer AMD CPUs appear to be missing it. This creates a mismatch
between when the __haswell__ macro gets declared in qsimd_p.h and the
runtime check using the CpuArchHaswell value. That in turn creates a
condition where qInitDrawhelperFunctions() in qdrawhelper.cpp leaves the
memfill pointers set to null.
#elif defined(__SSE2__)
# ifndef __haswell__
qt_memfill32 = qt_memfill32_sse2;
qt_memfill64 = qt_memfill64_sse2;
# endif
...
#if defined(QT_COMPILER_SUPPORTS_AVX2)
if (qCpuHasFeature(ArchHaswell)) {
qt_memfill32 = qt_memfill32_avx2;
qt_memfill64 = qt_memfill64_avx2;
It does this so the qt_memfillXX_sse2 functions don't have to be defined
anywhere, so the QtGui build won't carry unnecessary dead code.
This is old code (from Qt 4.x) and several improvements I've made for
QtCore are not applied yet. My work for qSimdDispatcher[1] isn't
complete: it might have avoided this problem here, but it would also
have required major work for the draw helpers to work in the first
place.
[1] https://codereview.qt-project.org/c/qt/qtbase/+/537384
Pick-to: 6.8 6.7 6.5 6.2
Fixes: QTBUG-129193
Change-Id: Ia427a9e502b0fb46b2bdfffda8e2131b7091c9e9
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Apparently there used to be a mechanism where an alias element in a
top-level LDML element could serve to provide a parent locale as its
source attribute. That is long gone and, since at least a decade ago,
alias elements only ever appear in root.xml, with source="locale" and
a path that starts ../ (so is a relative XPath).
Ditch some complications (that I transcribed faithfully five-ish years
ago when transforming the scripts), replacing them with assertions
that check what's now documented in the LDML spec and confirmed by my
own grep-checks in the CLDR data. This incidentally made one prior
(weaker) check redundant, so I've now removed that from the look-up
for the tags that identify a locale. That look-up is only ever
performed after the DOM root nodes it uses have come through the scan
of locale roots that now does the stronger check.
Makes no difference to generated data.
Change-Id: I811ffbef5f5ecb69183d68fa8bda57281f2a579d
Reviewed-by: Mate Barany <mate.barany@qt.io>
Previously it re-ran only the failed tests, if a proper XML logfile was
written. But in case of "crash" caused by the watchdog for
QTEST_FUNCTION_TIMEOUT, the subsequent tests were not included in the
XML logfile, thus those were never run.
Fixes: QTQAINFRA-5226
Change-Id: Ib4f0849fa2511bb34365fd901fd53c5a3e3ab293
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Do not print warning about no errors in logfile, when
a logfile is not expected.
Task-number: QTQAINFRA-5084
Change-Id: I92f94452418738d31936d47362aa6090090af6de
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Since we now have quite a few "special" test wrappers, I took the
opportunity to refactor the code and add a couple of testcases too.
Change-Id: I20e1214351d71c1474be32f03d4218ae6bdd2277
Reviewed-by: Axel Spoerl <axel.spoerl@qt.io>
Reviewed-by: Toni Saario <toni.saario@qt.io>
On Unix, if target exists and it is a file, rename silently replaces it
if the user has permission. However, on Windows, if the target exists,
FileExistError will be raised.
With replace, if target points to an existing file or empty directory,
it will be unconditionally replaced.
Pick-to: 6.8
Change-Id: I2774152fec78a00c4ca6c9d1b927e503df2f2e84
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Using the default system encoding cldr2qlocalexml.py and
qlocalexml2cpp.py may terminate with encoding errors on Windows.
Warn the user to set the PYTHONUTF8 environment variable to 1 before
running those scripts to avoid encoding errors.
Pick-to: 6.8
Change-Id: I315a45072cb6ea516d3e9bb7613c6f251792ec59
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The variable ianalist is not really used for anything, it was probably
meant to be ianaList.
Pick-to: 6.8
Change-Id: Ie9f42bf9716da28ee0017319dda96389c415ef4f
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This amends commit 880d1aef99a6826c8dd690b13e1ca6ea5574f403 and
extends it to cover the testlocales program under util/.
Pick-to: 6.8 6.7 6.5
Task-number: QTBUG-121653
Change-Id: I3efadc69ce08810876f8e20aa4636c7624728153
Reviewed-by: Lucie Gerard <lucie.gerard@qt.io>
The aim is to provide CLDR-derived localized timezone display names,
enabling the MS, TZ and soon-to-be-added C++20 chrono::tzdb backends
to deliver localized names when ICU is not available.
Internal feature timezone_locale controls access to locale-appropriate
names for QTimeZone's backends, which can be taken from ICU if
avalable (even when not using its backend, as with TZ) or derived from
CLDR data. The Android, Darwin and ICU backends take care of this for
themselves, so do not need to enable this feature.
When the feature is enabled and ICU is not available, include data
extracted from CLDR from which to obtain the needed locale-appropriate
namings. This is extracted in the same process as updates
../text/qlocale_data_p.h, the various q*calendar_data_p.h and
qlocaleprivate_data_p.h (the last having now been brought into the
fold ready for this change).
This commit defines the locale-dependent data that complements an
earlier addition of locale-independent data in QTZP_data_p.h in the
QtTimeZoneCldr namespace. The new data goes in a QtTimeZoneLocale
namespace in qtimezonelocale* files, but is not included in this
commit, due to browsers having trouble display the full 12 MiB of
source code. That data compiles down to 2.3 MiB (compared to the
libicudata.so size of about 30 MB). Updated the CLDR
qt_attribution.json entry to include the new generated file.
In place of the full locale-dependent data, the present commit
substitutes minimal dummy data, with comments indicating the real
data's size. The expected failure of various testcases will only be
cleared when that data lands and the feature to activate it is
enabled.
The new data also include (in this commit) one IANA ID, Europe/Kirov,
listed in CLDR's bcp47/timezone.xml but neither as an alias nor with
any aliases, so missing from the alias data previously stored. The
addition of its naming data brings it in.
[ChangeLog][Third-Party Code] The data extracted from the Unicode
Consortium's Common Locale Data Repository (CLDR) now includes, on
platforms where this is otherwise unavailable, data on how different
locales name the world's various time-zones.
Task-number: QTBUG-68812
Task-number: QTBUG-84297
Task-number: QTBUG-112909
Task-number: QTBUG-114914
Task-number: QTBUG-115158
Task-number: QTBUG-122448
Change-Id: I3a823cc92844c380723412d12303714b9ec493ef
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The data is very big but much of it is inherited by zones from those
that they map to via likely-subtag reduction, so omit the data where
it coincides with the result of such an inheritance; this shall
complicate the reading of the data, but saves dramatically on its
size, reducing it to "only" c. 2 MiB.
Task-number: QTBUG-115158
Change-Id: I53ff13e29f1f73a551d73d75773373bb90673c8e
Reviewed-by: Mate Barany <mate.barany@qt.io>
This includes the data in the Locale objects read prior to writing
CLDR data out to relevant files. Actually writing the new data out
shall follow in a later commit.
Task-number: QTBUG-115158
Change-Id: Iaf1466242eb31e66d8ace0bec2ffe7554f66fc10
Reviewed-by: Mate Barany <mate.barany@qt.io>
This makes the XML file bigger by a factor of roughly 8, at about 30
MB. Code to read the new data out of it shall follow in a later
commit.
Task-number: QTBUG-115158
Change-Id: I7b9b6abe88be2457fa6cf0e8d7b6a68845136770
Reviewed-by: Mate Barany <mate.barany@qt.io>
As previously commented, mn_Mong_MN would end up with the same decimal
and group separators if we trusted its draft="contributed" amendment
to the decimal separator. I reported this to the CLDR folk some time
ago and they now have a Jira ticket for it, which turns out to be a
duplicate, so we can track them and know when to remove my
hack-around.
Change-Id: Ib8f49dbdce090393ad20cd50969d6323818ee4ff
Reviewed-by: Mate Barany <mate.barany@qt.io>
Expand unicode data to include information needed to
parse emoji sequences. This is a pre-requisite for
automatically preferring color fonts for emojis.
As a drive-by, this also fixes a double space in the
output of the uc_properties array.
Task-number: QTBUG-111801
Change-Id: Icd993803c87c69ed278c7724377028f3706d0272
Reviewed-by: Eirik Aavitsland <eirik.aavitsland@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This also expands the IANA ID table (by about 5 KiB) even when the
feature is inactive, since it includes all IANA zones referenced by
the new data, as well as those for which CLDR has aliases.
Add code to QTZlocale.cpp to use this locale-independent data. This
shall need expanded once locale-dependent data is also available.
Task-number: QTBUG-115158
Change-Id: I720f10cb9ae4cf87dfd8bb66af965a45d49c389a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This is the locale-independent part of the data, for inclusion in
qtimezoneprivate_data_p.h in some form.
Task-number: QTBUG-115158
Change-Id: Ic46f53dd22d45ddc999633bc1bb4a0a3cf6d5112
Reviewed-by: Mate Barany <mate.barany@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Use CDATA when outside ASCII. Share the attribute-packing code for an
open-tag in a static method. In passing, tweak a comment's text.
Change-Id: Ic8b75afc56d537a1a51d13797c737d4bfcc1f910
Reviewed-by: Mate Barany <mate.barany@qt.io>
The msLandZones/territorycode and msZoneIana/iana can become
attributes of their parent nodes instead of child elements, as
territory codes and IANA IDs are plain ASCII and not unduly long.
In the process, rename territorycode to territory.
Change-Id: Iab9901da01d15abc8c5db7a7d57f925fce8bb521
Reviewed-by: Mate Barany <mate.barany@qt.io>
Replacing elements for the alias and IANA ID with attribute makes the
table more compact, albeit the ComodRivadavia like is a little long.
(Some existing msLandZones/ianaids lines are longer, though.)
Change-Id: Iab2b55a21857402ad7c863ef33abd241f1d58a8d
Reviewed-by: Mate Barany <mate.barany@qt.io>
This makes the likely subtag part of the file more compact.
Introduces a QLocaleXmlWriter.asTag() for attribute-only elements;
this requires the Spacer to recognize self-closing elements as not
increasing the indent needed.
Change-Id: I1b73b755f9841617a5c002cf624785321e808d0c
Reviewed-by: Mate Barany <mate.barany@qt.io>
The existing naming lists provide the needed mapping and this prepares
the way to move the language, script and territory into the from and
to elements as attributes, saving some file-size. It incidentally
pushes the mapping to enum values upstream and simplifies the
downstream processing.
Change-Id: I8f6d2615d52b14d46d1b795539c71f8afdc310ca
Reviewed-by: Dennis Oberst <dennis.oberst@qt.io>
These were written (and empty for the Any* enum members) but never
read. We, in any case, infer what we need from the enum members, via
the languageList, scriptList and territoryList elements.
In the process, add a comment between the fromXml() and toXml()
methods of Locale to remind those editing the code to also edit the
schema describing the XML.
Change-Id: Ie5e51f594c2636802eefd8159954105718d9af52
Reviewed-by: Øystein Heskestad <oystein.heskestad@qt.io>
Reviewed-by: Mate Barany <mate.barany@qt.io>
Two FIXME comments related to the misnaming of string data tables: a
table of space-joined lists of IANA IDs was named ianaIdData, as a
result of which a table of single IANA IDs (and some aliases) was
named aliasIdData. A field in one struct was an index into the former
even though its values were actually single IANA IDs.
So rename the list data table to ianaListData, reusing its old name
for the former ianaIdData, and transfer the single-ID data from the
ID-list table to the single-ID table. Moving that data changed
indexing into both string tables and thus all of the data-tables
referencing these tables.
Task-number: QTBUG-115158
Change-Id: I84165736e91d0bf127f3f9f3b95e9c3060a30c12
Reviewed-by: Mate Barany <mate.barany@qt.io>
This was in fact present in v44, but we overlooked it somehow. The new
version also fixes some inconsistencies in the data, that I reported
against v44.1; in particular, Tamil no longer claims to override the
root AM/PM markers (probably because it uses 24-hour time so doesn't
need them).
Add the test-file under util to the list of files containing generated
content.
[ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to
v45.
Task-number: QTBUG-126060
Pick-to: 6.8 6.7 6.5 6.2
Change-Id: I81a5bcca49519b55091fc541de6b73b606661bb4
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This means LocaleDataWriter.likelySubtags() now only gets an iterable,
so doesn't know when it's on the last item to skip the comma after it,
but that seems to be acceptable in modern C++.
Change-Id: I9d3bb9af3bb46b28b7a2529e27ab72a72c358503
Reviewed-by: Mate Barany <mate.barany@qt.io>
The id and code are reliably pure ASCII with no special characters, so
can safely be expressed as attributes. Extend the reader and writer
classes to handle using attributes on a simple text element.
This leaves only the name as text content, so skip the extra
<name>...</name> layer. As the resulting element is inside a *List
element that tells us whether it's a language, script or territory we
don't need to have different elements and can unify them all as simply
a <naming id="..." code="...">...</naming> element. This makes these
sections of the XML file considerably terser, with no change to the
generated data.
Change-Id: Id2e884f1d2713341524549cc49253eb33b5aa487
Reviewed-by: Mate Barany <mate.barany@qt.io>
One character instead of four adds up to a lot of saved bytes when a
file has many lines: and the timezone name L10n data is going to add a
lot of lines.
Task-number: QTBUG-115158
Change-Id: I856f3771266a70b7a9ef4078a9b4aecf42315831
Reviewed-by: Mate Barany <mate.barany@qt.io>
Make our encoding explicit and enable more tools to understand what
they're looking at.
Change-Id: I29327364a5eaac51eeda9a4fb3b8e9b7527ca488
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Also move the CLDR version into the tag. The version numbers are plain
ASCII, with no special characters, so can safely be attributes. In
the process, fix a mistake in __openTag()'s handling of attributes;
join with plain space, no comma.
Having the Qt version in the XML makes it possible to assert
compatibility between the Qt version that generated it and the one
that's consuming it.
Change-Id: I6fa6b668b072ff3616955d81af2cffaba5b67250
Reviewed-by: Mate Barany <mate.barany@qt.io>
Support control over verbosity of output. For now just have
qlocalexml2cpp.py show a stack-trace when failing (and return on all
failures) and have cldr2qlocalexml.py route its information to stdout
(when not in use as the XML output stream, else stderr) or discard it
in quiet mode.
Change-Id: I58afd3a083794eae3a35f6e1235bd62c288fabcf
Reviewed-by: Mate Barany <mate.barany@qt.io>
6.6 is out and 6.8 is in.
Not picking this back, because the prior branches should not pick to
later branches.
Change-Id: Iaca586d2d5bafa195e2ea49730b9ee98d9ecf48b
Reviewed-by: Timur Pocheptsov <timur.pocheptsov@qt.io>
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
The duplicate entries just bulked up the intermediate file.
Makes no change to generated data.
Task-number: QTBUG-115158
Change-Id: I6dc0d1f79f8dcf2e46264c6f9d1ae06ff4c91394
Reviewed-by: Mate Barany <mate.barany@qt.io>
Move to construction time, instead of passing to each append() call;
the table's field sizes are, after all, the same for all entries.
Add support for larger tables by allowing more than 16-bit indices.
Task-number: QTBUG-115158
Change-Id: I8f1113482e80838c512da6353fa17b9f365f956a
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Mate Barany <mate.barany@qt.io>
It was setting *_code='0' for the Any* forms of language, script and
territory; this is wrong, the codes for these are all empty or other
special tokens (like 'und', 'Zzzz', 'ZZ'). The IDs for them are zero,
as an int not a string, but were omitted. Also add the variant
details, for all that they're currently unused, for consistency.
This makes no difference to the generated data.
Task-number: QTBUG-115158
Change-Id: I339d1b201e50e2bbc510758ffbbaae0fa02277d4
Reviewed-by: Mate Barany <mate.barany@qt.io>
The qlocalexml.py Locale.C() had to replicate a whole lot of data that
isn't really relevant to how C differs from en_US and every addition
to what we support required further additions to it. So pass the en_US
Locale object to the pseudoconstructor so that C can inherit from it
and only override the parts where we care about the difference.
Hand-code shortening for short Jalali month names, to match Soroush's
original contribution, and include the narrow forms in the hard-coded
data to keep the generated data unchanged (for now). Note some of the
departures from CLDR; we may want to drop these overrides later.
In the process, convert the mapping from keys to locales to
consistently use IDs for all members of the key, instead of using the
(empty) code value for (as yet unused) variant; it now gets ID 0 and
is consistent with returns from codesToIdNames(). This makes life
easier for the code that now has to construct an en_US key.
Task-number: QTBUG-115158
Change-Id: I3d7acb6a4059daec1bba341fcf015c39c7a6803b
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
Omit parentheses round what python will form into a tuple anyway.
Include trailing commas on last entries of tuples so adding future
entries don't drag the existing line into their diffs.
Let the writer's tag-opener handle attributes, if supplied.
Clean up spacing in some doc-strings.
This is all preparation for further changes, to limit their diffs.
Change-Id: I989ae28bbd235b2af9c1d72467d4741c4f1f20ae
Reviewed-by: Mate Barany <mate.barany@qt.io>
Future work shall need the timezone alias data to be synchronized
between the (expanded) locale-independent timezone data and the
(coming) locale-dependent timezone data. The latter shall need to come
via QLocaleXml, hence the former now needs to, too.
This makes no change to the generated data, aside from changing the
regeneration instructions for qtimezoneprivate_data_p.h, to use the
same scripts as locale data, instead of cldr2qtimezone.py, which is
now removed.
Task-number: QTBUG-115158
Change-Id: I47ddd95f6af1855cbb1f601e9074c13f213cd61c
Reviewed-by: Mate Barany <mate.barany@qt.io>
It's trivial to do - and done when generating our compiled data
tables, so makes no difference to users - but makes the offset list
table simpler. Reformat the list so that the fragment-of-hour offsets
are clearly distinguished from the whole-hour ones.
Change-Id: I6e0ea23dc317542b3256e88492e4073faedef1d7
Reviewed-by: Friedemann Kleint <Friedemann.Kleint@qt.io>
It was originally (without any comment to this effect, either in the
code or the commit message) just the list of offset-zones
corresponding to known Windows zones' offsets, augmented to include
each whole hour offset out to ±14 hours. Absent documentation, of
course, this was not maintained.
Added the four offset zones implied by that, that hadn't been added
when new entries joined the Windows IDs with novel offsets. Check,
after scanning CLDR for Windows data, that this has been kept up to
date. Updated the generated data.
Change-Id: I3cf3932c320876f7f2f74840d8c3951be49cfe70
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The CLDR's "IANA" IDs may (for the sake of stability) date back to
before IANA's own naming has been updated. As a result, the "IANA" IDs
we were using were in some cases out of date. CLDR does provide a
mapping from its stable IDs to all aliases and the current IANA name
for each (which I shall soon be needing in other work), so use that to
map the CLDR IDs to contemporary IANA ones.
Revise the documentation of CldrAccess.readWindowsTimeZones() to take
this into account, pass it the alias mapping from the table, use that
to map IDs internally and, in passing, rename a variable. Update
cldr2qtimezone.py to match the new CldrAccess methods and regenerate
the data.
Change-Id: I23d8a7d048d76392099d125376b544a41faf7eb3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Mate Barany <mate.barany@qt.io>