This fixes some read-past-end issues in assertions that verify the
next row of a table after the last for a specific locale belong to a
later locale. Since those assertions happen without sight of the table
of which the locale's range is a part, they can't tell when the
range's end is in fact the table's end - so they shouldn't have been
reading from a row there. Fix by putting a row there, that belongs to
a nominal locale with index out of range.
Pick-to: 6.9
Change-Id: Ib9d227ca4f86c372c13f963a08a8d637eae63ed0
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Add type annotations to CalendarDataWriter, TestLocaleWriter and
LocaleHeaderWriter.
Task-number: QTBUG-128634
Pick-to: 6.8
Change-Id: I2c9168fda9cb79cbef3e7ef32ec67270ce168a1b
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Add some type hints to unicode2hex as well, it is used by
ByteArrayData.
Task-number: QTBUG-128634
Pick-to: 6.8
Change-Id: I86b7ce8567483bf8a4d4db78c9585652526cb90d
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Unify generation of the data on enum-related names (and their indices)
with that of the data for the corresponding codes. This produces the
same tables, just in a different order, putting each code table right
after the name table and its indexing.
It'll mean more conflicts on picking future updates back to 6.8 and
before, but those should usually involve regenerating data anyway,
even when they don't get (visible) conflicts, so this'll just
encourage doing that.
As the TODO comment noted, the reason for keeping the table separate
was just that, during a major rewrite of the scripts (most of five
years ago), I wanted to be sure data didn't change. We've stabilised
plenty since then, so it's time to do that clean-up.
Change-Id: I0c3ee9d41d85debdba8b8b2624f137fadb6d8a3f
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The aim is to provide CLDR-derived localized timezone display names,
enabling the MS, TZ and soon-to-be-added C++20 chrono::tzdb backends
to deliver localized names when ICU is not available.
Internal feature timezone_locale controls access to locale-appropriate
names for QTimeZone's backends, which can be taken from ICU if
avalable (even when not using its backend, as with TZ) or derived from
CLDR data. The Android, Darwin and ICU backends take care of this for
themselves, so do not need to enable this feature.
When the feature is enabled and ICU is not available, include data
extracted from CLDR from which to obtain the needed locale-appropriate
namings. This is extracted in the same process as updates
../text/qlocale_data_p.h, the various q*calendar_data_p.h and
qlocaleprivate_data_p.h (the last having now been brought into the
fold ready for this change).
This commit defines the locale-dependent data that complements an
earlier addition of locale-independent data in QTZP_data_p.h in the
QtTimeZoneCldr namespace. The new data goes in a QtTimeZoneLocale
namespace in qtimezonelocale* files, but is not included in this
commit, due to browsers having trouble display the full 12 MiB of
source code. That data compiles down to 2.3 MiB (compared to the
libicudata.so size of about 30 MB). Updated the CLDR
qt_attribution.json entry to include the new generated file.
In place of the full locale-dependent data, the present commit
substitutes minimal dummy data, with comments indicating the real
data's size. The expected failure of various testcases will only be
cleared when that data lands and the feature to activate it is
enabled.
The new data also include (in this commit) one IANA ID, Europe/Kirov,
listed in CLDR's bcp47/timezone.xml but neither as an alias nor with
any aliases, so missing from the alias data previously stored. The
addition of its naming data brings it in.
[ChangeLog][Third-Party Code] The data extracted from the Unicode
Consortium's Common Locale Data Repository (CLDR) now includes, on
platforms where this is otherwise unavailable, data on how different
locales name the world's various time-zones.
Task-number: QTBUG-68812
Task-number: QTBUG-84297
Task-number: QTBUG-112909
Task-number: QTBUG-114914
Task-number: QTBUG-115158
Task-number: QTBUG-122448
Change-Id: I3a823cc92844c380723412d12303714b9ec493ef
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The data is very big but much of it is inherited by zones from those
that they map to via likely-subtag reduction, so omit the data where
it coincides with the result of such an inheritance; this shall
complicate the reading of the data, but saves dramatically on its
size, reducing it to "only" c. 2 MiB.
Task-number: QTBUG-115158
Change-Id: I53ff13e29f1f73a551d73d75773373bb90673c8e
Reviewed-by: Mate Barany <mate.barany@qt.io>
This also expands the IANA ID table (by about 5 KiB) even when the
feature is inactive, since it includes all IANA zones referenced by
the new data, as well as those for which CLDR has aliases.
Add code to QTZlocale.cpp to use this locale-independent data. This
shall need expanded once locale-dependent data is also available.
Task-number: QTBUG-115158
Change-Id: I720f10cb9ae4cf87dfd8bb66af965a45d49c389a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Two FIXME comments related to the misnaming of string data tables: a
table of space-joined lists of IANA IDs was named ianaIdData, as a
result of which a table of single IANA IDs (and some aliases) was
named aliasIdData. A field in one struct was an index into the former
even though its values were actually single IANA IDs.
So rename the list data table to ianaListData, reusing its old name
for the former ianaIdData, and transfer the single-ID data from the
ID-list table to the single-ID table. Moving that data changed
indexing into both string tables and thus all of the data-tables
referencing these tables.
Task-number: QTBUG-115158
Change-Id: I84165736e91d0bf127f3f9f3b95e9c3060a30c12
Reviewed-by: Mate Barany <mate.barany@qt.io>
This means LocaleDataWriter.likelySubtags() now only gets an iterable,
so doesn't know when it's on the last item to skip the comma after it,
but that seems to be acceptable in modern C++.
Change-Id: I9d3bb9af3bb46b28b7a2529e27ab72a72c358503
Reviewed-by: Mate Barany <mate.barany@qt.io>
Support control over verbosity of output. For now just have
qlocalexml2cpp.py show a stack-trace when failing (and return on all
failures) and have cldr2qlocalexml.py route its information to stdout
(when not in use as the XML output stream, else stderr) or discard it
in quiet mode.
Change-Id: I58afd3a083794eae3a35f6e1235bd62c288fabcf
Reviewed-by: Mate Barany <mate.barany@qt.io>
The duplicate entries just bulked up the intermediate file.
Makes no change to generated data.
Task-number: QTBUG-115158
Change-Id: I6dc0d1f79f8dcf2e46264c6f9d1ae06ff4c91394
Reviewed-by: Mate Barany <mate.barany@qt.io>
Move to construction time, instead of passing to each append() call;
the table's field sizes are, after all, the same for all entries.
Add support for larger tables by allowing more than 16-bit indices.
Task-number: QTBUG-115158
Change-Id: I8f1113482e80838c512da6353fa17b9f365f956a
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Mate Barany <mate.barany@qt.io>
Future work shall need the timezone alias data to be synchronized
between the (expanded) locale-independent timezone data and the
(coming) locale-dependent timezone data. The latter shall need to come
via QLocaleXml, hence the former now needs to, too.
This makes no change to the generated data, aside from changing the
regeneration instructions for qtimezoneprivate_data_p.h, to use the
same scripts as locale data, instead of cldr2qtimezone.py, which is
now removed.
Task-number: QTBUG-115158
Change-Id: I47ddd95f6af1855cbb1f601e9074c13f213cd61c
Reviewed-by: Mate Barany <mate.barany@qt.io>
Include documentation in both, using common phrasing. Take sys.argv as
a parameter, along with sys.stdout and sys.stderr, so that we can
invoke them from python when importing into a python session to debug
or test. Supply the script name to the argument parser as prog, so it
can correctly report it and forward the rest of argv to parse_args().
Remove comments anticipating one of the several calendars we don't yet
support; the existing entries suffice to make clear what shall be
needed when we get round to adding more.
Change-Id: I2cebd385679e3c84d4ccf899e60091ac823ad10d
Reviewed-by: Mate Barany <mate.barany@qt.io>
This old test program has bitrotted due to not being autogenerated as
part of CLDR updates. Amend qlocalexml2cpp.py to regenerate it and do
such an update. It was still using Qt5's QLocale enum numeric values,
many of which have changed in Qt6. Actually fixing the code so that it
compiles and runs can wait for a later commit.
Inspired by a patch supplied by Kizito Birabwa.
Task-number: QTBUG-124200
Change-Id: I33811313976a4860aad6d7b5b88a40c5b111a4fe
Reviewed-by: Mate Barany <mate.barany@qt.io>
It has many grumbles about spacing, but at least this code is
currently consistent about its departure from PEP8's spacing rules
(and closer to Qt's) for the present. We can review whether to do a
drastic spacing revolution later.
Change-Id: Ife4e8a5b02b63434bd9c7ac7ba4cbc11b6311f9f
Reviewed-by: Mate Barany <mate.barany@qt.io>
Various comments need to continue using the enumdata.py names, as they
associate data with particular enum members, but we can now correctly
use the en.xml versions of their names when we report them, rather
than the enum-friendly names we use in the code. Since this now means
the data may stray outside plain ASCII - it'll be UTF-8-encoded - this
implies replacing the QLatin1StringView()s of the code that formerly
read this data with QString::fromUtf8().
Fixes: QTBUG-94460
Change-Id: Id3b08875a46af58c0555c3e303b0e15a19441509
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The former needed the latter's .dupes to do the job, so can now just
take a method as a tool to do the job instead, letting .dupes become
private. In the process refine the munging to free enumdata.py from
having to capitalize each word in its names. This will, in due course,
let us use more natural forms in various comments. This causes no
change to generted data.
Update enumdata.py's introduction doc, mainly to reflect this but also
fixing the out-of-date names (old *_list have long been *_map) and
adding some details to other paragraphs.
Task-number: QTBUG-94460
Change-Id: If195b2e94a53a495fc4f1f216bed07a910439fa7
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
The string data tables have a mix of digit-length tokens, up to
four-digit hex; we can fit 12 of those per line within our margins.
Leave the one-row-per-locale tables as they are, though, despite long
lines.
Change-Id: I655fddecf24133c26d16187b7a5a8fbc25553e07
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
Pack some of the arrays that contain locale data more tightly. The
AlphaCode struct is a char[4] but always holds only [a-z]{,3} which
could be fit into 16 bits, halving the size of an AlphaCode struct.
With the new constructor the initialization of the AlphaCode struct
also changes - modify qlocalexml2cpp.py to reflect this change and
regenerate the languageCodeList.
Fixes: QTBUG-105050
Change-Id: I2b1e93ab7cc3f2d667bf67b45769b74a15211931
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
We can easily enough obtain the root of the present source tree using
the value of __file__, so might as well do so.
Change-Id: If14773ac1127278b6018a090c0b376437b9c6eec
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
Replace the current license disclaimer in files by
a SPDX-License-Identifier.
Files that have to be modified by hand are modified.
License files are organized under LICENSES directory.
Task-number: QTBUG-67283
Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
This commit extends functionality for QLocale::codeToLanguage()
and QLocale::languageToCode() by adding an additional argument
that allows selection of the ISO 639 code-set to consider for
those operations.
The following ISO 639 codes are supported:
* Part 1
* Part 2 bibliographic
* Part 2 terminological
* Part 3
As a result of this change the codeToLanguage() overload without
the additional argument now returns a Language value if it matches
any know code. Previously a valid language was returned only if
the function argument matched the first code defined for that
language from the above list.
[ChangeLog][QtCore][QLocale] Added overloads for codeToLanguage()
and languageToCode() that support specifying which ISO 639 codes
to consider.
Fixes: QTBUG-98129
Change-Id: I4da8a89e2e68a673cf63a621359cded609873fa2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Use context manager interface (with statement) to atomically update source
files. This ensures that all files are properly closed and the temporary
file is removed even in case of errors.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I18cd96f1d03e467dea6212f6576a41e65f414ce1
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
pathlib's API is more modern and easier to use than os.path. It
also allows to distinguish between paths and other strings in type
annotations.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ie6d9b4e35596f7f6befa4c9635f4a65ea3b20025
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
arparse is the standard way to parse command line arguments in Python.
It provides help and usage information for free and is easier to extend
than a custom argument parser.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I1e4c9cd914449e083d01932bc871ef10d26f0bc2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Replace most uses of str.format() and string arithmetic by f-strings.
This results in more compact code and the code is easier to read
when using an appropriate editor.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I3409f745b5d0324985cbd5690f5eda8d09b869ca
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This is the standard way to call base class methods in Python 3 and
it is shorter than the custom one used now.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ifaff591a46e92148fbf514856109ff794a50c9f7
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Instead of implementing all the intricacies of a cmp for the python
sort-function, support for which is due to be dropped at Python 3 in
any case, implement a much simpler key function that achieves the same
result.
In the process, eliminate the ugly kludge of setting an attribute on a
function to, in effect, communicate with it via a global. Instead,
instantiate a class, that wraps the value previously given to the
attribute and whose instance provides the key-function.
Thanks to Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io> for
pointing out that a key function is the way of the future - and
sorted() is a nicer way to sort.
Pick-to: 6.2
Change-Id: Icf1ed5597fedf420d054fbc860e3e7fc6615875c
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
The ordering function used to sort the locale data generated for
QLocale attempted to sort the default territory for a given language
and script before other territories, but was too tangled for it to be
obvious this is what it was doing. The result turned out to be
non-transitive. Replace with code that implements the same preference
but only applies it where the result is compatible with transitivity.
This leads to a shuffling of the order of the Serbian-language
locales, which sorts the Cyrillic ones before the Latin ones. This is
consistent with my reading of the CLDR data, which fills in Cyrillic
and Serbia for Serbian; Serbian/Cyrillic/Serbia did previously sort
before all other Serbian variants.
Thanks to Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io> for
discovering the non-transitivity.
Pick-to: 6.2
Change-Id: I0ce9f78e620e714f980f32b85b7100ed0f92ad74
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
IOError does not have property 'message' in Python 3. Instead of
attempting to access it, just use the string representation of
the exception object. This produces the error message possibly combined
with additional arguments in both Python 2 and Python 3.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Icb198a409e7f80b832e474d8390b770fdeacc6c2
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Name 'stem' is undefined inside CalendarDataWriter.write(). The error
was repoted by flake8.
Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ib816b40d0bde2afd3112da76deee0ce39985693a
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Change the nomenclature used in the scripts and the QLocaleXML data
format to use "territory" and "territories" in place of "country" and
"countries". Does not change the generated source files.
Change-Id: I4b208d8d01ad2bfc70d289fa6551f7e0355df5ef
Reviewed-by: JiDe Zhang <zhangjide@uniontech.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The use of "Country" is misleading as some entries in the enumeration
are not countries (eg, HongKong), for all that most are. The Unicode
Consortium's Common Locale Data Repository (CLDR, from which QLocale's
data is taken) calls these territories, so introduce territory-based
names and prepare to deprecate the country-based ones in due course.
[ChangeLog][QtCore][QLocale] QLocale now has Territory as an alias for
its Country enumeration, and associated territory-based names to match
its country-named methods, to better match the usage in relevant
standards. The country-based names shall in due course be deprecated
in favor of the territory-based names.
Fixes: QTBUG-91686
Change-Id: Ia1ae1ad7323867016186fb775c9600cd5113aa42
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Follow through on a comment from 2012: sort the likely subtag array
(in the CLDR update script) and use bsearch to find entries in it.
This simplifies QLocaleXmlReader.likelyMap() slightly, moving the
detection of last entry to LocaleDataWriter.likelySubtags(), but
requires collecting all likely sub-tag mapping pairs (rather than just
passing them through from read to write via generators) in order to
sort them.
Change-Id: Ieb6875ccde1ddbd475ae68c0766a666ec32b7005
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Our enumdata.py namings of countries had fallen somewhat out of sync
with CLDR's names. In the process, support including hyphenation in
the unsquashed name, along with spacing. Distinguish, in comments,
between older renamings and those first seen in Qt6.
Change-Id: I91ec444bf35222ab6a9332e389ace19cca0e4fdf
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The code pervasively presumes their values can be held in a ushort, so
make sure the compiler knows we expect that to work (and doesn't
complain about narrowing when we do convert them to ushort).
Change-Id: Idde7be6cceee8a6dae333c5b1d5a0120fec32e4a
Reviewed-by: Andrei Golubev <andrei.golubev@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Read three more values from CLDR and add a byte to the bit-fields at
the end of QLocaleData, indicating the three group sizes. This adds
three new parameters to various low-level formatting functions. At the
same time, rename ThousandsGroup to GroupDigits, more faithfully
expressing what this (internal) option means.
This replaces commit 27d139128013c969a939779536485c1a80be977e with a
fuller implementation that handles digit-grouping in any of the ways
that CLDR supports. The formerly "Indian" formatting now also applies
to at least some locales for Bangladesh, Bhutan and Sri Lanka.
Fixed Costa Rica currency formatting test that wrongly put a separator
after the first digit; the locale (in common with several Spanish
locales) requires at least two digits before the first separator.
[ChangeLog][QtCore][Important Behavior Changes] Some locales require
more than one digit before the first grouping separator; others use
group sizes other than three. The latter was partially supported (only
for India) at 5.15 but is now systematically supported; the former is
now also supported.
Task-number: QTBUG-24301
Fixes: QTBUG-81050
Change-Id: I4ea4e331f3254d1f34801cddf51f3c65d3815573
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Conflicts:
examples/opengl/doc/src/cube.qdoc
src/corelib/global/qlibraryinfo.cpp
src/corelib/text/qbytearray_p.h
src/corelib/text/qlocale_data_p.h
src/corelib/time/qhijricalendar_data_p.h
src/corelib/time/qjalalicalendar_data_p.h
src/corelib/time/qromancalendar_data_p.h
src/network/ssl/qsslcertificate.h
src/widgets/doc/src/graphicsview.qdoc
src/widgets/widgets/qcombobox.cpp
src/widgets/widgets/qcombobox.h
tests/auto/corelib/tools/qscopeguard/tst_qscopeguard.cpp
tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp
tests/benchmarks/corelib/io/qdiriterator/qdiriterator.pro
tests/manual/diaglib/debugproxystyle.cpp
tests/manual/diaglib/qwidgetdump.cpp
tests/manual/diaglib/qwindowdump.cpp
tests/manual/diaglib/textdump.cpp
util/locale_database/cldr2qlocalexml.py
util/locale_database/qlocalexml.py
util/locale_database/qlocalexml2cpp.py
Resolution of util/locale_database/ are based on:
https://codereview.qt-project.org/c/qt/qtbase/+/294250
and src/corelib/{text,time}/*_data_p.h were then regenerated by
running those scripts.
Updated CMakeLists.txt in each of
tests/auto/corelib/serialization/qcborstreamreader/
tests/auto/corelib/serialization/qcborvalue/
tests/auto/gui/kernel/
and generated new ones in each of
tests/auto/gui/kernel/qaddpostroutine/
tests/auto/gui/kernel/qhighdpiscaling/
tests/libfuzzer/corelib/text/qregularexpression/optimize/
tests/libfuzzer/gui/painting/qcolorspace/fromiccprofile/
tests/libfuzzer/gui/text/qtextdocument/sethtml/
tests/libfuzzer/gui/text/qtextdocument/setmarkdown/
tests/libfuzzer/gui/text/qtextlayout/beginlayout/
by running util/cmake/pro2cmake.py on their changed .pro files.
Changed target name in
tests/auto/gui/kernel/qaction/qaction.pro
tests/auto/gui/kernel/qaction/qactiongroup.pro
tests/auto/gui/kernel/qshortcut/qshortcut.pro
to ensure unique target names for CMake
Changed tst_QComboBox::currentIndex to not test the
currentIndexChanged(QString), as that one does not exist in Qt 6
anymore.
Change-Id: I9a85705484855ae1dc874a81f49d27a50b0dcff7
It was causing all lines after the first, in each calendar's
locale_data[], to be over-indented. This only changes spacing.
Change-Id: Ibfc4986548eecbfdba2902cc18f44a2af669bc6d
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>