36 Commits

Author SHA1 Message Date
Edward Welbourne
af57b23b62 Move clearing of self-aliases upstream to QLocaleXmlWriter
The duplicate entries just bulked up the intermediate file.
Makes no change to generated data.

Task-number: QTBUG-115158
Change-Id: I6dc0d1f79f8dcf2e46264c6f9d1ae06ff4c91394
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-06-05 16:39:45 +02:00
Edward Welbourne
69aefa4edf Update C Locale constructor to match others on ids and codes
It was setting *_code='0' for the Any* forms of language, script and
territory; this is wrong, the codes for these are all empty or other
special tokens (like 'und', 'Zzzz', 'ZZ'). The IDs for them are zero,
as an int not a string, but were omitted. Also add the variant
details, for all that they're currently unused, for consistency.

This makes no difference to the generated data.

Task-number: QTBUG-115158
Change-Id: I339d1b201e50e2bbc510758ffbbaae0fa02277d4
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-06-02 15:26:05 +02:00
Edward Welbourne
0c809fc3b5 Derive C locale data from en_US, overriding minor details
The qlocalexml.py Locale.C() had to replicate a whole lot of data that
isn't really relevant to how C differs from en_US and every addition
to what we support required further additions to it. So pass the en_US
Locale object to the pseudoconstructor so that C can inherit from it
and only override the parts where we care about the difference.

Hand-code shortening for short Jalali month names, to match Soroush's
original contribution, and include the narrow forms in the hard-coded
data to keep the generated data unchanged (for now). Note some of the
departures from CLDR; we may want to drop these overrides later.

In the process, convert the mapping from keys to locales to
consistently use IDs for all members of the key, instead of using the
(empty) code value for (as yet unused) variant; it now gets ID 0 and
is consistent with returns from codesToIdNames(). This makes life
easier for the code that now has to construct an en_US key.

Task-number: QTBUG-115158
Change-Id: I3d7acb6a4059daec1bba341fcf015c39c7a6803b
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
2024-06-02 15:25:52 +02:00
Edward Welbourne
72a7dddc25 QLocaleXML: Improve documentation, tidy up a bit
Omit parentheses round what python will form into a tuple anyway.
Include trailing commas on last entries of tuples so adding future
entries don't drag the existing line into their diffs.
Let the writer's tag-opener handle attributes, if supplied.
Clean up spacing in some doc-strings.
This is all preparation for further changes, to limit their diffs.

Change-Id: I989ae28bbd235b2af9c1d72467d4741c4f1f20ae
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-06-02 15:25:36 +02:00
Edward Welbourne
9534341654 Integrate timezone data into the CLDR-via-QLocaleXml pipeline
Future work shall need the timezone alias data to be synchronized
between the (expanded) locale-independent timezone data and the
(coming) locale-dependent timezone data. The latter shall need to come
via QLocaleXml, hence the former now needs to, too.

This makes no change to the generated data, aside from changing the
regeneration instructions for qtimezoneprivate_data_p.h, to use the
same scripts as locale data, instead of cldr2qtimezone.py, which is
now removed.

Task-number: QTBUG-115158
Change-Id: I47ddd95f6af1855cbb1f601e9074c13f213cd61c
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-06-02 15:25:27 +02:00
Edward Welbourne
4e23dbb742 Add assorted notes and suggestions in util/locale_database/
Change-Id: I22534943f2c9710d501235672811a861a5fd3aea
Reviewed-by: Øystein Heskestad <oystein.heskestad@qt.io>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
2024-06-02 15:25:21 +02:00
Edward Welbourne
5f8dc8ea5f Purge an almost-redundant duplicate datetime format conversion
The QLocale XML reader was passing datetime formats through a format
conversion despite the data being converted at the point where we read
it from CLDR. It turns out this was needed because the long date and
time formats in our hard-coded data for the C Locale object used CLDR
format strings, unlike all other Locale objects. Fix those two formats
in the C locale and remove the redundant processing step. This, in
turn, enables the parser to include the date and time formats in its
general handling of most fields that it reads.

This does not result in any change to the generated data QLocale uses
(although it does change the intermediate QLocale XML file).

Task-number: QTBUG-115158
Change-Id: Iaf9da206158043dda2e9e5a3790f009b100e46b4
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-04-30 18:30:15 +02:00
Edward Welbourne
e08ca2c9c8 Fix spacing inconsistencies brought to light by flake8
It has many grumbles about spacing, but at least this code is
currently consistent about its departure from PEP8's spacing rules
(and closer to Qt's) for the present. We can review whether to do a
drastic spacing revolution later.

Change-Id: Ife4e8a5b02b63434bd9c7ac7ba4cbc11b6311f9f
Reviewed-by: Mate Barany <mate.barany@qt.io>
2024-04-23 20:51:19 +02:00
Edward Welbourne
cf0ebc9ad3 Fix typo in doc comment for QLocaleXmlWriter.close()
Change-Id: I128ed5e0ebd01a7ed1f3a3049d2b63f1df042562
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2024-04-22 18:56:20 +02:00
Edward Welbourne
f2a2379de8 Use dict comprehensions more in cldr.py and qlocalexml.py
They're a bit more readable than calling dict on a generator.

Change-Id: I3177e31b1f617b80d1cf5d5f83df7036fc0c4c01
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2024-04-22 18:56:20 +02:00
Edward Welbourne
1ae24f8b50 Use CLDR's names in QLocale::*ToName() for language, script, territory
Various comments need to continue using the enumdata.py names, as they
associate data with particular enum members, but we can now correctly
use the en.xml versions of their names when we report them, rather
than the enum-friendly names we use in the code. Since this now means
the data may stray outside plain ASCII - it'll be UTF-8-encoded - this
implies replacing the QLatin1StringView()s of the code that formerly
read this data with QString::fromUtf8().

Fixes: QTBUG-94460
Change-Id: Id3b08875a46af58c0555c3e303b0e15a19441509
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2023-08-09 17:53:42 +02:00
Edward Welbourne
743ceb7cc2 Move enum-name-munging from LocaleHeaderWriter to QLocaleXmlReader
The former needed the latter's .dupes to do the job, so can now just
take a method as a tool to do the job instead, letting .dupes become
private. In the process refine the munging to free enumdata.py from
having to capitalize each word in its names. This will, in due course,
let us use more natural forms in various comments. This causes no
change to generted data.

Update enumdata.py's introduction doc, mainly to reflect this but also
fixing the out-of-date names (old *_list have long been *_map) and
adding some details to other paragraphs.

Task-number: QTBUG-94460
Change-Id: If195b2e94a53a495fc4f1f216bed07a910439fa7
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-09 17:53:26 +02:00
Edward Welbourne
37c5a9f20b Fix typos in QLocaleXmlWriter
The script and territory to exclude from reports about unused ones
were swapped, so we excluded a territory from the script list (which
didn't contain it anyway) and vice versa.

TheTest for whether to report used the non-existend .territories
attribute by mistake for .__territories

Change-Id: I29e9d9f8f34883d7c3a5ac15470d9e7a0366e3db
Reviewed-by: Ievgenii Meshcheriakov <ievgenii.meshcheriakov@qt.io>
2023-08-01 15:35:57 +02:00
Lucie Gérard
05fc3aef53 Use SPDX license identifiers
Replace the current license disclaimer in files by
a SPDX-License-Identifier.
Files that have to be modified by hand are modified.
License files are organized under LICENSES directory.

Task-number: QTBUG-67283
Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
2022-05-16 16:37:38 +02:00
Ievgenii Meshcheriakov
41458fafa0 locale_database: Use f-strings in Python code
Replace most uses of str.format() and string arithmetic by f-strings.
This results in more compact code and the code is easier to read
when using an appropriate editor.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I3409f745b5d0324985cbd5690f5eda8d09b869ca
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-16 19:04:20 +02:00
Ievgenii Meshcheriakov
65a0e04072 locale_database: Add schema for intermediate locale data files
The schema is in RelaxNG Compact syntax. It can be used to validate
files produced by the cldr2qlocalexml.py script and also gives an
overview of the file format.

Change-Id: I344978f2201c5e67e236ab580a12ad33262f33cb
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-16 18:27:28 +02:00
Ievgenii Meshcheriakov
53382b7b07 locale_database: Don't use u prefix for strings in python files
This prefix is useless with Python 3.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: Ic008d53fe506865759e9a5003f439f7ac107b9e6
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-15 17:06:53 +02:00
Ievgenii Meshcheriakov
b02d17c5c0 Convert CLDR scripts to Python 3
The convertion is moslty done using 2to3 script with manual cleanup
afterwards.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I4d33b04e7269c55a83ff2deb876a23a78a89f39d
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-15 17:06:53 +02:00
Ievgenii Meshcheriakov
1887c4ecc1 locale_database: Sort lists of unused tags before printing
This way the output is easier to compare between versions.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: If4053c574c4ad200a179b06276bd889f2cb9e1c6
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-06 15:17:15 +02:00
Ievgenii Meshcheriakov
0b2646c495 locale_data: Add new line at the end of script output
Output of cldr2qlocalexml.py looks weird without the final new line.

Task-number: QTBUG-83488
Pick-to: 6.2
Change-Id: I5d675e475c57cdc8101887c39052007ba0a19857
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-07-06 14:21:39 +02:00
Edward Welbourne
1a49d7d1e0 Report unused enum members after CLDR data scan
We should at least know when members of QLocale's enums aren't adding
any value, and it may make sense to deprecate the unused ones.

Change-Id: Icf202f81d2a35904c13ccdc202d41985bcb3f2e6
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-06-07 17:14:14 +02:00
Edward Welbourne
e51831260a Nomenclature change: s/countr/territor/g in locale scripts
Change the nomenclature used in the scripts and the QLocaleXML data
format to use "territory" and "territories" in place of "country" and
"countries". Does not change the generated source files.

Change-Id: I4b208d8d01ad2bfc70d289fa6551f7e0355df5ef
Reviewed-by: JiDe Zhang <zhangjide@uniontech.com>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2021-05-26 18:00:01 +02:00
Edward Welbourne
21e0ef3ccf Rename util/locale_database/enumdata.py's various *_list to *_map
These variables provide mappings, not lists, so name them non-deceptively.

Change-Id: Idf15e78ad73790bc86dd8b9d4f248d1c4f73993c
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-05-26 18:00:01 +02:00
Edward Welbourne
181424d9b5 QLocaleXmlWriter.enumData(): move enumdata import to method from caller
The only reason cldr.py imported enumdata was so as to pass what it
imported to writer.enumData(); that method might as well do the import
itself.

Change-Id: Ie77dcd29058f926b8cca4deef35837f30505859f
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2021-05-26 18:00:01 +02:00
JiDe Zhang
50a7eb8cf7 Add the "Territory" enumerated type for QLocale
The use of "Country" is misleading as some entries in the enumeration
are not countries (eg, HongKong), for all that most are. The Unicode
Consortium's Common Locale Data Repository (CLDR, from which QLocale's
data is taken) calls these territories, so introduce territory-based
names and prepare to deprecate the country-based ones in due course.

[ChangeLog][QtCore][QLocale] QLocale now has Territory as an alias for
its Country enumeration, and associated territory-based names to match
its country-named methods, to better match the usage in relevant
standards. The country-based names shall in due course be deprecated
in favor of the territory-based names.

Fixes: QTBUG-91686
Change-Id: Ia1ae1ad7323867016186fb775c9600cd5113aa42
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2021-04-15 20:17:49 +08:00
Edward Welbourne
a9e4bf7eef Implement binary search in QLocale's likely sub-tag lookup
Follow through on a comment from 2012: sort the likely subtag array
(in the CLDR update script) and use bsearch to find entries in it.

This simplifies QLocaleXmlReader.likelyMap() slightly, moving the
detection of last entry to LocaleDataWriter.likelySubtags(), but
requires collecting all likely sub-tag mapping pairs (rather than just
passing them through from read to write via generators) in order to
sort them.

Change-Id: Ieb6875ccde1ddbd475ae68c0766a666ec32b7005
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
2020-11-08 13:01:33 +01:00
Edward Welbourne
ed853a66f8 Simplify QLocaleXmlWriter::enumData()
Move the repeated List suffix to the __enumTable() helper, where half
the parameter's uses were having to snip it off anyway.

Change-Id: Ia396e87e59ceeb81fc4b0890a86934dc67da10cb
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-11-08 13:01:06 +01:00
Edward Welbourne
bb6a73260e Support digit-grouping correctly
Read three more values from CLDR and add a byte to the bit-fields at
the end of QLocaleData, indicating the three group sizes. This adds
three new parameters to various low-level formatting functions. At the
same time, rename ThousandsGroup to GroupDigits, more faithfully
expressing what this (internal) option means.

This replaces commit 27d139128013c969a939779536485c1a80be977e with a
fuller implementation that handles digit-grouping in any of the ways
that CLDR supports. The formerly "Indian" formatting now also applies
to at least some locales for Bangladesh, Bhutan and Sri Lanka.

Fixed Costa Rica currency formatting test that wrongly put a separator
after the first digit; the locale (in common with several Spanish
locales) requires at least two digits before the first separator.

[ChangeLog][QtCore][Important Behavior Changes] Some locales require
more than one digit before the first grouping separator; others use
group sizes other than three. The latter was partially supported (only
for India) at 5.15 but is now systematically supported; the former is
now also supported.

Task-number: QTBUG-24301
Fixes: QTBUG-81050
Change-Id: I4ea4e331f3254d1f34801cddf51f3c65d3815573
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2020-07-14 14:52:08 +02:00
Qt Forward Merge Bot
8823bb8d30 Merge remote-tracking branch 'origin/5.15' into dev
Conflicts:
	examples/opengl/doc/src/cube.qdoc
	src/corelib/global/qlibraryinfo.cpp
	src/corelib/text/qbytearray_p.h
	src/corelib/text/qlocale_data_p.h
	src/corelib/time/qhijricalendar_data_p.h
	src/corelib/time/qjalalicalendar_data_p.h
	src/corelib/time/qromancalendar_data_p.h
	src/network/ssl/qsslcertificate.h
	src/widgets/doc/src/graphicsview.qdoc
	src/widgets/widgets/qcombobox.cpp
	src/widgets/widgets/qcombobox.h
	tests/auto/corelib/tools/qscopeguard/tst_qscopeguard.cpp
	tests/auto/widgets/widgets/qcombobox/tst_qcombobox.cpp
	tests/benchmarks/corelib/io/qdiriterator/qdiriterator.pro
	tests/manual/diaglib/debugproxystyle.cpp
	tests/manual/diaglib/qwidgetdump.cpp
	tests/manual/diaglib/qwindowdump.cpp
	tests/manual/diaglib/textdump.cpp
	util/locale_database/cldr2qlocalexml.py
	util/locale_database/qlocalexml.py
	util/locale_database/qlocalexml2cpp.py

Resolution of util/locale_database/ are based on:
https://codereview.qt-project.org/c/qt/qtbase/+/294250
and src/corelib/{text,time}/*_data_p.h were then regenerated by
running those scripts.

Updated CMakeLists.txt in each of
	tests/auto/corelib/serialization/qcborstreamreader/
	tests/auto/corelib/serialization/qcborvalue/
	tests/auto/gui/kernel/
and generated new ones in each of
	tests/auto/gui/kernel/qaddpostroutine/
	tests/auto/gui/kernel/qhighdpiscaling/
	tests/libfuzzer/corelib/text/qregularexpression/optimize/
	tests/libfuzzer/gui/painting/qcolorspace/fromiccprofile/
	tests/libfuzzer/gui/text/qtextdocument/sethtml/
	tests/libfuzzer/gui/text/qtextdocument/setmarkdown/
	tests/libfuzzer/gui/text/qtextlayout/beginlayout/
by running util/cmake/pro2cmake.py on their changed .pro files.

Changed target name in
	tests/auto/gui/kernel/qaction/qaction.pro
	tests/auto/gui/kernel/qaction/qactiongroup.pro
	tests/auto/gui/kernel/qshortcut/qshortcut.pro
to ensure unique target names for CMake

Changed tst_QComboBox::currentIndex to not test the
currentIndexChanged(QString), as that one does not exist in Qt 6
anymore.

Change-Id: I9a85705484855ae1dc874a81f49d27a50b0dcff7
2020-04-08 20:11:39 +02:00
Edward Welbourne
c3dea1ffca Move some shared code to a localetools module
The time-zone script was importing two functions from the locale data
generation script. Move them to a separate module, to which I'll
shortly add some more shared utilities. Cleaned up some imports in the
process.

Combined qlocalexml2cpp's and xpathlit's error classes into a new
Error class in the new module and made it a bit more like a proper
python error class.

Task-number: QTBUG-81344
Change-Id: Idbe0139ba9aaa2f823b8f7216dee1d2539c18b75
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:42:40 +01:00
Edward Welbourne
4d9f1a87de Move qlocalexml2cpp.py's XML-reading to QLocaleXmlReader
This new class mirrors the existing QLocaleXmlWriter and places the
two side-by-side in qlocalexml.py, rather than having the writing and
reading in separate places.

Made judicious use of transformed versions of mappings to save
repeated iteration of a mapping's entries to do lookups on fist
entries of pair-values; several (id, name, code) data-sets are
sometimes indexed by id, sometimes by name.

Reworked the default_map, that the complicated compareLocaleKeys()
used in sorting locale keys, to map IDs instead of names; the function
also needed the locale_map so that it could convert IDs to names,
which we can skip by going directly with IDs.

Task-number: QTBUG-81344
Change-Id: Iff6a97f7f0755b56dda70d8a6796ec074c558910
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:42:34 +01:00
Edward Welbourne
a20697a394 Rework cldr2qlocalexml.py in terms of a QLocaleXmlWriter class
Delegate the output of XML to a helper class provided by qlocalexml.py
and restructure the driver script so that it can be imported without
running anything. It now has a minimal __name__ == '__main__' block
that calls a main() function. This, for the moment, requires a global
via which it shares the CLDR directory with various other functions;
that shall go away in a later commit.

Task-number: QTBUG-81344
Change-Id: Ica2d3ec09f2d38ba42fd930258cc765283f29a71
Reviewed-by: Cristian Maureira-Fredes <cristian.maureira-fredes@qt.io>
2020-04-02 19:42:28 +01:00
Simon Hausmann
ff922e7b87 Merge remote-tracking branch 'origin/5.15' into dev
Conflicts:
	src/corelib/kernel/qmetatype.cpp

Change-Id: I88eb0d3e9c9a38abf7241a51e370c655ae74e38a
2020-03-16 18:41:27 +01:00
Edward Welbourne
ebcd8e16db Deduplicate day-name data in QLocaleXML files
This is a follow-up to commit ebb0212133bd91f1da4931b29eb1d33fb77b1444.
The day name data appeared twice in the XML files.
Skip the second copy, saving 8.8% of the intermediate file-size.
This makes no change to generated QLocale data.

Change-Id: Ic2cc543a2a85cbb1d2d47ebac7df4fa9ad6ee0a7
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-03-16 08:51:46 +01:00
Lars Knoll
2a4b957789 Merge remote-tracking branch 'origin/5.15' into dev
Change-Id: I99ee6f8b4bdc372437ee60d1feab931487fe55c4
2020-03-04 14:39:18 +00:00
Edward Welbourne
84382bde5c Rename the localexml module to qlocalexml
It implements interaction with the QLocaleXML file format type, so
rename it to match.

Task-number: QTBUG-81344
Change-Id: I46302d4ac1038cdfc5929e73b554b6d793814c56
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-03-03 07:38:06 +01:00