The convention is to start the benchmark names with the "tst_bench_"
prefix. This makes it easier to detect the benchmark in tools, such as
the Core Benchmarks Runner.
Change-Id: I2dcebb6cef0aba4133c4135462e8d76387b776bf
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
If we have no specific need for the private QHashCombine class, use the
front-end functions. For headers, we do have a need: we prefer
QHashCombine because it compiles faster.
Change-Id: I73578ea802d3b905a53bfffd504c20af0ca96cf8
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
It doesn't matter at all, but it's now got a defined value, so use it.
Change-Id: Id8e734cd81624a3d4c139b2639381e3f0b162db4
Reviewed-by: Matthias Rauter <matthias.rauter@qt.io>
If debug symbols are needed, then pass the respective flag to the
configure script.
Pick-to: 6.9 6.8
Change-Id: I99db92bdd5b7eb896e0d592117a8f218467a4bd7
Reviewed-by: Øystein Heskestad <oystein.heskestad@qt.io>
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Requested by Marc Mutz when adding the utc() benchmark.
Pick-to: 6.9 6.8 6.5 5.15
Change-Id: I6f97f9e4dab07d10718280b4fb7ac158e42b8d67
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
On my machine this gives me:
```
********* Start testing of tst_QTimeZone *********
Config: Using QtTest library 6.10.0, Qt 6.10.0 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 14.2.1 20250207), arch unknown
PASS : tst_QTimeZone::initTestCase()
PASS : tst_QTimeZone::utc()
RESULT : tst_QTimeZone::utc():
358.686871 nsecs per iteration (total: 358,686,513, iterations: 999999)
896.524312 CPU cycles per iteration, 2,5 GHz (total: 896,523,416, iterations: 999999)
2,227.000427 instructions per iteration, 2,484 instr/cycle (total: 2,226,998,200, iterations: 999999)
560.000375 branch instructions per iteration, 1,56 G/sec (total: 559,999,815, iterations: 999999)
PASS : tst_QTimeZone::cleanupTestCase()
Totals: 3 passed, 0 failed, 0 skipped, 0 blacklisted, 374ms
********* Finished testing of tst_QTimeZone *********
```
Profiling shows some quite unexpected code paths that
I will try to optimize in follow-up patches. Note that
this function can be called frequently when deserializing
QDateTime over a QDataStream e.g. - I have stumbled over
it while profiling some KDE PIM code in akonadi.
Pick-to: 6.9 6.8 6.5 5.15
Change-Id: I7439df53ae8512c766f63cb4b0d4f33d14aa3a01
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Path trimming is used in vector graphics and design.
Change-Id: Id5f32b570182f0e8f790835b2fbeb28b3432e40d
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Also add a benchmark test for them.
Change-Id: Icc44f54786048550d0a96fd0d1acd3801eaca132
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Unlike CBOR maps, JSON objects can only have string keys, but, like
CBOR, they are internally stored as either US-ASCII (ie. Latin-1),
UTF-8, or UTF-16, so in order to return a view on the internal
storage, QAnyStringView is the perfect fit.
Add (const_)iterator::keyView() and add a benchmark, prepared for
testing a similar change to the value side of things.
Results (fastest each of ten runs) on my machine suggest a 40%
speedup:
PASS : BenchmarkQtJson::iteratorKey()
RESULT : BenchmarkQtJson::iteratorKey():
0.071 msecs per iteration (total: 73, iterations: 1024)
PASS : BenchmarkQtJson::iteratorKeyView()
RESULT : BenchmarkQtJson::iteratorKeyView():
0.042 msecs per iteration (total: 87, iterations: 2048)
[ChangeLog][QtCore][QJsonObject] Added keyView() methods to iterator
and const_iterator, allowing zero-copy inspection of the key().
Task-number: QTBUG-133688
Change-Id: I0ccedaf8a4fa41125b12bdbab5bea3bd2468d9a5
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This includes:
- turning VERIFY_SOURCE_SBOM ON
- adding exception to the licenseRule.json files
- correcting the licensing given via REUSE.toml files
- renaming license files not located in LICENSES folder.
They need to be named LICENSE. to be ignored by reuse and
excluded from the source SBOM. The name are updated in the
corresponding qt_attribution.json
A lot of files are skipped during the license test,
but all are present in the source SBOM.
This is why correction are needed before turning the
source SBOM check on.
[ChangeLog][Third-Party Code] Renaming the license files with prefix
LICENSE. to have them ignored by reuse tool.
Task-number: QTBUG-131434
Pick-to: 6.9
Change-Id: Iab517215bb10a17357d2d2436bba8d3af76e5cd1
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
An application may need to load shortcuts from the settings at startup,
so QKeySequence::fromString() should be fast enough if there is a lot of
shortcuts.
This change adds a QKeySequence::fromString() benchmark so one could get
partial insight in its performance.
Change-Id: I9e15c0e9a199787189d5076a41154f127d2930a3
Reviewed-by: Axel Spoerl <axel.spoerl@qt.io>
Simple cases, all local const containers.
tst_QHash: rename the template parameter to "Str". My eyes saw
"QList<String>", but my brain somehow assumed QList<*Q*String>. You
could argue that I am a bit slow, but it has tricked someone else in
code review, so just rename it for the sake of clarity.
Drive-by, remove braces from one-line for-loop-block.
Task-number: QTBUG-115839
Change-Id: Ia1a56bea7b931efb377ba8c04ee8933561abf341
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
The container is a member of the unittest, it's not changed by the loop
body or during iteration, so use a ranged-for and std::as_const.
Task-number: QTBUG-115839
Change-Id: I1be75d17ff305bc542cf7b058cc59d70cedc77ad
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
Use C arrays for data known at compile time.
Drive-by change: verify the two containers that are iterated in tandem
have the same size.
Task-number: QTBUG-115839
Change-Id: I457ddca8aa98e2f15a833a40ff45bd208701bf6b
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
For the simple case of setting a solid style pen or brush with a color,
which is what we do most in Qt.
Change-Id: Ie90a842ee9638f04941855dfd2e9211235db6cce
Reviewed-by: Christian Ehrlicher <ch.ehrlicher@gmx.de>
The default implementation creates a harfbuzz font and iterates
over all glyphs. The more optimized implementation for CoreText
uses the native API.
Change-Id: I9c5b8115f72fb9ade3892a65ddbed76f7af0a580
Reviewed-by: Anton Kudryavtsev <antkudr@mail.ru>
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Some fonts name their glyphs in the "post" or "cff" tables, and both
CoreText and FreeType provide high-level APIs to access those values.
Also, harfbuzz has an API for OpenType fonts, which we can use on
Windows (where we would have to load the font into FreeType, as
neither GDI nor DirectWrite provide APIs to access this data), and as
default implementation that we can fall back to if the platform APIs
don't find anything.
We need to set the harfbuzz font up with the OpenType callbacks
explicitly, so cannot just use what we get from harfbuzzFont.
With this low-level API we can now make name-based lookups of glyphs,
and eventually render those in a QIconEngine implementation.
Task-number: QTBUG-102346
Change-Id: I68cc19814fc45d63a88e063b719b46f6aa6100bc
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
For some known fonts, confirm that we get the right glyph count and
glyph index for specific unicode code points.
Create QGuiApplication with a specific font engine, and test the fonts
with all engines. On Windows, that's DirectWrite, GDI, and Freetype;
on macOS CoreText and Freetype; and otherwise only Freetype.
Not all fonts will be available with all engines, so test in each test
function whether the font is a good enough match (family is enough, no
need to do a deep test).
Add a benchmark as well, using the same setup plumbing, but with
different test functions.
Change-Id: I2ed279965fc3f1dc3f283d0fe7b018fc3035c67d
Reviewed-by: Anton Kudryavtsev <antkudr@mail.ru>
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Those files are read by reuse to complement or override the copyright
and licensing information found in file.
The use of REUSE.toml files was introduced in REUSE version 3.1.0a1.
This reuse version is compatible with reuse specification
version 3.2 [1].
With this commit's files,
* The SPDX document generated by reuse spdx conforms to SPDX 2.3,
* The reuse lint command reports that the Qt project is reuse compliant.
[1]: https://reuse.software/spec-3.2/
Task-number: QTBUG-124453
Task-number: QTBUG-125211
Pick-to: 6.8
Change-Id: I01023e862607777a5e710669ccd28bbf56091097
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
The total stack space available is 1MB, but each pointer is 8 bytes.
So we have to limit the amount of pointers we allocate on the stack to
somewhere below 128K.
Pick-to: 6.8 6.7 6.5
Change-Id: I1d1262a4048cf4b3fed8df813decc3e142430a32
Reviewed-by: Mate Barany <mate.barany@qt.io>
Both APFS and HFS+ can be both case-sensitive and case-insensitive
(the default), and the mounted file system may be any other file
system than these two as well, so hard-coding to case-sensitive
is not sufficient.
Pick-to: 6.8
Task-number: QTBUG-28246
Task-number: QTBUG-31103
Change-Id: Ibdb902df3f169b016a519f67ad5a79e6afb6aae3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
- The problem is found by a clang-tidy tool.
Use size_t instead of qsizetype as a type for the index variable to
get rid of the warning.
The clang-tidy output:
tst_bench_qtimezone.cpp:130:30: warning: comparison of integer expressions of different signedness: ‘qsizetype’ {aka ‘long long int’}
and ‘std::size_t’ {aka ‘long unsigned int’} [-Wsign-compare]
locIndex < std::size(locName) ? locName[locIndex] :
where.bcp47Name().toUtf8();
Task-number: QTBUG-105464
Change-Id: I603cbf201827e6e502c9737b02928f31ad6b2517
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
When we are casting an rvalue QSharedPointer, we do not need to
pay the cost for the atomic refcount increment / decrement. Optimize
this by adding rvalue overloads that handle this specific case
directly.
Note that this is arguably a micro optimization since in most cases
the cost to create the pointer in the first place is going to dwarf
the cost for the atomic increment / decrement. But it starts to matter
for situations like `someConstObject.ptrGetter().dynamicCast()` - in
the common case the `ptrGetter()` returns by value and the cast can
then operate on an rvalue.
On my system, the benchmark speaks for itself:
```
./tests/benchmarks/corelib/tools/qsharedpointer/tst_bench_shared_ptr -perf -perfcounter cycles,instructions -iterations 100000 objectCast objectCast_rvalue
********* Start testing of tst_QSharedPointer *********
Config: Using QtTest library 6.9.0, Qt 6.9.0 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 14.2.1 20240805), arch unknown
PASS : tst_QSharedPointer::initTestCase()
PASS : tst_QSharedPointer::objectCast()
RESULT : tst_QSharedPointer::objectCast():
147.05521 CPU cycles per iteration (total: 14,705,522, iterations: 100000)
147.00058 instructions per iteration, 1.000 instr/cycle (total: 14,700,058, iterations: 100000)
PASS : tst_QSharedPointer::objectCast_rvalue()
RESULT : tst_QSharedPointer::objectCast_rvalue():
52.00227 CPU cycles per iteration (total: 5,200,227, iterations: 100000)
110.00056 instructions per iteration, 2.115 instr/cycle (total: 11,000,057, iterations: 100000)
PASS : tst_QSharedPointer::cleanupTestCase()
Totals: 4 passed, 0 failed, 0 skipped, 0 blacklisted, 45ms
********* Finished testing of tst_QSharedPointer *********
./tests/benchmarks/corelib/tools/qsharedpointer/tst_bench_shared_ptr -perf -perfcounter cycles,instructions -iterations 100000 dynamicCast dynamicCast_rvalue
********* Start testing of tst_QSharedPointer *********
Config: Using QtTest library 6.9.0, Qt 6.9.0 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 14.2.1 20240802), arch unknown
PASS : tst_QSharedPointer::initTestCase()
PASS : tst_QSharedPointer::dynamicCast()
RESULT : tst_QSharedPointer::dynamicCast():
148.34457 CPU cycles per iteration (total: 14,834,457, iterations: 100000)
120.00057 instructions per iteration, 0.809 instr/cycle (total: 12,000,058, iterations: 100000)
PASS : tst_QSharedPointer::dynamicCast_rvalue()
RESULT : tst_QSharedPointer::dynamicCast_rvalue():
25.00210 CPU cycles per iteration (total: 2,500,211, iterations: 100000)
81.00057 instructions per iteration, 3.240 instr/cycle (total: 8,100,058, iterations: 100000)
PASS : tst_QSharedPointer::cleanupTestCase()
Totals: 4 passed, 0 failed, 0 skipped, 0 blacklisted, 45ms
********* Finished testing of tst_QSharedPointer *********
./tests/benchmarks/corelib/tools/qsharedpointer/tst_bench_shared_ptr -perf -perfcounter cycles,instructions -iterations 100000 staticCast staticCast_rvalue
********* Start testing of tst_QSharedPointer *********
Config: Using QtTest library 6.9.0, Qt 6.9.0 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 14.2.1 20240802), arch unknown
PASS : tst_QSharedPointer::initTestCase()
PASS : tst_QSharedPointer::staticCast()
RESULT : tst_QSharedPointer::staticCast():
142.95894 CPU cycles per iteration (total: 14,295,894, iterations: 100000)
54.00057 instructions per iteration, 0.378 instr/cycle (total: 5,400,058, iterations: 100000)
PASS : tst_QSharedPointer::staticCast_rvalue()
RESULT : tst_QSharedPointer::staticCast_rvalue():
14.00205 CPU cycles per iteration (total: 1,400,205, iterations: 100000)
22.00056 instructions per iteration, 1.571 instr/cycle (total: 2,200,057, iterations: 100000)
PASS : tst_QSharedPointer::cleanupTestCase()
Totals: 4 passed, 0 failed, 0 skipped, 0 blacklisted, 50ms
********* Finished testing of tst_QSharedPointer *********
./tests/benchmarks/corelib/tools/qsharedpointer/tst_bench_shared_ptr -perf -perfcounter cycles,instructions -iterations 100000 constCast constCast_rvalue
********* Start testing of tst_QSharedPointer *********
Config: Using QtTest library 6.9.0, Qt 6.9.0 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 14.2.1 20240802), arch unknown
PASS : tst_QSharedPointer::initTestCase()
PASS : tst_QSharedPointer::constCast()
RESULT : tst_QSharedPointer::constCast():
142.38115 CPU cycles per iteration (total: 14,238,116, iterations: 100000)
54.00057 instructions per iteration, 0.379 instr/cycle (total: 5,400,058, iterations: 100000)
PASS : tst_QSharedPointer::constCast_rvalue()
RESULT : tst_QSharedPointer::constCast_rvalue():
13.00243 CPU cycles per iteration (total: 1,300,243, iterations: 100000)
22.00057 instructions per iteration, 1.692 instr/cycle (total: 2,200,058, iterations: 100000)
PASS : tst_QSharedPointer::cleanupTestCase()
Totals: 4 passed, 0 failed, 0 skipped, 0 blacklisted, 42ms
********* Finished testing of tst_QSharedPointer *********
```
[ChangeLog][QtCore][QSharedPointer] Optimized casts on rvalue shared
pointers.
Change-Id: I7dfb4d92253d6c60286d3903bc7aef66acab5689
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The check to see if m_itemView is empty is already
included in the first condition, and the re-check in
the second condition is unnecessary.
eg: A | | (!A && B) is equivalent to A | | B.
Pick-to: 6.8
Change-Id: I1a9f003bacea076fc1e72765c196a327a21c33b2
Reviewed-by: Alexey Edelev <alexey.edelev@qt.io>
The separator was changed at CLDR v44 to use a plain aleph, U+0623,
rather than the U+0627 aleph with hamza above previously used. This
is, in both cases, followed by U+0633, "seen".
Task-number: QTBUG-121325
Task-number: QTBUG-126060
Pick-to: 6.8 6.7 6.5
Change-Id: I013525e0876c4c47111846135c9311e7b3442dd3
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
By extending IteratorFlag so that it replaces both QDir::Filter and
QDirIterator::IteratorFlag enums, but with better defaults (based on how
QDir/Iterator is used in 15-20 years worth of code in Qt and KDE).
Make the QDirListing(QDir ~~) ctor private, also change it to use
QDirIterator::IteratatorFlags; it will be used to port existing code.
If QDir is ported to use QDirListing::IteratorFlags, instead of
QDir::Filters, a public QDirListing(QDir) constructor can then be added.
Pick-to: 6.8
Fixes: QTBUG-125504
Task-number: QTBUG-125859
Change-Id: Ide4ff8279f554029ac30d0579b0e8373ed4337f7
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
After the zip bomb checks were added the benchmark was not
adjusted.
Also move the QByteArray creation outside the loop, to not include
the time it takes to make a heap allocation.
Pick-to: 6.7 6.5
Change-Id: Ia958d497dd27fc61e0084b6f5c11d76886bb24c4
Reviewed-by: Mate Barany <mate.barany@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Functional fix will come later via separate tasks.
Task-number: QTBUG-122999
Change-Id: Ib805740c87ff21cea5a186add71cc594ab4d4df1
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Now developer build tests compile, but some are not working.
Functional fix will come later via separate tasks.
Task-number: QTBUG-122999
Change-Id: I70487b46c1b32ba4279cb02a4978e4f55ac0d310
Reviewed-by: Alexey Edelev <alexey.edelev@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
These reveal a roughly factor of six slow-down for two valid date
formats and a roughly factor of twelve slow-down for an invalid one.
Pick-to: 6.7 6.5 6.2
Task-number: QTBUG-124465
Change-Id: Ibd21e43d4c64aced33ba5b21e4602e0dc4fd7548
Reviewed-by: Mate Barany <mate.barany@qt.io>
By removing fs::directory_options::skip_permission_denied which isn't
available on VxWorks.
It's not strictly needed for the benchmark. I had added it to test
locally by listing some dirs under '/' (not all of them are readable for
users), and saw no reason at the time to remove it. The benchmark itself
listing dirs in the qtbase source dir tree.
Pick-to: 6.7
Task-number: QTBUG-115777
Change-Id: I4e68d01abd707dbf553f0a5832739ef0f4c9d585
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
It's the same aeshash() as before, except we're passing a template
parameter to indicate whether to read half and then zero-extend the
data. That is, it will perform a conversion from Latin1 on the fly.
When running in zero-extending mode, the length parameters are actually
doubled (counting the number of UTF-16 code units) and we then divide
again by 2 when advancing.
The implementation should have the following performance
characteristics:
* QLatin1StringView now will be roughly half as fast as Qt 6.7
* QLatin1StringView now will be roughly as fast as QStringView
For the aeshash128() in default builds of QtCore (will use SSE4.1), the
long loop (32 characters or more) is:
QStringView QLatin1StringView
movdqu -0x20(%rax),%xmm4 | pmovzxbw -0x10(%rdx),%xmm2
movdqu -0x10(%rax),%xmm5 | pmovzxbw -0x8(%rdx),%xmm3
add $0x20,%rax | add $0x10,%rdx
pxor %xmm4,%xmm0 | pxor %xmm2,%xmm0
pxor %xmm5,%xmm1 | pxor %xmm3,%xmm1
aesenc %xmm0,%xmm0 aesenc %xmm0,%xmm0
aesenc %xmm1,%xmm1 aesenc %xmm1,%xmm1
aesenc %xmm0,%xmm0 aesenc %xmm0,%xmm0
aesenc %xmm1,%xmm1 aesenc %xmm1,%xmm1
The number of instructions is identical, but there are actually 2 more
uops per iteration. LLVM-MCA simulation shows this should execute in the
same number of cycles on older CPUs that do not have support for VAES
(see <https://analysis.godbolt.org/z/x95Mrfrf7>).
For the VAES version in aeshash256() and the AVX10 version in
aeshash256_256():
QStringView QLatin1StringView
vpxor -0x40(%rax),%ymm1,%ym | vpmovzxbw -0x20(%rax),%ymm3
vpxor -0x20(%rax),%ymm0,%ym | vpmovzxbw -0x10(%rax),%ymm2
add $0x40,%rax | add $0x20,%rax
| vpxor %ymm3,%ymm0,%ymm0
| vpxor %ymm2,%ymm1,%ymm1
vaesenc %ymm1,%ymm1,%ymm1 <
vaesenc %ymm0,%ymm0,%ymm0 vaesenc %ymm0,%ymm0,%ymm0
vaesenc %ymm1,%ymm1,%ymm1 vaesenc %ymm1,%ymm1,%ymm1
vaesenc %ymm0,%ymm0,%ymm0 vaesenc %ymm0,%ymm0,%ymm0
> vaesenc %ymm1,%ymm1,%ymm1
In this case, the increase in number of instructions matches the
increase in number of uops. The LLVM-MCA simulation says that the
QLatin1StringView version is faster at 11 cycles/iteration vs 14 cyc/it
(see <https://analysis.godbolt.org/z/1Gv1coz13>), but that can't be
right.
Measured performance of CPU cycles, on an Intel Core i9-7940X (Skylake,
no VAES support), normalized on the QString performance (QByteArray is
used as a stand-in for the performance in Qt 6.7):
aeshash | siphash
QByteArray QL1SV QString QByteArray QString
dictionary 94.5% 79.7% 100.0% 150.5%* 159.8%
paths-small 90.2% 93.2% 100.0% 202.8% 290.3%
uuids 81.8% 100.7% 100.0% 215.2% 350.7%
longstrings 42.5% 100.8% 100.0% 185.7% 353.2%
numbers 95.5% 77.9% 100.0% 155.3%* 164.5%
On an Intel Core i7-1165G7 (Tiger Lake, capable of VAES and AVX512VL):
aeshash | siphash
QByteArray QL1SV QString QByteArray QString
dictionary 90.0% 91.1% 100.0% 103.3%* 157.1%
paths-small 99.4% 104.8% 100.0% 237.5% 358.0%
uuids 88.5% 117.6% 100.0% 274.5% 461.7%
longstrings 57.4% 111.2% 100.0% 503.0% 974.3%
numbers 90.6% 89.7% 100.0% 98.7%* 149.9%
On an Intel 4th Generation Xeon Scalable Platinum (Sapphire Rapids, same
Golden Cove core as Alder Lake):
aeshash | siphash
QByteArray QL1SV QString QByteArray QString
dictionary 89.9% 102.1% 100.0% 158.1%* 172.7%
paths-small 78.0% 89.4% 100.0% 159.4% 258.0%
uuids 109.1% 107.9% 100.0% 279.0% 496.3%
longstrings 52.1% 112.4% 100.0% 564.4% 1078.3%
numbers 85.8% 98.9% 100.0% 152.6%* 190.4%
* dictionary contains very short entries (6 characters)
* paths-small contains strings of varying length, but very few over 32
* uuids-list contains fixed-length strings (38 characters)
* longstrings is the same but 304 characters
* numbers also a lot contains very short strings (1 to 6 chars)
What this shows:
* For short strings, the performance difference is negligible between
all three
* For longer strings, QLatin1StringView now costs between 7 and 17% more
than QString on the tested machines instead of up to ~50% less, except on
the older machine (where I think the main QString hashing is suffering
from memory bandwidth limitations)
* The AES hash implementation is anywhere from 1.6 to 11x faster than
Siphash
* Murmurhash (marked with asterisk) is much faster than Siphash, but it
only managed to beat the AES hash in one test
Change-Id: I664b9f014ffc48cbb49bfffd17b045c1811ac0ed
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Make the benchmarks more comparable:
- Store the QDir::Filters in one central var, this way it's the same in
all the call sites
- Add a `bool forceStat`, when true force calling stat(), either
explicitly in posix_helper(), or implicitly in Qt classes by e.g.
calling a QFileInfo method that would have to call system stat()
internally. Otherwise benchmarking readdir()/dirent showed bigger
times, which was mostly due to the explicit stat() calls, whereas we
can use dirent::d_type (on the platforms where it's available)
Drive by change: for std::filesystem::recursive_directory_iterator, set
skip_permission_denied option and use the non-throwing constructor.
Change-Id: Icf138a5dc41d32741c1be611d664b01008b2f3fe
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
According to QUIP-18 [1], all tests file should be
LicenseRef-Qt-Commercial OR GPL-3.0-only
[1]: https://contribute.qt-project.org/quips/18
Pick-to: 6.7
Task-number: QTBUG-121787
Change-Id: I9657df5d660820e56c96d511ea49d321c54682e8
Reviewed-by: Christian Ehrlicher <ch.ehrlicher@gmx.de>
It was shown to have poor performance compared to contains() and
insert().
Pick-to: 6.7 6.6 6.5
Change-Id: I61cfbc8c34e325d677d7954118ef68057df640cb
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
There's a lot of variation in the benchmark graphs for QHash
presumably caused by variation in seed.
Optimally we would set deterministic seed for all
benchmarks, but we don't know whether or not it is
one until the macro is reached.
Pick-to: 6.7 6.6 6.5
Change-Id: I4e412e4d4e2cc65eada94ed123243ed0047dd9cf
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>