QMeta{Enum,Property}::metaType(): perform unsigned 32-bit divisions

Instead of signed 64-bit ones. Divisions by 5 (or 20 bytes) aren't
efficient in hardware, leading the compilers to implement a
multiplication by 2^n / 20 and shifting instead.

For Qt 7.0, we should consider either:
 * store the index to the metatype in the enum/property data block
 * increase the enum/property data block to 8 integers
 * decrease the enum/property data block to 4 integers

Before:
  16d1b8:       sub    %rcx,%rax         # d - mobj->d.data
  16d1bb:       sar    $0x2,%rax
  16d1bf:       movslq 0x24(%rcx),%rdx   # sign-extended enumeratorData
  16d1c3:       sub    %rdx,%rax
  16d1c6:       movabs $0x6666666666666667,%rdx
  16d1d0:       imul   %rdx              # 64-bit mul w/ 128-bit result
  16d1d3:       mov    %rdx,%rax
  16d1d6:       shr    $0x3f,%rax
  16d1da:       shr    $1,%rdx
  16d1dd:       add    %eax,%edx
  16d1df:       movslq 0x18(%rcx),%rax
  16d1e3:       movslq %edx,%rcx
  16d1e6:       add    %rax,%rcx

After:
  18def8:       sub    %rcx,%rdx         # d - mobj->d.data
  18defb:       shr    $0x2,%rdx
  18deff:       sub    0x24(%rcx),%edx   # 32-bit sub of enumeratorData
  18df02:       mov    $0xcccccccd,%esi
  18df07:       imul   %rdx,%rsi         # 64-bit mul w/ 64-bit result
  18df0b:       shr    $0x22,%rsi
  18df0f:       movslq 0x18(%rcx),%rcx
  18df13:       add    %rsi,%rcx

The IMUL with 128-bit results[1] needs 4 cycles to calculate the upper
part in current Intel P-cores and AMD cores, and 6 cycles on Intel E-
cores, which is one cycle more in all cases than the 64-bit IMUL[2]. The
old code has two SHR (1 cycle, execute in parallel), an ADD (1 cycle),
one sign-extension (1 cycle) before the final ADD, whereas the new code
has just one SHR.

As a result, the new code should require 3 cycles fewer in all x86
processors.

This is what it could look like with Size == 8:
  18def8:       sub    %rcx,%rdx
  18defb:       shr    $0x2,%rdx
  18deff:       sub    0x24(%rcx),%edx
  18df02:       shr    $0x3,%edx
  18df05:       movslq 0x18(%rcx),%rcx
  18df09:       add    %rcx,%rdx

[1] https://uops.info/html-instr/IMUL_R64.html
[2] https://uops.info/html-instr/IMUL_R64_R64.html

Change-Id: Ife2dfbd4c341cae4c6cdfffd580455ecf3e99704
Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
This commit is contained in:
Thiago Macieira 2024-09-12 09:26:16 -07:00
parent a90d99d8da
commit 9e6a59758c

View File

@ -3413,7 +3413,10 @@ QMetaEnum::QMetaEnum(const QMetaObject *mobj, int index)
int QMetaEnum::Data::index(const QMetaObject *mobj) const
{
return (d - mobj->d.data - priv(mobj->d.data)->enumeratorData) / Size;
#if QT_VERSION >= QT_VERSION_CHECK(7, 0, 0)
# warning "Consider changing Size to a power of 2"
#endif
return (unsigned(d - mobj->d.data) - priv(mobj->d.data)->enumeratorData) / Size;
}
/*!
@ -3551,7 +3554,10 @@ QMetaType QMetaProperty::metaType() const
int QMetaProperty::Data::index(const QMetaObject *mobj) const
{
return (d - mobj->d.data - priv(mobj->d.data)->propertyData) / Size;
#if QT_VERSION >= QT_VERSION_CHECK(7, 0, 0)
# warning "Consider changing Size to a power of 2"
#endif
return (unsigned(d - mobj->d.data) - priv(mobj->d.data)->propertyData) / Size;
}
/*!