QXmlStreamReader::addData: lock encoding for QLatin1 case

This fixes a bug when addData() is used to add a full Latin1-encoded
XML document with a proper "encoding" attribute.

Once addData() is called, it immediately converts the data to UTF-8.
However, if the encoding is not locked, the parser will later see
the "encoding" attribute, and try to interpret the data according
to the specified encoding.

The QXSR(QASV) constructor is not affected, because it already locks
the encoding.

Add a unit-test for the issue.

Amends 6bc227a06a0d1392d220aa79ddb1cdc145d4f76e.

[ChangeLog][QtCore][QXmlStreamReader] Fixed a bug when calling
addData() with a Latin1-encoded string containing a full XML document
with an encoding attribute, could result in incorrect parsing of this
document.

Fixes: QTBUG-135033
Pick-to: 6.8 6.5
Change-Id: I9a35d16d743050ea4feccab3d1336747ce0abff4
Reviewed-by: Marc Mutz <marc.mutz@qt.io>
(cherry picked from commit 4b8659ebf689b79ac88f5935ad662a604f0c8bea)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
This commit is contained in:
Ivan Solovev 2025-03-21 11:55:38 +01:00 committed by Qt Cherry-pick Bot
parent 5405f6b575
commit cfda487726
2 changed files with 21 additions and 0 deletions

View File

@ -572,6 +572,7 @@ void QXmlStreamReader::addData(QAnyStringView data)
} else if constexpr (std::is_same_v<decltype(data), QLatin1StringView>) {
// Conversion to a QString is required, to avoid breaking
// pre-existing (before porting to QAnyStringView) behavior.
d->lockEncoding = true;
if (!d->decoder.isValid())
d->decoder = QStringDecoder(QStringDecoder::Utf8);
addDataImpl(QString::fromLatin1(data).toUtf8());

View File

@ -568,6 +568,7 @@ private slots:
void readFromQBuffer() const;
void readFromQBufferInvalid() const;
void readFromLatin1String() const;
void readLatin1Document() const;
void readNextStartElement() const;
void readElementText() const;
void readElementText_data() const;
@ -1229,6 +1230,25 @@ void tst_QXmlStream::readFromLatin1String() const
}
}
void tst_QXmlStream::readLatin1Document() const
{
const auto in = "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?><a>M\xE5rten</a>"_L1;
{
QXmlStreamReader reader(in);
QVERIFY(reader.readNextStartElement());
QString text = reader.readElementText();
QCOMPARE(text, "M\xE5rten"_L1);
}
// Same as above, but with addData(), QTBUG-135033
{
QXmlStreamReader reader;
reader.addData(in);
QVERIFY(reader.readNextStartElement());
QString text = reader.readElementText();
QCOMPARE(text, "M\xE5rten"_L1);
}
}
void tst_QXmlStream::readNextStartElement() const
{
QLatin1String in("<?xml version=\"1.0\"?><A><!-- blah --><B><C/></B><B attr=\"value\"/>text</A>");