macOS: Generate UTF-16 clipboard content without BOM

Qt on macOS has traditionally not included a BOM in the UTF-16 data,
but due to iOS requiring it it was changed in 4e196159. This had the
unfortunate side effect of breaking macOS applications that were not
prepared for the BOM, even if the public.utf16-plain-text UTI can have
an optional BOM, most notably Microsoft Excel. It also resulted in the
public.utf8-plain-text having a BOM, as that's automatically generated
by macOS based on the UTF-16 content we give it. Having a BOM in UTF-8
is technically fine, but not required, and recommended against.

The fact that iOS requires a BOM is a bit dubious, and most likely a
result of applications or system frameworks decoding the data using
NSUTF16StringEncoding, which assumes big-ending byte ordering if there
is no BOM, as opposed to public.utf16-plain-text which assumes native
byte ordering. Since we can't fix iOS our best bet is to include a BOM.

For macOS though, we revert back to the old behavior of not including
a BOM, since that seems to surprise macOS frameworks and applications
the least, even if having a BOM in public.utf16-plain-text should be
fully supported.

Longer term we should look at what kind of UTIs we generate. Most apps
on macOS do not generate public.utf16-plain-text, but instead generate
public.utf16-external-plain-text, which differs from the former in that
it assumes big-endian byte-ordering when there's no BOM. On iOS apps
seem to generate public.utf8-plain-text, and do not generate any UTF-16
UTIs. Moving Qt over to these UTIs would fix the problem as well, but
is a larger change that needs more research.

Change-Id: I4769c8b7d09daef7e3012e99cacc3237f7b0fc1a
Fixes: QTBUG-61562
Reviewed-by: Tor Arne Vestbø <tor.arne.vestbo@qt.io>
This commit is contained in:
Tor Arne Vestbø 2019-05-16 11:44:44 +02:00
parent 299675e665
commit 387691498a

View File

@ -435,8 +435,23 @@ QList<QByteArray> QMacPasteboardMimeUnicodeText::convertFromMime(const QString &
if (flavor == QLatin1String("public.utf8-plain-text"))
ret.append(string.toUtf8());
#if QT_CONFIG(textcodec)
else if (flavor == QLatin1String("public.utf16-plain-text"))
ret.append(QTextCodec::codecForName("UTF-16")->fromUnicode(string));
else if (flavor == QLatin1String("public.utf16-plain-text")) {
QTextCodec::ConverterState state;
#if defined(Q_OS_MACOS)
// Some applications such as Microsoft Excel, don't deal well with
// a BOM present, so we follow the traditional approach of Qt on
// macOS to not generate public.utf16-plain-text with a BOM.
state.flags = QTextCodec::IgnoreHeader;
#else
// Whereas iOS applications will fail to paste if we do _not_
// include a BOM in the public.utf16-plain-text content, most
// likely due to converting the data using NSUTF16StringEncoding
// which assumes big-endian byte order if there is no BOM.
state.flags = QTextCodec::DefaultConversion;
#endif
ret.append(QTextCodec::codecForName("UTF-16")->fromUnicode(
string.constData(), string.length(), &state));
}
#endif
return ret;
}