wxWidgets/include
Vadim Zeitlin 4502e7563b Fix wxTextInputStream for input starting with BOM-like bytes
Contrary to what a comment in wxTextInputStream::GetChar() said, it is
actually possible to get more than one wide character from a call to
wxMBConv::ToWChar(len+1) even if a previous call to ToWChar(len) failed
to decode anything at all. This happens with wxConvAuto because it keeps
returning an error while it doesn't have enough data to determine if the
input contains a BOM or not, but then returns all the characters
examined so far at once if it turns out that there was no BOM, after
all.

The simplest case in which this created problems was just input starting
with a NUL byte as it as this could be a start of UTF-32BE BOM.

The fix consists in keeping all the bytes read but not yet decoded in
the m_lastBytes buffer and retrying to decode them during the next
GetChar() call. This implies keeping track of how much valid data is
there in m_lastBytes exactly, as we can't discard the already decoded
data immediately, but need to keep it in the buffer too, in order to
allow implementing UngetLast(). Incidentally, UngetLast() was totally
broken for UTF-16/32 input (containing NUL bytes in the middle of the
characters) before and this change fixes this as a side effect.

Also add test cases for previously failing inputs.
2017-11-09 23:49:59 +01:00
..
msvc/wx Add Microsoft Visual Studio 2017 solution file for building wxMSW 2017-01-16 17:02:10 +01:00
wx Fix wxTextInputStream for input starting with BOM-like bytes 2017-11-09 23:49:59 +01:00