Mention wxString caching in UTF-8 ode
git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@55344 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
This commit is contained in:
parent
3f5506cfd3
commit
a6919a6aca
@ -232,11 +232,12 @@ internal representation and this implies that it can't guarantee constant-time
|
|||||||
access to N-th element of the string any longer as to find the position of this
|
access to N-th element of the string any longer as to find the position of this
|
||||||
character in the string we have to examine all the preceding ones. Usually this
|
character in the string we have to examine all the preceding ones. Usually this
|
||||||
doesn't matter much because most algorithms used on the strings examine them
|
doesn't matter much because most algorithms used on the strings examine them
|
||||||
sequentially anyhow, but it can have serious consequences for the algorithms
|
sequentially anyhow and because wxString implements a cache for iterating over
|
||||||
using indexed access to string elements as they typically acquire O(N^2) time
|
the string by index but it can have serious consequences for algorithms
|
||||||
|
using random access to string elements as they typically acquire O(N^2) time
|
||||||
complexity instead of O(N) where N is the length of the string.
|
complexity instead of O(N) where N is the length of the string.
|
||||||
|
|
||||||
To return to the linear complexity, indexed access should be replaced with
|
Even despite caching the index, indexed access should be replaced with
|
||||||
sequential access using string iterators. For example a typical loop:
|
sequential access using string iterators. For example a typical loop:
|
||||||
@code
|
@code
|
||||||
wxString s("hello");
|
wxString s("hello");
|
||||||
|
@ -65,28 +65,41 @@ public:
|
|||||||
/**
|
/**
|
||||||
@class wxString
|
@class wxString
|
||||||
|
|
||||||
|
The wxString class has been completely rewritten for wxWidgets 3.0
|
||||||
|
and this change was actually the main reason for the calling that
|
||||||
|
version wxWidgets 3.0.
|
||||||
|
|
||||||
wxString is a class representing a Unicode character string.
|
wxString is a class representing a Unicode character string.
|
||||||
wxString uses @c std::string internally to store its content
|
wxString uses @c std::string internally to store its content
|
||||||
unless this is not supported by the compiler or disabled
|
unless this is not supported by the compiler or disabled
|
||||||
specifically when building wxWidgets. Therefore wxString
|
specifically when building wxWidgets and it therefore inherits
|
||||||
inherits many features from @c std::string. Most
|
many features from @c std::string. Most implementations of
|
||||||
implementations of @c std::string are thread-safe and don't
|
@c std::string are thread-safe and don't use reference counting.
|
||||||
use reference counting. By default, wxString uses @c std::string
|
By default, wxString uses @c std::string internally even if
|
||||||
internally even if wxUSE_STL is not defined.
|
wxUSE_STL is not defined.
|
||||||
|
|
||||||
|
wxString now internally uses UTF-16 under Windows and UTF-8 under
|
||||||
|
Unix, Linux and OS X to store its content. Note that when iterating
|
||||||
|
over a UTF-16 string under Windows, the user code has to take care
|
||||||
|
of surrogate pair handling whereas Windows itself has built-in
|
||||||
|
support pairs in UTF-16, such as for drawing strings on screen.
|
||||||
|
|
||||||
Since wxWidgets 3.0 wxString internally uses UCS-2 (basically 2-byte per
|
|
||||||
character wchar_t and nearly the same as UTF-16) under Windows and
|
|
||||||
UTF-8 under Unix, Linux and OS X to store its content.
|
|
||||||
Much work has been done to make existing code using ANSI string literals
|
Much work has been done to make existing code using ANSI string literals
|
||||||
work as before. If you need to have a wxString that uses wchar_t on Unix
|
work as before. If you nonetheless need to have a wxString that uses wchar_t
|
||||||
and Linux, too, you can specify this on the command line with the
|
on Unix and Linux, too, you can specify this on the command line with the
|
||||||
@c configure @c --disable-utf8 switch.
|
@c configure @c --disable-utf8 switch or you can consider using wxUString
|
||||||
|
or std::wstring instead.
|
||||||
|
|
||||||
If you need a Unicode string class with O(1) access on all platforms
|
Accessing a UTF-8 string by index can be very inefficient because
|
||||||
you should consider using wxUString.
|
a single character is represented by a variable number of bytes so that
|
||||||
|
the entire string has to be parsed in order to find the character.
|
||||||
|
Since iterating over a string by index is a common programming technique and
|
||||||
|
was also possible and encouraged by wxString using the access operator[]()
|
||||||
|
wxString implements caching of the last used index so that iterating over
|
||||||
|
a string is a linear operation even in UTF-8 mode.
|
||||||
|
|
||||||
Since iterating over a wxString by index can become inefficient in UTF-8
|
It is nonetheless recommended to use iterators (instead of index bases
|
||||||
mode iterators should be used instead of index based access:
|
access) like this:
|
||||||
|
|
||||||
@code
|
@code
|
||||||
wxString s = "hello";
|
wxString s = "hello";
|
||||||
|
Loading…
Reference in New Issue
Block a user