Update the QUrl documentation concerning the encoding

Looks like I failed to update this earlier, when the behaviour changed.

Change-Id: Ic020c2a14d4e9153f2bc9d22d943a3a380c0851c
Reviewed-by: Shane Kearns <shane.kearns@accenture.com>
This commit is contained in:
Thiago Macieira 2012-04-24 13:33:16 +02:00 committed by Qt by Nokia
parent 2a683c1f14
commit 4544ec931f
2 changed files with 121 additions and 19 deletions

20
dist/changes-5.0.0 vendored
View File

@ -625,6 +625,26 @@ Qt for Windows CE
cleared). Any QPointers tracking a widget will NOT be cleared before the
QWidget destructor destroys the children for the widget being tracked.
- QUrl
* QUrl has been changed to operate only on percent-encoded
forms. Fully-decoded forms, where the percent character stands for itself,
are no longer possible. For that reason, the getters and setters with
"encoded" in the name are deprecated, except for QUrl::toEncoded() and
QUrl::fromEncoded().
QUrl now operates in a mode where it decodes as much as it can of the
percent-encoding sequences. In addition, the setter methods possess a mode
in which a '%' character not part of a percent-encoding sequence will cause
the parser to correct the input. Therefore, most software will not require
changes to adapt, since the getter methods will continue returning the
components in their most-decoded form as they did before and the setter
methods will accept input as they did before..
The most notable difference is when dealing with
QUrl::toString(). Previously, this function would return percent characters
in the URL by themselves. Now, it will return "%25", like
QUrl::toEncoded().
- QVariant

View File

@ -126,32 +126,47 @@
The parsing mode controls the way QUrl parses strings.
\value TolerantMode QUrl will try to correct some common errors in URLs.
This mode is useful when processing URLs entered by
users.
This mode is useful for parsing URLs coming from sources
not known to be strictly standards-conforming.
\value StrictMode Only valid URLs are accepted. This mode is useful for
general URL validation.
In TolerantMode, the parser corrects the following invalid input:
In TolerantMode, the parser has the following behaviour:
\list
\li Spaces and "%20": If an encoded URL contains a space, this will be
replaced with "%20". If a decoded URL contains "%20", this will be
replaced with a single space before the URL is parsed.
\li Spaces and "%20": unencoded space characters will be accepted and will
be treated as equivalent to "%20".
\li Single "%" characters: Any occurrences of a percent character "%" not
followed by exactly two hexadecimal characters (e.g., "13% coverage.html")
will be replaced by "%25".
will be replaced by "%25". Note that one lone "%" character will trigger
the correction mode for all percent characters.
\li Reserved and unreserved characters: An encoded URL should only
contain a few characters as literals; all other characters should
be percent-encoded. In TolerantMode, these characters will be
automatically percent-encoded where they are not allowed:
space / double-quote / "<" / ">" / "[" / "\" /
"]" / "^" / "`" / "{" / "|" / "}"
space / double-quote / "<" / ">" / "\" /
"^" / "`" / "{" / "|" / "}"
Those same characters can be decoded again by passing QUrl::DecodeReserved
to toString() or toEncoded().
\endlist
When in StrictMode, if a parsing error is found, isValid() will return \c
false and errorString() will return a simple message describing the error.
If more than one error is detected, it is undefined which error gets
reported.
Note that TolerantMode is not usually enough for parsing user input, which
often contains more errors and expectations than the parser can deal with.
When dealing with data coming directly from the user -- as opposed to data
coming from data-transfer sources, such as other programs -- it is
recommended to use fromUserInput().
\sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions
*/
/*!
@ -178,6 +193,62 @@
Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl
conforms to, require host names to always be converted to lower case,
regardless of the Qt::FormattingOptions used.
The options from QUrl::ComponentFormattingOptions are also possible.
\sa QUrl::ComponentFormattingOptions
*/
/*!
\enum QUrl::ComponentFormattingOptions
\since 5.0
The component formatting options define how the components of an URL will
be formatted when written out as text. They can be combined with the
options from QUrl::FormattingOptions when used in toString() and
toEncoded().
\value PrettyDecoded The component is returned in a "pretty form", with
most percent-encoded characters decoded. The exact
behavior of PrettyDecoded varies from component to
component and may also change from Qt release to Qt
release. This is the default.
\value EncodeSpaces Leave space characters in their encoded form ("%20").
\value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8
percent-encoded form (e.g., "%C3%A9" for the U+00E9
codepoint, LATIN SMALL LETTER E WITH ACUTE).
\value EncodeDelimiters Leave certain delimiters in their encoded form, as
would appear in the URL when the full URL is
represented as text. The delimiters are affected
by this option change from component to component.
\value EncodeReserved Leave the US-ASCII reserved characters in their encoded
forms.
\value DecodeReseved Decode the US-ASCII reserved characters.
\value FullyEncoded Leave all characters in their properly-encoded form,
as this component would appear as part of a URL. When
used with toString(), this produces a fully-compliant
URL in QString form, exactly equal to the result of
toEncoded()
\value MostDecoded Attempt to decode as much as possible. For individual
components of the URL, this decodes every percent
encoding sequence, control characters (U+0000 to U+001F)
and non-US-ASCII sequences that aren't valid UTF-8
sequences.
The values of EncodeReserved and DecodeReserved should not be used together
in one call. The behaviour is undefined if that happens. They are provided
as separate values because the behaviour of the "pretty mode" with regards
to reserved characters is different on certain components and specially on
the full URL.
\sa QUrl::FormattingOptions
*/
#include "qurl.h"
@ -1373,12 +1444,17 @@ const QByteArray &QUrlPrivate::normalized() const
/*!
Constructs a URL by parsing \a url. \a url is assumed to be in human
readable representation, with no percent encoding. QUrl will automatically
percent encode all characters that are not allowed in a URL.
The default parsing mode is TolerantMode.
Constructs a URL by parsing \a url. QUrl will automatically percent encode
all characters that are not allowed in a URL and decode the percent-encoded
sequences that represent a character that is allowed in a URL.
Parses the \a url using the parser mode \a parsingMode.
Parses the \a url using the parser mode \a parsingMode. In TolerantMode
(the default), QUrl will correct certain mistakes, notably the presence of
a percent character ('%') not followed by two hexadecimal digits, and it
will accept any character in any position. In StrictMode, encoding mistakes
will not be tolerated and QUrl will also check that certain forbidden
characters are not present in unencoded form. If an error is detected in
StrictMode, isValid() will return false.
Example:
@ -1457,12 +1533,18 @@ void QUrl::clear()
}
/*!
Parses \a url using the parsing mode \a parsingMode.
Parses \a url and sets this object to that value. QUrl will automatically
percent encode all characters that are not allowed in a URL and decode the
percent-encoded sequences that represent a character that is allowed in a
URL.
\a url is assumed to be in unicode format, with no percent
encoding.
Calling isValid() will tell whether or not a valid URL was constructed.
Parses the \a url using the parser mode \a parsingMode. In TolerantMode
(the default), QUrl will correct certain mistakes, notably the presence of
a percent character ('%') not followed by two hexadecimal digits, and it
will accept any character in any position. In StrictMode, encoding mistakes
will not be tolerated and QUrl will also check that certain forbidden
characters are not present in unencoded form. If an error is detected in
StrictMode, isValid() will return false.
\sa setEncodedUrl()
*/