IPv6 addresses can start with ":", for which QDir::isAbsolute() would
always return true (QResourceFileEngine::isRelativePath() returns
constant false) and would trip the calculation for local files.
Similarly, IPv6 addresses can start with strings that look like Windows
drives: "a:", "b:", "c:", "d:", "e:" and "f:" (though not today, as
those address blocks are unassigned). Since a valid IPv6 address will
definitely require at least one more colon and Windows file names cannot
contain ':', there's no ambiguity: a valid IPv6 address is never a valid
file on Windows.
This resolves the ambiguity in favor of IPv6 for Unix filenames (which
can contain a colon) and in case of an URL containing scheme, relative
path and no authority ("dead:beef::" for example could have been parsed
as scheme() == "dead" and path() == "beef::").
Task-number: QTBUG-41089
Change-Id: Id9119af1acf8a75a786519af3b48b4ca3dbf3719
Reviewed-by: David Faure <david.faure@kdab.com>
This is much more useful than the URL "file:", it allows to use
"empty path" and "empty URL" for the same meaning (e.g. not set).
QFileDialog actually uses "file:" though, as the URL for the
"My Computer" item in the sidebar. This patch preserves that.
[ChangeLog][QtCore][QUrl] QUrl::fromLocalFile now returns an empty URL
if the input string is empty.
Change-Id: Ib5ce1a3cdf5f229368e5bcd83c62c1d1ac9f8a17
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This was prompted by https://git.reviewboard.kde.org/r/119221
Change-Id: Ia148f07f6d711df533693918bbedfa5e7dc02cd5
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Useful for any application that can take URLs on the command-line, so that
full paths and relative paths can also be accepted.
Change-Id: I8a2c50f36d60bdc49c065b3065972fd5d45fa12a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This partially reverts 5764f5e6ea and
fixes the problem differently.
After this commit, "file:///foo" is still equal to "file:/foo", but
"foo:///foo" becomes different from "foo:/foo", as it should be.
Task-number: QTBUG-36151
Change-Id: Ia38638b0f30a7dcf110aa89aa427254c007fc107
Reviewed-by: David Faure <david.faure@kdab.com>
New API:
static QString QString::fromCFString(CFStringRef string);
CFStringRef QString::toCFString() const;
static QString QString::fromNSString(const NSString *string);
NSString *QString::toNSString() const;
static QUrl QUrl::fromCFURL(CFURLRef url);
CFURLRef QUrl::toCFURL() const;
static QUrl QUrl::fromNSURL(const NSURL *url);
NSURL * QUrl::toNSURL() const;
Add Q_OS_MAC-protected function declarations to header
files, add implementation to _mm files.
CF and NS types are forward-declared in the header
files to avoid including the CoreFoundation and Foundation
headers. This prevents accidental use of native types
in application code. Add helper macros for forward-
declaration to qglobal.h
Add cf_returns_retained/ns_returns_autoreleased attributes
to toCFString() and toNSURL(). These attributes assists
the clang static analyzer. Add Q_DECL_ helper macros
to qcompilerdetection.h.
Add test functions (in _mac.mm files) to the QString
and QUrl tests. Split out the test class declarations
into a separate headers files.
Change-Id: I60fd5e93f042316196284c3db0595835fe8c4ad4
Reviewed-by: Gabriel de Dietrich <gabriel.dedietrich@digia.com>
which would interpret 'path' as a hostname.
The check is in the public setPath so that the internal one can still
support parsing URLs such as ftp://ftp.example.com//path.
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery]QUrl now
normalizes the path given in setPath, removing ./ and ../ and duplicate
slashes.
Change-Id: I05ccd8a1d813de45e460384239c059418a8e6a08
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
QUrl("http%3A%2F%2Fexample.com") has only a path of
"http%3A%2F%2Fexample.com". In Qt 5.0 and 5.1, the %3A would get decoded
to ':', which in turn makes the URL invalid (colon before first slash).
Found via discussion on the interest mailing list.
Change-Id: I7f4f242b330df280e635eb97cce123e742aa1b10
Reviewed-by: David Faure <david.faure@kdab.com>
This fixes the wrong value for path() and fileName() when a
path or file name actually contains a '%'.
userInfo() and authority() are not individual getters, they combine
two or more fields, so full decoding isn't possible (e.g. username
containing a ':').
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery]QUrl now
defaults to decoded mode in the getters and setters for userName,
password, host, topLevelDomain, path and fileName. This means a '%'
in one of those fields is now returned (or set) as '%' rather than "%25".
In the unlikely case where the former behavior was expected, pass PrettyDecoded
to the getter and TolerantMode to the setter.
Change-Id: Iaeecbde9c269882e79f08b29ff8c661157c41743
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
For some time, we've assumed that the URL specification had a mistake in
that it didn't allow the "#" character to appear decoded in the
fragment. We've gotten away with it so far.
However, turns out that the CoreFoundation NSURL class doesn't like it.
So we have to be stricter.
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery] QUrl no
longer decodes %23 found in the fragment to "#" in the output of
toString(QUrl::FullyEncoded) or toEncoded()
Task-number: QTBUG-31945
Change-Id: If5e0fb37bae84710986c9ca89bd69ec98437cd63
Reviewed-by: David Faure (KDE) <faure@kde.org>
Those sections contain more than one components of a URL, separated by
delimiters. For that reason, QUrl::FullyDecoded and QUrl::DecodedMode do
not make sense, since they would cause the returned value to be
ambiguous and/or fail to parse again.
In fact, there was a comment in the test saying "look how it becomes
ambiguous".
Those modes are already forbidden in the setters and getters of the full
URL (setUrl(), url(), toString() and toEncoded()).
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery] QUrl no
longer supports QUrl::FullyDecoded mode in authority() and userInfo(),
nor QUrl::DecodedMode in setAuthority() and setUserInfo().
Change-Id: I538f7981a9f5a09f07d3879d31ccf6f0c8bfd940
Reviewed-by: David Faure (KDE) <faure@kde.org>
The longer explanation can be found in the comment in qurl.cpp. The
short version is as follows:
Up to now, we considered that every character could be replaced with
its percent-encoding equivalent and vice-versa, so long as the parsing
of the URL did not change. For example, x:/path+path and
x:/path%2Bpath were the same. However, to do this and yet be compliant
with most URL uses in the real world, we had to add exceptions:
- "/" and "%2F" were not the same in the path, despite the delimiter
being behind (rationale was the complex definition of path)
- "+" and "%2B" were not the same in the query, so we ended up not
transforming any sub-delim in the query at all
Now, we change our understanding based on the following line from
RFC 3986 section 2.2:
URIs that differ in the replacement of a reserved character with
its corresponding percent-encoded octet are not equivalent.
From now on, QUrl will not replace any sub-delim or gen-delim
("reserved character"), except where such a character could not exist
in the first place. This simplifies the code and removes all
exceptions.
As a side-effect, this has also changed the behaviour of the "{" and
"}" characters, which we previously allowed to remain decoded.
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery] QUrl no
longer considers all delimiter characters equivalent to their
percent-encoded forms. Now, both classes always keep all delimiters
exactly as they were in the original URL text.
[ChangeLog][Important Behavior Changes][QUrl and QUrlQuery] QUrl no
longer decodes %7B and %7D to "{" and "}" in the output of toString()
Task-number: QTBUG-31660
Change-Id: Iba0b5b31b269635ac2d0adb2bb0dfb74c139e08c
Reviewed-by: David Faure (KDE) <faure@kde.org>
We don't know what it might be used for. The RFC for URI says it's an
HEXDIG, and since we uppercase all other HEXDIGs already (in
percent-encodings...).
Change-Id: I56d0a81315576dd98eaa2657c0307d79332543a5
Reviewed-by: David Faure (KDE) <faure@kde.org>
We have no idea what it might contain, but test it anyway to make sure
it works. Turns out there were a few bugs the unit tests have now
caught.
Change-Id: I0a6c868365feec31c2360b3c341c8ca6944f4352
Reviewed-by: David Faure (KDE) <faure@kde.org>
Registered names and IP addresses can only contain unreserved
characters (letters, digits, dots, hyphens, underscores) and the
colon, which is a gen-delim. For registered names and IPv4 addresses,
we can simply use the default config -- if anything that remains
percent-encoded, it means it's not a valid hostname anyway.
For IPv6, we just need to decode the colon.
Change-Id: If8083d47f6e5375f760e7a6c59631c89e4da8378
Reviewed-by: David Faure (KDE) <faure@kde.org>
This is a bit like QDir::cleanPath(), but for URL paths.
The code is shared with QDir::cleanPath(), by extracting the common parts
it into a helper, qt_normalizePathSegments().
Change-Id: I7133c5e4aa2bf17fba98af13eb5371afba64197a
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This allows to find the parent directory url using
url.adjusted(QUrl::RemoveFilename).
Change-Id: I1ca433ac67e4f93080de54a9b7ab2e538509ed04
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
qt_ACE_do(".co.uk") was returning an empty string because of the
leading dot. Allow leading dots from topLevelDomain, but not from
other calls.
Change-Id: I757d9960708e205d30554cd2bbcf618c8624792b
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
This reverts commit e3fa266623b08e837cb4ccc7fe59da243d03dd27
That commit applied a change at the wrong place in the code.
Change-Id: I21e3045a3af14ad2f90c5fe338815c35a2d27ae6
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: David Faure (KDE) <faure@kde.org>
Leading and double dots are bad, but trailing dots are fine. The ASCII
part of a hostname is supposed to be LDH (letters, digits, hyphen) only,
but we accept '_' (underscore) as an exception too.
Change-Id: I79957ddec4da78a0e2357fe50c8687db03e1c99e
Reviewed-by: David Faure (KDE) <faure@kde.org>
qt_ACE_do(".co.uk") was returning an empty string because of the
leading dot. This has always caused issues in KDE code too, where ACE
normalization needs the dot removed, and re-added afterwards.
Change-Id: Id9fcea0333cf55c14d755a86d4bf33a50f194429
Reviewed-by: Frederik Gladhorn <frederik.gladhorn@digia.com>
qt_nameprep is tested by tst_qurlinternal. We just need to be sure that
QUrl handles them correctly.
Change-Id: Ic563004870d2cf2fa7a31ce49fff7280d5ffb5f3
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Most notably, .com and .net now may contain non-ASCII characters.
list has been generated from
http://www.mozilla.org/projects/security/tld-idn-policy-list.html
Change-Id: Idc3191dc782bc4173ccb19b4bc81f4f061ca7999
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
The table is there to know which domains are allowed to set cookies
and which are not. There are more than 2000 new entries since the
list has last been generated.
The split to 64K chunks was made because this is the hard limit for
strings in Visual Studio.
Change-Id: I511aec062af673555e9a69442c055f75bdcd1606
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
These tests were added to Qt 4 on commit
a17fc85b51a6bdcfa33dcff183d2b7efd667fb92
Task-number: QTBUG-28985
Change-Id: I3cf595384f14272197dcfb85943213c8f8ddeba0
Reviewed-by: Shane Kearns <shane.kearns@accenture.com>
This is a very common thing to do, e.g. in order to send urls via DBus.
Change-Id: I277902460ee1ad6780446e862e86b3c2eb8c5315
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
QUrl::fromUserInput("http://") was invalid, which doesn't make sense
since QUrl("http://") is valid. Same for "smb:" which is actually
even more a valid URL from a user's point of view.
Change-Id: I371ac393d61b49499edf5adbbc2a90b426fe9e5d
Reviewed-by: Marco Martin <mart@kde.org>
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
These cases weren't handled before.
The validateComponent function is copied from QUrlPrivate::parse, with
the added modification that it now needs to check the gen-delims for
the userinfo.
Change-Id: I055167b977199fa86b56a3a7259a7445585129c6
Reviewed-by: David Faure (KDE) <faure@kde.org>
... with respect to empty and null strings.
Change-Id: Ic107d5bcc8b659497a567b75a7244caceba5a715
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Keep the original QString that triggered the parsing error, instead of
just one QChar. This provides more powerful error messages, like:
Invalid IPv6 address; source was "http://[:::]"; scheme = "http", host = ""
(QUrl cannot keep invalid hostnames)
Invalid port or port number out of range; source was "http://example.com:abc"; scheme = "http", host = "example.com"
(QUrl cannot keep a non-numeric port number)
Invalid path (character '%' not permitted); source was "foo:/path%?"; scheme = "foo", path = "/path%25%1F"
(the tolerant parser runs first, so the faulty component is fixed)
This stores the error state in a special structure which is not
allocated under normal conditions, keeping the memory consumption
down. On 32-bit systems, QUrlPrivate does not increase in size; on
64-bit systems, it grows by 8 bytes.
Change-Id: I93d798d43401dfeb9fca7b6eed7ea758da10136b
Reviewed-by: David Faure <faure@kde.org>
Make both invalid hostname messages start with "Invalid hostname". And
split the empty port error from the invalid port one.
Change-Id: I870d1ed6fb07ec494f553871a37ed167141ffc06
Reviewed-by: David Faure <faure@kde.org>
Reviewed-by: Shane Kearns <shane.kearns@accenture.com>
That's what we have QUrl::errorString() for. This will become evident
especially now that QUrl::toString() / toEncoded() return empty if
there are errors.
Change-Id: I64a84e9c6ee57c0fc38cc0c58f5286ddc1248d1f
Reviewed-by: Shane Kearns <shane.kearns@accenture.com>
Reviewed-by: David Faure <faure@kde.org>
These two errors can only happen if one calls setPath() explicitly. They
cannot happen for parsed URLs, which is why they are only caught with
isValid(). It's not possible to set the error condition in setPath()
either because they depend on the presence / absence of the authority
and scheme.
Also update all the unit tests that set a path not starting with a slash
and were just "freeloaders" on the previous behaviour.
Change-Id: Ice58cd4589a850452d7573a5b19667bbab2fb43e
Reviewed-by: David Faure <faure@kde.org>