From Hanno:
When a server replies to a cookieless ClientHello with a HelloVerifyRequest,
it is supposed to reset the connection and wait for a subsequent ClientHello
which includes the cookie from the HelloVerifyRequest.
In testing environments, it might happen that the reset of the server
takes longer than for the client to replying to the HelloVerifyRequest
with the ClientHello+Cookie. In this case, the ClientHello gets lost
and the client will need retransmit. This may happen even if the underlying
datagram transport is reliable.
We previously observed random-looking failures from this test. I think they
were caused by a race condition where the client tries to reconnect while the
server is still closing the connection and has not yet returned to an
accepting state. In that case, the server would fail to see and reply to the
ClientHello, and the client would have to resend it.
I believe logs of failing runs are compatible with this interpretation:
- the proxy logs show the new ClientHello and the server's closing Alert are
sent the same millisecond.
- the client logs show the server's closing Alert is received after the new
handshake has been started (discarding message from wrong epoch).
The attempted fix is for the client to wait a bit before reconnecting, which
should vastly enhance the probability of the server reaching its accepting
state before the client tries to reconnect. The value of 1 second is arbitrary
but should be more than enough even on loaded machines.
The test was run locally 100 times in a row on a slightly loaded machine (an
instance of all.sh running in parallel) without any failure after this fix.
Use the same values as other 3d tests: this makes the test hopefully a bit
faster than the default values, while not increasing the failure rate.
While at it:
- adjust "needs_more_time" setting for 3d interop tests (we can't set the
timeout values for other implementations, so the test might be slow)
- fix some supposedly DTLS 1.0 test that were using dtls1_2 on the command
line
Adds a requirement for GNUTLS_NEXT (3.5.3 or above, in practice we should
install 3.6.3) on the CI.
See internal ref IOTSSL-2401 for analysis of the bugs and their impact on the
tests.
For now, just check that it causes us to fragment. More tests are coming in
follow-up commits to ensure we respect the exact value set, including when
renegotiating.
Note: no interop tests in ssl-opt.sh for now, as some of them make us run into
bugs in (the CI's default versions of) OpenSSL and GnuTLS, so interop tests
will be added later once the situation is clarified. <- TODO
1. Update the test script to un the ECC tests only if the relevant
configurations are defined in `config.h` file
2. Change the HASH of the ciphersuite from SHA1 based to SHA256
for better example
The code paths in the library are different for decryption and for
signature. Improve the test coverage by doing some error path tests
for decryption in addition to signature.
The certificate passed to async callbacks may not be the one set by
mbedtls_ssl_conf_own_cert. For example, when using an SNI callback,
it's whatever the callback is using. Document this, and add a test
case (and code sample) with SNI.
Add a test case for SSL asynchronous signature where f_async_resume is
called twice. Verify that f_async_sign_start is only called once.
This serves as a non-regression test for a bug where f_async_sign_start
was only called once, which turned out to be due to a stale build
artifacts with mismatched numerical values of
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS.
Testing the case where the resume callback returns an error at the
beginning and the case where it returns an error at the end is
redundant. Keep the test after the output has been produced, to
validate that the product does not use even a valid output if the
return value is an error code.
Document how the SSL async sign callback must treat its md_alg and
hash parameters when doing an RSA signature: sign-the-hash if md_alg
is nonzero (TLS 1.2), and sign-the-digestinfo if md_alg is zero
(TLS <= 1.1).
In ssl_server2, don't use md_alg=MBEDTLS_MD_NONE to indicate that
ssl_async_resume must perform an encryption, because md_alg is also
MBEDTLS_MD_NONE in TLS <= 1.1. Add a test case to exercise this
case (signature with MBEDTLS_MD_NONE).
Conflict resolution:
* ChangeLog: put the new entry from my branch in the proper place.
* include/mbedtls/error.h: counted high-level module error codes again.
* include/mbedtls/ssl.h: picked different numeric codes for the
concurrently added errors; made the new error a full sentence per
current standards.
* library/error.c: ran scripts/generate_errors.pl.
* library/ssl_srv.c:
* ssl_prepare_server_key_exchange "DHE key exchanges": the conflict
was due to style corrections in development
(4cb1f4d49c) which I merged with
my refactoring.
* ssl_prepare_server_key_exchange "For key exchanges involving the
server signing", first case, variable declarations: merged line
by line:
* dig_signed_len: added in async
* signature_len: removed in async
* hashlen: type changed to size_t in development
* hash: size changed to MBEDTLS_MD_MAX_SIZE in async
* ret: added in async
* ssl_prepare_server_key_exchange "For key exchanges involving the
server signing", first cae comment: the conflict was due to style
corrections in development (4cb1f4d49c)
which I merged with my comment changes made as part of refactoring
the function.
* ssl_prepare_server_key_exchange "Compute the hash to be signed" if
`md_alg != MBEDTLS_MD_NONE`: conflict between
ebd652fe2d
"ssl_write_server_key_exchange: calculate hashlen explicitly" and
46f5a3e9b4 "Check return codes from
MD in ssl code". I took the code from commit
ca1d742904 made on top of development
which makes mbedtls_ssl_get_key_exchange_md_ssl_tls return the
hash length.
* programs/ssl/ssl_server2.c: multiple conflicts between the introduction
of MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS and new auxiliary functions and
definitions for async support, and the introduction of idle().
* definitions before main: concurrent additions, kept both.
* main, just after `handshake:`: in the loop around
mbedtls_ssl_handshake(), merge the addition of support for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS and SSL_ASYNC_INJECT_ERROR_CANCEL
with the addition of the idle() call.
* main, if `opt.transport == MBEDTLS_SSL_TRANSPORT_STREAM`: take the
code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS.
* main, loop around mbedtls_ssl_read() in the datagram case:
take the code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS; revert to a do...while loop.
* main, loop around mbedtls_ssl_write() in the datagram case:
take the code from development and add a check for
MBEDTLS_ERR_SSL_ASYNC_IN_PROGRESS; revert to a do...while loop.
Add test cases for SSL asynchronous signature to ssl-opt.sh:
* Delay=0,1 to test the sequences of calls to f_async_resume
* Test fallback when the async callbacks don't support that key
* Test error injection at each stage
* Test renegotiation
Previously, the idling loop in ssl_server2 didn't check whether
the underlying call to mbedtls_net_poll signalled that the socket
became invalid. This had the consequence that during idling, the
server couldn't be terminated through a SIGTERM, as the corresponding
handler would only close the sockets and expect the remainder of
the program to shutdown gracefully as a consequence of this.
This was subsequently attempted to be fixed through a change
in ssl-opt.sh by terminating the server through a KILL signal,
which however lead to other problems when the latter was run
under valgrind.
This commit changes the idling loop in ssl_server2 and ssl_client2
to obey the return code of mbedtls_net_poll and gracefully shutdown
if an error occurs, e.g. because the socket was closed.
As a consequence, the server termination via a KILL signal in
ssl-opt.sh is no longer necessary, with the previous `kill; wait`
pattern being sufficient. The commit reverts the corresponding
change.
The UDP tests involving the merging of multiple records into single
datagrams accumulate records for 10ms, which can be less than the
total flight preparation time if e.g. the tests are being run with
valgrind.
This commit increases the packing time for the relevant tests
from 10ms to 50ms.