This allows us to get the file size even when the input file is passed
via stdin. This fixes `--content-size` not working in situations like
$ lz4 -v --content-size < /tmp/test > /tmp/test.lz4
Warning : cannot determine input content size
With this change, it works.
Also helps with #904.
which serves no more purpose.
The comment implies that the simple presence of this unused function was affecting performance,
and that's the reason why it was not removed earlier.
This is likely another side effect of instruction alignment.
It's obviously unreliable to rely on it in this way,
meaning that the impact will be different, positive of negative,
with any minor code change, and any minor compiler version change, even parameter change.
EndMark, the 4-bytes value indicating the end of frame,
must be `0x00000000`.
Previously, it was just mentioned as a `0-size` block.
But such definition could encompass uncompressed blocks of size 0,
with a header of value `0x80000000`.
But the intention was to also support uncompressed empty blocks.
They could be used as a keep-alive signal.
Note that compressed empty blocks are already supported,
it's just that they have a size 1 instead of 0 (for the `0` token).
Unfortunately, the decoder implementation was also wrong,
and would also interpret a `0x80000000` block header as an endMark.
This issue evaded detection so far simply because
this situation never happens, as LZ4Frame always issues
a clean 0x00000000 value as a endMark.
It also does not flush empty blocks.
This is fixed in this PR.
The decoder can now deal with empty uncompressed blocks,
and do not confuse them with EndMark.
The specification is also clarified.
Finally, FrameTest is updated to randomly insert empty blocks during fuzzing.
`LZ4_memcpy()` uses `__builtin_memcpy()` to ensure that clang/gcc
can inline the `memcpy()` calls in freestanding mode.
This is necessary for decompressing the Linux Kernel with LZ4.
Without an analogous patch decompression ran at 77 MB/s, and with
the patch it ran at 884 MB/s.
Similar work in the kernel:
https://patchwork.kernel.org/patch/11351499/
UBsan (+clang-10) complains about doing pointer arithmetic (adding 0)
to a nullpointer.
This patch is tested with clang-10+ubsan