SPIRV-Tools/tools/io.h
Shahbaz Youssefi 02433568af
tools: Accept hex representation as binary input (#5870)
Sometimes when debugging or logging, SPIR-V may be dumped as a stream of
hex values.  There are tools to convert such a stream to binary
(such as [1]) but they create an inconvenient extra step when for
example the disassembly of that hex stream is needed.

[1]: https://www.khronos.org/spir/visualizer/hexdump.html

In this change, the binary reader used by the tools is enhanced to
detect when the binary is actually a hex stream, and parse that instead.
The following formats are accepted, detected based on how the SPIR-V
magic number is output:

=== Words

If the first token of the hex stream is one of 0x07230203, 0x7230203,
x07230203, or x7230203, the hex stream is expected to consist of 32-bit
hex words prefixed with 0x or x.  For example:

    0x7230203, 0x10400, 0x180001, 0x79, 0x0

is parsed as:

    0x07230203 0x00010400 0x00180001 0x00000079 0x00000000

Note that `,` is optional in the stream, but the hex values are expected
to be delimited by either `,` or whitespace.

=== Bytes With Prefix

If the first token of the hex stream is one of 0x07, 0x7, x07, x7, 0x03,
0x3, x03, or x3, the hex stream is expected to consist of 8-bit hex
bytes prefixed with 0x or x.  If the first token has a value of 7, the
stream is big-endian.  Otherwise it's little-endian.  For example:

    0x3, 0x2, 0x23, 0x7, 0x0, 0x4, 0x1, 0x0, 0x1, 0x0, 0x18, 0x0, 0x79, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0

is parsed as:

    0x07230203 0x00010400 0x00180001 0x00000079 0x00000000

Similar to "Words", `,` is optional in the stream, but the hex values
are expected to be delimited by either `,` or whitespace.

=== Bytes Without Prefix

If the first two characters of the hex stream is 07, or 03, the hex
stream is expected to consist of 8-bit hex bytes of 2 characters each.
If the first token is 07, the stream is big-endian.  Otherwise it's
little-endian.  Unlike the other modes, delimiter is optional (which
automatically handles 32-bit word streams), but no 0-padding is done.
For example, all of the following:

    03, 02, 23, 07, 00, 04, 01, 00, 01, 00, 18, 00, 79, 00, 00, 00, 00, 00, 00, 00
    03 02 23 07 00 04 01 00 01 00 18 00 79 00 00 00 00 00 00 00
    03022307 00040100 01001800 79000000 00000000
    07,23,02,03,00,01,04,00,00,18,00,01,00,00,00,79,00,00,00,00
    07230203, 00010400, 00180001, 00000079, 00000000

are parsed as:

    0x07230203 0x00010400 0x00180001 0x00000079 0x00000000
2024-11-04 09:57:37 -05:00

66 lines
2.9 KiB
C++

// Copyright (c) 2016 Google Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef TOOLS_IO_H_
#define TOOLS_IO_H_
#include <cstdint>
#include <cstdio>
#include <cstring>
#include <vector>
// Sets the contents of the file named |filename| in |data|, assuming each
// element in the file is of type |uint32_t|. The file is opened as a binary
// file. If |filename| is nullptr or "-", reads from the standard input, but
// reopened as a binary file. If any error occurs, writes error messages to
// standard error and returns false.
//
// If the given input is detected to be in ascii hex, it is converted to binary
// automatically. In that case, the shape of the input data is determined based
// on the representation of the magic number:
//
// * "[0]x[0]7230203": Every following "0x..." represents a word.
// * "[0]x[0]7[,] [0]x23...": Every following "0x..." represents a byte, stored
// in big-endian order
// * "[0]x[0]3[,] [0]x[0]2...": Every following "0x..." represents a byte,
// stored in little-endian order
// * "07[, ]23...": Every following "XY" represents a byte, stored in
// big-endian order
// * "03[, ]02...": Every following "XY" represents a byte, stored in
// little-endian order
bool ReadBinaryFile(const char* filename, std::vector<uint32_t>* data);
// The hex->binary logic of |ReadBinaryFile| applied to a pre-loaded stream of
// bytes. Used by tests to avoid having to call |ReadBinaryFile| with temp
// files. Returns false in case of parse errors.
bool ConvertHexToBinary(const std::vector<char>& stream,
std::vector<uint32_t>* data);
// Sets the contents of the file named |filename| in |data|, assuming each
// element in the file is of type |char|. The file is opened as a text file. If
// |filename| is nullptr or "-", reads from the standard input, but reopened as
// a text file. If any error occurs, writes error messages to standard error and
// returns false.
bool ReadTextFile(const char* filename, std::vector<char>* data);
// Writes the given |data| into the file named as |filename| using the given
// |mode|, assuming |data| is an array of |count| elements of type |T|. If
// |filename| is nullptr or "-", writes to standard output. If any error occurs,
// returns false and outputs error message to standard error.
template <typename T>
bool WriteFile(const char* filename, const char* mode, const T* data,
size_t count);
#endif // TOOLS_IO_H_