mirror of
https://github.com/KhronosGroup/SPIRV-Tools
synced 2024-11-23 12:10:06 +00:00
0f3bea06ef
The current EnumSet implementation is only efficient for enums with values < than 64. The reason is the first 63 values are stored as a bitmask in a 64 bit unsigned integer, and the other values are stored in a std::set. For small enums, this is fine (most SPIR-V enums have IDs < than 64), but performance starts to drop with larger enums (Capabilities, opcodes). Design considerations: ---------------------- This PR changes the internal behavior of the EnumSet to handle enums with arbitrary values while staying performant. The idea is to extend the 64-bits buckets sparsely: - each bucket can store 64 value, starting from a multiplier of 64. This could be considered as a hashset with linear probing. - For small enums, there is a slight memory overhead due to the bucket storage, but lookup is still constant. - For linearly distributed values, lookup is constant. - Worse case for storage are for enums with values which are multiples of 64. But lookup is constant. - Worse case for lookup are enums with a lot of small ranges scattered in the space (requires linear probing). For enums like capabilities/opcodes, this bucketing is useful as values are usually scatters in distinct, but almost contiguous blocks. (vendors usually have allocated ranges, like [5000;5500], while [1000;5000] is mostly unused). Benchmarking: ------------- Benchmarking was done in 2 ways: - a benchmark built for the occasion, which only measure the EnumSet performance. - SPIRV-Tools tests, to measure a more realist scenario. Running SPIR-V tests with both implementations shows the same performance (delta < noise). So seems like we have no regressions. This method is noisy by nature (I/O, etc), but the most representative of a real-life scenario. Protocol: - run spirv-tests with no stdout using perf, multiple times. Result: - measure noise is larger than the observed difference. The custom benchmark was testing EnumSet interfaces using SPIRV enums. Doing thousand of insertion/deletion/lookup, with 2 kind of scenarios: - add once, lookup many times. - add/delete/loopkup many time. For small enums, results are similar (delta < noise). Seems relevant with the previously observed results as most SPIRV enums are small, and SPIRV-Tools is not doing that many intensive operations on EnumSets. Performance on large enums (opcode/capabilities) shows an improvement: +-----------------------------+---------+---------+---------+ | Metric | Old | New | Delta % | +-----------------------------+---------+---------+---------+ | Execution time | 27s | 7s | -72% | | Instruction count | 174b | 129b | -25% | | Branch count | 28b | 33b | +17% | | Branch miss | 490m | 26m | -94% | | Cache-misses | 149k | 26k | -82% | +-----------------------------+---------+---------+---------+ Future work ----------- This was by-design an NFC change to compare apples-to-apples. The next PR aims to add STL-like iterators to the EnumSet to allow using it with STL algorithms, and range-based for loops. Signed-off-by: Nathan Gauër <brioche@google.com> |
||
---|---|---|
.. | ||
diff | ||
fuzz | ||
fuzzers | ||
link | ||
lint | ||
opt | ||
reduce | ||
scripts | ||
tools | ||
util | ||
val | ||
wasm | ||
assembly_context_test.cpp | ||
assembly_format_test.cpp | ||
binary_destroy_test.cpp | ||
binary_endianness_test.cpp | ||
binary_header_get_test.cpp | ||
binary_parse_test.cpp | ||
binary_strnlen_s_test.cpp | ||
binary_to_text_test.cpp | ||
binary_to_text.literal_test.cpp | ||
c_interface_test.cpp | ||
CMakeLists.txt | ||
comment_test.cpp | ||
cpp_interface_test.cpp | ||
diagnostic_test.cpp | ||
enum_set_test.cpp | ||
enum_string_mapping_test.cpp | ||
ext_inst.cldebug100_test.cpp | ||
ext_inst.debuginfo_test.cpp | ||
ext_inst.glsl_test.cpp | ||
ext_inst.non_semantic_test.cpp | ||
ext_inst.opencl_test.cpp | ||
fix_word_test.cpp | ||
generator_magic_number_test.cpp | ||
hex_float_test.cpp | ||
immediate_int_test.cpp | ||
libspirv_macros_test.cpp | ||
log_test.cpp | ||
name_mapper_test.cpp | ||
named_id_test.cpp | ||
opcode_make_test.cpp | ||
opcode_require_capabilities_test.cpp | ||
opcode_split_test.cpp | ||
opcode_table_get_test.cpp | ||
operand_capabilities_test.cpp | ||
operand_pattern_test.cpp | ||
operand_test.cpp | ||
operand-class-test-coverage.csv | ||
parse_number_test.cpp | ||
pch_test.cpp | ||
pch_test.h | ||
preserve_numeric_ids_test.cpp | ||
software_version_test.cpp | ||
string_utils_test.cpp | ||
target_env_test.cpp | ||
test_fixture.h | ||
text_advance_test.cpp | ||
text_destroy_test.cpp | ||
text_literal_test.cpp | ||
text_start_new_inst_test.cpp | ||
text_to_binary_test.cpp | ||
text_to_binary.annotation_test.cpp | ||
text_to_binary.barrier_test.cpp | ||
text_to_binary.composite_test.cpp | ||
text_to_binary.constant_test.cpp | ||
text_to_binary.control_flow_test.cpp | ||
text_to_binary.debug_test.cpp | ||
text_to_binary.device_side_enqueue_test.cpp | ||
text_to_binary.extension_test.cpp | ||
text_to_binary.function_test.cpp | ||
text_to_binary.group_test.cpp | ||
text_to_binary.image_test.cpp | ||
text_to_binary.literal_test.cpp | ||
text_to_binary.memory_test.cpp | ||
text_to_binary.misc_test.cpp | ||
text_to_binary.mode_setting_test.cpp | ||
text_to_binary.pipe_storage_test.cpp | ||
text_to_binary.reserved_sampling_test.cpp | ||
text_to_binary.subgroup_dispatch_test.cpp | ||
text_to_binary.type_declaration_test.cpp | ||
text_word_get_test.cpp | ||
timer_test.cpp | ||
unit_spirv.cpp | ||
unit_spirv.h |