mirror of
https://github.com/KhronosGroup/SPIRV-Tools
synced 2024-10-18 19:20:05 +00:00
0f3bea06ef
The current EnumSet implementation is only efficient for enums with values < than 64. The reason is the first 63 values are stored as a bitmask in a 64 bit unsigned integer, and the other values are stored in a std::set. For small enums, this is fine (most SPIR-V enums have IDs < than 64), but performance starts to drop with larger enums (Capabilities, opcodes). Design considerations: ---------------------- This PR changes the internal behavior of the EnumSet to handle enums with arbitrary values while staying performant. The idea is to extend the 64-bits buckets sparsely: - each bucket can store 64 value, starting from a multiplier of 64. This could be considered as a hashset with linear probing. - For small enums, there is a slight memory overhead due to the bucket storage, but lookup is still constant. - For linearly distributed values, lookup is constant. - Worse case for storage are for enums with values which are multiples of 64. But lookup is constant. - Worse case for lookup are enums with a lot of small ranges scattered in the space (requires linear probing). For enums like capabilities/opcodes, this bucketing is useful as values are usually scatters in distinct, but almost contiguous blocks. (vendors usually have allocated ranges, like [5000;5500], while [1000;5000] is mostly unused). Benchmarking: ------------- Benchmarking was done in 2 ways: - a benchmark built for the occasion, which only measure the EnumSet performance. - SPIRV-Tools tests, to measure a more realist scenario. Running SPIR-V tests with both implementations shows the same performance (delta < noise). So seems like we have no regressions. This method is noisy by nature (I/O, etc), but the most representative of a real-life scenario. Protocol: - run spirv-tests with no stdout using perf, multiple times. Result: - measure noise is larger than the observed difference. The custom benchmark was testing EnumSet interfaces using SPIRV enums. Doing thousand of insertion/deletion/lookup, with 2 kind of scenarios: - add once, lookup many times. - add/delete/loopkup many time. For small enums, results are similar (delta < noise). Seems relevant with the previously observed results as most SPIRV enums are small, and SPIRV-Tools is not doing that many intensive operations on EnumSets. Performance on large enums (opcode/capabilities) shows an improvement: +-----------------------------+---------+---------+---------+ | Metric | Old | New | Delta % | +-----------------------------+---------+---------+---------+ | Execution time | 27s | 7s | -72% | | Instruction count | 174b | 129b | -25% | | Branch count | 28b | 33b | +17% | | Branch miss | 490m | 26m | -94% | | Cache-misses | 149k | 26k | -82% | +-----------------------------+---------+---------+---------+ Future work ----------- This was by-design an NFC change to compare apples-to-apples. The next PR aims to add STL-like iterators to the EnumSet to allow using it with STL algorithms, and range-based for loops. Signed-off-by: Nathan Gauër <brioche@google.com> |
||
---|---|---|
.. | ||
basic_block.cpp | ||
basic_block.h | ||
construct.cpp | ||
construct.h | ||
decoration.h | ||
function.cpp | ||
function.h | ||
instruction.cpp | ||
instruction.h | ||
validate_adjacency.cpp | ||
validate_annotation.cpp | ||
validate_arithmetics.cpp | ||
validate_atomics.cpp | ||
validate_barriers.cpp | ||
validate_bitwise.cpp | ||
validate_builtins.cpp | ||
validate_capability.cpp | ||
validate_cfg.cpp | ||
validate_composites.cpp | ||
validate_constants.cpp | ||
validate_conversion.cpp | ||
validate_debug.cpp | ||
validate_decorations.cpp | ||
validate_derivatives.cpp | ||
validate_execution_limitations.cpp | ||
validate_extensions.cpp | ||
validate_function.cpp | ||
validate_id.cpp | ||
validate_image.cpp | ||
validate_instruction.cpp | ||
validate_interfaces.cpp | ||
validate_layout.cpp | ||
validate_literals.cpp | ||
validate_logicals.cpp | ||
validate_memory_semantics.cpp | ||
validate_memory_semantics.h | ||
validate_memory.cpp | ||
validate_mesh_shading.cpp | ||
validate_misc.cpp | ||
validate_mode_setting.cpp | ||
validate_non_uniform.cpp | ||
validate_primitives.cpp | ||
validate_ray_query.cpp | ||
validate_ray_tracing_reorder.cpp | ||
validate_ray_tracing.cpp | ||
validate_scopes.cpp | ||
validate_scopes.h | ||
validate_small_type_uses.cpp | ||
validate_type.cpp | ||
validate.cpp | ||
validate.h | ||
validation_state.cpp | ||
validation_state.h |