It uses the bitmanip extension to optimize index_fist and index_last
with clz/ctz (using generic implementation that routes to compiler
builtin) and orc.b to check null bytes.
Checked the string test on riscv64 user mode.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>