On LoongArch GCC compiles __builtin_ffs{,ll} to basically
`(x ? __builtin_ctz (x) : -1) + 1`. Since a hardware ctz instruction is
available, this is much better than the table-driven generic
implementation.
Tested on loongarch64.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>