do not forget pipelining, IALU operations do not cause stalls (in this scenario)...
-test, bit is constant
lsl // 1 cycle
blt // 1 if not taken, 4 if taken
b // 4, use plain branch instead!
overall we are talking about either 5 or 6 cycles, inlining would make sense
-bit stored in register
mov // 1
sub // 1
lsl // 1
blt // 1 or 4
b // 4
overall either 7 or 8 cycles
// write this in C instead, the compiler might be able to do (31-param) at compile time!
// also, inlining!