Mark the ra register as undefined in _start, so that unwinding through main works correctly. Also, don't use a tail call so that ra points after the call to __libc_start_main, not after the previous call.