added alternative batching kernel (slow)
tweaked controls a bit
added command-line options --selected_demo=<int> and --new_batching
started looking into parallel 3d sap
start adding various scenes to test gpu rigid body pipeline
reserve more memory for shapes (concave triangle mesh can be huge) in GLInstancingRenderer
fix a few crashes when 0 objects