boolbStartSchedularOnStartup{true};// spawns the scheduler thread during the runtime initialization process, otherwise delegate the spawn until the very last minute.
boolbEnableLegacyTicks{false};// turn this on to enable an async-app/singleton-threadpool to SysPump tick on thread worker-id: zero. Alternatively, use SetMainThreadForSysPumpScheduling once you have a thread pool and worker id.
AuUInt32dwLegacyMainThreadSystemTickMS{60};// nowadays this is used to dispatch AuConsole commands to a mainthread with AuAsync.
boolbEnableCpp20RecursiveCallstack{true};// enables/disables co_routine support in that the runtime can work with nested IWorkItem::BlockUntilComplete()'s and IThreadPool::[Run/Poll/RunOnce/etc]()'s.
boolbIsApplicationClientSoftwareOnJitteryMemorySystem{false};// enable me to enable padding from system out of memory conditions.
// the catch: usually this memory is reserved for exit callbacks, internal low memory conditions, error reporting, and the like.
//
// generally you should not exploit this without ** acknowledging this thread-local condition via AuDebug::[Add/Dec]MemoryCrunch. ** ( << tl;dr: recommended way of accessing this memory)
//
// setting this flag enables debug buffer memory to be used at any point during any threads execution - the moment mimalloc runs
// out of pre-reserved and system mappable memory. i wouldn't use this for anything except monolithic client/user-facing applications
// that are likely to run on low resource systems (low spec or heavy app), with untested/uncaught C++ allocations splattered everywhere.
// this could be VERY useful to end users who are running into `bIsMemoryErrorFatal` crashes.
AuUInt32uDebugMemoryReserveSize{3*1024*1024};/* nowdays: a single v8 isolate is low sub-tens MB of memory, executable file sizes are low mbs, image sizes are much larger. forget small low-footprint vms
// Resets everything assuming we dont have default initialization (c++14) or we cannot bit default initialize (c++20).
// This is a struct local clear bit for on init.
#if defined(AU_LANG_CPP_20_)
boolbResetToRuntimeDefaults{false};
#else
boolbResetToRuntimeDefaults{true};
#endif
boolbNoThreadNames{false};
boolbPlatformIsSMPProcessorOptimized{true};// Whether to attempt to using mm_pause or similar instruction before yielding into the kernel
AuUInt16uSpinLoopPowerA{128};// Nudgable spinloop power. This is our local userland niceness factor
// This is comparable to Win32's SetCriticalSectionSpinCount applied across every single AuThreadPrimitives try-lock and lock.
// Adjust this value to compensate for longer critical sections when context switching isn't preferrable.
// Using 128 as a default (bouncing around 64 and 512)
// Facebook says half this (cant find src), I used to say about 82 to 512, Windows 7s implementation of CRITICAL_SECTION and SRWLOCK says double that (256), for aggressive (and now incorrect) spin mutex examples ive seen around 2k or less, for not so aggressive pause loops ive seen people use 32-128-ish pauses (also incorrect), dumb shits parroting Win9x documentation and SetCriticalSectionSpinCount example value think you need above >= 4k (stackexchange man strike again).
// Personally, I've seen this tested on 5-12th gen intel, Windows 7 through 11, Linux, and various other configurations.
// Personally, I've seen this run Qt with less CPU resources than every other Qt process on Win7. I've seen this run JavaScript programs dead last on the taskmanagers detail panel, on both 10 and 7.
// 128 to 512 is fine, unless you need to start asserting you are a real time application aware of your hardware requirements / have properly matched task affinity / etc, and don't mind shredding old processor power efficiency while chewing thru nop cycles
// <<<<<<<<<<<<<<< (QA:) Each applications will probably need its own nudge value
AuUInt64bEnableAggressiveScheduling:1AU_BIT_FIELD_INIT_AFTER_20(false);// <<<<<<<<<<<<<<< (SHIP:) ENABLE ME FOR AROUND 1MS OR LESS SCHED RESOLUTION
AuUInt64bPreferNt51XpMutexesOver8:1AU_BIT_FIELD_INIT_AFTER_20(false);// under modern versions of windows, do not use keyedevents. use the native waitonaddress internals, then waitonaddress proper; and dont touch keyedevents paths.
AuUInt64bPreferNt51XpCondvarsOver8:1AU_BIT_FIELD_INIT_AFTER_20(false);// under modern versions of windows, do not use keyedevents. use the native waitonaddress internals, then waitonaddress proper; and dont touch keyedevents paths.
AuUInt64bPreferNtCondvarModernWinSpin:1AU_BIT_FIELD_INIT_AFTER_20(false);// very modern cpus have monitor / tpause / etc intrins. sometimes like us, microsoft will use them in userspace under waitonaddress of very modern windows builds. i wouldn't rely on that. we implement spinning ourselves for linux + old win32 for 2 decades worth of processors.
AuUInt64bPreferNtCondvarOlderWinSpin:1AU_BIT_FIELD_INIT_AFTER_20(true);// windows 7 and lower sees better CPU + power draw when we implement spinning ourselves on top of the the dreaded bidirectionally blocking keyedevents. besides, msft refused to backport userland monitor (very modern chipsets) to old versions of 10 and 7.
AuUInt64bPreferWaitOnAddressAlwaysSpin:1AU_BIT_FIELD_INIT_AFTER_20(false);// ..., if emulated! if double-spinning under higher level locks, disable me.
AuUInt64bPreferWaitOnAddressAlwaysSpinNative:1AU_BIT_FIELD_INIT_AFTER_20(!AuBuild::kIsNtDerived);// ..., if not emulated! noting that most kernels and user-schedulers will spin for you. nt users can expect ntdll to spin / pause / monitor / etc, under * modern * win32 versions.
AuUInt64bPreferFutexRWLock:1AU_BIT_FIELD_INIT_AFTER_20(true);// Win10+ and Linux should use futexes inside the AuRWLock primitive, vs other dumber primitives built on similar futex abstraction, both that'll perform about the same regardless.
// Once taking to account other platform specific member overhead, making this compile time isnt worth it in memory and in the CPU-overhead. Enjoy the extra compat (incl WinXP, for almost free).
// Considering we beat pthreads, 3 STLs, Win32 primitives in API functionality and in legacy XP compat, we're * hundreds * of bytes less than a bad STL (incl llvm and msvc), I think our RWLock is fine.
// Making it any smaller would require a different API, different tooling assumptions, and different CPU branching overhead assumptions.
// Even the CPU branching implications of a *portable, potentially-relinkable, potentially-asm* thread id check destroys the excuse for a smaller Aurora::Threading::Waitables futex reimplementation.
AuUInt64bWinXpThrough7BlazeOptimizerPower:12AU_BIT_FIELD_INIT_AFTER_20(300);// dont worry about it. we dont care about old portables. lets try to make older win32 targets tweak the scheduling in our favor a bit.
AuUInt64bPreferFutexEvent:1AU_BIT_FIELD_INIT_AFTER_20(true);// Win10+ and Linux should use a futex inside the AuEvent / AuThreadPrimitive event as the hybrid binary-semaphore/cross/event's signal flag.
// It only takes 8k ns to 60k ns depending on the platform to wake a thread, and we can hit AuAsync scheduler without too much error; we can just do this instead of relying on historically shitty IO primitives.