boolbStartSchedularOnStartup{true};// spawns the scheduler thread during the runtime initialization process, otherwise delegate the spawn until the very last minute.
boolbEnableLegacyTicks{false};// turn this on to enable an async-app/singleton-threadpool to SysPump tick on thread worker-id: zero. Alternatively, use SetMainThreadForSysPumpScheduling once you have a thread pool and worker id.
AuUInt32dwLegacyMainThreadSystemTickMS{60};// nowadays this is used to dispatch AuConsole commands to a mainthread with AuAsync.
boolbEnableCpp20RecursiveCallstack{true};// enables/disables co_routine support in that the runtime can work with nested IWorkItem::BlockUntilComplete()'s and IThreadPool::[Run/Poll/RunOnce/etc]()'s.
boolbIsApplicationClientSoftwareOnJitteryMemorySystem{false};// enable me to enable padding from system out of memory conditions.
// the catch: usually this memory is reserved for exit callbacks, internal low memory conditions, error reporting, and the like.
//
// generally you should not exploit this without ** acknowledging this thread-local condition via AuDebug::[Add/Dec]MemoryCrunch. ** ( << tl;dr: recommended way of accessing this memory)
//
// setting this flag enables debug buffer memory to be used at any point during any threads execution - the moment mimalloc runs
// out of pre-reserved and system mappable memory. i wouldn't use this for anything except monolithic client/user-facing applications
// that are likely to run on low resource systems (low spec or heavy app), with untested/uncaught C++ allocations splattered everywhere.
// this could be VERY useful to end users who are running into `bIsMemoryErrorFatal` crashes.
AuUInt32uDebugMemoryReserveSize{3*1024*1024};/* nowdays: a single v8 isolate is low sub-tens MB of memory, executable file sizes are low mbs, image sizes are much larger. forget small low-footprint vms
// Facebook used to say half this (cant find src), I used to say about 82 to 512, Windows 7s implementation of CRITICAL_SECTION and SRWLOCK says double that (256), for aggressive (and now incorrect) spin mutex examples ive seen around 2k or less, some intel reference material uses 64 as a demo max spin value, for not so aggressive pause loops ive seen people use 32-128-ish pauses (also incorrect), dumb shits parroting Win9x documentation and SetCriticalSectionSpinCount's example value think you need above >= 4k (stackexchange man strike again).
// Personally, I've seen this tested on 5-12th gen intel, Windows 7 through 11, Linux, and various other configurations.
// Personally, I've seen this run Qt with less CPU resources than every other Qt process on Win7. I've seen this run JavaScript programs dead last on the taskmanagers detail panel, on both 10 and 7.
// 128 to 512 is fine. on the upper end you, the developer, need to start asserting you are a real time application aware of your hardware requirements / have properly matched task affinity / etc, and don't mind shredding old processor power efficiency while chewing thru nop cycles
AuUInt64bPreferNt51XpMutexesOver8:1AU_BIT_FIELD_INIT_AFTER_20(false);// under modern versions of windows, do not use keyedevents. use the native waitonaddress internals, then waitonaddress proper; and dont touch keyedevents paths.
AuUInt64bPreferNt51XpCondvarsOver8:1AU_BIT_FIELD_INIT_AFTER_20(false);// under modern versions of windows, do not use keyedevents. use the native waitonaddress internals, then waitonaddress proper; and dont touch keyedevents paths.
AuUInt64bPreferNtCondvarModernWinSpin:1AU_BIT_FIELD_INIT_AFTER_20(false);// very modern cpus have monitor / tpause / etc intrins. sometimes like us, microsoft will use them in userspace under waitonaddress of very modern windows builds. i wouldn't rely on that. we implement spinning ourselves for linux + old win32 for 2 decades worth of processors.
AuUInt64bPreferNtCondvarOlderWinSpin:1AU_BIT_FIELD_INIT_AFTER_20(true);// windows 7 and lower sees better CPU + power draw when we implement spinning ourselves on top of the the dreaded bidirectionally blocking keyedevents. besides, msft refused to backport userland monitor (very modern chipsets) to old versions of 10 and 7.
AuUInt64bPreferWaitOnAddressAlwaysSpin:1AU_BIT_FIELD_INIT_AFTER_20(false);// ..., if emulated! if double-spinning under higher level locks, disable me.
AuUInt64bPreferWaitOnAddressAlwaysSpinNative:1AU_BIT_FIELD_INIT_AFTER_20(!AuBuild::kIsNtDerived);// ..., if not emulated! noting that most kernels and user-schedulers will spin for you. nt users can expect ntdll to spin / pause / monitor / etc, under * modern * win32 versions.
AuUInt64bPreferFutexRWLock:1AU_BIT_FIELD_INIT_AFTER_20(true);// Win10+ and Linux should use futexes inside the AuRWLock primitive, vs other dumber primitives built on similar futex abstraction, both that'll perform about the same regardless.
// Once taking to account other platform specific member overhead, making this compile time isnt worth it in memory and in the CPU-overhead. Enjoy the extra compat (incl WinXP, for almost free).
// Considering we beat pthreads, 3 STLs, Win32 primitives in API functionality and in legacy XP compat, we're * hundreds * of bytes less than a bad STL (incl llvm and msvc), I think our RWLock is fine.
// Making it any smaller would require a different API, different tooling assumptions, and different CPU branching overhead assumptions.
// Even the CPU branching implications of a *portable, potentially-relinkable, potentially-asm* thread id check destroys the excuse for a smaller Aurora::Threading::Waitables futex reimplementation.
AuUInt64bWinXpThrough7BlazeOptimizerPower:12AU_BIT_FIELD_INIT_AFTER_20(300);// dont worry about it. we dont care about old portables. lets try to make older win32 targets tweak the scheduling in our favor a bit.
AuUInt64bPreferFutexEvent:1AU_BIT_FIELD_INIT_AFTER_20(true);// Win10+ and Linux should use a futex inside the AuEvent / AuThreadPrimitive event as the hybrid binary-semaphore/cross/event's signal flag.
// It only takes 8k ns to 60k ns depending on the platform to wake a thread, and we can hit AuAsync scheduler without too much error; we can just do this instead of relying on historically shitty IO primitives.
// ioConfig.bIsVeryLargeIOApplication | for increased memory overhead, allows servers to open more io completion contexts. looks bad on linux client applications.
// fio.optDefaultBrand | for application branding. change this to your publishers name. used for configuration and ~home isolation.
// threadingConfig.bEnableAggressiveScheduling | for real time applications. required for retarded timing resolution coalescence. 0.0Xms to 0.3MS tier resolution is viable on modern PC platforms (say <250,000NS).
// threadingConfig.bNoThreadNames | disable vendor libraries from specifying their thread name to an attached debugger, if not stripped from the application.
// threadingConfig.uSpinLoopPowerA | increase me if the *global* context switch rate is too high. use AuThreading:: APIs is it's a per-thread issue.
// async.dwSchedulerRateLimitNS | for sub 2MS AuAsync timers and Windows 7 timers.
// | Real-time applications should set this to 0.
// | Interactive applications should lower this to 500'000 nanoseconds (.5MS) to 1'000'000ns (1MS).
// | GUI applications should keep this value high to prevent high idle CPU usage.