Aurora Runtime: cross-platform platform-abstraction library - the 100kloc of /base/*.cx nobody wants to write.
Go to file
2024-12-18 07:35:51 +00:00
Include [+] AuHWInfo::GetPerCoreCPUTimeEx(const AuMemoryViewWrite &arrayOfCpuCoreTimeTimes) 2024-12-15 21:33:25 +00:00
Media [*] Update media txt files 2024-05-28 18:56:20 +01:00
Source [+] arm vendor and processor strings 2024-12-18 07:35:51 +00:00
.gitignore A pretty large patch not worth breaking up into separate commits 2021-11-05 17:34:23 +00:00
Aurora.json [+] Clang/Win32 compilation support 2023-12-19 03:43:11 +00:00
CONTRIBUTING [*] [*] Amend: (Disamb old grammar - 985bb6dab4) CONTRIBUTING 2024-09-20 11:18:53 +01:00
LICENSE [*] 2024-02-25 22:38:09 +00:00
README.md [*] Update readme again 2024-12-14 05:32:19 +00:00

Alpha (since v0.3.80)

AuroraRuntime

The Aurora Runtime is a low level platform abstraction layer for modern cross-platform C++ development targeting powerful embedded and PC systems.

console picture picture

Features

  • Reduced C++ standard template library dependence despite requiring a modern-ish driver (^1) (portable when?)
  • Logging; UTF-8 logger, common sink backends, formating interface
  • Debug and Telementry; asserts, panics, exception logging, demangling of symbols, more
  • Crypto ECC/[25519, P-384, P-256], [AES, RSA, X509], CBC[AES, Stinky3DES], HMAC, HashCash, BCrypt, [common digests]
  • Basic cmdline parsing from any module
  • Exit and fatal save condition callbacks
  • Random; secure and fast user-seeded backends
  • Hardware Info; memory and cpu info (including cpu feature bits, core topology, e-core awareness, and basic cache size)
  • Software stack information for retrieving kernel, version, brand, family, build string, etc
  • Compression (deflate, gzip, zstd, LZ4, bzip2, lzma, brotli)
  • Locale and encoding
  • High performance threading and synchronization primitives (os userland sched optimized)
  • Alternative WakeOnAddress implementation for non-aurt/user synchronization primitives when kernel support is missing (polyfill when faced with old and/or locked down private APIs)
  • Async subsystem backed by high performance sync primitives (cv loop) and hybrid switching into IO polling (think userland cv-backed promises + waitmultipleobjects)
  • IO subsystem for standard cross-platform IO loop queues, IPC (mutex with auto-unlock, semaphores, full-duplex single-connection pipes, and shared memory), file (direct uncached access), and network
  • Abstract kernel IO transactions, IPC objects, timers, semaphores, and others in the form of ILoopSources
  • Common IO transaction interface for network, file, and handle async access (with workarounds for platform querks)
  • IO processor for common network, pipe processing, and general work on any given thread (think of it as an io context)
  • IO pipe processor for the processing of data when invoked by io transactions and other signalable interfaces
  • Protocol stack concept, for implementing low-overhead IO stream processors, where data is streamed through layers of interceptors
  • Builtin support for TLS and compression in the form of protocol stack interceptors
  • Non-locking file system watchers with IO subsystem interoperability
  • Process spawning with stream redirection backed by the IO subsystem
  • Process memory management with IPC and file mapping
  • FIO settings registry
  • C++ utility templates and macros
  • Follows all strings are UTF-8 convention

^1 bring your own types auROXTL

API:

API Docs:
Tests and Examples: Hello Aurora
Build Pipeline: Aurora Build (Lua/Premake)
Donate / Other Links: Reece.SX
Discord: Invite

Support

Platform Support
NT/XP 🕖
NT/Server 2003 ⚠️⚠️
NT/Vista ⚠️⚠️
NT/Server 2008 ⚠️
NT/Win7 ⚠️
NT/Win8.1+ ⚠️
NT/Win10 RS4+
NT/Win11
NT/UWP 🕖
NT/GameOS
Linux
Linux/Android 🕖
OpenBSD
FreeBSD 9
FreeBSD 11 🕖
OpenBSD
XNU/NS-like

Win7/8: memory management (AuProcess) is limited.
Applications that don't need ::mmap-like functionality with pre-reserved address allocations should put your minimum requirements into the Vista era of NTs.
See: Windows XP - 7 defects
Earlier NT revisions could be supported; however, there is creep across various subsystems and libraries for XP to not work. In addition, anything older than a fully patched Windows 7 system is untested.
It should be noted most of these Windows XP regressions are more the fault of the precompiled CRT, and third party compression libraries being lazy, not the fault of the Runtime. With a bit of hacking, any reasonable version and subset of Windows should be in reach.

C Runtime Support
MSVC-Static
MSVC-Dynamic
GLIBC
Musl ⚠️
Cygwin
Bionic 🕖
FreeBSD
OpenBSD
MSVC-stl-Dynamic
libc++stl-Dynamic
libstdc++stl-Dynamic
MSVC-stl-Static
libc++stl-Static
libstdc++stl-Static

Musl and alternative CRTs stated as supported may require private build scripts and/or unique build roots.

Performance

Performance of each system should ideally be that of the best implementation on the platform. Due to heavyweight requirements, a handful of unknown good industry standard libraries have been brought into achieve compression, crypto, alloc, and formatting objectives. Footprint is expected to be on the heavier side for optimal performance usability, and flexibility.

Defer to benchmarks

Utilities

Aurora Sugar: Main Header, (*.)Utils.hpp
Aurora Macro Sugar: Main Header
Aurora Overloadable Type Declerations: Main Header

Logging

Logging is implemented through 2 subsystems, console and logging. Console provides IO abstraction to the logger subsystem sinks. Sinks are user implementable interfaces that can be either synchronous or asynchronous. Loggers are defined as an internal object that takes logger messages, applies string sanitization, checks the current filter statemap, makes a copy of the message, and then sinks the mesasge object to the relevant subscriber[s].

Flushing occurs at a fixed rate on a low prio background thread without any configuration requirement. The resources spent on this background thread is shared with the telemetry and debug subsystems to reduce the overall thread count of the runtime. Flushes also occur during panic events and other relevant problematic points to help mitigate the loss of crash telemetry.

Asynchronous logger sinks may double buffer log lines between the asynchronous callback and OnFlush callback, where the latter call is guaranteed after one or more delegated dispatch. The Windows Event Log backend takes advantage of this to multi-line group messages by approx time and log level.

Additionally, in the console subsystem, consoles that provide an input stream can be used in conjunction with the parse subsystem to provide basic command-based deserialization, tokenization, and dispatch of UTF-8 translated strings regardless of the system locale. Command processing and dispatch comes under the namespace of AuConsole::Commands, not AuLog.

Exceptions

ICanHasStackTraces

Through the use of compiler internal overloads, ELF hooking, and Win32 AddVectoredExceptionHandler, Aurora Runtime hooks exceptions at the time of throw, including some out of ecosystem exceptions, providing detailed telemetry of the object type, object string, and backtrace. In addition, the AuDebug namespace provides TLS based last-error and last-backtrace methods.

EXCEPTIONS ARE NOT CONTROL FLOW...

  • Aurora Runtime WILL attempt to mitigate exceptions in internal logic
  • Aurora Runtime WILL NOT abuse exceptions to communicate failure
  • Aurora Runtime WILL try to decouple internal exceptions from the API
  • Aurora Runtime WILL NOT use anything that automatically crashes on exception catch (no-noexcept)
  • Aurora Runtime WILL provide extended exception information to telemetry backends and through the AuDebug namespace
  • Aurora Runtime WILL NOT make any guarantees of being globally-noexcept; however, it should be a safe assumption in non-critical environments

SysPanic can be used to format a std::terminate-like exit condition, complete with telemetry data and safe cleanup.

Thread Primitives

The Aurora Runtime provides platform optimized threading primitives inheriting from a featureful IWaitable interface. Each method is guaranteed. Each primitive is implemented in userland with lock-less atomic fastpaths on every single platform from Windows XP to Windows 11, to Linux, with nanosecond resolution absolute and relative yielding.

struct IWaitable
{
    void Lock() 
    bool LockMS(AuUInt64 qwRelTimeoutInMs /*  = 0, infinity - use TryLock to avoid sleeping */)
    bool LockNS(AuUInt64 qwRelTimeoutInNs /* = 0, infinity - use TryLock to avoid sleeping */)
    bool LockAbsMS(AuUInt64 qwAbsTimeoutInMs /* = 0, infinity*/)
    bool LockAbsNS(AuUInt64 qwAbsTimeoutInNs /* = 0, infinity*/)
    bool TryLock()
    void Unlock()
};

Included high performance primitives

  • arbitrary condition variable ^1
  • condition mutex
  • condition variable
  • critical section ^2
  • event
  • mutex
  • semaphore
  • rwlock ^3
  • spinlocks

In addition, there are user-space (no syscall) monitor primitives for:

  • timeline semaphores ^4
  • wait barriers
  • mutexes
  • condition variables ^1
  • semaphores
  • trigger all when state is zero

See: AuThreading::Futexes for reference implementations

^1 Accepts any IWaitable as the mutex
^2 Reentrant Mutex
^3 Includes extended read to write upgrades and permits write-entrant read-routines to prevent writer deadlocks. Two variants are included to provide further granularity over reentry behavior.
^4 Wait until counter reaches X instead of waiting until counter is non-zero. Think: D3D fences, Vulkan timeline semaphores, etc.

Futexes

Every platform supported by the Runtime supports the in-process waitlist implementation for hidden condition variables, aka the bald-faced lie of "futexes" in the Linux world - theirs are neither fast nor in-userspace unlike other libc/vdso/similar features (eg: steady clocks, rand, cpuid, etc). In addition to the expected interface of Wait(pAddress, pCompare) and Signal(pAddress), there are special methods for handling numeric comparions and bitmasks, there is support for upto 32byte equal/not-equal comparions of unaligned volatile addresses, and WaitForMultipleAddressesAnd and WaitForMultipleAddressesOr are provided as a competing wait-multiple threading API to AuLoop::WaitMultipleLoopSources2. In addition, WaitForMultipleAddressesOrWithIO exists as a slower mechanism to interop HANDLE/FD based IO with an otherwise faster NT:KeyedEvent,Linux:futex,POSIX:condvar-based backend.

Fixing problems in other scheduler apis

Problem one (1):
Most STL implementations have generally awful to unnecessarily inefficient abstraction. Defer to libc++'s abuse of spin while (cond) yield loops and msvc/stl's painfully slow std::mutex and semaphore primitives.

Problem Two (2):
Moving to or from linux, macos, bsd, and win32 under varous kernels, there is no one standard (even in posix land) for the key thread primitives.

Bonus point NT (3):
The userland CriticalSection/CV set of APIs suck, lacking timeouts and try lock

Bonus point UNIX (4):
No wait multiple mechanism

Bonus point ALL (5):
It's not possible to mix kernel IO primitives (NT: CreateSemaphore, Linux: eventfd, Unix: pipes, etc) with user-space monitors (NT: CRTICIAL_SECTION, SRWLOCK, WaitOnAddress, Linux: * NONE *, Unix: * NONE *)

1, 2, 3: Use the high performance AuThreadPrimitives objects

4: Consider using loop sources, perhaps with the async subsystem, in your async application. Performance of loop sources will vary wildly between platforms, always being generally worse than the high performance primitives. They should be used to observe kernel-level signalable resources.

4 ex: Windows developers can use loop sources as a replacement to WaitMultipleObjects with more overhead

4 ex2: If you really don't care about efficiency or you're performing an AND operation, you can AuThreading::[WaitForShared/WaitFor] to yield on an array of generic waitables.

4 ex3: If you actually care about efficiency, as of dec/2024, there is an API for waiting on multiple in-process addresses, with no spurious wakes, and with efficient AND support. It is recommended that you attempt to implement your own primitives, if you plan to use WaitForMultipleAddressesOr or WaitForMultipleAddressesAnd, however reference primitives exist under AuThreading::Futexes.

5 and 4 ex4: It is possible to mix user-space waitlists / futexes with IO primitives using WaitForMultipleAddressesOrWithIO. It is possible to wait on multiple generic Handle/FDs alongside your own custom user-space primitives.

IO

The Aurora Runtime implements loop, file io, network io, and other sub-subsystems with various adapters and connectors.

An important note about texting encoding. Stdin, file encoding, text decoders, and other IO resources work with codepage UTF-8 as the internal encoding scheme. String overloads and dedicated string APIs in the IO subsystem will always write BOM prefixed UTF-8, and attempt to read a BOM to translate any other arbitrary user generated text input to UTF-8.

Loop

The Aurora Runtime implements a kernel-scheduler optimized IO loop subsystem for managing GUIs; network, file, ipc AIO; and thread synchronization objects for when these waitables converse.

ILoopSource is an interface defined by the loop subsystem for IO objects with a signalable state. Attached to an ILoopQueue, the ILoopQueue will provide wait-any/wait-all/is-signaled polling with optional subscription functionality. Furthermore, ILoopQueues are thread-safe allowing for cross-thread or mid-wait work scheduling (as in, the addition of new subscribers during sleep or callback).

It is possible to run loop queues like a poll object or with an arbitrary amount of optional subscribers per loop source.

Subscription notifications allow for optimized loop source removal or no-action/non-removal replies from subscription implementer. If you and all other subscribers want to evict the ILoopSource, the source will be automatically removed from the ILoopQueue. if just a single callback votes to evict the loop source, you will no longer receive updates for the object, but the ILoopSource and other subscribers will remain.

IPC

Included in the IPC subsystem are pipes, as used by AuProcesses; events; mutexes; semaphores; and shared memory views. IPC objects are exported by an internally generated non-standard string which contains platform specific information to import such object in a compatible application. Aurora IPC is not bound by processes bound by a common worker, instead, UNIX sockets and procfs are used to implement IPC within the applications namespace/sandbox.

FIO

A simple blocking file stream is provided by an open function given an Aurora path string and a file advisory lock level. This object can be used with AuProcess to map regions of the file into the address map. However, everything about this object is blocking.

An alternative asynchronous IAsyncFileStream interface is available which supplies IO transaction objects for scheduling direct disk reads. One should be careful to note each platform has file AIO querks. For instance...

  • Linux will block on most file systems if metadata has to be poked
  • FS support is limited (NT/NTFS > Linux/XFS > NT/xxxx (w/ caching) > Linux/EXT4 > unsupported)
  • Read/Writes musts be made with respect to sector alignment
  • Removing caching on Linux will mitigate blocking behaviour (~O_DIRECT blocking io_submit)
  • ...but caching on Win32 is sometimes desirable
  • Linux reads might be limited by max_sectors_kb or max_segments

For large block reads:
bDirectIO = true; read directly from fs (recommend)
offsets -> must align

When data is small enough for file caches to be useful:
bDirectIO = !AuBuild::kIsNtDerived (recommend)
offsets -> align for the highest denominator

Additional utility functions exist outside of the two file interfaces for: stat, directory iteration, UTF-8 string reading and writing, blocking binary read/writes, and more.

Paths

We assume all paths are messy. Incorrect splitters, double splitters, relative paths, and keywords are resolved internally. No URL or path builder, data structure to hold a tokenized URI expression, or similar concept exists in the codebase. All string paths are simply expanded, similar to MSCRT's fullpath or UNIX's realpath, at time of usage.

Expression Meaning
Path[0] == '.' Current Working Directory
Path[0] == '^' Executable module's Directory
Path[0] == '~' User Profile Storage + SDK brand
Path[0] == '!' All User Shared Storage + SDK brand
.. Go up a directory
/ Agnostic Directory Splitter
\ Agnostic Directory Splitter
\\ Escaped POSIX FS \ character.
. [SPLITTER] Nothing

TLS

TLS client and partial server support is provided by protocol stack interceptors meaning that our implementation is no-socket. It's possible to write into a buffered protocol stack using the provided stream writer, simulating data coming through a socket channel; and it's possible to fetch the response/translated message using an end protocol piece to be supplied with the data, or using the provided stream reader to read the end interceptors buffer, once the protocol stack has been ticked.

Resources

The Aurora Runtime provides system, application, and user specific paths under the Aurora::IO::FS subsystem. These include the users home directory, a per vendor sandboxed application user directory, a per vendor sandboxed application all users directory, the user-installable program directory, the user's real home directory, and other such relevant paths.

Networking

Character IO

Proccesses

The Aurora Runtime provides child process monitoring, asynchronous child stdin/out/err transactions, child synchronization (via a primitive threading event and an io event), process spawning, file opening, and url opening functionality.

Locale

Encoding and decoding of UTF-8, UTF-16, UTF-32, GBK, GB-2312, and SJIS is supported through OS provided decoders. System localization information, including system codepage, country, and system language, is provided by the available envrionment variables, OS specific interfaces, or the overload mechanism.

Memory

Allocator

Aurora Runtime (would like to) provide its' own global allocator to best manage resources under one best-fit allocator. If the runtime is built with AuroraAlloc.cpp, every module that touches the Runtime should be built alongside AuroraAlloc.cpp and its' link-linkage allocator operators. The AuList, AuHashMap<K,V>, AuBST<K,V>, AuString, AuSPtr, and almost-all AuUPtr types are hardened against such global allocator conflicts.

Memory Heap

Aurora provides a heap allocator for dividing up a large preallocated region of memory.

Currently, we use a modified version of O(1) heap that provides a constant worst case allocation and deallocation of any given request.

Shared Pointers

Heap objects, including shared pointers, and the object allocation model is defined by AuROXTL. The AuSPtr class template is backed by the standard std::shared_ptr, extended by #include <auROXTL/auMemoryModel.hpp>, in the default configuration.

AuSPtrs allow for stage/debug build null checking to prevent hard crashing during debugging and QA.

Temp AuSPtrs can be crafted with AuUnsafeRaiiToShared from unique pointers and raw pointers so that C-like applications can access a subset of the API.

Defer to auROXTL for more information

Debug

Asserts

[TODO]

Example:

Debug, Release, and Ship (all) assertions:

SysAssert(AuFunction{}, "unexpected default function")

Debug and Release (debug and optimized ship-with-debug) assertions:

SysAssertDbg(AuFunction{}, "unexpected default function")

Error stack

TEST(ErrorStack, A)
{
    AuErrorStack errors;
    
    SysPushErrorIO("Something something IO error");
    SysPushErrorIO("Something something IO error 1");
    SysPushErrorIO("Something something IO error 2 {}", "hello worlds");
    
    ASSERT_TRUE(errors.HasCaptured());
    ASSERT_TRUE(errors.HasMultipleOccurred());
    ASSERT_TRUE(errors.FirstMessage()->pNextThreadMesage);
    ASSERT_TRUE(errors.FirstMessage()->pNextThreadMesage->pNextThreadMesage);
    
    AuLogDbg("{}, {}, {}",
             *errors.ToString(),
             errors.FirstMessage()->pNextThreadMesage->ToString(),
             errors.FirstMessage()->pNextThreadMesage->pNextThreadMesage->ToString());
}

TEST(ErrorStack, B)
{
    AuErrorStack errors;
    try
    {
        AU_THROW_FORMATTED("hello people {}", 23423423);
        //throw "hello modern platforms";
    }
    catch (...)
    {
    }
    ASSERT_TRUE(errors.HasCaptured());
    AuLogDbg("{}", *errors.ToString());
}

More

Binding

Aurora Runtime provides C++ APIs; however, it should be noted that two libraries are used to extend interfaces and enums to help with porting and internal utility access. One, AuroraEnums, wraps basic enumerations and provides value vectors; value strings; look up; iteration; and more. The other, AuroraInterfaces, provides TWO class types for each virtual interface. Each interface can be backed by a; C++ class method overriding a superclass's virtual ...(...) = 0; method, or a AuFunctional -based structure.

It should be noted that most language bindings and generator libraries (^swig, v8pp, nbind, luabind) work with shared pointers. Other user code may wish to stuff pointers into a machineword-sized space, whether its a C library, a FFI, or a size constraint. One handle or abstraction layer will be required to integrate the C++ API into the destination platform, and assuming we have a C++ language frontend parsing our API, we can use AuSPtr for all caller-to-method constant reference scanerios. Furthermore, AuSPtrs can be created, without a deletor, using AuUnsafeRaiiToShared(unique/raw pointer). To solve the raw pointer issue, AuSPtrs are created in the public headers with the help of exported/default visibility interface create and destroy functions. These APIs provide raw pointers to public C++ interfaces, and as such, can be binded using virtually any shim generator. Method and API mapping will likely involve manual work from the library developer to reimplement AU concepts under their language runtime instead of using the C++ platform, or at least require manual effort to shim or map each runtime prototype into something more sane across the language barrier.

Memory is generally viewed through a std::span like concept called MemoryViews. MemoryViewRead and MemoryViewWrite provide windows into a defined address range with the possibility of shared ownership. MemoryViewStreamRead and MemoryViewStreamWrite expand upon this concept by accepting an additional offset (AuUInt &: reference) that is used by internal APIs to indicate how many bytes were written or read from a given input region. Such requirement came about from so many APIs, networking, compression, encoding, doing the exact same thing in different not-so-portable ways. Unifying memory access to 4 class types with shared control block storage should aid with FFI prototyping.

Unrelated note, structure interfacing with questionable C++ ABI reimplementations is somewhat sketchy in FFI projects (^ CppSharp) can lead to some memory leaks.

Strings

The auROXTL header only library defines an AuString type as an std::string; however, it should be assumed this type represents a binary blob of UTF-8 (or perhaps malformed to include NULs). Further locale processing is delegated to Aurora::Locale[::Encoding]

Dependencies

Aurora

Crypto (third party)

Compression (third party)

Utility (third party)

^1 Include-only macro library
^2 Provides core utilities and stl decoupling
^3 Provides platform information, included by default by the Aurora build pipeline
^4 C++ 20 saw another pathetic adoption attempt of an open source library, this one actually passed, but hardly
anyone implements std::format. Not to mention such is only a subset of the original library.
^5 Public Domain
^6 Potentially STL heavy, still potentially portable w/ a modern-ish toolchain

Philosophies

  • Assume C++17 to C++20-ish (but not quite) language support in the language driver

  • Use AuXXX type bindings for std types, allow customers to overload the std namespace
    We assume some containers and utility APIs exist, but where they come from is up to you

  • Keep the code and build chain simple such that any C++ developer could maintain their own software stack built around aurora components.

  • Dependencies and concepts should be cross-platform, cross-architecture, cross-ring friendly

    It is recommended to fork and replace any legacy OS specific code with equivalent AuroraRuntime concepts, introducing a circular dependency with the Aurora Runtime

    APIs shouldn't be designed around userland, mobile computing, or desktop computing; AuroraRuntime must provide a common backbone for all applications.

    Locale and user-info APIs will be limited due to the assumption userland is not a concept

  • Dependencies, excluding core reference algorithms (eg compression), must be rewritten and phased out over time.

  • Dependencies should not be added if most platforms provide some degree of native support
    Examples:
    -> Don't depend on a pthread shim for windows; implement the best thread
    primitives that lie on the best possible api for them
    -> Don't depend on ICU when POSIX's iconv and Win32's multibyte apis cover
    everything a conservative developer cares about; chinese, utf-16, utf-8,
    utf-32 conversion, on top of all the ancient windows codepages

  • Dependencies should only be added conservatively when it saves development time and provides production hardening
    Examples:
    -> Use embedded crypto libraries; libtomcrypt, libtommath
    ->> While there are some bugs in libtomcrypt and others, none appear to
    cryptographically cripple the library. Could you do better?
    -> Use portable libraries like mbedtls, O(1) heap, mimalloc
    ->> Writing a [D]TLS/allocator stack would take too much time
    ->> Linking against external allocators, small cross-platform utilities, and
    so on is probably fine
    -> Shim libcurl instead of inventing yet another http stack