[wasm] publish TurboFan results in batches
With mprotect-based write protection of the WebAssembly code space, we switch page protection flags each time (at least) one compilation thread needs write access. Two such switches happen when TurboFan compilation results are available in {ExecuteCompilationUnits}: One switch happens when calling {NativeModule::AddCompiledCode} and one more when calling {NativeModule::PublishCode} via {SchedulePublishCompilationResults} and {PublishCompilationResults}. So far, each TurboFan result was published eagerly, i.e., as soon as it became available. This has the benefit that faster code is available immediately, and had no large cost or downside without write protection. However, with write protection switching permissions is expensive (an mprotect syscall) and needs to lock the {WasmCodeAllocator::allocation_mutex_} (which causes lock contention and under Linux many futex syscalls). Thus, immediately publishing each TurboFan result when using write protection can cause up to 10x slower compilation compared with not using write protection. In terms of syscalls we measured (non scientifically) with {sudo perf stat -e 'syscalls:sys_enter*' d8 ...} on the Unity benchmark: - mprotect: 10k vs. 44k syscalls (baseline vs. write protection) - futex: 31k vs. 112k syscalls (baseline vs. write protection) - sys time: 1.6s vs. 10s (baseline vs. write protection) All of those are clearly to high. The fix here is simply to batch togther multiple TurboFan functions into one publishing step when using write protection. The batching logic already exists for Liftoff, so we can just disable eager publishing for TurboFan when using write protection. Additionally, we publish once when all Liftoff results are available (even if the batch is not complete), such that time-to-execute is not regressed. R=clemensb@chromium.org CC=jkummerow@chromium.org Bug: v8:11663, chromium:932033 Change-Id: Ibf6f28ecf4733b40322e62761e66046dec60a125 Cq-Include-Trybots: luci.v8.try:v8_linux64_fyi_rel_ng Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2922114 Commit-Queue: Daniel Lehmann <dlehmann@google.com> Reviewed-by: Clemens Backes <clemensb@chromium.org> Cr-Commit-Position: refs/heads/master@{#74829}
This commit is contained in:
parent
ab4986b8e1
commit
990c9386e2
@ -1343,14 +1343,25 @@ CompilationExecutionResult ExecuteCompilationUnits(
|
||||
return yield ? kYield : kNoMoreUnits;
|
||||
}
|
||||
|
||||
// Before executing a TurboFan unit, ensure to publish all previous
|
||||
// units. If we compiled Liftoff before, we need to publish them anyway
|
||||
// to ensure fast completion of baseline compilation, if we compiled
|
||||
// TurboFan before, we publish to reduce peak memory consumption.
|
||||
// Also publish after finishing a certain amount of units, to avoid
|
||||
// contention when all threads publish at the end.
|
||||
if (unit->tier() == ExecutionTier::kTurbofan ||
|
||||
queue->ShouldPublish(static_cast<int>(results_to_publish.size()))) {
|
||||
// Publish after finishing a certain amount of units, to avoid contention
|
||||
// when all threads publish at the end.
|
||||
bool batch_full =
|
||||
queue->ShouldPublish(static_cast<int>(results_to_publish.size()));
|
||||
|
||||
// Also publish each time the compilation tier changes from Liftoff to
|
||||
// TurboFan, such that we immediately publish the baseline compilation
|
||||
// results to start execution, and do not wait for a batch to fill up.
|
||||
bool liftoff_finished = unit->tier() != current_tier &&
|
||||
unit->tier() == ExecutionTier::kTurbofan;
|
||||
|
||||
// Without mprotect-based write protection, publish even more often,
|
||||
// namely every TurboFan unit individually (no batching) to reduce
|
||||
// peak memory consumption. However, with write protection, this results
|
||||
// in a high number of page protection switches (once for each function),
|
||||
// incurring syscalls and lock contention, so don't do it then.
|
||||
bool publish_turbofan_unit = !FLAG_wasm_write_protect_code_memory &&
|
||||
unit->tier() == ExecutionTier::kTurbofan;
|
||||
if (batch_full || liftoff_finished || publish_turbofan_unit) {
|
||||
std::vector<std::unique_ptr<WasmCode>> unpublished_code =
|
||||
compile_scope.native_module()->AddCompiledCode(
|
||||
VectorOf(std::move(results_to_publish)));
|
||||
|
Loading…
Reference in New Issue
Block a user