WebAssembly Gets a Performance Boost: Speculative Inlining and Deoptimization in V8

In recent years, WebAssembly has evolved from a low-level compilation target to a platform that now supports garbage-collected languages like Java, Kotlin, and Dart through the WasmGC extension. To unlock even faster execution for these managed languages, the V8 team introduced two groundbreaking optimizations: speculative call_indirect inlining and deoptimization support. These techniques, shipped with Chrome M137, allow the engine to generate smarter machine code by making educated guesses based on runtime behavior. This article answers key questions about how these optimizations work, why they matter, and what performance gains they deliver.

What are speculative optimizations and how do they apply to WebAssembly?

Speculative optimizations involve generating optimized machine code based on assumptions drawn from past execution patterns. For example, if a function is always called with integer arguments, the compiler can emit code optimized for integers instead of generic code that handles all types. If an assumption later proves false, the engine triggers a deoptimization—a rollback to unoptimized code—and restarts execution, collecting new feedback for recompilation. This technique has long been a cornerstone of JavaScript JIT compilers but was historically unnecessary for WebAssembly because WebAssembly 1.0 code was statically typed and compiled ahead-of-time from languages like C or Rust. With the arrival of WasmGC, however, bytecode now includes dynamic features like subtype polymorphism and garbage-collected objects, making speculative assumptions beneficial for generating faster machine code.

WebAssembly Gets a Performance Boost: Speculative Inlining and Deoptimization in V8 — Source: v8.dev

Why did WebAssembly traditionally not need speculative optimizations?

WebAssembly 1.0 was designed as a portable, low-level binary format with explicit static types for functions, variables, and instructions. Programs compiled from C, C++, or Rust already undergo extensive ahead-of-time optimization in toolchains like Emscripten (LLVM) or Binaryen, resulting in highly efficient bytecode. There was little room for runtime speculation because the structure of the code was fully known at compile time. Additionally, WebAssembly lacked high-level abstractions such as objects, inheritance, or dynamic dispatch—features that typically benefit from profile-guided optimization. Without such dynamism, the need for deoptimization (common in JavaScript) evaporated. As a result, V8 could rely on straightforward compilation strategies that delivered good performance without the complexity of speculative techniques.

What changes with WasmGC that makes speculative optimizations beneficial?

The WebAssembly Garbage Collection (WasmGC) proposal introduces rich, high-level types like structs, arrays, and subtyping, along with operations on them. This makes WebAssembly a viable target for managed languages that rely on dynamic behavior, such as method dispatch and type hierarchies. In such code, the exact type of an object may not be known at compile time—for example, a virtual method call could resolve to different implementations. Without speculation, V8 would have to emit slow generic dispatch code. By collecting runtime feedback about which types or functions are most frequently used, the engine can generate specialized inline code that assumes the common case, falling back to a deoptimization routine if the assumption fails. This drastically reduces overhead for object-oriented patterns and is key to making WasmGC performant.

How do speculative call_indirect inlining and deoptimization work together?

Speculative inlining targets indirect function calls (call_indirect in WebAssembly) where the called function varies at runtime. V8 tracks the most frequent target for each call site and speculatively inlines that function’s code directly into the caller. If the actual call matches the expectation, execution runs at maximum speed—no indirect jump, no type check. If a different function is invoked, the inlined assumption is violated. At that point, deoptimization kicks in: the engine discards the optimized code, reverts to unoptimized (but correct) execution, and starts collecting fresh feedback. Over time, the hot path gets recompiled with updated assumptions. This combination mirrors the approach used in JavaScript JITs and is especially powerful for WasmGC programs where indirect calls (e.g., virtual methods) are common.

What performance improvements have been observed from these optimizations?

The impact is significant. On a set of Dart microbenchmarks (compiled to WasmGC), the combination of speculative inlining and deoptimization yields an average speedup of more than 50%. For larger, realistic applications and standard benchmarks, the improvement ranges from 1% to 8%. These numbers underscore that the biggest gains come from programs with frequent indirect calls—typical in object-oriented code. Even modest gains of a few percent translate into tangible user experience improvements, especially on mobile or low-power devices. Moreover, these optimizations lay the groundwork for future enhancements: as V8 collects more runtime data, it can apply further speculative transformations, making WasmGC programs progressively faster.

How does deoptimization support for WebAssembly compare to JavaScript?

In JavaScript, deoptimization is a mature, fine-grained mechanism that handles everything from type changes to bailouts from speculative inlining. For WebAssembly, the implementation is more streamlined because the language lacks JavaScript’s extreme dynamism (e.g., prototype chains, eval). Wasm deoptimizations only occur when a speculative assumption—such as the target of an indirect call or the type of a WasmGC object—fails. The rollback is simpler: execution transfers to a precompiled unoptimized version of the same function. This keeps overhead low and avoids the complexity of reconstructing JavaScript’s execution stack. Nevertheless, the core concept remains identical: speculative compilation thrives on the ability to revert safely when wrong, enabling optimizations that would otherwise be too risky.

What are the future possibilities enabled by deoptimization in WebAssembly?

Deoptimization is not just a one-off feature—it’s a building block for an entire class of speculative optimizations. In the future, V8 could leverage runtime feedback to speculate on array lengths, loop iteration counts, or even object field types in WasmGC. For example, if an array is always accessed with small indices, the engine could elide bounds checks speculatively. Deoptimization provides the safety net: if an assumption fails, execution falls back gracefully. This opens the door to profile-guided recompilation, similar to what state-of-the-art JavaScript engines do. The V8 team expects that as WasmGC matures and more languages target it, speculative techniques will become essential for achieving performance parity with native code.

Tags: