提升Rust Workers可靠性:wasm-bindgen中的panic和中止恢复机制
Source: Cloudflare Blog
Rust Workers run on the Cloudflare Workers platform by compiling Rust to WebAssembly, but as we’ve found, WebAssembly has some sharp edges. When things go wrong with a panic or an unexpected abort, the runtime can be left in an undefined state. For users of Rust Workers, panics were historically fatal, poisoning the instance and possibly even bricking the Worker for a period of time.
While we were able to detect and mitigate these issues, there remained a small chance that a Rust Worker would unexpectedly fail and cause other requests to fail along with it. An unhandled Rust abort in a Worker affecting one request might escalate into a broader failure affecting sibling requests or even continue to affect new incoming requests. The root cause of this was in wasm-bindgen, the core project that generates the Rust-to-JavaScript bindings Rust Workers depend on, and its lack of built-in recovery semantics.
In this post, we’ll share how the latest version of Rust Workers handles comprehensive Wasm error recovery that solves this abort-induced sandbox poisoning. This work has been contributed back into wasm-bindgen as part of our collaboration within the wasm-bindgen organization formed last year. First with panic=unwind support, which ensures that a single failed request never poisons other requests, and then with abort recovery mechanisms that guarantee Rust code on Wasm can never re-execute after an abort.
Our initial attempts to address reliability in this area focused on understanding and containing failures caused by Rust panics and aborts in production Rust Workers. We introduced a custom Rust panic handler that tracked failure state within a Worker and triggered full application reinitialization before handling subsequent requests. On the JavaScript side, this required wrapping the Rust-JavaScript call boundary using Proxy‑based indirection to ensure that all entrypoints were consistently encapsulated. We also made targeted modifications to the generated bindings to correctly reinitialize the WebAssembly module after a failure.
While this approach relied on custom JavaScript logic, it demonstrated that reliable recovery was achievable and eliminated the persistent failure modes we were seeing in practice. This solution was shipped by default to all workers‑rs users starting in version 0.6, and it laid the groundwork for the more general, upstreamed abort recovery mechanisms described in the sections that follow.
The abort recovery mechanisms described above ensure that a Worker can survive a failure, but they do so by reinitializing the entire application. For stateless request handlers, this is fine. But for workloads that hold meaningful state in memory, such as Durable Objects, reinitialization means losing that state entirely. A single panic in one request could wipe the in-memory state being used by other concurrent requests.
In most native Rust environments, panics can be unwound, allowing destructors to run and the program to recover without losing state. In WebAssembly, things historically looked very different. Rust compiled to Wasm via wasm32-unknown-unknown defaults to panic=abort, so a panic inside a Rust Worker would abruptly trap with an unreachable instruction and exit Wasm back to JS with a WebAssembly.RuntimeError.
To recover from panics without discarding instance state, we needed panic=unwind support for wasm32-unknown-unknown in wasm-bindgen, made possible by the WebAssembly Exception Handling proposal, which gained wide engine support in 2023.
We start by compiling with RUSTFLAGS='-Cpanic=unwind' cargo build -Zbuild-std, which rebuilds the standard library with unwind support and generates code with proper panic unwinding. For example:
struct HasDropA;
struct HasDropB;
extern "C" {
fn imported_func();
}
fn some_func() {
let a = HasDropA;
let b = HasDropB;
imported_func();
}
compiles to WebAssembly as:
try
call <imported_func>
catch_all
call <drop_b>
call <drop_a>
rethrow
end
call <drop_b>
call <drop_a>
This ensures that even if imported_func() panics, destructors still run. Similarly, std::panic::catch_unwind(|| some_func()) compiles into:
try
call <some_func>
;; set result to Ok(return value)
catch
try
call <std::panicking::catch_unwind::cleanup>
;; set result to Err(panic payload)
catch_all
call <core::panicking::cannot_unwind>
unreachable
end
end
Getting this to work end-to-end required several changes to the wasm-bindgen toolchain. The WebAssembly parser Walrus did not know how to handle try/catch instructions, so we added support for them. The descriptor interpreter also needed to be taught how to evaluate code containing exception handling blocks. At that point, the full application could be built with panic=unwind.
The final step was modifying the exports generated by wasm-bindgen to catch panics at the Rust-JavaScript boundary and surface them as JavaScript PanicError exceptions. One subtlety: Rust will catch foreign exceptions and abort when unwinding through extern "C" functions, so exports needed to be marked extern "C-unwind" to explicitly allow unwinding across the boundary. For futures, a panic rejects the JavaScript Promise with a PanicError.
Closures required special attention to ensure unwind safety was properly checked, via a new MaybeUnwindSafe trait that checks UnwindSafe only when built with panic=unwind. This quickly exposed a problem, though: many closures capture references that remain after an unwind, making them inherently unwind-unsafe. To avoid a situation where users are encouraged to incorrectly wrap closures in AssertUnwindSafe just to satisfy the compiler, we added Closure::new_aborting variants, which terminate on panic instead of unwinding in cases where unwind safety can't be guaranteed.
With panic unwinding enabled:
Panics in exported Rust functions are caught by wasm-bindgen
Panics surface to JavaScript as PanicError exceptions
Async exports reject their returned promises with a PanicError
Rust destructors run correctly
The WebAssembly instance remains valid and reusable
The full details of the approach and how to use it in wasm-bindgen are covered in the latest guide page for Wasm Bindgen: Catching Panics.
Even with panic=unwind support, aborts still happen - out-of-memory errors being one common cause. Because aborts can’t unwind, there is no possibility of state recovery at all, but we can at least detect and recover from aborts for future operations to avoid invalid state erroring subsequent requests.
Panic unwind support introduced a new problem for abort recovery. When we receive an error from Wasm we don’t know if it came from an extern “C-unwind” foreign error, or if it was a genuine abort. Aborts can take many shapes in WebAssembly.
We had two options to solve this technically: either mark all errors which are definitely aborts, or mark all errors which are definitely unwinds. Either could have worked but we chose the latter. Since our foreign exception handling was directly using raw WAT-level (WebAssembly text format) Exception Handling instructions already, we found it easier to implement exception tags for foreign exceptions to distinguish them from aborting non-unwind-safe exceptions.
With the ability to clearly distinguish between recoverable and non-recoverable errors thanks to this Exception.Tag feature in WebAssembly Exception Handling, we were able to then integrate both a new abort handler as well as abort reentrancy guards.
A new abort hook, set_on_abort, can be used at initialization time to attach a handler that recovers accordingly for the platform embedding’s needs.
Hardening panic and abort handling is critical to avoiding invalid execution state. WebAssembly allows deeply interleaved call stacks, where Wasm can call into JavaScript and JavaScript can re-enter Wasm at arbitrary depths, while alongside this, multiple tasks can be functioning in the same instance. Previously, an abort occurring in one task or nested stack was not guaranteed to invalidate higher stacks through JS, leading to undefined behavior. Care was required to ensure we can guarantee the execution model, and contribution in this space remains ongoing.
While aborts are never ideal, and reinitialization on failure is an absolute worst-case scenario, implementing critical error recovery as the last line of defense ensures execution correctness and that future operations will be able to succeed. The invalid state does not persist, ensuring a single failure does not cascade into multiple failures.
While we were working on this, we realized that this is a common problem for libraries used by JS that are built with wasm-bindgen, and that they would also benefit from attaching an abort handler to be able to perform recovery.
But when building Wasm as an ES module and importing it directly (e.g. via import { func } from ‘wasm-dep’), it’s not clear what the recovery mechanism would be for a Wasm abort while calling func() for an already-linked and initialized library that is in a user JS application.
While not strictly a Rust Workers use case, our team also supports JS-based Workers users who run Rust-backed Wasm library dependencies. If we could fix this problem at the same time, that could indirectly also benefit Wasm usage on the Cloudflare Workers platform.
To support automatic abort recovery for Wasm library use cases, we added support for an experimental reinitialization mechanism into wasm‑bindgen, --reset-state-function. This exposes a function that allows the Rust application to effectively request that it reset its internal Wasm instance back to its initial state for the next call, without requiring consumers of the generated bindings to reimport or recreate them. Class instances from the old instance will throw as their handles become orphaned, but new classes can then be constructed. The JS application using a Wasm library is errored but not bricked.
The full technical details of this feature and how to use it in wasm-bindgen are covered in the new wasm-bindgen guide section Wasm Bindgen: Handling Aborts.
Upstream contributions for this work did not stop at the wasm-bindgen project. Building for Wasm with panic=unwind still requires an experimental nightly Rust target, so we’ve also been working to advance Rust’s Wasm support for WebAssembly Exception Handling to help bring this to stable Rust.
During the development of WebAssembly Exception Handling, a late‑stage specification change resulted in two variants: legacy exception handling and the final modern exception handling "with exnref". Today, Rust’s WebAssembly targets still default to emitting code for the legacy variant. While legacy exception handling is widely supported, it is now deprecated.
Modern WebAssembly Exception Handling is supported as of the following JS platform releases:
Runtime | Version | Release Date |
v8 | 13.8.1 | April 28, 2025 |
workerd | v1.20250620.0 | June 19, 2025 |
Chrome | 138 | June 28, 2025 |
Firefox | 131 | October 1, 2024 |
Safari | 18.4 | March 31, 2025 |
Node.js | 25.0.0 | October 15, 2025 |
As we were investigating the support matrix, the largest concern ended up being the Node.js 24 LTS release schedule, which would have left the entire ecosystem stuck on legacy WebAssembly Exception Handling until April 2028.
Having discovered this discrepancy, we were able to backport modern exception handling to the Node.js 24 release, and even backport the fixes needed to make it work on the Node.js 22 release line to ensure support for this target. This should allow the modern Exception Handling proposal to become the default target next year.
Over the coming months, we’ll be working to make the transition to stable panic=unwind and modern Exception Handling as invisible as possible to end users.
While these long‑term investments in the ecosystem take time, they help build a stronger foundation for the Rust WebAssembly community as a whole, and we’re glad to be able to contribute to these improvements.
As of version 0.8.0 of Rust Workers, we have a new --panic-unwind flag, which can be added to the build command, following the instructions here.
With this flag, panics can be fully recovered, and abort recovery will use the new abort classification and recovery hook mechanism. We highly recommend upgrading and trying it out for a more stable Rust Workers experience, and plan to make panic=unwind the default in a subsequent release. Users remaining on panic=abort will still continue to take advantage of the previous custom recovery wrapper handling from 0.6.0.
This work is part of our ongoing effort towards a stable release for Rust Workers. By solving these sharp edges of the Wasm platform foundations at their root, and contributing back to the ecosystem where it makes sense, we build stronger foundations not just for our platform, but the entire Rust, JS, and Wasm ecosystem.
We have a number of future improvements planned for Rust Workers, and we’ll soon be sharing updates on this additional work, including wasm-bindgen generics and automated bindgen, which Guy Bedford from our team previewed in a talk on Rust & JS Interoperability at Wasm.io last month.
Find us in #rust‑on‑workers on the Cloudflare Discord. We also welcome feedback and discussion and especially all new contributors to the workers-rs and wasm-bindgen GitHub projects.