u/minamoto108

Re-ran the wasm-in-JVM and JS-in-JVM benchmarks after maintainers asked to be included — wasmtime4j and chicory-redline numbers inside, same JMH harne
▲ 5 r/WebAssemblyDev+1 crossposts

Re-ran the wasm-in-JVM and JS-in-JVM benchmarks after maintainers asked to be included — wasmtime4j and chicory-redline numbers inside, same JMH harne

https://preview.redd.it/gwrsfmf5ve2h1.png?width=2048&format=png&auto=webp&s=fcb3a517d4d8b73b28e4fa4ddb551457ff088744

A couple of weeks back I posted two benchmark write-ups: wasm-in-JVM (six backends, JPEG decode) and JS-in-JVM (Sieve of Eratosthenes). The most useful thing that happened next: Andrea Peruffo (Chicory core contributor) reached out on LinkedIn, and u/Otherwise_Sherbert21 (wasmtime4j author) reached out on Reddit — both pointing out the obvious gap. So I added wasmtime4j and chicory-redline as backends to both harnesses, kept the workloads and JMH config identical, and re-ran the lot. Sharing the updated tables here because the two new rows actually move the discussion forward.

Same host both runs: Apple M2 Max, Oracle GraalVM 25 (25+37-LTS-jvmci-b01), JMH 1.37, 1 fork, single-threaded, AverageTime mode in µs/op. Workloads byte-identical to the original posts.

wasm — proxy.wasm JPEG decode (Rust jpeg-decoder, 320×240 → 230,400 bytes RGB8; SHA-256 of decoded output identical across all eight backends).

# Backend Score (µs/op) 99.9% CI vs fastest
1 nativeFfm 1,016.205 ±33.699 1.00×
2 graalwasm 1,324.282 ±352.856 1.30×
3 wasmtime4j 1,419.934 ±387.601 1.40×
4 chicoryRedline 1,782.507 ±33.860 1.75×
5 chicoryAotPlugin 9,594.229 ±116.206 9.44×
6 chicoryAot 9,600.974 ±296.031 9.45×
7 graalwasmInterp 72,996.191 ±2,938.575 71.84×
8 chicory 252,427.438 ±8,303.613 248.45×

What each new row actually is:

backend engine codegen bridge tier observed
wasmtime4j Wasmtime 44.0.1 Cranelift JIT wasmtime4j JNI JNI (Panama impl unpublished in 0.x)
chicoryRedline Chicory Machine SPI Cranelift AOT at build time jffi → native code redline.isNative() == true

JS — sieve(1_000_000) = 78,498 (all 7 backends return the same answer).

# Backend Score (µs/op) 99.9% CI vs fastest
1 graaljs 2,799.644 ±16.209 1.00×
2 graaljsInterp 116,971.726 ±286.025 41.78×
3 rquickjsFfm 164,382.032 ±1,129.132 58.72×
4 wasmtime4j 607,403.997 ±20,726.650 216.96×
5 chicoryRedline 2,012,876.770 ±89,055.397 718.96×
6 quickjs4j 14,341,999.558 ±54,122.651 5,123.51×
7 rquickjsChicory 18,492,288.679 ±64,727.834 6,605.30×

Both new JS rows execute rquickjs.wasm (the same QuickJS-via-Rust binding used by rquickjsFfm, just delivered as wasm instead of as a cdylib).

Things the new rows surface that the original posts couldn't

Same engine, two bridges: FFM vs JNI. nativeFfm (1,016 µs) and wasmtime4j (1,420 µs) both run the same proxy.wasmthrough Wasmtime + Cranelift. The 40 % gap is per-call bridge overhead — JEP 454 FFM vs wasmtime4j's JNI path. wasmtime4j 44.0.1 publishes a wasmtime4j-panama artifact, but the PanamaWasmRuntime class isn't shipped in 0.x yet, so on JDK 25 you still go through JNI. Closing that gap is upstream work, not engine work — and when the Panama impl lands, the expectation is wasmtime4j and nativeFfm converge.

Cranelift → native vs Cranelift → JVM bytecode, 9× apart. chicoryRedline (1,783 µs) and chicoryAotPlugin (9,594 µs) both compile proxy.wasm at build time. The only material difference is the codegen target — native machine code through redline vs JVM bytecode through Chicory's compiler plugin. The native path wins 9.4× despite paying for the jffi bridge on every call. JVM-bytecode AOT is not a substitute for true native compilation on this workload.

Bridge cost depends on workload, not just on the bridge. Same two backends across the two harnesses:

JPEG decode Sieve
wasmtime4j (JNI) 1,420 µs 607,404 µs
chicoryRedline (jffi + Chicory Instance scaffold) 1,783 µs 2,012,877 µs
ratio 1.26× 3.32×

On JPEG decode each benchmark op is one heavy guest call — bridge cost is paid once and amortised across ~1 ms of native work. On Sieve, each op runs millions of QuickJS-interpreter instructions but the JVM↔guest scaffolding has more to do per op, and Chicory's Instance export call is enough heavier than Wasmtime's call ABI that the 1.26× gap on JPEG decode blows out to 3.32× here. Same engines, same bridge primitives — different workload-to-bridge-cost ratio.

wasm sandbox tax (JS-side). wasmtime4j (608 ms) runs the same upstream rquickjs binding as rquickjsFfm (164 ms) — just compiled to wasm and executed by Wasmtime + Cranelift instead of linked as a cdylib. The 3.7× gap is wasm linear-memory bounds checks, indirect calls, plus the JNI hop to enter the QuickJS interpreter. Useful number to keep in mind if you're considering "ship as wasm for portability" for a JS engine.

Floor on both workloads is unchanged. Chicory's tree-walk interpreter on JPEG decode (248×), Chicory bytecode-AOT on a JS interpreter wrapped in wasm (6,600×). These set the lower bound for "no codegen / two interpreters deep on the JVM" — useful context for the new rows but no rank-order change.

Caveats — same as before, repeated for completeness

  • Single host, single fork, wide CIs on graalwasm (±353), wasmtime4j wasm row (±388), and chicoryRedline Sieve row (±89,055 — that 4.4 % CI is the largest absolute error in either table). Rank order is stable; the absolute spread on those rows would tighten with more forks.
  • JDK = Oracle GraalVM 25. Stock OpenJDK 25 reproduces every row except graalwasm / graaljs — those depend on Graal-as-JIT (JVMCI / libgraal). Running them on Temurin / Corretto silently falls back to the Truffle interpreter row, which is the calibration trap from the original posts.
  • Bridge mix is uneven across rows (FFM / JNI / jffi / direct JVM-bytecode call). A perfectly controlled "engine A vs engine B" comparison would hold the bridge constant — currently impossible because not every published artifact ships a Panama impl.
  • One workload per harness. JPEG decode is compute-heavy with substantial memory traffic; Sieve is tight inner loops over arrays. Workloads with more allocation, more dynamic dispatch, more guest↔host round trips will reorder both tables — especially the bridge-overhead rows.

What's next on the Hexana side

The next release will ship tooling that lets you actually see inside one of these integrations from inside a JetBrains IDE — the wasm↔host boundaries, the codegen tier each module is running at, the bridge each call is going through. Whether seeing inside is enough to move numbers like the ones in these tables is the question the follow-up post will try to answer. Numbers, not promises.

Repros

Both repos: mvn package builds the wasm artifacts, the Rust cdylib, and (for the wasm harness) the redline-compiled native code; then java --enable-native-access=ALL-UNNAMED -cp … runs the JMH suite. mvn exec:java will fail — JMH's forked runner can't see the project classpath that way; both READMEs spell out the workaround.

PRs welcome for backends I've missed — wasmer-java, wazero-on-JVM via JNI, additional WASI-heavy workloads. And if you're seeing materially different ratios on a different workload or JDK, I'd love to see the numbers — would help calibrate where these generalise.

Thanks again to u/Otherwise_Sherbert21 (wasmtime4j) and Andrea Peruffo (Chicory) for asking to be included. The two new rows changed how I read the original tables.

reddit.com
u/minamoto108 — 1 day ago
▲ 6 r/WebAssemblyDev+2 crossposts

Hexana 0.9.1 for JetBrains IDEs: `.wit` declarations and `.wasm` exports now surface in Goto Symbol / Search Everywhere

https://preview.redd.it/fg9jnycgoe2h1.png?width=2048&format=png&auto=webp&s=1c74810574a371c7a581e2ecf3a6695711d238ce

Small workflow polish in this release. The JetBrains-platform "Search Everywhere → Symbols" / "Goto Symbol" navigation now indexes Hexana's parsed binary metadata:

  • .wit declarations (functions, types, world / interface names)
  • Component-model exports (from .wasm component binaries)
  • Core .wasm module exports

Picking a .wasm export from the symbol picker opens the file in Hexana's binary editor and selects the matching row in the Exports tab automatically.

Concretely: if your day-to-day was open .wasm → click the Exports tab → scroll-and-eyeball for the right row, that's now ⇧⇧ → type the export name → Enter, the same way you'd jump to a Kotlin / Java / Rust symbol in the same IDE.

This sits alongside what 0.9 shipped a couple of weeks back (experimental WASM debugging via Wasmtime / WAMR + lldb, bring-your-own GraalVM as a run target, Top-tab sortable headers, Chicory and GraalWasm completion for Java embedders). The current focus is making the IDE treat .wasm and .wit as first-class citizens of the symbol space, not as opaque artifacts you have to switch contexts to inspect.

Install / update: https://plugins.jetbrains.com/plugin/29090-hexana
Docs: https://jetbrains.github.io/hexana

Per-version changelog under the "Versions" tab of that same Marketplace page.

reddit.com
u/minamoto108 — 1 day ago
▲ 4 r/WebAssemblyDev+2 crossposts

Hexana 0.1.0 for VS Code: WAMR + GraalVM runtimes, experimental WASM debugging, MCP server

https://preview.redd.it/a76rk5ntle2h1.png?width=2048&format=png&auto=webp&s=068ba9c43efd76dcedd5619090a5434060d18185

The VS Code build of Hexana has been chasing the JetBrains feature set since it first shipped (0.0.2 in early May). 0.1.0 closes most of that headline-feature gap in one bundle.

Three things land in this release:

WAMR and GraalVM as runtimes — previously the VS Code Run command only used Wasmtime. Now you can pick WAMR or GraalVM as well, same Run UI. Useful if you're targeting a runtime your wasm will actually ship into (WAMR for embedded scenarios, GraalVM for JVM-embedding scenarios) and you want to test against that engine, not just Wasmtime.

Experimental WASM debugging — same scope and constraints as the JetBrains 0.9 release: requires LLVM 22.1 or newer, works only with Wasmtime or WAMR (not GraalVM), and only for targets that are debuggable with lldb. Within those bounds you can step through your .wasm, pause, inspect locals, continue, all from inside VS Code. The honest framing is in the label — it's experimental, the constraints are real, but within those constraints it works.

MCP server — the extension now ships a Model Context Protocol server. The point is that MCP-capable AI assistants reading your wasm see what Hexana sees, not just a raw byte stream.

What this means for the cross-IDE story: the JetBrains and VS Code builds are not identical yet — the JetBrains 0.9.1 release that landed today, for example, adds .wit and .wasm symbol contributions to JetBrains' Goto Symbol / Search Everywhere, which is a platform-specific integration. But the headline features — three runtimes, experimental debugging, MCP server, the full structural analysis surface — are now in both builds.

Install / update

VS Code command palette:

ext install JetBrains.hexana-wasm

Or directly:

Per-version changelogs are under the "Versions" tab on each of those listings.

reddit.com
u/minamoto108 — 1 day ago
▲ 13 r/WebAssemblyDev+3 crossposts

Hexana now has documentation — jetbrains.github.io/hexana

https://preview.redd.it/tyusheln6i1h1.png?width=2048&format=png&auto=webp&s=e02f6a992d6903bd0a13ea3ecaff99e8831f47fd

Hexana is a WebAssembly and binary analysis toolkit by JetBrains. Until today, the only places you could go to figure out what it does were:

  • the JetBrains Marketplace listing (per-version changelog), and
  • the VS Code Marketplace / Open VSX listings (release notes only).

That's not enough surface for a tool that does multi-tab .wasm inspection, WAT/WIT language support, structural analysis, an MCP server for AI assistants, run/debug, DWARF source mapping, and Java embedder support across Wasmtime / WAMR / GraalVM. So: https://jetbrains.github.io/hexana

What's there

  • Two flavours, side by side. Hexana ships as a JetBrains IDE plugin (IntelliJ IDEA, RustRover, WebStorm, GoLand, CLion, PyCharm, etc.) and as a VS Code extension. The docs cover both, and there's a "Choosing between the two" page for when it's not obvious which one fits. In practice today: VS Code is the lighter read-only view; the JetBrains plugin is where the debugger, multi-runtime run configs, and MCP live.
  • Wasm coverage documented. Core, Component Model, GC, SIMD, Threads, Tail Call, Reference Types — and what Hexana actually does with each.
  • Runtimes. Wasmtime, WAMR, GraalVM — what's selectable where, and which combinations the debugger supports today (LLVM ≥ 22.1, Wasmtime or WAMR, lldb-debuggable target).
  • Per-flavour pages. Getting Started, Features, Settings, Troubleshooting, Changelog — for each.

What this is not

  • Not a one-size-fits-all docs site pretending the two flavours are at parity. JetBrains plugin is 0.9 (May 7). VS Code extension is 0.0.2 preview (May 7). The docs match reality on the ground — the JetBrains side is broader because the plugin is broader.
  • Not an internal-API reference. It's user-facing.

Links

If anything in the docs is wrong, unclear, or missing — that's exactly the feedback we want right now.

reddit.com
u/minamoto108 — 6 days ago
▲ 8 r/WebAssembly+1 crossposts

https://preview.redd.it/kkoxkrhstpzg1.png?width=2048&format=png&auto=webp&s=e4e72504a50976b170a7c580315c962d33957910

Hexana started life as a plugin for JetBrains IDEs (IntelliJ IDEA, RustRover, WebStorm, GoLand, CLion, PyCharm, etc.) that treats .wasm and .wit as first-class IDE artifacts. It now also ships as a VS Code extension — version 0.0.2 just landed on Open VSX.

Install (VS Code command palette):

ext install JetBrains.hexana-wasm

Or from VSCode Marketplace: on VSCode Marketplace: https://marketplace.visualstudio.com/items?itemName=JetBrains.hexana-wasm or here: https://open-vsx.org/extension/JetBrains/hexana-wasm

Below: what's in the VS Code release on day one.

Custom binary editor for .wasm

Opens .wasm files in a dedicated read-only editor instead of the default VS Code hex view. The editor auto-detects whether the binary is a Core Wasm module, a Component Model binary, or a generic Wasm file. The structural analysis panel adjusts based on which kind it is.

Hex viewer

Virtual-scrolling hex dump. Byte selection via click, shift-click, drag. Keyboard navigation. Text search across the byte stream.

Structural analysis — up to 11 tabbed views

Surfaced based on the binary kind:

  • Summary — section table + binary statistics
  • Exports — kind, name, index, function signature
  • Imports — kind, module, name
  • Functions — index, name, signature
  • Data — data segments
  • Custom — custom sections
  • Top — largest contributors by size
  • Monos — monomorphisation analysis
  • Garbage — unreferenced / dead code detection
  • Modules — clickable nested-module drill-down (component model)
  • WAT — WebAssembly Text rendered in a native VS Code editor tab with syntax highlighting

Every table sorts by column and supports text search.

Run support

Run a .wasm from the editor toolbar via wasmtime. The Run dialog asks which export to call and what program arguments to pass.

  • Core modules → import stubs auto-generated.
  • Component-Model binaries → dependencies resolved and composed before run.

Component Model

  • Automatic dependency resolution by scanning workspace directories for matching .wasm files, transitively.
  • Open a nested module inside a component binary in its own editor tab — same custom editor, full structural analysis.

Day-one scope

This is the day-one VS Code feature set. The JetBrains plugin has been around longer and currently has additional capabilities not yet in the VS Code extension — experimental WASM debugging (shipped in JetBrains 0.9, also out today), DWARF source mapping, WIT language support, JS↔Wasm type inference, Java embedder support (Chicory, GraalWasm), and additional runtimes for Run (WAMR, GraalVM).

If you need any of those today, the JetBrains plugin: https://plugins.jetbrains.com/plugin/29090-hexana

File issues if you hit something

If a .wasm should open and doesn't, or a section doesn't parse, the "doesn't load on this binary" reports are exactly what helps right now — ideally with a reproducer.

Install: ext install JetBrains.hexana-wasm
Web listing: https://open-vsx.org/extension/JetBrains/hexana-wasm

VSCode Marketplace: on VSCode Marketplace: https://marketplace.visualstudio.com/items?itemName=JetBrains.hexana-wasm

reddit.com
u/minamoto108 — 14 days ago

https://preview.redd.it/be313tkzmpzg1.png?width=2048&format=png&auto=webp&s=3be561a087277bed40da9f0b3be77c3e1899c730

Hexana is a plugin for JetBrains IDEs (built on the IntelliJ Platform — works in IntelliJ IDEA, RustRover, WebStorm, GoLand, CLion, PyCharm, etc.) that treats `.wasm` and `.wit` as first-class IDE artifacts: explorer tree, hex view, WAT view, navigation, MCP API for AI assistants. Free on the JetBrains Marketplace.

0.9 just shipped. Highlights below; per-version detail on the Marketplace listing: https://plugins.jetbrains.com/plugin/29090-hexana

Experimental WASM debugging

You can step through .wasm from the IDE — pause, inspect, continue. It's experimental and the constraints are explicit:

  • LLVM 22.1 or newer required
  • Works with Wasmtime and WAMR only
  • The target has to be debuggable with lldb

Within those bounds, it works. If you've been doing wasm debugging via printf-into-host-imports, this should feel like a real upgrade. If your toolchain is older than LLVM 22.1, you're out for now.

WAMR support for run + debug

WAMR is now a selectable runtime in run configurations alongside Wasmtime (which shipped in 0.8). Same UI, pick a runtime, hit Run or Debug.

Custom GraalVM home

Until 0.9 the GraalVM run option used the bundled Graal only. You can now point at any GraalVM install on your machine.

UX

  • Information bar across the top of the binary view: file size (hover for stats), module kind, inline Run/Debug buttons.
  • Top tab: proper headers, sortable columns, scrolling.
  • Nested modules: opening one now shows a backreference to the containing module so you can navigate back out.

Java embedder support

If you're embedding wasm in Java:

  • Chicory (RedHat): Java completion + inspections specific to Chicory APIs
  • GraalWasm (Oracle): same, for GraalWasm

File issues if you hit something

If you've got a .wasm that should debug and doesn't (LLVM ≥ 22.1, wasmtime or WAMR target, lldb-debuggable), the "doesn't work" reports are exactly what helps right now — ideally with a reproducer.

Plugin: https://plugins.jetbrains.com/plugin/29090-hexana

reddit.com
u/minamoto108 — 15 days ago
▲ 14 r/WebAssembly+1 crossposts

https://preview.redd.it/w2x0ka2iinyg1.png?width=2048&format=png&auto=webp&s=1923e39f182efc80179a0f66ffd186269d4ea741

Hexana is a JetBrains IntelliJ plugin that treats .wasm binaries (and .wit definitions) as first-class IDE artifacts: explorer tree, hex view, WAT view, navigation, MCP API for AI assistants. Free on the JetBrains Marketplace. Below is a consolidated changelog from 0.5 → 0.8.2 — six weeks, five releases.

Major features added since 0.5

  • Component Model + WIT support. Component sections, instances, type definitions, imports, exports, interfaces, and worlds all show up in the explorer tree. WIT files get full language support — go-to-definition, find usages, hover docs, keyword completion, formatting — and cross-navigate from WIT into the corresponding .wasmdefinitions.
  • DWARF source mapping. Hexana detects and parses DWARF in .wasm and maps functions back to source files and lines. Click a function in the binary, land in the source.
  • Code-Size Profiler for WebAssembly. See exactly which functions, sections, and data segments are eating bytes in your .wasm, right in the IDE.
  • JS interop with Wasm awareness. Real code completion and type inference for instance.exports.*, import namespaces, and property names — derived from the actual .wasm module, not a stale .d.ts.
  • Run configurations. Pick Wasmtime or GraalVM, hit Run.
  • WAT view that's actually usable. Offset-based line numbers matching byte positions, IDE zoom, line numbers, text selection, search, smooth scrolling.
  • Hex view polish. Text selection across hex and text columns, arrow keys behave.
  • Search across imports / exports / functions in any table view (filter-as-you-type).
  • Broader opcode coverage in WAT and MCP. reference-types and bulk-memory instruction families, plus Legacy Exception Handling parsing/rendering.
  • MCP improvements. Tool descriptions tightened for cleaner AI-assisted binary analysis.

Stability picked up alongside this — Go-compiled .wasm modules load, KDoc rendering doesn't break with Hexana enabled, shared-memory limits handled correctly, big WAT files don't lag, run configs work on Windows, and a long-running data race on the shared byte buffer that caused sporadic UnParsedOpcodeExceptions on larger modules is gone.

Chronological breakdown

0.6 — Component Model + WIT (2026-03-18)

Added

  • Component Model binary support: component sections, instances, type definitions, imports, exports, interfaces, worlds — all parsed and shown in the explorer tree
  • WIT language support: code model, go-to-definition, find usages, hover documentation
  • Cross-navigation from WIT to Wasm: click an export in .wit, jump to its definition in .wasm

Fixed

  • Several MCP-side issues affecting AI-assisted analysis

0.7 — WAT usability + search (2026-03-31)

Added

  • WAT files now show offset-based line numbers that match byte positions in the binary — finally makes WAT ↔ hex correlation trivial
  • Search across imports, exports, and functions in any table view (filter-as-you-type, no shortcut)
  • Arrow-key navigation, scrolling, layout fixes across all table views
  • WIT basic editing: keyword completion, code formatting

Fixed

  • KDoc rendering no longer breaks when Hexana is enabled
  • Go-compiled .wasm modules load without crashing
  • Shared memory limits handled correctly

0.7.1 — UX polish (2026-04-09)

Added / improved

  • IDE zoom now works in the WAT tab (presentations, screenshots, blog posts — readable at last)
  • Big WAT files: line numbers, text selection, search, smooth scrolling — proper editor instead of a flat dump
  • Hex view: text selection works across hex and text columns, arrow keys behave

Fixed

  • Unbalanced tree parsing in WAT no longer trips the plugin
  • .wasm/.wat served over HTTP (local debug scenarios) handled correctly
  • WIT folding with empty ranges

0.8 — DWARF + profiler + JS interop + run configs (2026-04-21)

Added

  • DWARF support. Detects and parses DWARF in .wasm, maps functions back to source files and lines. Click a function in the binary, land in the source.
  • Code-Size Profiler. See exactly which functions, sections, and data segments are consuming bytes in your .wasm.
  • JS interop with Wasm awareness. Real code completion and type inference for instance.exports.*, import namespaces, and property names — derived from the actual .wasm module, not a stale .d.ts.
  • Run configurations for Wasmtime and GraalVM. Pick a runtime, hit Run.
  • Explorer integration: Hexana views slot into the Project tool window
  • MCP tool descriptions optimized for cleaner AI-assisted analysis

Fixed

  • IJPL-242167 (Project tool window crash on certain configurations)
  • WIT ClassCastException

0.8.2 — patch (2026-04-30)

Added

  • Legacy EH (exception handling) parsing/rendering — for modules built against the older proposal
  • WAT/MCP rendering of reference-types and bulk-memory instruction families

Fixed

  • Run configurations now work on Windows (Wasmtime / GraalVM run configs in 0.8 didn't actually launch on Windows — they do now)
  • Wasm parser fixes (vector, table)
  • Element segment type 6 now reads the reference-type per WebAssembly 3.0 spec §5.5.12
  • Data race on shared CommonByteBuffer causing sporadic UnParsedOpcodeExceptions on larger modules — fixed

(0.8.1 didn't ship publicly — the Windows fix needed an extra revision before going out.)

Where this is going

Short list of what's actively in progress, in case anyone has opinions to share before it's frozen:

  • WASM debugging via DWARF — read-only inspection works; stepping through wasm in the IntelliJ debugger is next
  • Cross-navigation from Wasm imports back to WIT (the inverse of what shipped in 0.6)
  • More opcodes / proposals coverage in WAT and MCP (threads, tail-call, GC types are the obvious gaps)

Plugin: https://plugins.jetbrains.com/plugin/29090-hexana
Issues / feature requests: https://github.com/JetBrains/hexana/issues

If you've hit something that should be here and isn't — ideally with a .wasm reproducer — file it. The "doesn't load" / "crashes on" tickets get prioritized over feature work.

reddit.com
u/minamoto108 — 20 days ago
▲ 30 r/WebAssembly+2 crossposts

https://preview.redd.it/wqercu2zcnyg1.png?width=2048&format=png&auto=webp&s=370de2ae8a59213447dc5f136bed6dead124ac2f

We've been running wasm modules inside a JVM application (a Rust wasmprinter embedded via GraalWasm) and the obvious follow-up question was: how does this compare to the alternatives, and when should we actually pick something else?

So I built a small JMH harness that runs the same proxy.wasm artifact through six execution paths and wrote up the results. Sharing here because I couldn't find a head-to-head comparison covering all of these in one place, and I'd genuinely like to hear if anyone has reasons to expect different numbers on different workloads.

The workload

A tiny Rust crate compiled to wasm32-wasip1 exposing one export:

#[no_mangle]
pub unsafe extern "C" fn decode_jpeg(
    in_ptr: *const u8, in_len: usize,
    out_ptr: *mut u8, out_cap: usize,
) -> i32 { /* jpeg-decoder → RGB8 */ }

Input: a 320×240 JPEG baked into the wasm via include_bytes!. Output: 230,400 bytes of RGB. Steady-state ~1 ms of native CPU — small enough to expose call/dispatch overhead, big enough that the JIT actually kicks in. Cross-variant correctness check: every backend produces byte-identical output (sha256 matches across all six).

The six backends

Backend What it actually is
chicory Chicory's pure-Java interpreter
chicory-aot Chicory + MachineFactoryCompiler.compile(...) at JVM startup
chicory-aot-plugin Chicory build-time AOT via chicory-compiler-maven-plugin (wasm → JVM .class at mvn compile)
graalwasm GraalWasm with Truffle JIT enabled (libgraal)
graalwasm-interp GraalWasm with engine.Compilation=false
native-ffm Wasmtime/Cranelift in a Rust cdylib, called via Java's FFM API

JVM: Oracle GraalVM 25 (25+37-LTS-jvmci-b01), Apple Silicon. JMH 5×1s warmup + 5×2s measurement, 1 fork, single thread.

Results (µs/op, lower is better)

Backend Mean vs Wasmtime
nativeFfm — Wasmtime/Cranelift via FFM 971 ± 10 1.00×
graalwasm — GraalWasm Truffle JIT 1,275 ± 332 1.31×
chicoryAot — Chicory runtime AOT 9,037 ± 118 9.31×
chicoryAotPlugin — Chicory build-time AOT 9,198 ± 131 9.47×
graalwasmInterp — GraalWasm Truffle no-JIT 69,992 ± 1,204 72.1×
chicory — Chicory pure interpreter 240,707 ± 2,560 248×

A few things worth pulling out

GraalWasm JIT is almost native. 1.31× of Wasmtime/Cranelift is genuinely good — I expected a bigger gap given that Truffle goes through partial evaluation while Cranelift goes wasm → CLIF → assembly directly. After warmup, libgraal produces code competitive with Cranelift's output for this workload. The ±25% CI on graalwasm is the only weak number here, probably tier-promotion noise that more forks would smooth out.

Build-time vs runtime AOT in Chicory is a wash. 9,037 vs 9,198 µs/op, CIs overlap. They run identical bytecode — Chicory's compiler produces the same .class content whether invoked at mvn compile or at JVM startup. Choose based on deployment story, not perf.

The calibration trap. graalwasm-interp at 70,000 µs/op is what you get on stock OpenJDK without JVMCI / libgraal. Truffle prints exactly one warning at startup:

>

…and then runs at interpreter speed. If you benchmark GraalWasm on Temurin or Corretto and conclude it's unusable, you're running it without its compiler. The fix on most platforms is to install Oracle GraalVM 25 (or CE) — the Graal compiler ships in the JDK and Truffle picks it up automatically. If you can't change vendor, the "jargraal" path with org.graalvm.compiler:compiler + org.graalvm.truffle:truffle-compiler on --upgrade-module-path and -XX:+EnableJVMCI works but is fiddly.

Pure interpreters aren't benchmarks. 248× slower means Chicory's interpreter isn't a viable production path for non-trivial workloads. It's still the right default for "run untrusted user wasm with a 100 ms budget" sandbox scenarios — instant startup, no codegen step.

Bonus silliness

While I had the harness open: I compiled Cranelift's codegen library itself to wasm32-wasip1, AOT'd that 2.7 MB wasm artifact via chicory-compiler-maven-plugin into a JVM .class file, and used the resulting Chicory-hosted, JVM-resident Cranelift to emit native machine code for all six host triples. Output sizes for an add(i32,i32) -> i32 test function:

Triple Object bytes Format
aarch64-apple-darwin 320 Mach-O
aarch64-unknown-linux-gnu 600 ELF
aarch64-pc-windows-msvc 126 COFF
x86_64-apple-darwin 328 Mach-O
x86_64-unknown-linux-gnu 608 ELF
x86_64-pc-windows-msvc 130 COFF

Six of Cranelift's ~4,000 internal functions exceed the JVM's 64 KB method-size limit and fall back to Chicory's interpreter; the rest AOT cleanly into a single 2.6 MB .class. Not (yet) a wasm-to-CLIF translator inside the sandbox — cranelift-wasm was deprecated at 0.112 and the translator now lives inside Wasmtime, so a real wasm-compiling-wasm pipeline would mean pinning to deprecated 0.112 or hand-rolling it on wasmparser. Separate project.

Caveats

One workload (small JPEG, ~1 ms of native CPU), one platform (Apple Silicon, GraalVM 25), one JMH config. These generalize well for "small to medium pure-compute wasm modules that don't touch WASI on the hot path" but will shift for: large modules (GraalWasm setup cost grows with module size), WASI-heavy workloads (host-call cost differs across runtimes), JIT-cold workloads (you're measuring tier-up, not steady state), and other JVMs (J9, Zing not measured).

Harness

Source: https://github.com/minamoto79/webasm-java-integration-benchmark

Switching backends in the harness is two lines of Kotlin — happy to take PRs adding workloads or runtimes I missed (wasmer-java? wazero-on-JVM via JNI? would love numbers on those if anyone has them). And if you're seeing materially different ratios on a different workload or JDK, please post — would help calibrate where these numbers actually generalize.

reddit.com
u/minamoto108 — 20 days ago