Integrate WASM-based Prism parser into standard build #9184

headius · 2026-01-22T02:42:40Z

This continues the integration of Prism parser wrapper libraries from the jruby-prism project by shading it in and including the new WASM support and Chicory libraries.

This is a prototype of how it should eventually look and this PR is intended to iterate.

Notable behavior changes so far:

If the prism parser is enabled with -Xparser.prism but the dynamic library is not configured or present, it will fall back on the wasm parser.
If the prism parser is enabled with -Xparser.prism.wasm the wasm parser will be used.
The parser statistics output now prints the actual parser class in use.
All dependencies are shaded in and moved to internal packages as appropriate.

The artifacts this depends on comes from:

jruby-prism-gem: see the java-wasm PR for build and install instructions. The resulting artifact is required by jruby-prism.
jruby-prism: Build and install the 2.0 snapshot from the wasm integration PR.

This integrates jruby-prism 2.0 by shading it into our main JRuby jar and loading it from there. Incoming packages considered to be for internal use are moved under `org.jruby.internal`. The output of parser statistics now shows the actual class of the parser, since either or both can be activated now.

headius · 2026-01-22T08:09:36Z

Performance of a "gem list" benchmark with legacy parser versus wasm prism (after sufficient warmup):

legacy:

Parser Statistics:
  Generic:
    parser type: class org.jruby.parser.Parser
    bytes processed: 1451897
    files parsed: 155
    evals parsed: 145
    time spent parsing(s): 0.028649121
    time spend parsing + building: 0.040757128000000004
  IRBuild:
    build time: 0.012108007

wasm prism:

Parser Statistics:
  Generic:
    parser type: class org.jruby.prism.parser.ParserPrismWasm
    bytes processed: 1849537
    files parsed: 155
    evals parsed: 145
    time spent parsing(s): 0.383976125
    time spend parsing + building: 0.39173020000000003
  Prism:
    time C parse+serialize: 0.0
    time deserializing: 0.009623121
    serialized bytes: 1224017
    serialized to source ratio: x0.66179645
  IRBuild:
    build time: 0.007754075

Wierdly this runs much better with tiered compilation turned off (-J-XX:-TieredCompilation):

Parser Statistics:
  Generic:
    parser type: class org.jruby.prism.parser.ParserPrismWasm
    bytes processed: 1849537
    files parsed: 155
    evals parsed: 145
    time spent parsing(s): 0.07803804
    time spend parsing + building: 0.085966137
  Prism:
    time C parse+serialize: 0.0
    time deserializing: 0.008756323
    serialized bytes: 1224017
    serialized to source ratio: x0.66179645
  IRBuild:
    build time: 0.007928097

headius · 2026-01-22T08:10:39Z

Benchmark command line:

jruby -J-XX:-TieredCompilation -Xparser.summary -Xparser.prism -e 'loop { t = Time.now; ruby = org.jruby.Ruby.newInstance; ruby.loadService.require("rubygems"); ruby.loadService.require("rubygems/gem_runner"); ruby.evalScriptlet("Gem::GemRunner.new.run [%{list}]"); ruby.tearDown; $stderr.puts Time.now - t }'

Remove -Xprism.parser or the tiered compilation flag to test without them.

CufeHaco · 2026-01-22T17:53:55Z

Try this. Its straight out of my Kernel IPC.

Transort.rb

# Send fd through Unix socket
socket.sendmsg("x", 0, nil, [:SOCKET, :RIGHTS, io.fileno])

# Receive fd through Unix socket  
msg, sender, rflags, *controls = socket.recvmsg
controls.each do |cmsg|
  if cmsg.level == Socket::SOL_SOCKET && cmsg.type == Socket::SCM_RIGHTS
    fd = cmsg.data.unpack("i!")[0]
    received_io = IO.for_fd(fd)
  end
end

Combine that for a control message layer for fd-passing.

Check to see if

java.nio.channels.UnixDomainSocketChannel.send(buffer)

exists. If it does, try this.

LibC.sendmsg(fd, msg_struct, flags)

the trouble is the socket send/receive isnt exposed for ancillary data @headius .

If this is parsed by Prism, it's a ProgramNode contained in a ParseResult, so just use the ParseResult.

CufeHaco · 2026-01-23T00:34:53Z

Also try out something like this. What i did was use message pack binaries as the translator between the 2 natives.

ruby
def sendmsg(data, flags, dest, controls)
  if controls && controls[1] == :RIGHTS
    # Use your MessagePack framing pattern
    fd_metadata = extract_fd_metadata(controls[2])
    message = { data: data, control: { type: :RIGHTS, fd: fd_metadata } }
    packed = MessagePack.pack(message)
    # Length-prefix and send over JEP-380 socket
    getChannel().write(ByteBuffer.wrap([packed.bytesize].pack('N') + packed))
  else
    # Regular send
    getChannel().write(ByteBuffer.wrap(data))
  end
end

def recvmsg(maxlen, flags = 0)
  # Read 4-byte length prefix
  header = ByteBuffer.allocate(4)
  getChannel().read(header)
  header.flip()
  length = header.getInt()
  
  # Read MessagePack data
  buffer = ByteBuffer.allocate(length)
  getChannel().read(buffer)
  buffer.flip()
  
  # Unpack and reconstruct
  packed_bytes = buffer.array()
  message = MessagePack.unpack(packed_bytes)
  
  if message[:control] && message[:control][:type] == :RIGHTS
    # Reconstruct IO from metadata
    io = reconstruct_io_from_metadata(message[:control][:fd])
    [message[:data], sender, flags, [:SOCKET, :RIGHTS, io]]
  else
    [message[:data], sender, flags]
  end
end

def extract_fd_metadata(io)
  {
    path: io.path,
    position: io.pos,
    mode: io.fcntl(Fcntl::F_GETFL),
    stat: io.stat.to_h
  }
end

def reconstruct_io_from_metadata(metadata)
  io = File.open(metadata[:path], metadata[:mode])
  io.seek(metadata[:position])
  io
end

The key is the 4-byte length prefix framing pattern. That's what makes the binary protocol work reliably over the socket. Both sides read length first, then exactly that many bytes.

CufeHaco · 2026-01-23T00:57:36Z

@headius You're on the right track with WASM portability, but the benchmark reveals something interesting.
You're measuring Ruby.newInstance → parse → tearDown in a loop. That ~50ms difference between WASM (~78ms) and native (~28ms) isn't just parse time - it's runtime lifecycle overhead compounded by parser switching.
What's missing: Process isolation with IPC.

Instead of:

loop {
  ruby = Ruby.newInstance  # Heavy
  ruby.parse(source)
  ruby.tearDown           # Heavy
}

You get:

parser_process = spawn_persistent_parser  # Once
loop {
  send_source_via_socket(source)
  ast = receive_ast_via_fd
}

This is how my kernel IPC coordinates cross-process operations. The control message layer (SCM_RIGHTS) is the missing piece for passing data efficiently between isolated runtimes.
WASM gets you portability. IPC gets you performance and enables MRI↔JRuby parser coordination.

I didn't want you to scratch your head on why I gave you the code i did.

enebo · 2026-01-23T18:53:52Z

@CufeHaco He does not really care too much about how to do this because it is so artificial is not useful in the real world. He is looking for how well chicory will warm up doing the same task over and over. It is more of a test at how fast Chicory can get with prisms wasm.

Your general idea of a persistent parser which uses IPC is a big idea but we have went down the paths of external processes before (albeit differently - drip and the other one). Not the same but a lesson learned was that managing an external service is a pain. It also tended to create more confusion issues because something is transparently happening that people are not expecting.

That is not to say it is a bad idea. It really could give amazing parsing performance but if you look at prism native performance the parsing time really is a small percentage of overall execution time. I tend to think a bigger priority for parse would be to get native prism builds onto installations (which could be precompiled with dist, post-install hook, nag warn on install).

CufeHaco · 2026-01-23T19:10:03Z

@enebo I do apologize thst I dont come across clearly. Dont look at what the thing is, im referring to how its structured. Look at the how. Im using the JEP-380 structure because its what i have showing the cleanest mechanics.

Im Sure Prisim has its own bytecode we can build byte arrays with? If so, what im proposing would be just like the builtins, just for bytcode interlop. In the IPC, im just substituting bytecode for message pack binaries for the arrays.

CufeHaco · 2026-01-23T21:44:52Z

I do also just want to add im still learning java and the JVM. Im still relying on you and Charles to point me in the right direction. Im trying to use established abstraction place holders so I can understand and communicate better. @enebo . For best results, focus me one one thing at you need me to hyper focus on. I learn the stack as I go.

headius added this to the JRuby 10.1.0.0 milestone Jan 22, 2026

Don't assume this is a RootNode

84d2972

If this is parsed by Prism, it's a ProgramNode contained in a ParseResult, so just use the ParseResult.

headius force-pushed the wasm_prism branch from 4f4f0e8 to 84d2972 Compare January 22, 2026 21:58

headius mentioned this pull request Jan 27, 2026

InstanceVariableFinder's visitInstVarNode method does not actually visit the children #8939

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate WASM-based Prism parser into standard build #9184

Integrate WASM-based Prism parser into standard build #9184

Uh oh!

headius commented Jan 22, 2026 •

edited

Loading

Uh oh!

headius commented Jan 22, 2026

Uh oh!

headius commented Jan 22, 2026

Uh oh!

CufeHaco commented Jan 22, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

enebo commented Jan 23, 2026 •

edited

Loading

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Integrate WASM-based Prism parser into standard build #9184

Are you sure you want to change the base?

Integrate WASM-based Prism parser into standard build #9184

Uh oh!

Conversation

headius commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

headius commented Jan 22, 2026

Uh oh!

headius commented Jan 22, 2026

Uh oh!

CufeHaco commented Jan 22, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

enebo commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

CufeHaco commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

headius commented Jan 22, 2026 •

edited

Loading

enebo commented Jan 23, 2026 •

edited

Loading