-
-
Notifications
You must be signed in to change notification settings - Fork 941
Use FastDoubleParser where appropriate #9150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 10.1-dev
Are you sure you want to change the base?
Conversation
|
CRuby 4.0 numbers for comparison: |
|
@headius So faster than CRuby ? |
|
Have you tried, if the performance improves even more, when JavaDoubleParser accesses the byte array contained in the BytesList? Like this: |
|
I think, you can get some speed improvement and pass all tests in JRuby, if you revert the changes in class ConvertDouble, and only change the last line of method completeCalculation() from this: to this: |
This hooks up the Java implementation of Daniel Lemire's fast float parsing algorithm to our internal float parsing logic, excluding cases that are not 7-bit ASCII or which contain underscore characters (not currently allowed by FDP, see wrandelshofer/FastDoubleParser#85 for an attempt to add that feature).
3628c5d to
a402496
Compare
It doesn't make much difference based on my measurements (ByteList implements CharSequence by just accessing the byte array).
That does indeed avoid the failures, but isn't as fast as my patch. My patch: Your patch: It's quite a bit better than the original, though! No patch: It's possible my patch is faster because it's not handling all those other forms that we need for Ruby support. @wrandelshofer I really want to figure out how to use your library in JRuby but Ruby has so many oddities in float parsing. For just parsing things like "45.6 degrees" by terminating parsing at the first unexpected character: parsing "NaN", "Infinity", and "-Infinity" It's pretty close at this point. |
The Ruby behavior is to treat strings like "NaN" as non parseable and return 0.0: I can configure this behavior: One down! |
* Allow underscore as a group separator (essentially ignoring it) * Treat "NaN", "Infinity", and "+Infinity" as unparsable (0.0)
1470290 to
14a3511
Compare
|
The remaining failures are all due to FDP rejecting any trailing non-float characters and returning 0.0 for such cases. @wrandelshofer I don't see any way to configure this and I was not clear where in the code it decide to bail out for bad input. |
The purpose of this change, is to get reliable performance of the parser regardless of the JIT. |
Yes. There is currently no API for this in the fast double parser library. |
|
@wrandelshofer Thank you! |
This hooks up the FastDoubleParser project to our internal float parsing logic, excluding cases that are not 7-bit ASCII or which contain underscore characters (not currently allowed by FDP, see wrandelshofer/FastDoubleParser#85 for an attempt to add that feature).
FastDoubleParser is the Java implementation by @wrandelshofer of Daniel Lemire's fast float parsing algorithm. See the project page here: https://github.com/wrandelshofer/FastDoubleParser
This does not yet pass all float-parsing specs, primarily because it does not reject some numeric forms that Ruby's current parser rejects.
See ruby/ruby#15655 for a similar effort to add the Eisel-Lemire algorithm variant to CRuby, described in a blog post by @mensfeld here: https://mensfeld.pl/2025/12/ruby-string-to-float-optimization/.
Benchmarks are significantly faster than the previous implementation in JRuby and much faster than CRuby 4.0 (without @mensfeld's improvements).
BEFORE:
AFTER: