Speeding up Ruby by rewriting C in Ruby

Speeding up Ruby by rewriting C in Ruby

297

by todsacerdoti

Someone

FTA: The loop example iterates 1 billion times, utilizing a nested loop:

  u = ARGV[0].to_i       
  r = rand(10_000)                          
  a = Array.new(10_000, 0)                 
 
  (0...10_000).each do |i|                     
    (0...100_000).each do |j|               
      a[i] += j % u                     
    end
    a[i] += r                      
  end
 
  puts a[r]

Weird benchmark. Hand-optimized, I guess this benchmark will spend over 99% of its time in the first two lines.

If you do liveliness analysis on array elements you’ll discover that it is possible to remove the entire outer loop, turning the program into:

  u = ARGV[0].to_i       
  r = rand(10_000)                                         
                    
  (0...100_000).each do |j|               
    a += j % u                     
  end
  a += r                      
  
  puts a

Are there compilers that do this kind of analysis?

Even though u isn’t known at compile time, that inner loop can be replaced by a few instructions, too, but that’s a more standard optimization that, I suspect, the likes of clang may be close to making.

IshKebab

Compilers don't do liveness analysis on individual array elements. It's too much data to keep track of and would probably only be useful in incorrect code like this.

I used to work on an AI compiler where liveness analysis of individual tensor elements actually would have been useful. We still didn't do it because the compilation time/memory requirements would be insane.

Asmod4n

Truffle ruby could replace this with a O(1) Operation, even when it’s part of a c extension.

IshKebab

I think most compilers could do that. That's a separate much easier optimisation.

hatthew

Closed form that works for most cases:

    result = ((u * (u - 1)) / 2 * (100000/u)) + (100000%u * (100000%u - 1) / 2) + r)

kristianp

The article refers to upcoming versions of Ruby. For the curious, looks[1] like ruby 3.4.0 will be released this Christmas, and ruby 3.5.0 next Christmas.

Also, I'm wondering what effect Python's minimal JIT [2] has coming for this type of loop. Python 3.13 needs to be built with the JIT enabled, so it would be interesting if someone who has built it runs the benchmarks.

[1] https://www.ruby-lang.org/en/downloads/releases/

[2] https://drew.silcock.dev/blog/everything-you-need-to-know-ab...

riffraff

Ruby is always released on Christmas, it's a predictable and cute schedule.

But perf improvements can and do drop in point releases too, afair.

Lammy

> There was a PR to improve the performance of `Integer#succ` in early 2024, which helped me understand why anyone would ever use it: “We use `Integer#succ` when we rewrite loop methods in Ruby (e.g. `Integer#times` and `Array#each`) because `opt_succ (i = i.succ)` is faster to dispatch on the interpreter than `putobject 1; opt_plus (i += 1)`.”

I find myself using `#succ` most often for readability reasons, not just for performance. Here's an example where I use it twice in my UUID library's `#bytes` method to keep my brain in “bit slicing mode” when reading the code. I need to loop 16 times (`0xF.succ`) and then within that loop divide things by 256 (`0xFF.succ`): https://github.com/okeeblow/DistorteD/blob/ba48d10/Globe%20G...

e12e

Why do you find 0xF.succ better than 0x10 in this case?

Lammy

Because of how I'm used to thinking of the internal 128-bit UUID/GUID value as a whole:

  irb> 0xFFFFFFFF_FFFFFFFF_FFFFFFFF_FFFFFFFF.bit_length => 128

fsckboy

... 0 to 127 < 128

block_dagger

After all these years, I still love Ruby. Thank you Matz!

Imustaskforhelp

super interesting , actually I am also a contributer of the https://github.com/bddicken/languages and after I had tried to create a lua approach , I started to think of truffleruby as it was mentioned somewhere but unfortunately when I had run the code of main.rb , there was virtually no significant difference b/w truffleruby and main.rb (sometimes normal ruby was faster than truffleruby)

I am not sure if the benchmark that you had provided showing the speed of truffleruby were made after the changes that you have made.

I would really appreciate it if I could verify the benchmark

and maybe try to add it to the main https://github.com/bddicken/languages as a commit as well , because the truffleruby implementation actually is faster than the node js and goes close to bun or even golang for that matter which is nuts.

This was a fun post to skim through , definitely bookmarking it.

cutler

With TruffleRuby you'll need to account for startup time and time to max. performance which vary with the native and JVM runtime configurations. See https://github.com/oracle/truffleruby

Alifatisk

Woah, Ruby has become fast, like really fast. What's even more impressive is TruffleRuby, damn!

knowitnone

It's Oracle https://github.com/oracle/truffleruby Double Damn!

Twirrim

It's open source under Eclipse Public License version 2.0, GNU General Public License version 2, or GNU Lesser General Public License version 2.1.

Making it easily fork-able should Oracle choose to do something users dislike.

ksec

Holy! I know TruffleRuby is Open Source but I somehow always thought Graal ( Which TruffleRuby is based on ) wasn't open sourced.

tiffanyh

Note that Rails doesn't work on Truffle and from what I understand, won't anytime soon.

Which is disappointing since it has the highest likelihood of making the biggest impact to Ruby perf.

uamgeoalsk

Huh, what exactly doesn't work? Their own readme says "TruffleRuby runs Rails and is compatible with many gems, including C extensions." (https://github.com/oracle/truffleruby)

tiffanyh

Truffle:

  TruffleRuby is not 100% compatible with MRI 3.2 yet

Rails:

  Rails 8 will require Ruby 3.2.0 or newer

https://github.com/oracle/truffleruby

https://rubyonrails.org/2024/9/27/this-week-in-rails

lmm

That doesn't mean Rails won't run on TruffleRuby. TruffleRuby may not implement 100% of MRI 3.2, but that doesn't mean it doesn't implement all the parts that Rails needs.

hotpocket777

Is it possible that those two statements taken together means truffleruby can run rails 8?

Comment was deleted :(

jeremy_k

Super interesting. I didn't know that YJIT was written in Rust.

riffraff

It was initially written in C then ported to rust[0], which seems like it was a good idea. The downside is that it may not be enabled at build time if you don't have the right toolchain/platform, but that seems a good trade off.

0: https://shopify.engineering/porting-yjit-ruby-compiler-to-ru...

tgmatt

Another language comparison repo that's been going for longer with more languages https://github.com/niklas-heer/speed-comparison.

igouy

Another language comparison repo with hard-to-read presentation.

The chart axis labels and bar labels overlap each other, and there are no vertical grid lines.

Oh for a simple HTML table!

resonious

> Python was the slowest language in the benchmark, and yet at the same time it’s the most used language on Github as of October 2024.

Interesting that there seems to be a correlation between a language being slow and it being popular.

_kb

Now do it again, but include compile time and amortise across the number of executions expected for that specific build.

I say this as a pretty deep rust fanatic. All languages (and runtimes, interpreters, and compilers) are tools. Different problems and approaches to solving them benefit from having a good set at your disposal.

If you're building something that may only run a handful of times (which a lot of python, R, et al programs include) slow execution doesn't matter.

VeejayRampay

it's like food, people like it way more when you put sugar on top

by and large, Ruby is slow, but damn is it nice to code with, which is more appealing for newcomers

Alifatisk

I think, for being a interpreted language, Ruby is quite fast now.

pjmlp

Because now a JIT is part of the picture, as it should be in any dynamic language that isn't only meant for basic scripting tasks.

KerrAvon

Ruby was always faster than people gave it credit for.

igouy

Not really.

Work has been done to make faster Ruby language implementations.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

VeejayRampay

of course sorry, you're right, what I meant is that it's rather slow _in the grand scheme of things_

jb1991

Slower languages are higher level thus easier to use.

igouy

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Program performance is associated with the specific programming language implementation and the specific program implementation.

jb1991

No one takes that site too seriously when judging real world programming, for a variety of reasons.

igouy

Do you really not understand that the exact same Java programs are likely to be 10x slower without JIT?

Different language implementation, so different performance.

> too seriously

Take those measurements just seriously enough.

remedan

Does that correlation hold if you look at let's say the top 20 popular languages?

sigzero

No, because Java is #2, C++ is #4, C# is #7. People just really like Python for what it brings to the table.

smileson2

Game changing for my advent of code solutions which look surprisingly similar

knowitnone

I'm a little surprised that Node is beating Deno. Interesting that Java would be faster than Kotlin since both run on jvm.

wiseowise

“Faster”.

> Ran each three times and used the lowest timing for each. Timings taken on an M3 Macbook pro with 16 gb RAM using the /usr/bin/time command. Input value of 40 given to each.

Not even using JMH. I highly doubt accuracy of the “benchmark”.

pjmlp

That is one of the differences between a platform systems language, and guest languages.

You only have to check the additional bytecode that gets generated, to work around the features not natively supported.

wiseowise

Which difference? It is literally same code, it doesn’t even use any Kotlin std goodies.

pjmlp

Yet, they don't generate the same bytecode, and that matters.

https://godbolt.org/z/h4dofq3Wq

entropicdrifter

I mean, the JVM's been optimized specifically for Java since the Bronze Ages at this point, it's not that surprising

coliveira

Although being slow, Python has a saving grace: it doesn't have a huge virtual machine like Java, so it can in many situations provide a better experience.

igouy

Does JavaME have a "huge virtual machine" ?

https://www.oracle.com/java/technologies/javameoverview.html

Do you mean CPython or PyPy or MicroPython or ?

coliveira

> Does JavaME have a "huge virtual machine"

Yes, compared to Python.

> Do you mean CPython or PyPy

Python standard virtual machine is called CPython, just look at the official web page.

igouy

I imagine we need a nuts and bolts definition of "virtual machine" before we can make a comparison.

acmj

[flagged]

ksec

>This got me thinking that it would be interesting to see a kind of “YJIT standard library” emerge, where core ruby functionality run in C could be swapped out for Ruby implementations for use by people using YJIT.

This actually makes me feel sad because it reminded me of Chris Seaton. The idea isn't new and Chris has been promoting it during his time working on TruffleRuby. I think the idea goes back even further to Rubinius.

It is also nice to see TruffleRuby being very fast and YJIT still has lots of headroom to grow. I remember one obstacle with it running rails was memory usage. I wonder if that is still the case.

Asmod4n

One of the amazing things truffle ruby does is handle c extensions like ruby code, meaning C is interpreted and not compiled in a traditional sense.

This makes way for jitting c code to make it way faster than the author has written it.

pantulis

Amazing indeed!

0x457

Yup, Rubinius was probably the most widely known implementation of Ruby's standard library in Ruby. Too bad it was slower than MRI.

Lio

I think jRuby takes a similar approach.

It’s possible to write gems which will use underlying C on MRI or Java when running on jRuby.

It would be interesting to know if a “pure” would also help jRuby too.

e12e

I thought maybe mruby had a mostly ruby stdlib - but I guess it's c ported over from mri?

Comment was deleted :(

jerf

"In most ways, these types of benchmarks are meaningless. Python was the slowest language in the benchmark, and yet at the same time it’s the most used language on Github as of October 2024."

First, this indicates some sort of deep confusion about the purpose of benchmarks in the first place. Benchmarks are performance tests, not popularity tests. And I don't think I'm just jumping on a bit of bad wording, because I see this idea in its various forms a lot poking out in a lot of conversations. Python is popular because there are many aspects to it, among which is the fact that yes, it really is a rather slow language, but the positives outweigh it for many purposes. They don't cancel it. Python's other positive aspects do not speed it up; indeed, they're actually critically tied to why it is slow in the first place. If they were not, Python would not be slow. It has had a lot of work done on it over the years, after all.

Secondly, I think people sort of chant "microbenchmarks are useless", but they aren't useless. I find that microbenchmark actually represents some fairly realistic representation of the relative performance of those various languages. What they are not is totally determinative. You can't divide one language's microbenchmark on this test by another to get a "Python is 160x slower than C". This is, in fact, not an accurate assessment; if you want a single unified number, 40-50 is much closer. But "useless" is way too strong. No language is so wonderful on all other dimensions that it can have something as basic as a function call be dozens of times slower than some other language and yet keep up with that other language in general. (Assuming both languages have had production-quality optimizations applied to them and one of them isn't some very very young language.) It is a real fact about these languages, it is not a huge outlier, and it is a problem I've encountered in real codebases before when I needed to literally optimize out function calls in a dynamic scripting language to speed up certain code to acceptable levels, because function calls in dynamic scripting languages really are expensive in a way that really can matter. It shouldn't be overestimated and used to derive silly "x times faster/slower" values, but at the same time, if you're dismissing these sorts of things, you're throwing away real data. There are no languages that are just as fast as C, except gee golly they just happen to have this one thing where function calls are 1000 times slower for no reason even though everything else is C-speed. These performance differences are reasonably correlated.

mlyle

> First, this indicates some sort of deep confusion about the purpose of benchmarks in the first place. Benchmarks are performance tests, not popularity tests.

I don't think it indicates a deep confusion. I think it leaves a simple point unsaid because it's so strongly implied (related to what you say):

Python may be very low in benchmarks, but clearly it has acceptable performance for a very large subset of applications. As a result, a whole lot of us can ignore the benchmarks.

Even in domains where one would have shuddered at this before. My students are launching a satellite into low earth orbit that has its primary flight computer running python. Yes, sometimes this does waste a few hundred milliseconds and it wastes several milliwatts on average. But even in the constrained environment of a tiny microcontroller in low earth orbit, language performance doesn't really matter to us.

We wouldn't pay any kind of cost (financial or giving up any features) to make it 10x better.

jerf

I wouldn't jump on it except for the number of times I've been discussing this online and people completely seriously counter "Python is a fairly slow language" with "But it's popular!"

Fuzzy one-dimensional thinking that classifies languages on a "good" and "bad" axis is quite endemic in this industry. And for those people, you can counter "X is slow" with "X has good library support", and disprove "X lacks good tooling" with "But X has a good type system", because all they hear is that you said something is "good" but they have a reason why it's "bad", or vice versa.

Keep an eye out for it.

ModernMech

"My students" - so there's really nothing on the line except a grade then, yeah? That's why you wouldn't pay any cost to make it 10x better, because there's no catastrophic consequence if it fails. But sometimes wasting a few milliwatts on average is the difference between success and failure.

I've built an autonomous drone using Matlab. It worked but it was a research project, so when it came down to making the thing real and putting our reputation on the line, we couldn't keep going down that route -- we couldn't afford the interpreter overhead, the GC pauses, and all the other nonsense. That aircraft was designed to be as efficient as possible, so we could literally measure the inefficiency from the choice of language in terms of how much it cost in extra battery weight and therefore decreased range.

If you can afford that, great, you have the freedom to run your satellite in whatever language. If not, then yeah you're going to choose a different language if it means extra performance, more runtime, greater range, etc.

mlyle

> "My students" - so there's really nothing on the line except a grade then, yeah? That's why you wouldn't pay any cost to make it 10x better, because there's no catastrophic consequence if it fails. But sometimes wasting a few milliwatts on average is the difference between success and failure.

Years of effort from a large team is worth something, as is the tens of thousands of dollars we're spending. We expect a return on that investment of data and mission success. We're spending a lot of money to improve odds of success.

But even in this power constrained application, a few milliwatts is nothing. (Nearly half the time, it's literally nothing, because we'd have to use power to run heaters anyways. Most of the rest of the time, we're in the sun, so there's a lot of power around, too). The marginal benefit to saving a milliwatt is zero, so unless the marginal cost is also zero we're not doing it.

> That aircraft was designed to be as efficient as possible, so we could literally measure the inefficiency from the choice of language in terms of how much it cost in extra battery weight and therefore decreased range

If this is a rotorcraft of some sort, that seems silly. It's hard to waste enough power to be more than rounding error compared to what large brushless motors take.

ModernMech

If you have enough power from the sun and enough compute, are you really that resource constrained?

Let me ask you, why do you think most real-time mission critical projects are not typically done in Python?

> If this is a rotorcraft of some sort, that seems silly. It's hard to waste enough power to be more than rounding error compared to what large brushless motors take.

It was a glider trying to fly as long as possible, so no motors, no solar power either. It got to the point that we could not even execute the motion planner fast enough in Matlab given the performance demands of the craft, we had to resort to Mex, and at that point we might as well have been writing in C. Which we did.

igouy

otoh When performance doesn't matter, it doesn't matter.

otoh When the title is "Speeding up Ruby" we are kind-of presuming it matters.

grumpyprole

> My students are launching a satellite into low earth orbit that has its primary flight computer running python. Yes, sometimes this does waste a few hundred milliseconds

Never mind performance, would it not be good to at least machine check some static properties? A dynamic language is not a good choice for anything mission critical IMHO.

wiseowise

Python has had since Mypy and Pyright since forever.

grumpyprole

Even with those retrofits, it's still a language designed for maximum flexibility and maximum ease of use. This has trade offs with regard to reasoning for correctness.

wiseowise

What’s your point? That their type checks are incomplete?

grumpyprole

That Python makes the wrong trade-offs for mission critical software. This goes beyond just lacking static types.

wiseowise

Which trade offs do you make when you opt in for full static typing? Except for performance.

pjmlp

It does help that the Python ecosystem sees C and Fortran as being "Python".

igouy

> people sort of chant "microbenchmarks are useless", but they aren't useless.

They might be !

(They aren't necessarily useless. It depends. It depends what one is looking for. It depends etc etc)

> You can't divide one language's microbenchmark on this test by another to get a "Python is 160x slower than C".

Sure you can !

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

— and —

Table 4, page 139

https://dl.acm.org/doi/pdf/10.1145/3687997.3695638

— and then one has — "[A] Python is 160x slower than C" not "[THE] Python is 160x slower than C".

Something multiple and tentative not something singular and definitive.

lmm

> Benchmarks are performance tests, not popularity tests.

But presumably they're meant to test something that matters. And the popularity suggests that what's being tested in this case doesn't.

> But "useless" is way too strong. No language is so wonderful on all other dimensions that it can have something as basic as a function call be dozens of times slower than some other language and yet keep up with that other language in general.

And yet Python does keep up with C in general. You might object that when a Python-based system outperforms a C-based system it's not running the same algorithm, or it's not really Python, and that would be technically true, but seemingly not in a way that matters.

> if you're dismissing these sorts of things, you're throwing away real data

Everything is data. The most important part of programming is often ignoring the things that aren't important.

chikere232

very true.

Also, for a lot of the areas where languages like python or ruby aren't great choices because of performance, they would also not be great choices because of the cost of maintaining untyped code, or in python's case the cost of maintaining code in a language that keeps making breaking changes in minor versions.

Script with scripting languages, build other things in other languages

Comment was deleted :(

ribadeo

It seems odd to willfulky ignore Crystal language when discussing Ruby and speeding it up. Granted, macro semantics mean something else, more like c macros, but the general syntax and flow of Crystal is basically Ruby. https://crystal-lang.org/

Amber and Lucky are 2 mature frameworks to give Rails a run for their money, and Kemal is your Sinatra.

https://docs.amberframework.org/amber

https://luckyframework.org/

https://kemalcr.com/

hamandcheese

Crystal is not Ruby. Full stop. It is not useful to anyone with an existing Ruby code base.

Mentioning Crystal would be odd since it has nothing to do with the article.

Alifatisk

Will these Crystal frameworks allow me to share a single standalone binary with peers that allows them run the web application locally?

norman784

As per this article[0] seems that crystal produces statically linked binaries, so I think the answer is yes.

[0] https://crystal-lang.org/2020/02/02/alpine-based-docker-imag...

Alifatisk

Woah, what a luxury

Comment was deleted :(

davidw

It seems like it's been a while since I've seen one of these language benchmark things.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/... seems like the latest iteration of what used to be a pretty popular one, now with fewer languages and more self-deprecation.

igouy

> fewer languages

Maybe you've only noticed the dozen in-your-face on the home page?

The charts have shown ~27 for a decade or so.

There's another half-dozen more in the site map.

Comment was deleted :(

igouy

> a fun visualization of each language’s performance

The effect is similar to dragging a string past a cat: complete distraction — unable to avoid focusing on the movement — unable to extract any information from the movement.

To understand the measurements, cover the "fun visualization" and read the numbers in the single column data table.

(Unfortunately we aren't able to scan down the column of numbers, because the language implementation name is shown first.)

Previously: <blink>

https://developer.mozilla.org/en-US/docs/Glossary/blink_elem...

chikere232

It does visualise how big the difference is though

igouy

Cover up the single column of lang/secs and then try to read how big the difference is between java and php from the moving circles.

You would have no problem doing that with a [typo histogram should say bar chart].

Comment was deleted :(

MeetingsBrowser

Cover the labels on the histogram and try to read how big the difference is between java and php....

igouy

We can read the relative difference from the length of the bars because the bars are stable.

MeetingsBrowser

I can see the relative difference in speed between the two balls.

igouy

"The first principle is that you must not fool yourself and you are the easiest person to fool."

:-)

chikere232

PHP looks much slower

igouy

The question is: How much slower?

We could try to count how many times the java circle crosses left-to-right and right-to-left, in the time it takes for the PHP circle to cross left-to-right once.

That's error prone but should be approximately correct after a couple of attempts.

That's work we're forced to do because the "fun visualization" is uninformative.

chikere232

That might be your question, but then you can look at the numbers. No chart will be as exact.

igouy

If only we could look at the numbers without the uninformative distraction.

chikere232

I found the animation informative

igouy

> I found the animation informative

Java was so fast it glowed orange!

I wonder if the distraction of the animation actually makes people slower at reading the information that is in the text column.

The animation serves its purpose -- it grabs attention.

Comment was deleted :(

tiffanyh

Dart - I see it mentioned (and perf looks impressive), but is it widely adopted?

Also, would have loved to see LuaJIT (interpreted lang) & Crystal (static Ruby like language) included just for comparison sake.

suby

It looks like a more complete breakdown is here. Crystal ranks just below Dart at 0.5413 (Dart was 0.5295). Luajit was 0.8056. I'm surprised Luajit does worse than Dart. Actually I am surprised Dart is beating out languages like C# too.

http://benjdd.com/languages2

saurik

Dart's VM was designed by the team (I think not just the one guy, but maybe I'm wrong on that and it really is just Lars Bak) that designed most of the truly notable VMs that have ever existed: Self, Smalltalk Strongtalk, Java Hotspot, and JavaScript V8. It also features an ahead-of-time compiler mode in addition to a world-class JIT and interpreter, allowing for hot reload during development.

https://en.m.wikipedia.org/wiki/Lars_Bak_(computer_programme...

It was stuck with a bad rep for being the language that was never going to replace JavaScript in the browser, and then was merely a transpiler no one was going to use, before it found a new life as the language for Flutter, which has driven a lot of its syntax and semantics improvements since, with built-in VM support for extremely efficient object templating (used by the reactive UI framework).

igouy

Maybe that dozen lines of code isn't sufficient to characterize performance differences?

Nearly 25 years ago, nested loops and fibs.

https://web.archive.org/web/20010424150558/http://www.bagley...

https://web.archive.org/web/20010124092800/http://www.bagley...

It's been a long time since the benchmarks game showed those.

neonsunset

This nested loops microbenchmark only measures in-loop integer division optimizations on ARM64 - there are division fault handling differences which are ARM64 specific which introduce significant variance between compilers of comparable capability.

On x86_64 I expect the numbers would have been much closer and within measurement error. The top half is within 0.5-0.59s - there really isn't much you can do inside such a loop, almost nothing happens there.

As Isaac pointed out in a sibling comment - it's best to pick specific microbenchmarks, a selection of languages and implementations that interest you and dissect those - it will tell you much more.

lern_too_spel

Runtime startup isn't amortized.

igouy

How do you know?

lern_too_spel

The methodology is documented in the link of the comment I responded to.

igouy

Perhaps you mean that "the methodology" does not include an explicit step intended to amortize runtime startup.

Perhaps the tiny tiny programs none-the-less took enough time that startup was amortized.

ModernMech

I wonder why C++ isn't in that list but a bunch of languages no one uses are.

Alifatisk

Been using pure Dart since last year, it's a lovely language that has it's quirks. I like it.

It's fast and flexible.

contagiousflow

Have you used it for anything other than Flutter? I recently did a Flutter project and I'm interested in using dart more now.

Alifatisk

Yes, that's what I meant with pure Dart. I've created cli's with it and a little api-only server.

igouy

fwiw

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Comment was deleted :(

entelechy0

[dead]

wly_cdgr

[flagged]

Comment was deleted :(

pentestercrab

[flagged]

coliveira

This kind of benchmark doesn't make sense for Python because it is measuring the speed of pure code written in the language. However, and here is the important point, most python code rely on compiled libraries to run fast. The heavy lifting in ML code is done in C, and Python is used only as a glue language. Even for web development this is also the case, Python is only calling a bunch of libraries, many of those being written in C.

chucke

That's not true. Sure, many hot path functions dealing with tensor calculations are done in numpy functions, but etl and args/results are python objects and functions. And most web development libs are pure python (flask, django, etc)

coliveira

For performance, hot paths are the only ones that matter.

IshKebab

Sure, but only a small subset of problems have a hot path. You can easily offload huge tensor operations to C. That's the best possible case. More usually the "hot path" is fairly evenly distributed through your entire codebase. If you offload the hot path to C you'll end up rewriting the whole thing in C.

coliveira

> "hot path" is fairly evenly distributed

No, hot paths are seldom fairly evenly distributed, even on non-numeric applications. In most cases they will be in a small number of locations.

IshKebab

Not in my experience.

dragonwriter

Yeah, this is a benchmark of recursion and tight loops doing integer math on array members. Nontrivial recursion is nonidiomatic in Python, and tight loops doing integer math on array members will probably be done via one of the many libraries that do one or more of optimizing, jitting, or move those to GPU (Numpy, Taichi, Numba, etc.)

igouy

aka Python is as fast as C when it is C.

int_19h

Any language with FFI (which is like all of them, these days) has the same exact issue, the only difference being how common it is to drop into C or other fast compiled language for parts of the code.

And this kind of benchmark is the one that tells you why this is different across different languages.

knowitnone

I don't know. Ruby is able to call C too so it's a wash?

kreetx

Yet this particular blog post shows how Ruby-writen-in-Ruby is faster than Ruby-written-in-C because it's more optimizable.

ModernMech

Yes, if you pull out all the optimization tricks for Python, it will be faster than vanilla Python. And yet it's still 6x slower (by my measurement) than naive code written in a compiled language like Rust without any libraries.

Crafted by Rajat

Source Code

hckrnws

Speeding up Ruby by rewriting C in Ruby