hckrnws
Love it. I've been using cudarc lately; would love to try this since it looks like it can share data structures between host and device (?). I infer that this is a higher-level abstraction.
Where is the Metal love…
It also compiles directly to MSL, it is just missing from the post title.
See also this overview for how it compares to other projects in the Rust and GPU ecosystem: https://rust-gpu.github.io/ecosystem/
Surprised this doesn't mention candle: https://github.com/huggingface/candle
I don't think that fits; that's a ML framework. The others in the link are general GPU frameworks.
Very interesting project! I am wondering how it compare against OpenCL, which I think adopts the same fundamental idea (write once, run everywhere)? Is it about CUbeCL's internal optimization for Rust that happens at compile time?
A lot of things happen at compile time, but you can execute arbitrary code in your kernel that executes at compile time, similar to generics, but with more flexibility. It's very natural to branch on a comptime config to select an algorithm.
[dead]
Crafted by Rajat
Source Code