hckrnws

Trap – Transformers in APL

by tlack

photonthug

> Though APL may strike some as a strange language of choice for deep learning, it offers benefits that are especially suitable for this field: First, the only first-class data type in APL is the multi-dimensional array, which is one of the central object of deep learning in the form of tensors. This also signifies that APL is by nature data parallel and therefore particularly amenable to parallelization. Notably, the Co-dfns project compiles APL code for CPUs and GPUs, exploiting the data parallel essence of APL to achieve high performance. Second, APL also almost entirely dispenses with the software-specific "noise" that bloats code in other languages, so APL code can be directly mapped to algorithms or mathematical expressions on a blackboard and vice versa, which cannot be said of the majority of programming languages. Finally, APL is extremely terse; its density might be considered a defect by some that renders APL a cryptic write-once, read-never language, but it allows for incredibly concise implementations of most algorithms. Assuming a decent grasp on APL syntax, shorter programs mean less code to maintain, debug, and understand.

This is really cool. At about 150 lines, terse indeed. And it makes sense that of course APL could work well with gpus, but I’m kind of surprised there’s enough of it still out in the wild so that there’s already a reliable tool chain for doing this.

nextos

> APL could work well with gpus

I've seen at least an APL implementation running on top of Julia, thanks to macros.

Julia has good GPU support, and it makes it easy to compose that support with any library.

However, kdb+ and q, which are APL descendants, have good GPU support already: https://code.kx.com/q/interfaces/gpus. But licenses are not cheap...

koolala

GPUs can even run APL as a higher level programming language. It's the only abstract language I've ever heard to run on a GPU. Arrays in, Arrays out. A gpu is array programming hardware.

I hope one day its normal like the 1000s of CPU languages. Would be nice to have more than 10 gpu languages.

LegNeato

Check out https://github.com/Rust-GPU/rust-gpu if you have not seen it

pjmlp

StarLisp on the Connection Machine, which is kind of early attempt to what GPU became.

Futhark is another more recent example.

twoodfin

CM-Lisp was closer to that vision, though it was never (TMK) fully implemented, unlike StarLisp.

https://dl.acm.org/doi/pdf/10.1145/319838.319870

anonzzzies

There is the famous '17 line' compiler here https://scholarworks.iu.edu/dspace/items/3ab772c9-92c9-4f59-...

(I got the t-shirt)

pistoleer

That's a phd dissertation with hundreds of pages, wrong link maybe?

abrudz

No, it appears on page 210 (logical page 220).

anonzzzies

Thanks, I forgot to add that!

Athas

This looks like an invocation of a C++ CUDA kernel through an FFI. It is not running k or q directly on the GPU.

Dr_Birdbrain

> APL code can be directly mapped to algorithms or mathematical expressions on a blackboard and vice versa

After looking at the code, I find this claim questionable.

Avshalom

APL was invented by Iverson as a blackboard notation because he felt the existing notation was awkward/insufficent for describing computation/algorithms

koolala

linear algebra notation is the real notation (on a blackboard). aka the language of AI. the medium is just the message format.

Dr_Birdbrain

I agree, linear algebra or pseudocode.

jodrellblank

After looking at HN comments for years, I find this low effort dismissal downvoteable.

APL was originally a rewrite and normalisation of traditional math notation for use on blackboards. Before it was anything to do with computers it was linear algebra without all the bizarre precedence rules and with some common useful operations.

Dr_Birdbrain

Ok, well, I understand it may have been invented with that goal, but it frankly does not look like blackboard notation at all. This is not lower effort than your rebuttal “yes it does, in fact it was invented that way”.

I agree with a sibling comment, also heavily downvoted, that the real blackboard notation is linear algebra notation. Either that, or pseudocode. Python and Haskell look like pseudocode. This doesn’t, and it doesn’t matter what the developer was targeting, he didn’t hit the target.

jodrellblank

Your point appears to be the usual "I glanced at APL and it is unfamiliar and therefore bad. I am going to dismiss it without learning anything about it - and make sure to tell everyone" which isn't enticing to put high effort replies to. Consider the exchange:

"I'm sceptical that one can write +.× on a blackboard to indicate matrix cross product"

"well, one can"

"it doesn't look like one could. This is not a low effort reply. Yours is a low effort reply. The 'real' way to write a matrix cross product by hand in chalk on a blackboard is with nested loops and Python".

The APL creator Ken Iverson's paper Notation as a Tool of Thought[1] talks through an introduction to why it was designed the way it was; to do things Python and Haskell and pseudocode and traditional math notation don't do. Early on he writes:

> "APL, a general-purpose language which originated in an attempt to provide clear and precise expression in writing and teaching, and which was implemented as a programming language only after several years of use and development"

To use a thing for several years for expressing and teaching mathematics, is evidence that it can be used for that. It was designed for things code and traditional math notation don't do, e.g. code doesn't hide irrelevant details, math notation has inconsistent precedence, both have instances of wildly different syntax and symbols for closely related concepts which hinder seeing the connections. Code isn't amenable to formal proofs, pseudocode isn't good at expressing a problem, only expressing instructions for solving a problem.

The APL of modern times has a lot added since 1972, but the core is still there. See also anecdote in [2] a demonstration of K. Iverson casually writing APL on a napkin at dinner to solve a problem.

[1] https://www.eecg.utoronto.ca/~jzhu/csc326/readings/iverson.p...

[2] http://archive.vector.org.uk/art10002990

keithalewis

Name checks out.

Comment was deleted :(

sakras

> Though APL may strike some as a strange language of choice for deep learning

I've actually spent the better part of last year wondering why we _haven't_ been using APL for deep learning. And actually I've been wondering why we don't just use APL for everything that operates over arrays, like data lakes and such.

Honestly, APL is probably a good fit for compilers. I seem to remember a guy who had some tree-wrangling APL scheme, and could execute his compiler on a GPU. But I can't find it now.

tlack

I believe the tree-wrangler you mentioned is Aaron Hsu, author of the Co-dfns APL "compiler" that Trap uses.

Here are some videos related to his work: https://www.youtube.com/playlist?list=PLDU0iEj6f8duXzmgnlGX4...

Co-dfns was most recently discussed on Hacker News 3 months ago: https://news.ycombinator.com/item?id=40928450

Archit3ch

> I've actually spent the better part of last year wondering why we _haven't_ been using APL for deep learning.

JAX?

bornaahz

Hello everyone,

I am the author of this project. If anyone has any questions concerning trap, I'd be more than happy to address them.

kstrauser

I’m too ignorant on the subject to have smart questions, so I’ll state instead: that’s brilliant. Terrifying, but brilliant. If someone locked me in a box and said I had to use this for everything, I imagine I’d either break down crying or write an AGI in a page.

Well done.

bornaahz

Thanks a ton! It means a lot.

anonzzzies

There is this on https://shakti.com (the 'new k' from Arthur);

k-torch llm(61) 14M 2 14 6 288 288 x+l7{l8x{x%1+E-x}l6x}rl5x+:l4@,/(hvi,:l3w)Ss@S''h(ki,:ql2w)mql1w:rl0x (18M 2 32000 288)

which apparently can run on the gpu someone told me on discord (but i'm not sure if it's true or not).

gcanyon

> Though APL may strike some as a strange language of choice for deep learning

It sure did to me, even as someone who has written (a trivial amount of) J. But the argument that follows is more than convincing.

smartmic

It would be good if the APL dialect in which this is implemented is mentioned on the front page. I implemented some things in GNU APL which is an (almost) complete implementation of ISO standard 13751, based primarily on APL2. More common and modern is the proprietary Dyalog APL, which I assume is used here (and which is also free for personal use).

abrudz

It is indeed Dyalog APL (evident by certain features used, and also what Co-dfns requires). And yes, I agree, especially since this uses the ".apl" file extension of GNU APL, rather than the ".apls" that Dyalog uses for shell scripts. Oddly enough, the "⎕IO←0" appears outside the ":Namespace" which means it cannot be used by other APL code.

skruger

It is Dyalog APL.

Crafted by Rajat

Source Code