hckrnws
Kinda disappointing that the spherical harmonic data is just totally thrown out. It's a pretty big factor in making gaussian splats look so good, and I'm sure there are interesting ways of compressing the data!
It's definitely possible to compress/palletise the SH bands (see https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-m... for one approach).
We just found that splats encode some view-dependent shininess anyway (for example the guitar body https://playcanvas.com/model-viewer?load=https://code.playca...) and so considered the extra data and rendering cost of SH not worth it for now.
It would be quite straightforward to add SH to the PLY file and we'll definitely consider it in future!
Oh nice! I didn't realize those kinds of effects were possible without SH
Yeah splats can encode view dependency without SH, but not with a correct geometry. E.g. often, you see reflections encoded in GS as just a 3d scene on the other side of the mirror. Other view dependency effect can make the surfaces "fluffy", etc. This is not great when you also try to get decent geometry.
That said, I don't think SH are the best way to represent view dependency. Other approaches using MLP or deep features seem more promising.
Are the baked in reflections technically bug or is it fundamental in some way?
They’re fundamental to the scene if you recite light transport and not just geometry. Gaussian splats are just a clever way to sample a lightfield and reflections are part of the lightfield.
Well it's more that I'm asking if lighting data should live entirely in the spherical harmonics. It's certainly feasible and common in the game world.
...But maybe that doesn't make sense in the GS context.
This blog goes a lot more in depth in different techniques for compressing them (although for the purpose of using less VRAM):
https://aras-p.info/blog/2023/09/13/Making-Gaussian-Splats-s...
https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-m...
Including quantising spherical harmonics, grouping splats into chunks and compressing their position and rotation in groups, etc.
The second of those links was discussed here:
Making Gaussian Splats more smaller - https://news.ycombinator.com/item?id=37687134 - Sept 2023 (18 comments)
Yes! We are using this clustering approach for position and scale data.
(maybe then a small credit for the initial work by Aras in your blog post?)
Did you see that we credit Aras at the end of the post and link to his website?
No. Apologies.
The moment I saw the use of motion vectors with gaussian splats I started wondering if it was a viable way to achieve high performance, high quality spatial video. Instead of capturing from all angles like most current approaches (which seems more geared to create content that is used in place of a 3D model) why not capture from a fixed perspective, using an array of cameras covering about 1m square to allow for slop in head position, providing parallax and perspective correct rendering. I'd presume that'd make it even more compressible too, since you could get rid of any data that isn't visible from outside that frustum.
It would be a true hologram, completely supplanting 180/360 stereo video. Imagine Avatar on such a format. Or, lets be real, porn.
I wonder what style of non-realistic graphics would be best suited to gaussian splats. Like how pixel art and low-poly came from limitations of yesterday's technologies, but became distinctive styles in their own right.
Sounds kind of similar to https://augmentedperception.github.io/deepviewvideo/
That was definitely on my mind, that uses lightfields though. The file sizes are ludicrous.
Yep, though the layered mesh post processing is nice because most consumer hardware at this point is pretty good at rendering plain ol meshes. The big challenge for mass adoption of nerfs/gsplats techniques will be getting them running and viewable on a phone web browser.
Gsplats already are viewable on a phone web browser.
Have you seen this? https://shenhanqian.github.io/gaussian-avatars
So basically what it is now but with view and occlusion culling?
More compression with considerations made to allow for streaming and some kind of equivalent to I/P/B-frames, spatial audio, metadata (chapters, aspect ratios etc) in a standardized file container.
At the moment it's all rather basic and bespoke.
Can anyone point me to some good high level introduction to Gaussian splats?
E.g. what's the benefit of using splats vs traditional polygons? is it somehow easier for the neural network to create these from the 2D photos? or what's the magic behind this?
The article links to a page with this video embedded in it, which I found very helpful. I'm not in this field and this was the first time I'd heard of Gaussian splatting; the original article sort of assumes you're not a rando coming to the page from Hacker News, that you are sold on the idea already, and use it so much that you need to worry about optimization.
Polygons are great because you can make a solid, closed surface. But, they are a pain because you have to maintain the connectivity between the vertices. Adding or removing polygons in a mesh is serious work. Especially doing so on a GPU. Most LOD schemes do heavy precomputation at data compilation time and have the GPU just view a sliding window into static results.
Splats however are loose points. It’s harder to get a solid appearance from them. But, you can add/remove/move each point independently.
Splats don’t really use neural techniques. But, they are well set up for neural techniques to work with them. Generating the splats from photos uses some sort of linear algebra optimization/back-propagation. The Gaussian aspect is required to enable that to work because gaussians are differentiable. Hopefully that means GS should fit in well with recent advancements in differentiable rendering.
https://huggingface.co/blog/gaussian-splatting
Think of this as a LiDAR visualization on steroids... A point cloud of information that can be used to render "The Blob".
But the blobs can change color and opacity depending on direction you are viewing them (spherical harmonics) which allows to encode transparency and specular reflections.
Splats change color based on direction, but opacity is constant.
More so than just compressing the data, I feel like the biggest gains for splats will come from a chunked LOD system, and streaming data into memory. Regardless of how efficient you make the representation, you are still going to run into fundamental limitations without this.
On the low hanging fruit side of things, tools should really start integrating a Spherical Harmonics skybox into the training steps to better handle large scale distant details.
That rendering style looks really similar to Dreams on playstation
Anyone know if that game uses the same technique?
Yes Dreams is actually a huge inspiration with the realtime rendering side of gaussian splatting. Specifically this presentation by Alex Evans https://www.mediamolecule.com/blog/article/siggraph_2015.
Dreams doesn't use gaussian splats as such, but we still learn a lot about how to compress and render a huge number of particles efficiently. (We're not doing half of this on PlayCanvas... yet).
https://lightgaussian.github.io/ in similar vein
I see various people online calling the primitive "splats" (ex: the scene has millions of splats, 248 bytes for each splat, etc). But the primitive are the 3D gaussians, "splat" just refers to how they are rendered. I wonder why people call them "splats", because it's catchier?
They're not rendered like simply converting a point to screen space or doing polygonal texture filling, "splatting" to the viewport is actually a pretty accurate description of how they're rendered.
Yes that's exactly what I'm saying. I'm wondering why people call the geometric primitive "splats" instead of "gaussians".
I see people refer to the .ply file as containing millions of "splats". It's just strange to me. You could take these primitives (3D gaussians + other attributes like spherical harmonics coeffs) and render them differently, even if it would be slower. They only become "splats" in the renderer.
Crafted by Rajat
Source Code