Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It really is oversimplified.

"Creating dozens of light sources simultaneously on screen at once is basically not doable unless you have Mantle or DirectX 12. Guess how many light sources most engines support right now? 20? 10? Try 4. Four. Which is fine for a relatively static scene. "

For my Masters degree project at uni I had a demo written in OpenGL with over 500 dynamic lights, running at 60fps on a GTX580. Without Mantle, or DX12. How? Deffered rendering, that's how. You could probably add a couple thousand and it would be fine too.

"Every time I hear someone say “but X allows you to get close to the hardware” I want to shake them. None of this has to do with getting close to the hardware. It’s all about the cores"

Also not true. I work with console devkits every single day and the reason why we can squeeze so much performance out of relatively low-end hardware is that we get to make calls which you can't make on PC. A DirectX call to switch a texture takes a few thousand clock cycles. A low-level hardware call available on Playstation Platform will do the same texture switch in few dozen instruction calls. The numbers are against DirectX, and that's why Microsoft is slowly letting devs access the GPU on the Xbox One without the DirectX overhead.



100% agree. Article seems very misinformed.

> For my Masters degree project at uni I had a demo written in OpenGL with over 500 dynamic lights, running at 60fps on a GTX580. Without Mantle, or DX12. How? Deffered rendering, that's how.

Indeed. For fixed function forward rendering you do have those limitations, though. However it's nowhere as low as 4: you can expect at least 8 and most often 16 lights. The catch is that DirectX 12 will do nothing for that limitation.

At the end of the day all that this comes down to is decreasing the cost of one thing:

    Draw()
Which has immense amounts of CPU overhead due to abstractions. Anything else DX12 might do is really just a bonus.


This type of direct access is fine on a console, but DirectX is primarily designed as an abstraction layer on top of arbitrary hardware. PC hardware isn't as predictable as the Xbox One, and yeah, it adds clock cycles. On PC you don't want to get that deep into the hardware: it has the potential to break compatibility, and because dev teams generally don't have the ability to test 5,000 different combinations of CPUs, GPUs and drivers, you try to keep as much compatibility as you can.


What he really seem to mean by "real lights" are actual shadow casting lights. I doubt you were rendering 500 shadow maps along with your 500 lights.


His examples don't really fit that though. For small ranged lights (lightsaber) or lights that are relatively short lived and intense (explosions), you can totally get away with not casting shadows. Also I don't see how threads would really help with rendering more shadows, other than providing a generalized speedup. The cost of the shadow map is really more in the GPU shaders than any sort of CPU calculation.

For stencil shaders threads might help, since the CPU has to calculate and upload them frequently (unless you're doing stencils in a geometry shader). Stencil shadows are pretty niche though, you only use them when you need pixel-perfect precision. Shadow maps are vastly more popular.


>you can totally get away with not casting shadows.

His entire point is that you CAN do this, but it all adds up to making it not look real.


And yet none of what he talks about does anything to help that. I'm not against oversimplifying, but what he's saying is entirely bogus, shadows being expensive has nothing to do with how many threads the CPU can issue commands with.

He's setting up this notion that the problem with graphics is a lack of threading, which is ridiculous. Graphics code is incredibly parallel, and his assertion that graphics work spends most of it's time constrained by waiting for the CPU just does not add up to me. Except in poorly written systems, I just haven't seen this be the case often at all.


Even without deferred rendering, why the limitation? I am sure there is a limit in the standart pipeline but can't I just pass as much as light info to shaders now, the result is just slightly longer shaders


There is no limitation per se,but with per-pixel lighting(and pretty much all lighting is done per-pixel nowadays) you would have to run your shader program exactly 2073600 times for a FullHD resolution....and that's every frame. So 60 times a second. Now imagine doing the lighting calculations for one light source....not too complex, few multiplications and that's all....but if you have several lights....then it's just way too much for any graphics card. A "slightly longer" program that has to run over 2 million times a frame is not really acceptable.


I think he meant that most shaders only receive four light sources. However I am not a low level graphics guy but my understanding is that deferred or not most shaders will only receive four light sources.


This hasn't been true for a long time and DX12 doesn't really change anything here. DX11 was already able to handle very large numbers of constants. The limit on how many lights you support in a shader is largely a factor of controlling the shader cost rather than one of constant space limits these days. Deferred renderers are very popular these days as well and don't really handle lights in the same way anyway.

DX12 does offer some potential CPU side performance benefits when it comes to updating large numbers of constants efficiently which may well help performance when dealing with lots of dynamic lights but it's not adding any new capabilities beyond what DX11 offers.


a "light source" is nothing else than a collection of variables when passed in to the shader. Vector3 for the position, a float for the intensity, Vector3 for the colour and so on. Each one of them can be passed in as a uniform - with OpenGL 3.0 you should be able to pass in at the very least 1024 uniforms, but in most implementations the limit is much higher. So if you were only passing in the position of the light, you should be able to give at least 1024 lights to any shader.


Is this a direct effect of the design of PC hardware, or could we theoretically build a PC OS that would let you do a texture switch as efficiently as on PS/XBOX/etc?


>For my Masters degree project at uni I had a demo written in OpenGL with over 500 dynamic lights,

I don’t know what kind of 3D engine you wrote, but as a FPS gamer I have quite a bit of experience with 3D engines. From Doom 3 to Alan Wake, some of the worst performance hits occur in scenes with heavy use of dynamic lights. Did OpenGL 4/DX11 fix this?

Which of the modern APIs have you actually used? How do Mantle, DX12, Metal, and OpenGl Next compare?

EDIT: "as a non-3D graphics programmer, who's tried to build FPS levels for fun and familiar with all the modern engines". This article is for those of us with interest but not experts in the field, right?


No, shaders and deferred rendering fixed this. Shaders mean you could design anything you imagined and not use the old deprecated fixed function OpenGL that only supported 4 lights.

Someone figured out deferred rendering which was a new technique that allows lots of lights. Lots of games use it. I believe one of the first was Killzone

http://www.slideshare.net/guerrillagames/the-rendering-techn...

Here's a live demo of deferred rendering

http://threejs.org/examples/webgldeferred_pointlights.html

It's using only OpenGL 2.1 features (which is all that's needed to emulate OpenGL ES 2.0 which WebGL is based on). To do deferred rendering efficiently all you really need is support for multiple render targets.


Pure OpenGL 4.0. It allowed me to do Tessellation and use Deferred rendering. This is the key - rendering the lights in a separate pass is crucial to performance, and like I've said - it allows you to have thousands of dynamic lights in the same scene. Neither Doom 3 nor Alan Wake could use this technique. I mostly work on consoles nowadays, so I use none of these APIs. On PS3/PS4 you have to do everything manually, instead of using nice OpenGL API to bind and send vertex buffer object,you have to allocate memory for it yourself and copy it over manually to the address that you want. That's where the speed lies. I have done some Xbox One programming but that's mostly regular DirectX at the moment, I haven't had a chance to play with the DX12 stuff.

There is a quite a good explanation on how deferred rendering works: http://gamedevelopment.tutsplus.com/articles/forward-renderi...


Great link and thanks for the explanation. So by using a deferred rendering algorithm you can reduces the complexity O(m+n), (m=number of surfaces, n=number of lights) where the forward lighting path renderers have complexity O(m*n). The trade-off seems higher memory usage, not working well on older hardware and with anti-aliasing and transparent objects explains why many modern engines like Unreal 3 and IW engine don't utilize it.


In case you're curious about the downvotes, it's probably because anyone who has ever done engine programming will laugh about your assertion that "as a FPS gamer" you know anything relevant about graphics rendering and technology.

It's like somebody claiming they can comment meaningfully on light bulb manufacturing standards because they've seen the lighting in a bunch of made-for-TV specials.


"As an frequent airplane passenger, I am qualified to both build build and pilot large aircraft."


Aren't you mixing forward shading and deferred shading? http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: