codeape Posted February 4, 2014 Share Posted February 4, 2014 Read this Nvidia presentation (page 64 - 65, "Porting Source to Linux - Valve’s Lessons Learned"): https://developer.nvidia.com/sites/default/files/akamai/gamedev/docs/Porting%20Source%20to%20Linux.pdf In short : Don't switch out the attachments of your framebuffers. Instead, switch framebuffers. The best explanation I found about this: I realize I never actually said what the optimization is, but if anybody out there is a graphics programmer and wants to know, it might be helpful to explain, since this one is a non-obvious but highly-necessary optimization for GL. The key is this: don't switch out the attachments of your framebuffers. Instead, switch framebuffers. E.g., if you need to render to a texture, don't just keep the same framebuffer and switch out the attachment. Create a new framebuffer, attach the texture, use it, then cache the new FBO. Next time you need to render to the same texture, do a cache lookup and use the same FBO. The idea is that when you change out the attachments of an FBO, it forces the drivers to do some kind of stupid validation, and, for whatever reason (probably the fact that everything related to FBOs is Evil and Bad), this validation takes a remarkable chunk of time, leading to noticeable performance degradation whenever attachment-switching is involved. http://forums.ltheory.com/viewtopic.php?f=12&t=2148#p28788 I do not know if Leadwerks do this but it could be a good optimisation. Quote Link to comment Share on other sites More sharing options...
Josh Posted February 4, 2014 Share Posted February 4, 2014 Yep. The validation step pauses the GPU to return info to the CPU, which is something I always avoid in real-time rendering. 1 Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
codeape Posted February 4, 2014 Author Share Posted February 4, 2014 Leadwerks already avoid the validation step, awesome Quote Link to comment Share on other sites More sharing options...
Josh Posted February 4, 2014 Share Posted February 4, 2014 It's just kind of a general rule. Never do anything in real-time that returns data from the GPU. Occlusion queries actually have a special command that lets you check if the test is finished yet, and you just keep checking each loop until it's safe to get the result. The reason for this is that the CPU -> GPU flow of data is one-way, out to the monitor. So the GPU is normally a bit behind what the CPU is doing. 1 Quote My job is to make tools you love, with the features you want, and performance you can't live without. Link to comment Share on other sites More sharing options...
codeape Posted February 4, 2014 Author Share Posted February 4, 2014 Ahhha cool! Thanks for providing fast feedback. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.