Vulkan Dynamic Rendering
The VK_KHR_dynamic_rendering extension has made its way into Vulkan 1.2.203 and I have implemented this in Ultra Engine. What does it do?
Instead of creating renderpass objects ahead of time, dynamic rendering allows you to just specify the settings you need as your are performing filling in command buffers with rendering instructions. From the Khronos working group:
QuoteWhen we were designing Vulkan 1.0, we had an idea to embed a task-graph-like object into Vulkan in the form of the render pass object. We knew the first version would be kind of restricted because we had an API to ship, and not long to do the work - but we had plans to extend the initial version, and those extensions would eventually provide significant flexibility to the API. Eventually, render passes would support all kinds of bells and whistles, including larger regions on input attachments, resolve shaders, and compute shaders! The idea was that these features would provide enough motivation to move all rendering to render pass objects and make the small amount of pain setting them up always worth it.
Fast forward to 2021, and the situation is not quite what we'd envisioned. On tiling GPUs, subpasses provide optimisation opportunities that can translate to impressive performance and efficiency wins. However, for many developers, subpasses either remain too restrictive to use or simply don't provide any practical benefit. For developers not using subpasses, render pass objects largely just get in the way.
In my experience, post-processing effects is where this hurt the most. The engine has a user-defined stack of post-processing effects, so there are many configurations possible. You had to store and cache a lot of renderpass objects for all possible combinations of settings. It's not impossible but it made things very very complicated. Basically, you have to know every little detail of how the renderpass object is going to be used in advance. I had several different functions like the code below, for initialing renderpasses that were meant to be used at various points in the rendering routine.
bool RenderPass::InitializePostProcess(shared_ptr<GPUDevice> device, const VkFormat depthformat, const int colorComponents, const bool lastpass) { this->clearmode = clearmode; VkFormat colorformat = __FramebufferColorFormat; this->colorcomponents = colorComponents; if (depthformat != 0) this->depthcomponent = true; this->device = device; std::array< VkSubpassDependency, 2> dependencies; dependencies[0] = {}; dependencies[0].srcSubpass = VK_SUBPASS_EXTERNAL; dependencies[0].dstSubpass = 0; dependencies[0].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; dependencies[0].srcAccessMask = 0; dependencies[0].dstStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT; dependencies[0].dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT; dependencies[1] = {}; dependencies[1].srcSubpass = VK_SUBPASS_EXTERNAL; dependencies[1].dstSubpass = 0; dependencies[1].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; dependencies[1].srcAccessMask = 0; dependencies[1].dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT; dependencies[1].dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; renderPassInfo = {}; renderPassInfo.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO; renderPassInfo.attachmentCount = colorComponents; renderPassInfo.dependencyCount = colorComponents; if (depthformat == VK_FORMAT_UNDEFINED) { dependencies[0] = dependencies[1]; } else { renderPassInfo.attachmentCount++; renderPassInfo.dependencyCount++; } renderPassInfo.pDependencies = dependencies.data(); colorAttachment[0] = {}; colorAttachment[0].format = colorformat; colorAttachment[0].samples = VK_SAMPLE_COUNT_1_BIT; colorAttachment[0].initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; colorAttachment[0].loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; colorAttachment[0].storeOp = VK_ATTACHMENT_STORE_OP_STORE; colorAttachment[0].stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; colorAttachment[0].stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; colorAttachment[0].finalLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; if (lastpass) colorAttachment[0].finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR; VkAttachmentReference colorAttachmentRef = {}; colorAttachmentRef.attachment = 0; colorAttachmentRef.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; depthAttachment = {}; VkAttachmentReference depthAttachmentRef = {}; if (depthformat != VK_FORMAT_UNDEFINED) { colorAttachmentRef.attachment = 1; depthAttachment.format = depthformat; depthAttachment.samples = VK_SAMPLE_COUNT_1_BIT; depthAttachment.loadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; depthAttachment.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;// VK_IMAGE_LAYOUT_UNDEFINED; depthAttachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE; depthAttachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE; depthAttachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE; depthAttachment.finalLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; depthAttachmentRef.attachment = 0; depthAttachmentRef.layout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL; } colorAttachment[0].initialLayout = VK_IMAGE_LAYOUT_UNDEFINED; depthAttachment.initialLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;// VK_IMAGE_LAYOUT_UNDEFINED; subpasses.push_back( {} ); subpasses[0].pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS; subpasses[0].colorAttachmentCount = colorComponents; subpasses[0].pColorAttachments = &colorAttachmentRef; subpasses[0].pDepthStencilAttachment = NULL; if (depthformat != VK_FORMAT_UNDEFINED) subpasses[0].pDepthStencilAttachment = &depthAttachmentRef; VkAttachmentDescription attachments[2] = { colorAttachment[0], depthAttachment }; renderPassInfo.subpassCount = subpasses.size(); renderPassInfo.pAttachments = attachments; renderPassInfo.pSubpasses = subpasses.data(); VkAssert(vkCreateRenderPass(device->device, &renderPassInfo, nullptr, &pass)); return true; }
This gives you an idea of just how many render passes I had to create in advance:
// Initialize Render Passes shadowpass[0] = make_shared<RenderPass>(); shadowpass[0]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 0, true);//, CLEAR_DEPTH, -1); shadowpass[1] = make_shared<RenderPass>(); shadowpass[1]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 0, true, true, true, 0); if (MULTIPASS_CUBEMAP) { cubeshadowpass[0] = make_shared<RenderPass>(); cubeshadowpass[0]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 0, true, true, true, CLEAR_DEPTH, 6); cubeshadowpass[1] = make_shared<RenderPass>(); cubeshadowpass[1]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 0, true, true, true, 0, 6); } //shaderStages[0] = TEMPSHADER->shaderStages[0]; //shaderStages[4] = TEMPSHADER->shaderStages[4]; posteffectspass = make_shared<RenderPass>(); posteffectspass->InitializePostProcess(dynamic_pointer_cast<GPUDevice>(Self()), VK_FORMAT_UNDEFINED, 1, false); raytracingpass = make_shared<RenderPass>(); raytracingpass->InitializeRaytrace(dynamic_pointer_cast<GPUDevice>(Self())); lastposteffectspass = make_shared<RenderPass>(); lastposteffectspass->InitializeLastPostProcess(dynamic_pointer_cast<GPUDevice>(Self()), depthformat, 1, false); lastcameralastposteffectspass = make_shared<RenderPass>(); lastcameralastposteffectspass->InitializeLastPostProcess(dynamic_pointer_cast<GPUDevice>(Self()), depthformat, 1, true); { std::vector<VkFormat> colorformats = { __FramebufferColorFormat ,__FramebufferColorFormat, VK_FORMAT_R8G8B8A8_SNORM, VK_FORMAT_R32_SFLOAT }; for (int earlyZPass = 0; earlyZPass < 2; ++earlyZPass) { for (int clearflags = 0; clearflags < 4; ++clearflags) { renderpass[clearflags][earlyZPass] = make_shared<RenderPass>(); renderpass[clearflags][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 1, false, false, false, clearflags, 1, earlyZPass); renderpassRGBA16[clearflags][earlyZPass] = make_shared<RenderPass>(); renderpassRGBA16[clearflags][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), colorformats, depthformat, 4, false, false, false, clearflags, 1, earlyZPass); firstrenderpass[clearflags][earlyZPass] = make_shared<RenderPass>(); firstrenderpass[clearflags][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 1, false, true, false, clearflags, 1, earlyZPass); lastrenderpass[clearflags][earlyZPass] = make_shared<RenderPass>(); lastrenderpass[clearflags][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), { VK_FORMAT_UNDEFINED }, depthformat, 1, false, false, true, clearflags, 1, earlyZPass); //for (int d = 0; d < 2; ++d) { for (int n = 0; n < 5; ++n) { if (n == 2 or n == 3) continue; rendertotexturepass[clearflags][n][earlyZPass] = make_shared<RenderPass>(); rendertotexturepass[clearflags][n][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), colorformats, depthformat, n, true, false, false, clearflags, 1, earlyZPass); firstrendertotexturepass[clearflags][n][earlyZPass] = make_shared<RenderPass>(); firstrendertotexturepass[clearflags][n][earlyZPass]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), colorformats, depthformat, n, true, true, false, clearflags, 1, earlyZPass); // lastrendertotexturepass[clearflags][n] = make_shared<RenderPass>(); // lastrendertotexturepass[clearflags][n]->Initialize(dynamic_pointer_cast<GPUDevice>(Self()), depthformat, n, true, false, true, clearflags); } } } } }
With dynamic rendering, you still have to fill in most of the same information, but you can just do it based on whatever the current state of things is, instead of looking for an object that hopefully matches the exact settings you want:
VkRenderingInfoKHR renderinfo = {}; renderinfo.sType = VK_STRUCTURE_TYPE_RENDERING_INFO_KHR; renderinfo.renderArea = scissor; renderinfo.layerCount = 1; renderinfo.viewMask = 0; renderinfo.colorAttachmentCount = 1; targetbuffer->colorAttachmentInfo[0].imageLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL; targetbuffer->colorAttachmentInfo[0].clearValue.color.float32[0] = 0.0f; targetbuffer->colorAttachmentInfo[0].clearValue.color.float32[1] = 0.0f; targetbuffer->colorAttachmentInfo[0].clearValue.color.float32[2] = 0.0f; targetbuffer->colorAttachmentInfo[0].clearValue.color.float32[3] = 0.0f; targetbuffer->colorAttachmentInfo[0].imageView = targetbuffer->imageviews[0]; renderinfo.pColorAttachments = targetbuffer->colorAttachmentInfo.data(); targetbuffer->depthAttachmentInfo.clearValue.depthStencil.depth = 1.0f; targetbuffer->depthAttachmentInfo.clearValue.depthStencil.stencil = 0; targetbuffer->depthAttachmentInfo.imageLayout = VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL; renderinfo.pDepthAttachment = &targetbuffer->depthAttachmentInfo; device->vkCmdBeginRenderingKHR(cb->commandbuffer, &renderinfo);
Then there is the way render passes effect the image layout state. With the TransitionImageLayout command, it is fairly easy to track the current state of the image layout, but render passes automatically switch the image layout after completion to a predefined state. Again, not impossible to handle, in and of itself, but when you add these things into the complexity of designing a full engine, things start to get ugly.
void GPUCommandBuffer::EndRenderPass() { vkCmdEndRenderPass(commandbuffer); for (int k = 0; k < currentrenderpass->layers; ++k) { for (int n = 0; n < currentrenderpass->colorcomponents; ++n) { if (currentdrawbuffer->colortexture[n]) currentdrawbuffer->colortexture[n]->imagelayout[0][currentdrawbuffer->baseface + k] = currentrenderpass->colorAttachment[n].finalLayout; } if (currentdrawbuffer->depthtexture != NULL and currentrenderpass->depthcomponent == true) currentdrawbuffer->depthtexture->imagelayout[0][currentdrawbuffer->baseface + k] = currentrenderpass->depthAttachment.finalLayout; } currentdrawbuffer = NULL; currentrenderpass = NULL; }
Another example where this was causing problems was with user-defined texture buffers. One beta tester wanted to implement some interesting effects that required rendering to some HDR color textures, but the system was so static it couldn't handle a user-defined color format in a texture buffer. Again, this is not impossible to overcome, but the practical outcome is I just didn't have enough time because resources are finite.
It's interesting that this extension also removes the need to create a Vulkan framebuffer object. I guess that means you can just start rendering to any combination of textures you want, so long as they use a format that is renderable by the hardware. Vulkan certainly changes a lot of conceptions we had in OpenGL.
So this extension does eliminate a significant source of problems for me, and I am happy it was implemented.
- 6
8 Comments
Recommended Comments