r/vulkan

▲ 6 r/vulkan

vulkan try to apply ios

더해봐야 하겠지만 vulkan이 ios도 적용할 수 있네요. 다음은 안드로이드로 해볼려고요
I need to test it more, but it turns out Vulkan can be applied to iOS as well. Next, I’m planning to try it on Android

https://youtu.be/WXFv1oauXYk?si=vnPBsFyqrt8WYyNu

u/innolot — 2 days ago
▲ 25 r/vulkan+2 crossposts

Should I learn Vulkan using vulkan.h or vulkan.hpp as a beginner ?

I have some ++basic OpenGL and D3D11 experience and I want to learn Vulkan properly from scratch , not just making things compile without understanding them.

The problem: most tutorials I find are outdated like https://vulkan-tutorial.com or use wrappers that abstract things to make things easier and avoid boilerplate like https://vkguide.dev . The official Khronos tutorial looks solid and modern (Vulkan 1.4, dynamic rendering, timeline semaphores) but uses vulkan.hpp with RAII instead of raw vulkan.h:
docs.vulkan.org/tutorial/latest

My concern is that vulkan.hpp will negatively affect the learning the Vulkan concepts or it doesn't matter since it is a thin wrapper.

reddit.com
u/Latter_Relationship5 — 3 days ago
▲ 10 r/vulkan

Any Recommendations on C resources for Learning Vulkan?

I was following this tutorial, which is supposed to be "modern" with "up to date" features and practices.

However, I very quickly stumbled upon this piece of code:

auto unsupportedLayerIt = std::ranges::find_if(requiredLayers,
                                               [&layerProperties](auto const &requiredLayer) {
                                               return std::ranges::none_of(layerProperties,
                                                                           [requiredLayer](auto const &layerProperty) { return strcmp(layerProperty.layerName, requiredLayer) == 0; });
                                               });

I'm just stunned. I can't believe cpp people actually write code like this? This just seems insane to me and I kind of hate it.

So I have 2 questions:

  1. Am I in the wrong here? Should I just accept that this is normal and the industry standard and get over it? Or is it just absurd?
  2. If it IS absurd, does anyone have any recommendations for other resources I could learn Vulkan from? I'd prefer if it was just straight up in C or at least very basic features of C++?
reddit.com
u/Undeniable_Dilemma_ — 3 days ago
▲ 25 r/vulkan

New Vulkan sample: Rasterization Order Attachment Access

This sample demonstrates VK_EXT_rasterization_order_attachment_access, which enables framebuffer attachment reads from one fragment to the next in rasterization order, without requiring explicit synchronization or subpass self-dependencies. Techniques like programmable blending become far more practical.

The sample pairs the extension with VK_KHR_dynamic_rendering and VK_KHR_dynamic_rendering_local_read to show how to implement framebuffer fetch with guaranteed fragment ordering using modern Vulkan patterns.

Explore the sample: https://github.com/KhronosGroup/Vulkan-Samples/tree/main/samples/extensions/rasterization_order_attachment_access

reddit.com
u/thekhronosgroup — 4 days ago
▲ 6 r/vulkan

Vulkan tutorial: why is src access null, but dst access write when transitioning image layout?

In Khronos' Vulkan tutorial when transitioning the image layout for rendering the triangle, there is a dst access mask which doesn't seem to do anything since the src access mask is null. As far as I understand memory barriers make src accesses performed in src stage visible to dst accesses in dst stage. Shouldn't both src and dst access be empty by that logic since we are just discarding the image?

The new How to Vulkan also does the same thing but they also [or] the write with read access.


 // Before starting rendering, transition the swapchain image to vk::ImageLayout::eColorAttachmentOptimal
 transition_image_layout(
     imageIndex,
     vk::ImageLayout::eUndefined,
     vk::ImageLayout::eColorAttachmentOptimal,
     {},                                                        // srcAccessMask (no need to wait for previous operations)
     vk::AccessFlagBits2::eColorAttachmentWrite,                // dstAccessMask
     vk::PipelineStageFlagBits2::eColorAttachmentOutput,        // srcStage
     vk::PipelineStageFlagBits2::eColorAttachmentOutput         // dstStage
 );
reddit.com
u/Content_Economist132 — 4 days ago
▲ 28 r/vulkan

step 1: draw a triangle, step 2: vkCmdDrawIndexed the rest of the Bistro

I think I'm having fun. Nothing can describe the feeling of satisfaction after squashing that last bug. Suddenly the entire scene fully renders! (I know it's not yet PBR and ray tracing but those don't feel too far away now)

I'd been stuck for about a week hunting a bug where Bistro would look like a hellscape of windows to nowhere, I had to quadruple-check every single part of the geometry codebase, implement non-ktx2 texture loading so I could test simpler gltf files... only to have Sponza render only half of itself too. I woke up at 5am this morning with the solution fresh in my sleepy head: it turned out to be a faulty ordering or mesh primitives in my custom gltf loader, leading to duplicate vertex and index buffers on the wrong nodes. 5 seconds later I had a full scene.

So today, filled with joy, I optimized stuff left and right, and within a couple hours the framerate on Bistro jumped from 8 to 80, not bad for a first pass, and without culling yet!

If I can give other beginners some fresh advice: keep using the raw vulkan API for as long as possible instead of abstracting it away from the start. It might sound crazy at first, but I truly think it helped me design and architecture my app around the features I actually know and control rather than around an API I still hadn't properly learned and used. And don't forget to sleep!

u/AwesomeDewey — 6 days ago
▲ 12 r/vulkan

Abstracting Vulkan

Good evening,

I'm playing around with Vulkan and got all the fun basic stuff up and running. Due to the complexity of Vulkan I find it really hard to kind of abstract it to make it more scalable and usable as an API and my code gets confusing pretty fast. Do you guys have any examples / articles etc. where I could take a look inside any more professional abstractions / architectures?

Thanks in advance

reddit.com
u/Recent_Bug5691 — 7 days ago
▲ 20 r/vulkan+1 crossposts

How to design CommandBufferAllocator in modern graphics API (Vulkan, DirectX12)

hi, i am doing my academic RHI and making posts about design moments that as i think might be helpfull for people who learning graphics. So they might find answers on their question and I might improve my knowledge by repeating.

Exactly this post about CommandBufferAllocator design, soon would be posts about Ring Allocators, DescriptorStates, multi api architecture design and why we need all of this.

All my posts is what i personally learned with blood and sweat, hope you find it usefull. Dont forget to comment and share your thoughts

Thats post in my blog and linked in.

Post in blog
Linked in post
Mad-RHI repo (contains release manager, descriptor state, allocators etc)

u/F1oating — 7 days ago
▲ 24 r/vulkan+1 crossposts

First Game with my Vulkan Engine

A few months ago I decided to finally learn Vulkan properly.
At the same time I also realized that my older OpenGL engine written in C# had reached a point where I wanted to rethink a lot of architectural decisions from scratch.

So I started over and began writing a new engine in C++ with Vulkan.

This time I intentionally tried to avoid the “build everything first” trap.
Instead of creating a giant general-purpose engine with hundreds of features I may never use, I decided to build the engine alongside an actual game project: a small space simulation inspired by games like X4.

One thing I learned from previous projects is that maintaining your own engine only makes sense if the engine fits the game you actually want to create. So the current goal is not to compete with Unreal or Unity, but to build a focused codebase that solves the problems my game needs.

At the moment the engine already has:

  • Vulkan renderer with swapchain recreation
  • Dynamic rendering states
  • Offscreen rendering/framebuffers
  • Instanced rendering
  • Raycasting for scene interaction
  • Asset loading/management
  • Assimp model loading with shared mesh resources
  • Simple PBR-style renderer
  • Scene serialization
  • A behavior/component system
  • A small reflection/property system for editor integration
  • ImGui + ImGuizmo integration

One thing that became surprisingly important was tooling.

Originally I wanted the project to stay mostly code-focused, but after implementing scene serialization and behavior reflection, creating a small in-game world editor suddenly became very practical.

So instead of building a completely separate editor application, the editor is just another part of the game itself.

Behaviors expose editable properties, and the editor automatically builds UI controls from them.

For example a behavior can expose enums and editable values like this:

properties.push_back({
    .name = "Faction",
    .type = PropertyType::String,
    .hint = PropertyHint::Enum,
    .data = (void*)&m_faction,
    .metaData = factionMetaData
});

That allows the editor to generate controls dynamically while also making serialization straightforward.

Right now I’m at the point where I can finally spend more time building actual gameplay systems instead of only rendering tech.
The latest addition was a station behavior system for interactable space stations and sector-based world loading.

There’s still a lot missing of course:

  • frustum culling
  • better streaming
  • animation improvements
  • more gameplay systems
  • proper UI later on

But for the first time the project feels less like “graphics experiments” and more like an actual game slowly coming together.

I’d honestly be interested how other engine/gameplay programmers handle the balance between:

  • custom engine tech
  • tooling
  • and actually shipping gameplay.

Also here are some pictures:

https://preview.redd.it/6e04z2u1441h1.png?width=1922&format=png&auto=webp&s=5691776595d50c48091698623d265729cb56a88f

https://preview.redd.it/hye00gy3441h1.png?width=1922&format=png&auto=webp&s=9aeb2f4e6fa651764d9b000972965caf73a8d386

https://preview.redd.it/5rd0whia441h1.png?width=1922&format=png&auto=webp&s=bd01099060af8a043b8808db1e5e93e8979ffad6

also here is the source code of the engine itself: Andy16823/GFXEngine if you want have a look. Would be nice to get some feedback.

reddit.com
u/Tiraqt — 8 days ago
▲ 5 r/vulkan

On my NVIDIA MX150, my triangle draws (yay!). But on Intel UHD 630, it doesn't. How do I work out what's not working?

So, long story short, but I'm finally at the point where I'm ready to draw a triangle. And I can, at least when I use my NVIDIA GPU (I have an Optimus laptop). But when I use the Intel GPU instead, the triangle doesn't draw.

In both cases, the background clears to the selected colour correctly. It's just that the triangle doesn't appear when using the Intel GPU.

I've tried RenderDoc - although I don't really know much about it - and everything seems identical in what it records between runs of the program with different GPUs selected - except for one thing.

On NVIDIA, the Mesh Viewer looks like this: https://i.ibb.co/jvpttBfM/image.png (hopefully you can see the wireframe triangle).

But on Intel, it looks like this: https://i.ibb.co/qM0pKvvj/image.png

I've got no idea what it means that the triangle is filled red, or what the blue square on the top vertex means.

I don't expect anyone to diagnose the exact problem - although that would be great if you happen to know - but can anyone help me with how I can investigate further? I don't really know what to try next, or what else I can do in RenderDoc to figure it out.

My shaders are simple, by the way - hardcoded vertices in the vertex shader, and returning a constant colour from the fragment shader. I haven't got as far as vertex buffers or attributes yet.

u/wonkey_monkey — 8 days ago
▲ 190 r/vulkan+2 crossposts

Vulkan engine in one year

A year ago, I began working through the Vulkan tutorial with the intention of building a graphics engine for my solo MMO project. I knew that building a graphics engine could be a slippery slope and that I might never start working on the actual game, so I set myself a deadline: one year. This post describes the tech I have built over this year.

The renderer was implemented in C++ using Vulkan 1.3. I used Vulkan-HPP bindings (to make the code idiomatic C++) and Vulkan VMA (to simplify GPU memory management). For input and output, I used SDL3. Procedural mesh and scene generation were implemented in Go.

My build system of choice is Bazel. I repackaged all the Vulkan libraries and SDL3 itself for Bazel, which allowed me to build single-binary, statically linked executables with all unused functions tree-shaken out. I do my development on Linux and occasionally test on Windows. In theory, macOS is also possible, but I have left it out for now. I have a working Bazel build of SDL for macOS, so bolting on MoltenVK should not be a problem in the future. Perhaps macOS will support Vulkan natively before I even get to that.

To generate meshes, I built a DSL that allowed me, as a programmer, to model 3D objects through code. It is similar to OpenSCAD but operates in more artistic terms than in points and vectors. I can express a 3D model as a sequence of commands: create a cube, select the right face, extrude it, rotate, subdivide, attach sub-objects, etc. The parser creates an AST in the form of a Protobuf message that I can serialize and tweak procedurally before passing it to the modeler. The modeler outputs a number of meshes per shader and a recipe on how to assemble the final object from smaller sub-objects and transformations. Some transformations are static, while others are tied to named parameters and enable object animation. All metadata is stored in two formats: one for the server (to understand the semantics of each transformation, for example) and one for the renderer (to upload to the GPU).

The generated meshes then go through the baker. I used the open-source GPU baker "Fornos," but I had to strip out all Windows-specific code, convert it to Vulkan, and make it headless to run on my server. With the help of the baker, I generate base color, normal maps, metalness, roughness, and ambient occlusion textures.

Once models are generated, they are serialized and stored on disk as Protobuf messages.

The scene generator creates the environment as multiple voxel grids. At the highest level, it generates large voxels, each corresponding to a whole room, a tunnel, huge pillars under the building, etc. In principle, I could use wave-function collapse at this level, but it was left out of the prototype. Level generation will be handled separately during actual game development.

Once the coarse grid is ready, a second generation pass kicks in. It creates a fine-grained voxel grid describing where the scene will have concrete, air, windows, etc. The core algorithm is straightforward: if a voxel has contact with both the interior and exterior simultaneously, it becomes a structure. The floor/ceiling boundary checks the "room index" stored in the coarse grid alongside the voxel type. If the voxel below belongs to room X and the voxel above to room Y, the current voxel is turned into a structure as well.

The third pass “cuts out” windows. It looks at large vertical slabs of concrete, decides how many windows to make, and replaces structural voxels with window voxels where necessary.

Next, the generator iterates over all voxels to recognize small patterns (e.g., if there is concrete on the left and a window on the right, it adds a window frame object at that position oriented accordingly). In some places, it creates corner meshes for window frames. In others, it adds exterior decorations to the building. The pattern-matching engine is very generic; a predicate describes what to check around the current position, and an action performs the task, such as adding an object.

The final stage identifies all continuous planes of concrete and, depending on their orientation and the surrounding voxels, creates scene-level geometry such as walls and floors. The exported scene contains meshes ready to be uploaded to the GPU with minimal preprocessing.

The scene is serialized and stored on disk as a Protobuf message again. A lot of things are stored as protobufs. It’s a pretty compact binary format that is easily accessible from C++ and Go.

Finally, we come to the Vulkan renderer. I use Vulkan 1.3 with dynamic rendering and bindless textures.

I experimented with deferred rendering, but the lack of MSAA was a significant downside. I tried implementing anti-aliasing with TAA but couldn't achieve decent results; it was either too blurry or ghosty. Ultimately, I settled on Forward+ rendering and 8x MSAA. The screen is split into small 3D boxes in (x, y, depth) space. For each box, a light clustering shader computes which lights and reflection probes affect it, storing this information in a buffer.

A depth pre-pass fills the depth buffer to reduce the number of fragment shader calls later. The main rendering pass samples the light clustering buffer and applies the lights and reflection probes from the generated lists. Window glass is rendered using weight-based order-independent transparency. It’s very cheap and produces decent results. The shader for large flat surfaces, such as walls and roofs, uses the Hextile algorithm to eliminate repetitive tiling patterns.

Global illumination uses a hierarchy of 3D voxel buffers of increasing size, similar to the technique used in Enlisted. Simplified scene geometry is rasterized to a 3D scene buffer, then a compute shader traces rays from each non-empty voxel to each light to compute voxel illuminance. Finally, another compute shader handles the most expensive part: tracing rays in all directions for each voxel in the scene to update six-sided irradiance. I have not yet implemented a 3D ring buffer to update GI incrementally as the camera moves; this will be done when converting this prototype into a real game client. It should be straightforward.

When sampling reflection probes and GI voxels, I had to fight light leaks through thin walls. If an irradiance voxel ends up right on the wall, it will sense light from both sides, so sampling it will produce incorrect results - light from the left will be visible on the right and vice versa. I found a pretty simple solution. At the scene generation time, I insert “sampler repellents” into each large wall. Each repellent is a thin rectangle that repeats the shape of the wall. Before sampling the voxel grid, I look up those repellents in the vicinity of the point and if it’s too close, then I shift the sampling along the normal. To make it efficient, I use a similar approach to clustered lighting: for each (x, y, depth) box I precompute a list of sampling repellents that can affect that box. However I think in the production version of the engine, I’ll probably scrap this clustered thing and implement a simple global BVH lookup.

For the UI, I used RmlUi and implemented a custom rendering interface interfacing with my engine. My original implementation suffered from multi-millisecond latencies for complex UIs because every rectangle and line of text was rendered with a separate draw call. I couldn't batch them together because those calls were a) order-dependent and b) each used a different texture. Eventually, I discovered bindless textures in Vulkan and fell in love with them. Instead of separate draw calls, I now bind all textures as a large array, add texture index as an instance attribute, and add draw commands to the single draw command buffer. Now, the entire UI is rendered with a single draw call without any texture rebinding between draws, and it takes 0.1ms latency, regardless of the UI complexity.

The final step does tonemapping, color grading, and compositing. I used a piecewise filmic curve and exposed all settings in a custom debug panel. The LUT table is precomputed on the CPU side and sampled in the single compositing shader.

I attached a bunch of screenshots from the last version of the engine to the post.

u/0xSYNAPTOR — 10 days ago
▲ 75 r/vulkan

Life after vulkan-tutorial.com

Just recently stumbled upon this gem from u/SaschaWillems:
https://www.howtovulkan.com/

This is really a game changer for me. I always struggled with multiple textures, until i get to know descriptor set indexing. Recently switched from RenderPasses/FrameBuffers to dynamic rendering, and I will never look back. Now a naive render graph implementation is a no-brainer. With that, I can dare to think of implementing a more complicated effects, instead of fighting an uphill battle with my own code. Thanks for this!

u/inactu — 10 days ago
▲ 7 r/vulkan

Pipeline (Image) Barrier for both layout and queue families

I have a Compute pipeline and a Graphics pipeline that access a VkImage that is created with VK_SHARING_MODE_EXCLUSIVE.

As I was working on the acquire-release mechanism, I've encountered, with the help of Khronos Validation Layer, something I was not expecting:

I have to make two separate vkCmdPipelineBarrier() calls in both acquire and release phases-one for srcQueueFamilyIndex / dstQueueFamilyIndex and another for oldLayout / newLayout.

I was under the impression that a single VkImageMemoryBarrier would suffice, but to satisfy the Khronos Validation Layer I had to perform family index and layout transitions in the release phase as two separate calls and likewise in the acquire counterpart.

Am I misinterpreting something here?

reddit.com
u/Tensorizer — 8 days ago
▲ 42 r/vulkan+2 crossposts

Little update from Trinity Engine

After more than a month of rewriting my entire renderer I finally have something to show again, not much just some basic texture loading but yeah, the engine's comming along nicely now

u/ThatTanishqTak — 10 days ago
▲ 47 r/vulkan

🚀Vulkan SDK 1.4.350.0 is now available!

Highlights:

• Major KosmicKrisp upgrades on macOS (Tessellation shaders + 7 new extensions supported)

• 16 new extensions across all platforms

• GFXReconstruct now ARM64X on Windows on ARM

• Alpha release of GPU Dump tool

• Improved validation coverage

A useful maintenance release that rewards staying current. Full blog + download 👉 https://www.lunarg.com/lunarg-releases-vulkan-sdk-1-4-350-0/

u/LunarGInc — 10 days ago
▲ 13 r/vulkan

Disabling stencil testing with a depth+stencil buffer doesn’t work on AMD/Windows

Currently playing with stencil test in my samples projects for my RHI.

I have a deferred renderer who enable the stencil test for the depth prepass, the g-buffer pass, the skybox pass and the lighting pass.

For the OIT pass the stencil test is disabled with VkPipelineDepthStencilStateCreateInfo.stencilTestEnable=false (VK_DYNAMIC_STATE_STENCIL_TEST_ENABLE is not enabled in the dynamic states) but the depth/stencil buffer is the same as the other passes (third image, layout VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL)

On Nvidia GPU, the stencil test is correctly disabled in the OIT pass both on Windows & Linux.

On AMD GPU the stencil test is correctly disabled on Linux (second image, OIT revealage buffe) but not on Windows (first image).

RenderDoc shows no stencil state on the OIT pass (fourth image), which is correct since the stencil test is disabled, but we can see in the ouputs buffers that the stencil is applied.

Did I miss something in the pipeline config parameters ?

This is the pipeline creation code : https://github.com/HenriMichelon/vireo_rhi/blob/main/src/vulkan/VKPipelines.cpp#L255

UPDATE : I tried on a computer with an integrated AMD GPU (my AMD development computer have a RX5700) and it works correctly (both computers have the same drivers versions).

u/Kakod123 — 9 days ago
▲ 5 r/vulkan

Clustered Forward with/without compute shader

I've implemented clustered forward shading both on cpu and gpu with a compute shader, based on this guide https://www.aortiz.me/2018/12/21/CG.html.

When using the gpu implementation I see a little drop in fps (10-20fps), which goes a bit against my expectations since i thought the compute shader was best suited for this task.

What I'm doing is the following:

  • use a single queue for graphics/compute tasks (no aysnc compute)
  • create ssbo for light clusters (one per frames in flight) at startup, recreate when window resizes
  • for each frame:
    • dispatch clustered forward compute shader (write on clusters ssbo)
    • insert bufferbarrier on light cluster ssbo of current frame
    • graphics draw commands (read from clusters ssbo)

For the cpu implementation I used openmp to speedup things, my computer has 12 logical threads. I experimented with up to 500 dynamic point lights.

Do you think this performance gap is ok or did I mess somethings up with my shader implementation? What's your experience using clustered forward shading?

reddit.com
u/giomatfois — 10 days ago