And clean up the resulting errors. While some were tricky, there weren't
all that many: just some attachment issues and the multi-stage image
copy for scraps.
Fixing scraps required a barrier between copies. It might be overkill,
but a transfer_dst to transfer_dst image barrier worked.
Fixing attachments was a bit trickier:
- depth needed early and late fragment tests to be treated as one stage
- all attachments that were read later needed storeOp = none (using the
extension)
- and then finalLayout needed to be correct to avoid ghost transitions
- as well, for some reason the deffered gbuffer subpass needed a depth
dependency on the translucent pass even though neither one writes to
the depth attachment (possibly a validation bug, needs more
investigation).
I guess it's kind of UB, but it's handy for images that will be
conditionally written by the GPU but need to be in shader-read-only for
draw calls and the validation layers can't tell that the layers won't be
used.
It's a bit flaky for particles, especially at higher frame rates, but
that's due to supporting only 64 overlapping pixels. A reasonable
solution is probably switching to a priority heap for the "sort" and
upping the limit.
I got tired of writing the same 13 or so lines of code over and over (it
actually put me off experimenting with Vulkan). Thus...
QFV_PacketCopyBuffer does the work of handling barriers and a (full
packet) copy from the staging buffer to a GPU buffer.
QFV_PacketCopyImage does a similar job, but for images. However, it
still needs a lot of work, but it does make getting a basic texture onto
the GPU much less of a hassle.
Both functions should make staging data much less error-prone.
Mostly, this gets the stage flags in with the barrier, but also adds a
couple more barrier templates. It should make for slightly less verbose
code, and one less opportunity for error (mismatched barrier/stages).