They are usually larger images (eg, the main menu graphic) and thus make
a mess of the atlas (thus, making them separate means a smaller atlas
can be used). All sorts of things are in a mess (descriptor management,
subpic rendering not supported, wrong alpha value for the transparent
pixel), but this gets the basic loader going.
This just takes advantage of the dynamic verts for doing subpics. It's
not really the most optimal code as it has to write both the vertices
(64 bytes per quad) and the instances (24 bytes per quad), but that's
still better than the old 128 bytes per quad (and having a single
pipeline is nice).
The problem was that I had mixed up the purpose of the per-frame vertex
buffers and used them for the core quad data when they were meant for
subpic and the like, and forgotten about the static vertex buffer.
This gets at least conchars working (pic in general not tested yet).
Any performance gains will be utterly swamped by the deferred renderer,
but it will allow better control of quad render order by any client
code (and should be slightly better for simpler renderers when I get
support for them working).
Right now, plenty is broken (much of the higher level draw functions are
disabled, and pics don't render correctly), but this gets at least the
basics in so I'm not bouncing diffs around as much.
It turns out the slice pipeline is compatible with the glyph pipeline in
that its vertex attribute data is a superset (just the addition of the
offset attributes). While the queues have yet to be merged, this will
eventually get glyphs, sliced sprites, and general (static) quads into
the one pipeline. Although this is slightly slower for glyph rendering
(due to the need to pass an extra 8 bytes per glyph), this should be
faster for quad rendering (when done) as it will be 24 bytes per quad
instead of 32 bytes per vertex (ie, 128 bytes per quad), but this does
serve as a proof of concept for doing quads, glyphs and sprites in the
one pipeline.
The main reason I had created in the first place was I hadn't thought of
using image view swizzles to handle coverage-alpha textures (for
monochrome glyphs), and for whatever reason also had the texture in a
different binding slot to the twod fragment shader. With both issues out
of the way, there's no reason to have an almost identical (just some
naming) shader just for glyphs.
With an eye towards merging the 2d pipelines as much as possible, I
found that the glyph and basic 2d quad texture descriptors were in
different slots for no reason I can think of. Having them in the same
slot would mean I could use the same fragment shader for all 2d
pipelines (though the plan is to get it down to two: (sliced) quads and
lines).
I hadn't noticed the problem until playing with early fragment tests for
the sprite fragment shaders, but passing data that expects triangle
strips to a pipeline that expects triangle lists doesn't work too well
when drawing quads.
Marking them as cached means that they'll be "uncached" instead of
destroyed when freed, which would not be a particularly good thing. I
have no memory as to how I found this as I found the change in my git
stash.
It seems that the mouse escaping the barriers requires some combination
of hitting two at once, and holding your mouth just right (something
about sliding the mouse up and down one barrier near the other).
However, sending the mouse back to the center of the screen when it
touches a barrier makes such sliding impossible.
This seems to fix#38
I obviously need a better way to test legacy code because the fix for
unsigned-int behavior with clang broke mouse warping when using
XGrabPointer instead of XInput2's XIGrabEnter.
While Draw_Glyph does draw only one glyph at a time, it doesn't shape
the text every time, so is a major win for performance (especially
coupled with pre-shaped text).
And add a function to process a passage into a set of views with glyphs.
The views can be flowed: they have flow gravity and their sizes set to
contain all the glyphs within each view (nominally, words). Nothing is
tested yet, and font rendering is currently broken completely.
Font and text handling is very much part of user interface and at least
partially independent of rendering, but does fit it better with GUI than
genera UI (ie, both graphics and text mode), thus libQFgui as well as
libQFui are built in the ui directory.
The existing font related builtins have been moved into the ruamoko
client library.
I had done the loader for the GPU renderers, so the CPU renderer didn't
draw the characters transparently. Fixes the pink block in my ruamoko
test scene (due to the notify text area).
While it doesn't really make any difference to the texture upload (8-bit
is 8-bit), and the sampler is in control of the interpretation, this
makes vulkan more consistent with the specification of the glyph
texture.
In theory, it supports all the non-palette formats, but only luminance
and alpha (tex_l and tex_a) have been tested. Fixes the rather broken
glyph rendering.
Thanks to the 3d frame buffer output being separate from the swap chain,
it's possible to have a different frame buffer size from the window
size, allowing for a smaller buffer and thus my laptop can cope (mostly)
with the vulkan renderer.
The escape was actually harmless as the buffers would not be read due to
the particle count being 0 (thus why the buffers were at the end of the
staging buffer: no space was allocated for them, only for the system
buffer, but their offsets were just past the system buffer). However,
the validation layers quite rightly did not like that. Thus, the two
buffers are pointed to the system buffer so all three descriptors are
always valid.
Where too far is 1024 units as that is the maximum supported, or the
radius. The change to using unsigned for the distances meant the simple
checks missed the effective max dist going negative, thus leading to a
segfault.
I had debated putting the blending in the compose subpass or a separate
pass but went with the separate pass originally, but it turns out that
removing the separate pass gains 1-3% (5-15/545 fps in a timedemo of
demo1).
It's a bit flaky for particles, especially at higher frame rates, but
that's due to supporting only 64 overlapping pixels. A reasonable
solution is probably switching to a priority heap for the "sort" and
upping the limit.
This required making the texture set accessible to the vertex shader
(instead of using a dedicated palette set), which I don't particularly
like, but I don't feel like dealing with the texture code's hard-coded
use of the texture set. QF style particles need something mostly for the
smoke puffs as they expect a texture.
It doesn't want to work on my nvidia (or more recent sid?) and doesn't
seem to be necessary. The problem may be multiple event sets before the
first wait, but investigation can wait for now.
This is probably the biggest reason I had problems with particles not
updating correctly: the descriptors were generally point pointing to
where the data actually was in the staging buffer.
I don't yet know whether they actually work (not rendering yet), but the
system isn't locking up, and shutdown is clean, so at least resources
are handled correctly.
Although it works as intended (tested via hacking), it's not hooked up
as the current frame buffer handling in r_screen is not readily
compatible with how vulkan output is handled. This will need some
thought to get working.
This splits up render pass creation so that the creation of the various
resources can be tailored to the needs of the actual render pass
sub-system. In addition, it gets window resizing mostly working (just
some problems with incorrect rendering).
It turns out the semaphore used for vkAcquireNextImageKHR may be left in
a signaled state for VK_ERROR_OUT_OF_DATE_KHR. While it seems to be
possible to clear the semaphore using an empty queue submission,
destroying and recreating the semaphore works well.
Still have problems with the frame buffer after window resize, though.
Swap chain acquisition is part of final output handling. However, as the
correct frame buffers are required for the render passes, the
acquisition needs to be performed during the preoutput render pass.
Window resize is still broken, but this is a big step towards fixing it.
This is the minimum maximum count for sampled images, and with layered
shadow maps (with a minimum of 2048 layers supported), that's really way
more than enough.
I guess nvidia gives a non-srgb format as the first in the list, but my
laptop gives an srgb format first, thus the unexpected difference in
rendering brightness. Hard-coding BGRA isn't any better, but it will do
for now.
Things are a bit of a mess with interdependence between sub-module
initialization and render pass initialization, and window resizing is
broken, but the main render pass rendering to an image that is then
post-processed (currently just blitted) is working. This will make it
possible to implement fisheye and water warp (and other effects, of
course).
When working, this will handle the output to the swap-chain images and
any final post-processing effects (gamma correction, screen scaling,
etc). However, currently the screen is just black because the image
for getting the main render pass output isn't hooked up yet.
Now each (high level) render pass can have its own frame buffer. The
current goal is to get the final output render pass to just transfer the
composed output to the swap chain image, potentially with scaling (my
laptop might be able to cope).
While the HUD and status bar don't cut out a lot of screen (normally),
they might start to make a difference when I get transparency working
properly. The main thing is this is a step towards pulling the 2d
rendering into another render pass so the main deferred pass is
world-only.
Using swizzles in an image view allows the same shader to be used with
different image "types" (eg, color vs coverage).
Of course, this needed to abandon QFV_CreateImageView, but that is
likely for the best.
It turns out that nearest filtering doesn't need any offsets to avoid
texel leaks so long as the screen isn't also offset. With this, the 2d
rendering looks good at any scale (minus the inherent blockiness).
It seemed like a good idea at the time, but it exacerbates pixel leakage
in atlas textures that have no border pixels (even in nearest sampling
modes).
The rest of the system won't add one automatically (since entity
creation no longer does), but the alias and iqm rendering code expect
there to be one. Fixes a segfault when starting a scene (demo etc).
There's no API yet as I need to look into the handling of qpic_t before
I can get any of this into the other renderers (or even vulkan, for that
matter).
However, the current design for slice rendering is based on glyphs (ie,
using instances and vertex pulling), with 3 strips of 3 quads, 16 verts,
and 26 indices (2 reset). Hacky testing seems to work, but real tests
need the API.
I don't know why it didn't happen during the demo loop, but going from
the start map to e1m1 caused a segfault due to the efrags for a lava
ball getting double freed (however, I do think it might be because the
ball passed through at least two leafs, while entities in the demos did
not). The double free was because SCR_NewScene (indirectly) freed all
the efrags without removing them from entities, and then the client code
deleting the entities caused the visibility components to get deleted
and thus the efrags freed a second time. Using ECS_RemoveEntities on the
visibility component ensures the entities don't have a visibility
component to remove when they are later deleted.
It's currently used only by the vulkan renderer, as it's the only
renderer that can make good use of it for alias models, but now vulkan
show shirt/pants colors (finally).
This cuts down on the memory requirements for skins by 25%, and
simplifies the shader a bit more, too. While at it, I made alias skins
nominally compatible with bsp textures: layer 0 is color, 1 is emissive,
and 2 is the color map (emissive was on 3).
As the RGB curves for many of the color rows are not linearly related,
my idea of scaling the brightest color in the row just didn't work.
Using a masked palette lookup works much better as it allows any curves.
Also, because the palette is uploaded as a grid and the coordinates are
calculated on the CPU, the system is extendable beyond 8-bit palettes.
This isn't quite complete as the top and bottom colors are still in
separate layers but their indices and masks can fit in just one, but
this requires reworking the texture setup (for another commit).
For whatever reason, I had added an extra 4 bytes to the fragment
shader's push-constants. It took me a while to figure out why renderdoc
wouldn't stop complaining about me not writing enough data.
It turns out my approach to alias skin coloring just doesn't work for
the quake data due to the color curves not having a linear relationship,
especially the bottom colors.
It works on only one layer and one mip, and assumes the provided texture
data is compatible with the image, but does support sub-image updates
(x, y location as parameters, width and height in the texture data).
The bright end of the color map is actually twice the palette value, but
I didn't understand this when I came up with the shirt/pants color
scheme for vulkan. However, the skin texture can store only 0..1, so the
mapping to 0..2 needs to be done in the shader. It looks like it works
at least better: the gold key at the end of demo1 doesn't look as bleh,
though I do get some weird colors still on ogres etc.
Currently only for gl/glsl/vulkan. However, rather than futzing with
con_width and con_height (and trying to guess good values), con_scale
(currently an integer) gives consistent pixel scaling regardless of
window size.
Well, sort of: it's still really in the renderer, but now calling
R_AddEfrags automatically updates the visibility structure as necessary,
and deleting an entity cleans up the efrags automatically. I wanted this
over twenty years ago.