It's a tad bogus as it's the lights close to the camera, but it should
at least be a good start once things are working. There's currently
something very wrong with the state of things.
This makes tex_t more generally useable and probably more portable. The
goal was to be able to use tex_t with data that is in a separate chunk
of memory.
The sky texture is loaded with black's alpha set to 0. While this does
hit both layers, the screen is cleared to black so it shouldn't be a
problem (and will allow having a skybox behind the sheets).
Glow map and sky sheet and cube need to wait until I can get some
default textures going, but the world is rendering correctly otherwise
(though a tad dark: need to do a gamma setting).
It now uses the ring buffer code I wrote for qwaq (and forgot about,
oops) to handle the packets themselves, and the logic for allocating and
freeing space from the buffer is a bit simpler and seems to be more
reliable. The automated test is a bit of a joke now, though, but coming
up with good tests for it... However, nq now cycles through the demos
without obvious issue under the same conditions that caused the light
map update code to segfault.
Needed to use an rgba format to use floats (and optimal layout), but
having to set the alpha to 1 even for full-dark luxels is not very
efficient. Better to just ignore the alpha in the shader. Fixes the
occasional transparent surface in shadowed areas.
Many surfaces are missing (I suspect it's due to transform stage
management in the index emitter), and currently only the light maps are
rendered (still not binding the correct textures), but the basics are
working.
Vulkan validation (quite rightly) doesn't like it when the flush range
goes past the end of the buffer, but also doesn't like it when the flush
range isn't cache-line aligned, so align the size of the buffer, too.
Copying data from the wrong buffer was the cause of the corrupted brush
model vertices, and then lots of little errors (mostly forgetting to
multiply by bpp) for textures.
I had originally planned on mixing the stage management with general
texture support code like I did in glsl, but I think that was a mistake
and I did keep looking for scrap.[ch] when I wanted to edit something to
do with the scrap...
There's still a problem with the vertex data itself not getting sent to
the GPU properly, but vulkan is now happy with my tiny test map (which
required disabling skies entirely until I get null textures working).
This cleans up texture_t and possibly even improves locality of
reference when running through texture chains (not profiled, and not
actually the goal).
It optionally generates mipmaps, and supports the main texture types
(especially for texture packs), including palettes, but is otherwise
rather unsophisticated code. Needs a lot of work, but testing first.
This is more correct as the environment (X11 etc) might provide more
swapchain images than we want: 3 frames in flight is generally
considered a good balance between saturating the hardware and latency.
Cleans up global space and makes it usable in multiple contexts. Also,
max quads dropped to 32k as each frame now has its own vertex buffer to
avoid issues with vertex overwrites (which I have seen). However, all
vertex buffers are in the one memory/buffer object (using offsets) and
the index buffer has been moved into a device-local memory object.
I think I did two as a bit of a ring buffer, but the new ring buffer
system used inside a staging buffer makes it less necessary. Also, the
staging buffer is now a fair bit bigger (4M is probably not really
enough)
This allows the array in which the command buffers are allocated to be
allocated on the stack using alloca and thus remove the need to
malloc/free of relatively small chunks.
The console background is missing, and scaled vs unscaled (currently
always scaled) 2d, but otherwise everything seems to work. Lots of
places to clean up, though.
Draw now has its own staging buffer to use with its scrap. Also, a few
fixes were needed for the staging buffer and scrap flush routines.
Other than some synchronization issues with draw scrap flushing
(currently worked around with a fence-wait) things seem to be working
nicely.
The scrap texture did very good things for the glsl renderer and the
better control over data copying might help it do even better things for
vulkan, especially with lots of little icons.
It's never actually used (the texture can be fetched using
GLSL_ScrapTexture) and gets in the way of sharing the scrap system with
the vulkan renderer.
r_screen because of SCR_UpdateScreen, and r_cvar because the cvars
really should never have been in a plugin in the first place (and
r_screen needed access).
First pixels! This was a nightmare of little issues that the validation
layers couldn't help with: incorrect input assembly, incorrect vertex
attribute specs. Though the layers did help with getting the queues
working. Still, lots of work to go but this is a major breakthrough as
I now have access to visual debugging for textures and the like.
Short wrappers for Draw functins are in vid_render_vulkan.c so the
vulkan context can be passed on to the actual functions. The 2D shaders
are set up similar to those in glsl, but with full 32-bit color (rgba)
support instead of paletted. However, the textures are not loaded yet,
nor is anything bound.
This necessitated hand-writing qfv_swapchain_t's descriptors as I don't
feel like getting that complicated with vkgen at this stage and it's not
really appropriate anyway? qfv_swapchain_t is meant to be read-only and
not parsed from a plist.
The prototypes for handle parsers needed to be changes because it turned
out "single" was inappropriate for handles as "single" allocates memory
for the parsed object, but handles must be written directly.
The way I wound up using the field meant that exprctx should not "own"
the hashlinks chain, but rather just point to it. This fixes the nasty
access errors I had.
Dependencies on vkparse.hinc were spreading through the code which I
didn't want as that removes a lot of the automation from the automake
files. This keeps all parser code internal to vkparse.c's scope, and any
accesses required for enum and struct (not yet) definitions can be
fetched by name.
Array and single type overrides now allow the parsing of the items
themselves to be customized. This makes it easy to handle arrays and
pointers to single items while also using custom specifications, rather
than relying entirely on the custom override.
QC's int type is named "integer" (didn't feel like changing that right
now), so special case it to be "int".
Output the parse func name (instead of "fix me").
Output a parse func for enums (needed for arrays of enums
(VkDynamicState)).
The static variable meant that Fog_GetColor was not thread-safe (though
multiple calls in the one thread look to be ok for now). However, this
change takes it one step closer to being more generally usable.
Patch found in an old stash.
I had missed the array declaration and thus initialized the pointer to
the offset array incorrectly. Didn't show up until I tried using
multiple offsets.
Shaders can be built as spv files and installed into
$libdir/quakeforge/shaders or as spvc files and compiled into the
engine. Loading supports $builtin/name to access builtin shaders,
$shader/path to access external standard shaders and quake filesystem
access for all other paths.
I had forgotten that msaa samples was governed by the driver (as a max)
and the renderpass setup code simply took the max. Thus why 1 vs 8
caused the display to render incorrectly.
It turned out the msaa setting defaulting to 1 instead of 8 was the
problem no idea why at this stage (need to read up on just how that
setting works). Once I understand just how it works, I'll rework the
msaa handling.
This gets renderpass parsing almost working (not hooked up, though). The
missing bits are support for expressions for flags (namely support for
the | operator) and references (eg $swapchain.format). However, this
shows that the basic concept for the parser is working.
Nothing is actually done yet other than parsing the built-in property
list to property list items (the actual parser is just a skeleton), but
everything compiles
The property list specifies the base structures for which parser code
will be generated (along with any structures and enums upon which those
structures depend). It also defines option specialized parsers for
better control.
It worked as a proof of concept, but as the code itself needs to be a
bit smarter, it would be a lot smarter to break up that code to make it
easier to work on the individual parts.
The tables are generated from the enums pulled out of the vulkan headers
using a ruamoko program (thanks to its reflection capabilities). They
will be used for parsing property lists used to create render passes and
pipelines.
There's still some cleanup to do, but everything seems to be working
nicely: `make -j` works, `make distcheck` passes. There is probably
plenty of bitrot in the package directories (RPM, debian), though.
The vc project files have been removed since those versions are way out
of date and quakeforge is pretty much dependent on gcc now anyway.
Most of the old Makefile.am files are now Makemodule.am. This should
allow for new Makefile.am files that allow local building (to be added
on an as-needed bases). The current remaining Makefile.am files are for
standalone sub-projects.a
The installable bins are currently built in the top-level build
directory. This may change if the clutter gets to be too much.
While this does make a noticeable difference in build times, the main
reason for the switch was to take care of the growing dependency issues:
now it's possible to build tools for code generation (eg, using qfcc and
ruamoko programs for code-gen).
They take a pointer to a free-list used for hashlinks so the hashlink
pools can be per-thread. However, hash tables that are not updated are
always thread-safe, so this affects only updates. progs_t has been set
up such that it is easy for multiple progs within one thread can share
hashlinks.
It turned out I needed access to the physical device from a buffer
object, so rather than storing the vulkan logical device directly in
buffer (and other) objects, store the qfv logical device.
It's just a wrapper around hashtab, but it makes checking if a string is
in a set easy. Way overkill when only a few extensions are enabled, but
more might come later.
This paves the way for clean initialization of the Vulkan renderer, and
very much cleans up the older renderer initialization code as gl and sw
are no longer intertwined.
This fixes the segfault and pushes things very much in the desired
direction of proper system independence for rendering and presentation
separation (though things were headed in the right direction before).
Things are still a mess, but a proper cleanup will be a lot of work and
will, really, involve properly splitting quake-specific code* out from
the rest of the renderer.
* data loading and format specific stuff
A single graphics-capable queue should be enough for now. However, I'm
not sure I'm happy with a lot of the code: it's a bit difficult to write
flexibly configured code for Vulkan (or seems to be at this stage),
especially in C.
After messing with SIMD stuff for a little, I think I now understand why
the industry went with xyzw instead of the mathematical wxyz. Anyway, this
will make for less pain in the future (assuming I got everything).
I'm not certain despair actually meant for the break to be there. It
certainly would have sped up the game a bit but at the expense of proper
blood trails in the software renderers.
These are the ones where I could easily make scan-build happy. They do seem
to be potential holes where invalid data in one place could result in use
of uninitialized values.
While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
It seems mesa still has the bug where non-array attributes don't work
when set as attribute 0, and that the allocation order changed sometime
since I last tested with mesa. This fixes the black world and flickering
alias models on my eeepc.
So far, alias model rendering is the only victim, but things are working,
even if only color map lookup and fog blending are broken out at this
stage.
I expect the effect naming scheme will go through some changes until I'm
happy with it.
Again, based on The OpenGL Shader Wrangler. The wrangling part is not used
yet, but the shader compiler has been modified to take the built up shader.
Just to keep things compiling, a wrapper has been temporarily created.
I'd missed a set of bit->lightnum conversions that resulting in lightnum
becoming much greater than 128 and thus trashing memory when the surface
was marked.
The seed is currently 0xdeadbeef, but I intend on fixing that soon. Now the
particle velocities and origins use fully independent bits (though a big
chunk is wasted right now).
This is a quick fix until I get a random number generator into QF.
Mingw's RAND_MAX is only 0x7fff and so the (((rnd >> 10) & 63) - 31.5) / 63.0
used for the z component of origin and velocity would never go positive.
For now, change the 10 to 9 (reusing another bit from Y). I plan on
implementing a full 32-bit PRNG in QF so we always have a reliable
generator.
This fixes the status bar refresh issues in sw. The problem was that with
two viddef's hanging around, things got a little confused and recalc_refdef
wasn't getting into the renderer.
It was properly cleared after drawing water chains and sky chains, but I
had missed normal surfaces. It took the use of the same texture for both
normal surfaces and water surfaces to trigger the bug. Thanks go to Simon
'Sock' O'Callaghan and his In The Shadows mod.
The depth limits in the gl and glsl renderers and in the trace code really
bothered me, but then the fix hit me: at load-time, recurse the trees
normally and record the depth in the appropriate place. The node stacks can
then be allocated as necessary (I chose to add a paranoia buffer of 2, but
I expect the maximum depth will rarely be used).
The attached patch (against quakeforge git) changes the [con]width,
[con]height, and most importantly the rowbytes members of viddef_t
from unsigned to signed int, like in q2. This allows for a properly
negative vid.rowbytes which may be needed in, e.g. a DIB sections
windows driver if needed. Along with it, I changed a few places
where unsigned int is used along with comparisons against the relevant
vid.* members.
One thing I am not 100% sure is the signedness requirements of
d_zrowbytes and d_zwidth: q2 has them as unsigned but I am not sure
whether that is because they are needed as unsigned or it was just an
oversight of the id developers. They do look like they should be OK
as signed int to me, though: comments?
==
Note from Bill Currie: I had to do some extra changes as many
signed/unsigned comparisons were somehow missed.
It turns out the expected orientation of the sky cube is exactly that of
Blender's default cube looked at from the front view (num-1) and the front
face being the nearest face. This put's Marcher's sun nicely in the view
when exiting the cave.
Rearrange the sky_suffix and sky_coords arrays and remove the sky_target
array such that the faces can be loaded using
GL_TEXTURE_CUBE_MAP_POSITIVE_X + i (apparently certain drivers break if
the faces aren't loaded in the correct order).
Also, the nomalization of the direction vector in the fragment isn't
necessary.
All of the nastiness is hidden in bspfile.c (including the old bsp29
specific data types). However, the conversions between bsp29 and bsp2 are
implemented but not yet hooked up properly. This commit just gets the data
structures in place and the obvious changes necessary to the rest of the
engine to get it to compile, plus a few obvious "make it work" changes.
The setup had been lost at some stage, thus shadows were always directly
under the entity. Unlike the original quake shadow code, the vector is
correctly transformed into the entity's space.
I finally found the cause of Despair's gl shadows non-rendering+segfault...
the shadow code expected triangle fans and strips but was getting simple
triangles. Oops.
LordHavoc had made lighting positive for sw32, but I had done something in
the plugin code that broke that (probably something to do with the
colormap loading). Going back to id's original code fixes the issue.
This reverts commit e170f4ee75.
It turns out I messed up something in the patch. I noticed the problem
while playing digs04.bsp: many sub-model surfaces, particularly those with
animated textures, were not being transformed correctly. As this patch did
not make a large performance difference, it's probably better to just
revert it. I might revisit it later.
Since the backtile is loaded into a scrap and used as a subtexture, we
can't use GL's texture wrapping, thus do the wrapping ourselves. There are
some minor issues with the wrong part of the scrap being drawn: need to
investigate where the bug is (vrect, make_quad, etc).
It turns out glsl, sw and sw32 weren't getting any benefit from R_CullBox
because the frustum wasn't setup :P. Get another 8% out of bigass1
(174->184fps). bigass1 now runs 2x as fast as it did before I started this
optimisation run :)
This severely reduces the calles to BindTexture, and more importantly,
glUseProgram, EnableVertexAttribArray etc. The biggest changes are:
o icons and text are all in the one giant texture
o icons and text are mixed in the one queue
This gave ~9% speedup for bigass1 (159->174fps).
gl, sw and sw32 use blend palettes, so share the code. This also abandons
the optimization for transforming verts in sw (had all sorts of problems
anyway). sw still doesn't work, though.
There are still many issues to sort out, but the basics are working.
Problems:
rendered fullbright (no lighting done)
normals are ignored
extra textures (glow etc) not used/loaded
4 models on the screen don't seem to be a problem.
Though the bsp loader doesn't yet support colored lighting, the ambient
light will be colored when it does. With this, I guess iqm model support is
done for glsl until I figure out how I want to do dual quaternion support.
There's still a problem with his finger tips and feet, but the rest of his
limbs seem to be working well. Much thanks to Spike for encouraging me to
do a dump of the matices that are actually sent to the card.
It turns out that animated joints remain relative right up to the last
moment.
This avoids sending invalid pose data to the renderer. The symptom was a
vertex array offset higher than the vertex array size. Discovered by calim
of nouveau while he was debugging a driver problem found by QF. Many
thanks.
While this particular tigger of the real bug was caused by 659d95221e
(hopefully fix both the "get stuck waiting for 3d" bug and the null
worldmode bug.), the real bug was lurking in the code since the dawn of
time (from sw32's perspective). This fix is as per LordHavoc's suggestion
(heh, despite the years, he knows his code), but I spent the time hunting
down the trigger to understand just what was going on.
It turns out that (0,0,0) is too close to a wall (probably on, but the
slight default offset is too close) and the above commit changed the first
rendered frame to be before the player origin was set rather than after.
This fix feels correct to me because noclipping around with the sw32
renderer would probably hit the same bug with a bit of bad luck. Thus
ensure the index resulting from zi never exceeds 65535.
While checking the shaders to see if there might be anything obvious to
work around the current nouveau shader issues, I found a 1 that should have
been a 1.0. I'm surprised it ever compiled.
It doesn't seem to have any useful effect in QF (even before the plugin
project) other than setting the number of frames to update. I'm not sure if
it's a useless variable or one where the user is supposed to match it to
the system configuration. Anyway, with this, the glsl plugin now works.
This allows the vid module to load the render module and access render
specific functions before the renderer initializes, which happens to need
an initialized vid module...
The renderer now gets initialized and things sort of work (qw-client will
idle, though nothing is displayed). However, as the viddef stuff is broken,
it segs on trying to run the overkill demo.
Still, nothing will work: no plugins are loaded and they're all broken
anyway.
glx, sgl, glslx etc are going away, just the basics will be built: fbdev
(probably go away eventually), sdl, x11 and hopefully someday win. That's
actually the only reason anything links.
Where possible, symbols have been made static, prefixed with glsl_/GLSL_ or
moved into the code shared by all renderers. This will make doing plugins
easier but done now for link testing. The moving was done via the gl
commit.
Where possible, symbols have been made static, prefixed with gl_/GL_ or
moved into the code shared by all renderers. This will make doing plugins
easier but done now for link testing.
QF will now load a single image with (${skyname}_map.*) using blender's
layout. That is, 3x2 with the top row being back, right, front and the
bottom row being bottom, top, left.
o All instances of LIBADD/LDADD have a corresponding DEPENDENCIES
specificatiion.
o libraries now use a lib_ldflags macro to keep things consistent
o duplication of source/lib names has been minimized (particularly in
the libraries; more work needs to be done for the executables)
o automake spec blocks have been organized (again, more work needs to be
done for the executables)
Most subsystems that depend on other subsystems now call the init functions
themselves. This makes for much cleaner client initialization (more work
needs to be done for the server).
I'd actually done this the first time, but then got confused and forgot the
waterchain works with multiple textures. This is actually the right place
as all transparent surfaces need to be sorted irrespective to their
textures. Really, waterchain needs to be renamed.
If the map got reloaded but the current leaf didn't change the world (and
most entities) didn't get drawn. Forcing a vis update by first setting
r_viewleaf to null and marking surfaces does the trick :)
The renderer should now be free of any direct access to client code. Even
3d rendering is now done via a function pointer.
The cshift code is done as a 2d screen function.
This is a rather "evil" hack because GLES doesn't seem to need
GL_VERTEX_PROGRAM_POINT_SIZE, but GL does, and all my work is currently
done in GL rather than GLES. Point particles now work, but the sizes are
all wrong.
Using quads requires 4 elements, but triangles require 6. I'd gotten the
element array setup right, but I'd forgotten to tweak vacount when drawing
the particles.
I'm not sure if there's a bug in mesa, or if I'm doing something wrong, but
GL_POINTS doesn't seem to be working properly. I get the points, but
writing to gl_PointSize doesn't make a difference despite the size range
being 1-255.
Unfortunately, the maximum point size on Intel hardwar seems to be 1, so I
can't tell if the colors are right.
This is largely just a hacked version of GL's particle code.
For now, only the glsl loader disables caching, but it stores the frame
vertices in GL memory, so its hunk usage is relatively lower (and will be
lower still when I get skins sorted out).
Unfortunately, the intel driver on my eeepc doesn't like the mipmas for
plat_top2 or +2floorsw. If I either don't load their mipmaps, or skip
drawing them, things seem to work nicely.
It turns out my complicated plan was just that: complicated. Although there
are currently some bugs, the method I used to build the VBO in the first
place will work equally well for building the index lists.
Just some problems with lightmaps. There also seem to be some issues with
seams (t-junctions?) and far clip, but they're quite separate.
There's also a problem with a segfault when loading a second map.
The entire vertex set from every model is put into one list (not yet
uploaded). chains of elements arrays are build for non-instanced models
(instanced models will have their chains built each frame).
Still nothing being rendered: still in the process of building the display
lists, but I'm making good progress. Get this into git before something
goes wrong :)
After getting in contact with serplord, I now know that the sw alias
loading was correct. Turns out the gl loaders was mostly correct, just a
mistaken subtract rather than add. And with that, I can implement alias-16
support in glsl. better yet, since all the work is done in the loader, the
renderer doesn't know anything about it :) However, I need to create some
16-bit models for testing.
Not all hardware can access a texture sampler from the vertex shader, and I
don't want multiple paths this early in the game. Now, vertex normals are
uploaded as shorts. Should be 14 bytes per vertex (was 10, could have been
8 if I had put the normal index with the vertex rather than st).