I had forgotten that msaa samples was governed by the driver (as a max)
and the renderpass setup code simply took the max. Thus why 1 vs 8
caused the display to render incorrectly.
It turned out the msaa setting defaulting to 1 instead of 8 was the
problem no idea why at this stage (need to read up on just how that
setting works). Once I understand just how it works, I'll rework the
msaa handling.
This gets renderpass parsing almost working (not hooked up, though). The
missing bits are support for expressions for flags (namely support for
the | operator) and references (eg $swapchain.format). However, this
shows that the basic concept for the parser is working.
Nothing is actually done yet other than parsing the built-in property
list to property list items (the actual parser is just a skeleton), but
everything compiles
The property list specifies the base structures for which parser code
will be generated (along with any structures and enums upon which those
structures depend). It also defines option specialized parsers for
better control.
It worked as a proof of concept, but as the code itself needs to be a
bit smarter, it would be a lot smarter to break up that code to make it
easier to work on the individual parts.
The tables are generated from the enums pulled out of the vulkan headers
using a ruamoko program (thanks to its reflection capabilities). They
will be used for parsing property lists used to create render passes and
pipelines.
There's still some cleanup to do, but everything seems to be working
nicely: `make -j` works, `make distcheck` passes. There is probably
plenty of bitrot in the package directories (RPM, debian), though.
The vc project files have been removed since those versions are way out
of date and quakeforge is pretty much dependent on gcc now anyway.
Most of the old Makefile.am files are now Makemodule.am. This should
allow for new Makefile.am files that allow local building (to be added
on an as-needed bases). The current remaining Makefile.am files are for
standalone sub-projects.a
The installable bins are currently built in the top-level build
directory. This may change if the clutter gets to be too much.
While this does make a noticeable difference in build times, the main
reason for the switch was to take care of the growing dependency issues:
now it's possible to build tools for code generation (eg, using qfcc and
ruamoko programs for code-gen).
They take a pointer to a free-list used for hashlinks so the hashlink
pools can be per-thread. However, hash tables that are not updated are
always thread-safe, so this affects only updates. progs_t has been set
up such that it is easy for multiple progs within one thread can share
hashlinks.
It turned out I needed access to the physical device from a buffer
object, so rather than storing the vulkan logical device directly in
buffer (and other) objects, store the qfv logical device.
It's just a wrapper around hashtab, but it makes checking if a string is
in a set easy. Way overkill when only a few extensions are enabled, but
more might come later.
This paves the way for clean initialization of the Vulkan renderer, and
very much cleans up the older renderer initialization code as gl and sw
are no longer intertwined.
This fixes the segfault and pushes things very much in the desired
direction of proper system independence for rendering and presentation
separation (though things were headed in the right direction before).
Things are still a mess, but a proper cleanup will be a lot of work and
will, really, involve properly splitting quake-specific code* out from
the rest of the renderer.
* data loading and format specific stuff
A single graphics-capable queue should be enough for now. However, I'm
not sure I'm happy with a lot of the code: it's a bit difficult to write
flexibly configured code for Vulkan (or seems to be at this stage),
especially in C.
After messing with SIMD stuff for a little, I think I now understand why
the industry went with xyzw instead of the mathematical wxyz. Anyway, this
will make for less pain in the future (assuming I got everything).
I'm not certain despair actually meant for the break to be there. It
certainly would have sped up the game a bit but at the expense of proper
blood trails in the software renderers.
These are the ones where I could easily make scan-build happy. They do seem
to be potential holes where invalid data in one place could result in use
of uninitialized values.
While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
It seems mesa still has the bug where non-array attributes don't work
when set as attribute 0, and that the allocation order changed sometime
since I last tested with mesa. This fixes the black world and flickering
alias models on my eeepc.
So far, alias model rendering is the only victim, but things are working,
even if only color map lookup and fog blending are broken out at this
stage.
I expect the effect naming scheme will go through some changes until I'm
happy with it.
Again, based on The OpenGL Shader Wrangler. The wrangling part is not used
yet, but the shader compiler has been modified to take the built up shader.
Just to keep things compiling, a wrapper has been temporarily created.
I'd missed a set of bit->lightnum conversions that resulting in lightnum
becoming much greater than 128 and thus trashing memory when the surface
was marked.
The seed is currently 0xdeadbeef, but I intend on fixing that soon. Now the
particle velocities and origins use fully independent bits (though a big
chunk is wasted right now).
This is a quick fix until I get a random number generator into QF.
Mingw's RAND_MAX is only 0x7fff and so the (((rnd >> 10) & 63) - 31.5) / 63.0
used for the z component of origin and velocity would never go positive.
For now, change the 10 to 9 (reusing another bit from Y). I plan on
implementing a full 32-bit PRNG in QF so we always have a reliable
generator.
This fixes the status bar refresh issues in sw. The problem was that with
two viddef's hanging around, things got a little confused and recalc_refdef
wasn't getting into the renderer.
It was properly cleared after drawing water chains and sky chains, but I
had missed normal surfaces. It took the use of the same texture for both
normal surfaces and water surfaces to trigger the bug. Thanks go to Simon
'Sock' O'Callaghan and his In The Shadows mod.
The depth limits in the gl and glsl renderers and in the trace code really
bothered me, but then the fix hit me: at load-time, recurse the trees
normally and record the depth in the appropriate place. The node stacks can
then be allocated as necessary (I chose to add a paranoia buffer of 2, but
I expect the maximum depth will rarely be used).
The attached patch (against quakeforge git) changes the [con]width,
[con]height, and most importantly the rowbytes members of viddef_t
from unsigned to signed int, like in q2. This allows for a properly
negative vid.rowbytes which may be needed in, e.g. a DIB sections
windows driver if needed. Along with it, I changed a few places
where unsigned int is used along with comparisons against the relevant
vid.* members.
One thing I am not 100% sure is the signedness requirements of
d_zrowbytes and d_zwidth: q2 has them as unsigned but I am not sure
whether that is because they are needed as unsigned or it was just an
oversight of the id developers. They do look like they should be OK
as signed int to me, though: comments?
==
Note from Bill Currie: I had to do some extra changes as many
signed/unsigned comparisons were somehow missed.
It turns out the expected orientation of the sky cube is exactly that of
Blender's default cube looked at from the front view (num-1) and the front
face being the nearest face. This put's Marcher's sun nicely in the view
when exiting the cave.
Rearrange the sky_suffix and sky_coords arrays and remove the sky_target
array such that the faces can be loaded using
GL_TEXTURE_CUBE_MAP_POSITIVE_X + i (apparently certain drivers break if
the faces aren't loaded in the correct order).
Also, the nomalization of the direction vector in the fragment isn't
necessary.
All of the nastiness is hidden in bspfile.c (including the old bsp29
specific data types). However, the conversions between bsp29 and bsp2 are
implemented but not yet hooked up properly. This commit just gets the data
structures in place and the obvious changes necessary to the rest of the
engine to get it to compile, plus a few obvious "make it work" changes.
The setup had been lost at some stage, thus shadows were always directly
under the entity. Unlike the original quake shadow code, the vector is
correctly transformed into the entity's space.
I finally found the cause of Despair's gl shadows non-rendering+segfault...
the shadow code expected triangle fans and strips but was getting simple
triangles. Oops.
LordHavoc had made lighting positive for sw32, but I had done something in
the plugin code that broke that (probably something to do with the
colormap loading). Going back to id's original code fixes the issue.
This reverts commit e170f4ee75.
It turns out I messed up something in the patch. I noticed the problem
while playing digs04.bsp: many sub-model surfaces, particularly those with
animated textures, were not being transformed correctly. As this patch did
not make a large performance difference, it's probably better to just
revert it. I might revisit it later.
Since the backtile is loaded into a scrap and used as a subtexture, we
can't use GL's texture wrapping, thus do the wrapping ourselves. There are
some minor issues with the wrong part of the scrap being drawn: need to
investigate where the bug is (vrect, make_quad, etc).
It turns out glsl, sw and sw32 weren't getting any benefit from R_CullBox
because the frustum wasn't setup :P. Get another 8% out of bigass1
(174->184fps). bigass1 now runs 2x as fast as it did before I started this
optimisation run :)
This severely reduces the calles to BindTexture, and more importantly,
glUseProgram, EnableVertexAttribArray etc. The biggest changes are:
o icons and text are all in the one giant texture
o icons and text are mixed in the one queue
This gave ~9% speedup for bigass1 (159->174fps).
gl, sw and sw32 use blend palettes, so share the code. This also abandons
the optimization for transforming verts in sw (had all sorts of problems
anyway). sw still doesn't work, though.
There are still many issues to sort out, but the basics are working.
Problems:
rendered fullbright (no lighting done)
normals are ignored
extra textures (glow etc) not used/loaded
4 models on the screen don't seem to be a problem.
Though the bsp loader doesn't yet support colored lighting, the ambient
light will be colored when it does. With this, I guess iqm model support is
done for glsl until I figure out how I want to do dual quaternion support.