The JACK Audio Connection Kit support is now just an output target
rather than a full duplicate of the renderer (in pull mode). This is
what I wanted to to back when I first added jack support, but I needed
to get the renderer working asynchronously without affecting any of the
other outputs.
Fixes#16.
on_update is for pull-model outpput targets to do periodic synchronous
checks (eg, checking that the connection to the actual output device is
still alive and reviving it if necessary)
Output plugins can use either a push model (synchronous) or a pull
model (asynchronous). The ALSA plugin now uses the pull model. This
paves the way for making jack output a simple output plugin rather than
the combined render/output plugin it currently is (for #16) as now
snd_dma works with both models.
This gets the alsa target working nicely for mmapped outout. I'm not
certain, but I think it will even deal with NPOT buffer sizes (I copied
the code from libasound's sample pcm.c, thus the uncertainty).
Non-mmapped output isn't supported yet, but the alsa target now works
nicely for pull rendering.
However, some work still needs to be done for recovery failure: either
disable the sound system, or restart the driver entirely (preferable).
This brings the alsa driver in line with the jack render (progress
towards #16), but breaks most of the other drivers (for now: one step at
a time). The idea is that once the pull model is working for at least
one other target, the jack renderer can become just another target like
it should have been in the first place (but I needed to get the pull
model working first, then forgot about it).
Correct state checking is not done yet, but testsound does produce what
seems to be fairly good sound when it starts up correctly (part of the
state checking (or lack thereof), I imagine).
This failed with errors such as:
from ./include/QF/simd/vec4d.h:32,
from libs/util/simd.c:37:
./include/QF/simd/vec4d.h: In function ‘qmuld’:
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/avx2intrin.h:1049:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_permute4x64_pd’: target specific option mismatch
1049 | _mm256_permute4x64_pd (__m256d __X, const int __M)
and rename the variable since it's not the size of the frame (may be
from the very early days of ALSA development, and I suspect the
terminology changed a bit).
The calculation was including the bits per sample, which makes no sense
as the period size determines the number of samples in a submission
chunk (and thus latency). For now, set it to around 5.5ms (will probably
need a cvar).
If Sys_Shutdown gets called twice, particularly if a shutdown callback
hangs and the program is killed with INT or QUIT, shutdown_list would be
in an invalid state. Thus, get the required data (function pointer and
data pointer) from the list element, then unlink the element before
calling the function. This ensures that a reinvocation of Sys_Shutdown
continues from the next callback or ends cleanly. Fixes a segfault when
killing testsound while using the oss output (it hangs on shutdown).
Fixes#12
However, this is a bit of a band-aid in that the code for global defs
seems redundant (there is very similar code a little above that is
always executed) and the code for field defs should probably be executed
unconditionally: I suspect the problem fixed by
d5454faeb7 still shows with game coded
compiled with recent versions of the compiler, I just haven't tested
any.
Support for finding the first address associated with a source line was
added to the engine, returning 0 if not found.
A temporary breakpoint is set and the progs allowed to run free.
However, better handling of temporary breakpoitns is needed as currently
a "permanent" breakpoint will be cleared without clearing the temporary
breakpoing if the permanent breakpoing is hit while execut-to-cursor is
running.
For now, just bsearch (normal and fuzzy), qsort, and prefixsum (not in
C's stdlib that I know of, but I think having native implementations of
float and int prefix sums will be useful.
Fuzzy bsearch is useful for finding an entry in a prefix sum array
(value is >= ele[0], < ele[1]), and the reentrant version is good when
data needs to be passed to the compare function. Adapted from the code
used in pr_resolve.
A bit of a mess for optimized vs unoptimized, but the tests acknowledge
the differences in precision while checking that the code produces the
right results allowing for that precision.
It seems that i686 code generation is all over the place reguarding sse2
vs fp, with the resulting differences in carried precision. I'm not sure
I'm happy with the situation, but at least it's being tested to a
certain extent. Not sure if this broke basic (no sse) i686 tests.
GCC does a fairly nice job of producing code for vector types when the
hardware doesn't support SIMD, but it seems to break certain math
optimization rules due to excess precision (?). Still, it works well
enough for the core engine, but may not be well suited to the tools.
However, so far, only qfvis uses vector types (and it's not tested yet),
and tools should probably be used on suitable machines anyway (not
forces, of course).
I don't know that the cache line size is 64 bytes on 32 bit systems, but
it should be ok to assume that 64-byte alignment behaves well on systems
with smaller cache lines so long as they are powers of two. This does
mean there is some waste on 32-bit systems, but it should be fairly
minimal (32 bytes per memblock, which manages page sized regions).
Legacy progs do not have the extended defs data (and usually won't have
anything more complicated than a vector), so use the basic type size for
the def size. Fixes broken edict prints.
Standard quake has just linear, but the modding community added inverse,
inverse-square (raw and offset (1/(r^2+1)), infinite (sun), and
ambient (minlight). Other than the lack of shadows, marcher now looks
really good.
Because LoadImage uses Hunk_TempAlloc, the face images need to be copied
individually. Really, what's neeeded is to be able to load the image
data into a pre-allocated buffer (ideally, the staging buffer for
vulkan, but that's for later).
Mostly, this gets the stage flags in with the barrier, but also adds a
couple more barrier templates. It should make for slightly less verbose
code, and one less opportunity for error (mismatched barrier/stages).
This gets the shaders needed for creating shadow maps, and the changes
to the lighting pipeline for binding the shadow maps, but no generation
or reading is done yet. It feels like parts of various systems are
getting a little big for their britches and I need to do an audit of
various things.
The built up "path" name of the handle resource was not always surviving
the intervening call to cexpr_eval_string (in particular, when other
handles were created in the process of creating a handle). Rather than
simply increase the number of va buffers (where would it end?), just
regenerate the path when adding the new handle. It's probably quick
enough, and the code is not usually not on a critical path.
I was reading about multi-pass rendering on mobile devices
(https://developer.oculus.com/blog/loads-stores-passes-and-advanced-gpu-pipelines/)
and discovered that I had used the wrong flags (but then, I think Graham
Sellers had, too, since used his Vulkan Programming Guide as a
reference). Doesn't seem to make any difference on desktop, but as
there's no loss there, but potential gains on mobile, I'd say it's a
win.
QF now uses its own configuration file (quakeforge.cfg for now) rather
than overwriting config.cfg so that people trying out QF in their normal
quake installs don't trash their config.cfg for other quake clients. If
quakeforge.cfg is present, all other config files are ignored except
that quake.rc is scanned for a startdemos command and that is executed.
And improve the generated code for MSG_ReadShort
I suspect gcc didn't like all the excess pointer dereferences and so
couldn't assume that the bytes were being read sequentially.
And improve the generated code as well (ie, use a code sequence that gcc
recognizes and optimizes to a single 32-bit read and a byte-swap).
nq uses big-endian for its packet headers (arg, though it is consistent
with IP, it's not with the rest of quake).
This fixes the textures (and presumably mesh data) being deleted while
still in use. Oddly, the wait was needed in both brush and alias models
(I expected brush to always come first).
I'm not sure that the mismatch between refdef_t and the assembly defines
was a problem (many fields unused), but the main problem was due to
execute permission on the pages: one chunk of asm was in the data
section, and the patched code was not marked as being executable (due to
such a thing not existing when quake was written).
This is a bit of a hack for now (need to look into maybe using cmem),
but it gets 32-bit windows working for all but the software renderer
(probably just refdef (and maybe viddef) getting out of sync with the
assembly code.
This ensures that fov_y is not calculated until after the render view
size is known and thus doesn't become some crazy angle (that happens to
result in a negative tan). Fixes upside-down-quake :)
vid.aspect is removed (for now) as it was not really the right idea (I
really didn't know what I was doing at the time). Nicely, this *almost*
fixes the fov bug on fresh installs: the view is now properly
upside-down rather than just flipped vertically (ie, it's now rotated
180 degrees).
Not only does it makes sense to centralize the setting of viewport and
scissor, but it's actually necessary in order to fix the upside-down
rendering on windows.
This gets the GL and GLSL renderers working for the -win targets... sort
of: they are upside down and GLSL's bsp surfaces are black (same as
Vulkan). However, with this, all 5 renderers at least limp along for
-win, 4/5 work for -sdl.
It turns out the dd and dib "driver" code is very specific to the
software renderer. This does not fix the segfault on changing video
mode, but I do know where the problem lies: the window is being
destroyed and recreated without recreating the buffers. I suspect a
clean solution to this will allow for window resizing in X as well.
Only 64-bit windows is tested, and there are still various failures, but
QF is limping along in windows again.
nq-sdl works for sw, and sw32, gl and glsl are mostly black (but not
entirely for gl?), vulkan is not supported with sdl.
nq-win works for sw and sw32, and sort of for vulkan (very dark and
upside-down?). gl and glsl complain about vid mode,
qw-client-[sdl,win] seem to be the same, but something is wrong with the
console (reading keyboard input).
The merge with the improvements I made while hacking on csqc (still
undecided as to whether to continue that project) resulted in the size
of the progs string area getting mangled when no heap was allocated for
the progs due to a null zone pointer being used in some pointer
arithmetic. Fixes random(!!!) invalid string error in qfprogs.
While this caused some trouble for pr_strings and configurable strftime
(evil hacks abound), it's the result of discovering an ancient (from
maybe as early as 2004, definitely before 2012) bug in qwaq's printing
that somehow got past months of trial-by-fire testing (origin understood
thanks to the warning finding it).
It looks like choosing a visual is not necessary (at least for normal
apps, VR might be another matter). Still no idea if anything works (for
-win support in general, let alone vulkan).