It now processes 4 pixels at a time and uses a bit mask instead of a
conditional to set 3 of the 4 pixels to black. On top of the 4:1 pixel
processing and avoiding inner-loop conditional jumps, gcc unrolls the
loop, so Draw_FadeScreen itself is more than 4x as fast as it was. The
end result is about 5% (3fps) speedup to timedemo demo1 on my 900MHz
EEE Pc when nq has been hacked to always draw the fade-screen.
qwaq-curses has its place, but its use for running vkgen was really a
placeholder because I didn't feel like sorting out the different
initialization requirements at the time. qwaq-cmd has the (currently
unnecessary) threading power of qwaq-curses, but doesn't include any UI
stuff and thus doesn't need curses. The work also paves the way for
qwaq-x11 to become a proper engine (though sorting out its init will be
taken care of later).
Fixes#15.
This refactors (as such) keys.c so that it no longer depends on console
or gib, and pulls keys out of video targets. The eventual plan is to
move all high-level general input handling into libQFinput, and probably
low-level (eg, /dev/input handling for joysticks etc on Linux).
Fixes#8
The default is to enable (and autodetect based on lscpu) simd support,
or the mode can be specified via --enable-simd=mode. --disable-simd
disables the support but ensures gcc will still compile the float vector
types.
When moving an identifier label from one node to another, the first node
must be evaluated before the second node, which the edge guarantees.
However, code for swapping two variables
t = a; a = b; b = t;
creates a dependency cycle. The solution is to create a new leaf node
for the source operand of the assignment. This fixes the swap.r test
without pessimizing postop code.
This takes care of the core problem in #3, but there is still room for
improvement in that the load/store can be combined into a move.
This reverts commit 2fcda44ab0.
Killing the node is not the correcgt answer as it blocks many
optimization opportunities. The correct answer is adding edges to
describe the temporal dependencies. Of course, this breaks the swap.r
test.
In order to correctly handle swap-style code
{ t = a; a = b; b = t; }
edges need to be created for each of the assignments moving an
identifier lable, but the dag must remain acyclic (the above example
wants to create a cycle). Having the reachable nodes recorded makes
checking for potential loops a quick operation.
Identifiers can be constants. I don't remember quite what it fixed other
than some bogus kill relations in the dags (which might have caused
issues later).
I had forgotten to test with shared libs and it turns out jack and alsa
were directly accessing symbols in the renderer (and in jack's case,
linking in a duplicate of the renderer).
Fixes#16.
The JACK Audio Connection Kit support is now just an output target
rather than a full duplicate of the renderer (in pull mode). This is
what I wanted to to back when I first added jack support, but I needed
to get the renderer working asynchronously without affecting any of the
other outputs.
Fixes#16.
on_update is for pull-model outpput targets to do periodic synchronous
checks (eg, checking that the connection to the actual output device is
still alive and reviving it if necessary)
Output plugins can use either a push model (synchronous) or a pull
model (asynchronous). The ALSA plugin now uses the pull model. This
paves the way for making jack output a simple output plugin rather than
the combined render/output plugin it currently is (for #16) as now
snd_dma works with both models.
This gets the alsa target working nicely for mmapped outout. I'm not
certain, but I think it will even deal with NPOT buffer sizes (I copied
the code from libasound's sample pcm.c, thus the uncertainty).
Non-mmapped output isn't supported yet, but the alsa target now works
nicely for pull rendering.
However, some work still needs to be done for recovery failure: either
disable the sound system, or restart the driver entirely (preferable).
This brings the alsa driver in line with the jack render (progress
towards #16), but breaks most of the other drivers (for now: one step at
a time). The idea is that once the pull model is working for at least
one other target, the jack renderer can become just another target like
it should have been in the first place (but I needed to get the pull
model working first, then forgot about it).
Correct state checking is not done yet, but testsound does produce what
seems to be fairly good sound when it starts up correctly (part of the
state checking (or lack thereof), I imagine).
This failed with errors such as:
from ./include/QF/simd/vec4d.h:32,
from libs/util/simd.c:37:
./include/QF/simd/vec4d.h: In function ‘qmuld’:
/usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/avx2intrin.h:1049:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_permute4x64_pd’: target specific option mismatch
1049 | _mm256_permute4x64_pd (__m256d __X, const int __M)
and rename the variable since it's not the size of the frame (may be
from the very early days of ALSA development, and I suspect the
terminology changed a bit).
The calculation was including the bits per sample, which makes no sense
as the period size determines the number of samples in a submission
chunk (and thus latency). For now, set it to around 5.5ms (will probably
need a cvar).
If Sys_Shutdown gets called twice, particularly if a shutdown callback
hangs and the program is killed with INT or QUIT, shutdown_list would be
in an invalid state. Thus, get the required data (function pointer and
data pointer) from the list element, then unlink the element before
calling the function. This ensures that a reinvocation of Sys_Shutdown
continues from the next callback or ends cleanly. Fixes a segfault when
killing testsound while using the oss output (it hangs on shutdown).
And fix an error in floatview.
The z-transform program is so I can test out my math and eventually
develop some hopefully interesting sound generators.
Fixes#12
However, this is a bit of a band-aid in that the code for global defs
seems redundant (there is very similar code a little above that is
always executed) and the code for field defs should probably be executed
unconditionally: I suspect the problem fixed by
d5454faeb7 still shows with game coded
compiled with recent versions of the compiler, I just haven't tested
any.
The editor now uses the vertical scrollbar for handling mouse wheel
scrolling, thus keeping the scrollbar in sync.
Scrollbar index can now cover the full range (not sure why I had that
-1), and the potential divide by zero is avoided properly.
Thumb-tab now positioned properly when the range is 0.
This gets all the basic cursor motion from my old editor working.
arrow keys: left/right/up/down char/line
home/end: beginning/end of line
page up/down
and ctrl versions where left/right are prev/next work, up/down skip over
indents, home/end are beginning/end of screen, and page up/down are
beginning/end of text.