Shaders can be built as spv files and installed into
$libdir/quakeforge/shaders or as spvc files and compiled into the
engine. Loading supports $builtin/name to access builtin shaders,
$shader/path to access external standard shaders and quake filesystem
access for all other paths.
I had forgotten that msaa samples was governed by the driver (as a max)
and the renderpass setup code simply took the max. Thus why 1 vs 8
caused the display to render incorrectly.
It turned out the msaa setting defaulting to 1 instead of 8 was the
problem no idea why at this stage (need to read up on just how that
setting works). Once I understand just how it works, I'll rework the
msaa handling.
The problem is that I needed to support dynamic types on operators (for
bit-field enums), had things working, but a bad edit messed things up
and I had to rebuild that bit of code. Missed one bit :P
It is capable of parsing single expressions with fairly simple
operations. It current supports ints, enums, cvars and (external) data
structs. It is also thread-safe (in theory, needs proper testing) and
the memory it uses can be mass-freed.
This was inspired by
Hoard: A Scalable Memory Allocator
for Multithreaded Applications
Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, Paul R.
Wilson,
It's not anywhere near the same implementation, but it did take a few
basic concepts. The idea is twofold:
1) A pool of memory from which blocks can be allocated and then freed
en-mass and is fairly efficient for small (4-16 byte) blocks
2) Tread safety for use with the Vulkan renderer (and any other
multi-threaded tasks).
However, based on the Hoard paper, small allocations are cache-line
aligned. On top of that, larger allocations are page aligned.
I suspect it would help qfvis somewhat if I ever get around to tweaking
qfvis to use cmem.
The calculation fails (produces NaN) if the vectors are anti-parallel,
but works for all other combinations. I came up with this implementation
when I discovered Unity's Quaternion.FromToRotation could did not work
with very small angles. This implementation will produce a usable
quaternion below 0.00255 degrees (though it will be slightly larger than
unit). Unity's failed such that I could see KSP's skybox snap while it
rotated around my test vessel.
The problem was caused by passing the index into the dtables array to
dtable_get which expects a handle. A handle is the ones-compliment
negative of the index which means that handle 0 is invalid (but 0 was
being passed... oops). Fixes the segfault when qw-client-x11 connects to
a server.
The problem was caused by passing the index into the dtables array to
dtable_get which expects a handle. A handle is the ones-compliment
negative of the index which means that handle 0 is invalid (but 0 was
being passed... oops). Fixes the segfault when qw-client-x11 connects to
a server.
This gets renderpass parsing almost working (not hooked up, though). The
missing bits are support for expressions for flags (namely support for
the | operator) and references (eg $swapchain.format). However, this
shows that the basic concept for the parser is working.
The array has to be allocated using byte elements and thus the size of
the array is the number of bytes, but it needs to be the actual number
of elements in the array. Problem caused by not knowing the actual type
(and C not having type variables anyway).
Nothing is actually done yet other than parsing the built-in property
list to property list items (the actual parser is just a skeleton), but
everything compiles
The property list specifies the base structures for which parser code
will be generated (along with any structures and enums upon which those
structures depend). It also defines option specialized parsers for
better control.
It worked as a proof of concept, but as the code itself needs to be a
bit smarter, it would be a lot smarter to break up that code to make it
easier to work on the individual parts.
PL_ParseDictionary itself does only one level, but it takes care of the
key-field mappings and property list item type checking leaving the
actual parsing to a helper specified by the field. That helper is free
to call PL_ParseDictionary recursively.
The first line of the parsed item is stored and can be retrieved using
PL_Line. Line numbers not stored for dictionary keys yet. Will be 0 for
any items generated by code rather than parsed from a file or string.
The tables are generated from the enums pulled out of the vulkan headers
using a ruamoko program (thanks to its reflection capabilities). They
will be used for parsing property lists used to create render passes and
pipelines.
There's still some cleanup to do, but everything seems to be working
nicely: `make -j` works, `make distcheck` passes. There is probably
plenty of bitrot in the package directories (RPM, debian), though.
The vc project files have been removed since those versions are way out
of date and quakeforge is pretty much dependent on gcc now anyway.
Most of the old Makefile.am files are now Makemodule.am. This should
allow for new Makefile.am files that allow local building (to be added
on an as-needed bases). The current remaining Makefile.am files are for
standalone sub-projects.a
The installable bins are currently built in the top-level build
directory. This may change if the clutter gets to be too much.
While this does make a noticeable difference in build times, the main
reason for the switch was to take care of the growing dependency issues:
now it's possible to build tools for code generation (eg, using qfcc and
ruamoko programs for code-gen).
This fixes the segfault when loading the menu progs. I had forgotten
that the menu code doesn't use PR_LoadProgs (I don't remember why.
Obsolete reason?).
When I ported SEB to python, I discovered that I apparently didn't
really understand the paper's description of the end condition and the
usage of the affine and convex sets for center testing. This cleans up
the test and makes SEB more correct for the cases that have less than 4
supporting points (especially when there are less than 4 points total).
Returning a string was a bad idea as it makes str_str difficult to use
with str_mid. (actually, iirc, it was the only reason I moved all
strings into progs memory... hmm).
This allows a debugger to do any symbol lookups and other preparations
between loading progs and the first code execution. .ctors are called as
per normal if debug_handler is not set.
In testing variable fw/precision in PR_Sprintf, I got a nasty reminder
of the limitations of the current progs ABI: passing @args to another QC
function does not work because the args list gets trampled but the
called function's locals. Thus, the need for a va_copy. It's not quite
the same as C's as it returns the destination args instead of copying
like memcpy, but it does copy the list from the source args to a
temporary buffer that is freed when the calling function returns.
This is the first step in reworking PR_Sprintf to use a state machine.
The goal is to make it more robust against errors and easier to extend
(eg, * width and precision).
And rename prd_exit to prd_terminate (the idea is the host will
terminate the VM). This makes it possible for the debugger to pause the
VM before any code, even a builtin function, is executed. Breaks the
debugger source window, but only because it's not updating on file
change (I think).
I decided I want events for VM enter/exit but enter needs to somehow
pass the function which will be executed (even if a builtin). A generic
void * param seemed the best idea, which meant the error string could be
passed via the param instead of a "global" string in the progs struct.
They take a pointer to a free-list used for hashlinks so the hashlink
pools can be per-thread. However, hash tables that are not updated are
always thread-safe, so this affects only updates. progs_t has been set
up such that it is easy for multiple progs within one thread can share
hashlinks.
While there was a breakpoint hook, it was for only breakpoints and more
was needed. Now there's a generic hook that is called for tracing,
breakpoints, watch points, runtime errors and VM errors, with the
"event" type passed as the first parameter and a data pointer in the
second.
This allows for the four combinations of shift and control. Not
bothering with alt because alt-f4 closes my xterm (bbkeys from the looks
of it: it grabs a bunch of Mod1-* keys).
The idea is to find th def that contains the address. Had to write my
own bsearch (well... lifted from wikipedia) because libc's is exact. The
defs are assumed to be sorted (which qfcc now ensures when it writes
progs and sym files).
Type encodings are used whenever they are available. For now, if they
are not, then everything is treated as void (which prints <void>, not
very useful). Most return statements and references to .return are now
very readable (excluding structs), and only params going through "..."
are a messy union.
The memset instructions now match the move* instructions other than the
first operand (always int). Probably breaks much, but fixed in next few
commits.
This returns the character (as an int) at the index. Equivalent to
string[index], but qc code doesn't have char-level access and not having
it means that strings can internally change to wchar without too much
fuss (maybe).
If a temp string is found in the return slot, PR_FreeTempStrings won't
delete the string. However, PR_PopFrame was blindly stomping on the
possibly surviving temp string with the push strings, which would cause
a leak.
This causes the block to be freed when the forward: handler returns
(assuming it's not yet another builtin). This is necessary so calling a
lot of forwarded messages in a loop doesn't leak memory (though it will
get freed eventually).
This "pushes" a temp string onto the callee's stack frame after removing
it from the caller's stack frame. This is so builtins can pass
auto-freed memory to called progs code. No checking is done, but mayhem
is likely to ensue if a string is pushed that was allocated in an
earlier frame.
With this, object's implementing forward:: seem to accept the message
well, including receiving all the original args (not quite sure how to
deal with them in ruamoko code just yet, though).
PR_AllocTempBlock() works the same way as PR_SetTempString(), except
that it takes a size parameter and always allocates (never tries to
merge). This is, in a way, abusing the string system, but I needed a way
to allocate a block of progs memory that would be automatically freed
when the current frame ended. The biggest abuse is the need to cast away
the const of PR_GetString()'s return value.
libr supplies an __obj_forward definition that links to a builtin, but
as it is the only def in its object file, it is readily replaceable by
an alternative Ruamoko implementation.
The builtin version currently simply errors out (rather facetiously),
but only as a stub to allow progs to load.
This should speed up ruamoko code somewhat as hash table lookups have
been replaced with direct array indexing. As a bonus, support for
message forwarding has been added (though not tested).
Move the semi-permanent resource initialisation into the module init and
the cleanup of those resources into cleanup. Makes actual runtime init
much easier to read.
Rather than relying on progs code version, use the string to determine
whether PR_Sprintf should behave as if floats have been promoted through
... I imagine I'll get to the rest of the server code at some stage.
With these two changes, nq-x11 works again (teleporters were the
symptom).
This is one step closer to implementing conformsToProtocol. However,
protocols are not yet initialized correctly: they are not registered,
nor are their selectors.
While the static initializer list pointer was not written previously,
the module struct always came immediately after the symbols struct, and
the module version has so far always been 0. Thus, the list pointer is
correctly 0 for older progs and there's no need for a version bump.
With this, the VA is very close to being safe to use in a threaded
environment (so long as each VM is used by only one thread). Just the
debug file hash and source paths to sort out.
Other than consistency with printf(), I'm not sure why we went with the
printed size as the return value; returning the resultant strings makes
much more sense as dsprintf() (etc) can then be used as a safe va()
Other than its blocking of access to certain files, it really wasn't
that useful compared to the functions in qfs, and pointless with access
to qfs anyway.
The progs execution code will call a breakpoint handler just before
executing an instruction with the flag set. This means there's no need
for the breakpoint handler to mess with execution state or even the
instruction in order to continue past the breakpoint.
The flag being set in a progs file is invalid.
For technical reasons (programmer laziness), qfcc does not fix up local
def type encodings when writing the debug symbols file (type encoding
location not readily accessible).
The debug subsystem now uses the resources system to ensure it cleans
up, and its data is now semi-private. Unfortunately, PR_LoadDebug had to
remain public for qfprogs because using PR_RunLoadFuncs would cause
builtin resolution to complain.
It is now set to 0 when progs are loaded and every time
PR_ExecuteProgram() returns. This takes care of the default case, but
when setting parameters, pr_argc needs to be set correctly in case a
vararg function is called.
PR_SaveParams() is required for implementing the +initialize diversion
used by Objective-QuakeC because builtins do not have local def spaces
(of course, a normal stack calling convention would help). However, it
is entirely possible for a call to +initialize to trigger another call
to +initialize, thus the need for stacking parameter stashes. As a
bonus, this implementation cleans up some fields in progs_t.
The initial code was pretty much a port of the code in the editor I
wrote 25 years ago. Either I didn't think of the optimization back then,
or I tried to implement it, failed, and figured it wasn't worth it
(despite using it on a 386dx33). However, I noticed it now and it was
easy enough to get working, and it's always good to not do something
that's not needed.
When the substring is the tail of the supplied string, return a
"pointer" to within the supplied string rather than a new "return"
string. This means that tail-end substrings of string constants are
themselves constants.
The engine now requires non-v6 progs to store the log2 alignment for the
param struct in .param_alignment.
PR_EnterFunction is clearer and possibly more efficient.
Only as scalars, I still need to think about what to do for vectors and
quaternions due to param size issues. Also, doubles are not yet
guaranteed to be correctly aligned.
It turned out I needed access to the physical device from a buffer
object, so rather than storing the vulkan logical device directly in
buffer (and other) objects, store the qfv logical device.
The better accuracy is for specific cases (90 degree rotations around a
main axis: the matrix element for that axis is now 1 instead of
0.99999994). The speedup comes from doing fewer additions (multiply
seems to be faster than add for fp, at least in this situation).
I added Sys_RegisterShutdown years ago and never really did anything
with it: now any system that needs to be shutdown can ensure it gets
shutdown on program exit, and in the correct order (ie, reverse to init
order).
This makes sure that some unchecked event doesn't cause a lockup.
However, blocking input is really not the way to go: need to implement a
state machine and use non-blocking event reads.
Or really, allow it if the user specifically requests it: the default is
blocked. Modern systems (particularly displays) do not really like
changing resolution, so doing so by default seems rather wrong.
This makes sure that some unchecked event doesn't cause a lockup.
However, blocking input is really not the way to go: need to implement a
state machine and use non-blocking event reads.
Or really, allow it if the user specifically requests it: the default is
blocked. Modern systems (particularly displays) do not really like
changing resolution, so doing so by default seems rather wrong.
It's just a wrapper around hashtab, but it makes checking if a string is
in a set easy. Way overkill when only a few extensions are enabled, but
more might come later.
This paves the way for clean initialization of the Vulkan renderer, and
very much cleans up the older renderer initialization code as gl and sw
are no longer intertwined.
This fixes the segfault and pushes things very much in the desired
direction of proper system independence for rendering and presentation
separation (though things were headed in the right direction before).
Things are still a mess, but a proper cleanup will be a lot of work and
will, really, involve properly splitting quake-specific code* out from
the rest of the renderer.
* data loading and format specific stuff
A single graphics-capable queue should be enough for now. However, I'm
not sure I'm happy with a lot of the code: it's a bit difficult to write
flexibly configured code for Vulkan (or seems to be at this stage),
especially in C.
After messing with SIMD stuff for a little, I think I now understand why
the industry went with xyzw instead of the mathematical wxyz. Anyway, this
will make for less pain in the future (assuming I got everything).
I've decided that setting pr.max_edicts and pr.zone_size as part of the
local progs initialization rather than in PR_LoadProgsFile makes more
sense. For one, it is unlikely for the limits to change every time progs is
reloaded. Also, they seem to be a property of the VM rather than the progs.
However, there is nothing stopping the caller from updating max_edicts and
zone_size every call.
I'm not certain despair actually meant for the break to be there. It
certainly would have sped up the game a bit but at the expense of proper
blood trails in the software renderers.
These are the ones where I could easily make scan-build happy. They do seem
to be potential holes where invalid data in one place could result in use
of uninitialized values.
While scan-build wasn't what I was looking for, it has proven useful
anyway: many of the sizeof errors were just noise, but a few were actual
bugs (allocating too much or too little memory).
This was caused by an out-by one error when setting up the skin: if cmap
was 0 then the gl_skin struct would be taken from index -1 of the array and
thus cause all sorts of grief.
GL sometimes crashes when building skins. This probably isn't the correct
fix (finding the situation where fb->tex can become NULL despite fb being
non-null is), but it does kill the segfault. Luckily, this is git and this
commit can just be reverted when the real fix shows up. :)
Also fix a bug where despite supporting 32 buttons, only 18 were actually
supported, and a similar issue for the number of axes.
My saitek x52 has 34 buttons and 10 axes. Whee.
This fixes the problem of not finding files without a .gz extension when
gzip support is enabled (most of my quake data is compressed, so it took a
while for me to notice the problem).
The search for these files will stop in the vpath that contains the .bsp
file to which they belong. This will prevent problems with
id1/maps/start.lit being used for shadows/maps/start.bsp.
_QFS_VOpenFile is actually _QFS_FOpenFile reimplemented to take vpath start
and end parameters so the search can be limited. QFS_VOpenFile,
_QFS_FOpenFile, and QFS_FOpenFile are all wrappers for _QFS_VOpenFile.
A vpath is the union of all locations searched for a file in a single
gamedir (eg, shadows, id1 etc). This is a necessary step to preventing
problems like id1/maps/start.lit being used for shadows/maps/start.bsp.
However, QFS_FilelistFill still needs to be reworked as it does not compile
yet (testing was done with a gutted QFS_FilelistFill).
It seems mesa still has the bug where non-array attributes don't work
when set as attribute 0, and that the allocation order changed sometime
since I last tested with mesa. This fixes the black world and flickering
alias models on my eeepc.
So far, alias model rendering is the only victim, but things are working,
even if only color map lookup and fog blending are broken out at this
stage.
I expect the effect naming scheme will go through some changes until I'm
happy with it.
Again, based on The OpenGL Shader Wrangler. The wrangling part is not used
yet, but the shader compiler has been modified to take the built up shader.
Just to keep things compiling, a wrapper has been temporarily created.
The idea comes from The OpenGL Shader Wrangler
(http://prideout.net/blog/?p=11). Text files are broken up into chunks via
lines beginning with -- (^-- in regex). The chunks are optionally named
with tags of the form: [0-9A-Za-z._]+. Unnamed chunks cannot be found.
Searching for chunks looks for the longest tag that matches the beginning
of the search tag (eg, a chunk named "Vertex" will be found with a search
tag of "Vertex.foo"). Note that '.' forms the units for the searc
("Vertex.foo" will not find "Vertex.f").
Unlike glsw, this implementation does not have the concept of effects keys
as that will be separate. Also, this implementation takes strings rather
than file names (thus is more generally useful).
The offset to compensate for st++ was missing.
Obviously, the code has never been tested. Found while looking at the
jump code and thinking about using 32-bit addresses for the jump tables.
set_bits_t is now 64 bits for x86_64 machines (in linux, anyway). This gave
qfvis a huge speed boost: from ~815s to ~720s.
Also, expose some of the set internals so custom set operators can be
created.
Now we can get tight (<1e-6 * radius_squared error) bounding spheres. More
importantly (for qfvis, anyway) very quickly: 1.7Mspheres/second for a 5
point cloud on my 2.33GHz Core 2 :)
It "works" for lines, triangles and tetrahedrons. For lines and triangles,
it gives the barycentric coordinates of the perpendicular projection of the
point onto to features. Only tetrahedrons are guaranteed to reproduce the
original point.
Rather than prefixing free_ to the supplied name, suffix _freelist to the
supplied name. The biggest advantage of this is it allows the free-list to
be a structure member. It also cleans up the name-space a little.
I'd forgotten that ED_ConvertToPlist mangled light into light_lev and
single component angle values into a vector. This fixes much of the
breakage in qflight (but not the light levels)
Sys_LongTime returns time in microseconds as a 64-bit int. Sys_DoubleTime
uses Sys_LongTime, converts to double and offsets 0 time by 4G (2**32).
This gives us consistent sub-microsecond precision for a very long time.
See http://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/
Once and for all: remove the default and move the Sys_Error outside the
switch (changing appropriate breaks to returns). Now gcc will let me know
when I forget to update the switch statements.
I'd missed a set of bit->lightnum conversions that resulting in lightnum
becoming much greater than 128 and thus trashing memory when the surface
was marked.
This give much better control over individual joystick axes. They now have
per-axis pre and post amplification, the linear controller mappings are
more intuitive, and axes can now bet setup as buttons using thresholds.
Many thanks to Johnny on Flame for his work on the "user interface".
Johnny's number->J_AXISn mapping is preserved, but I had intended for any
key to be supported (J_AXISn was just to ensure free keys were available).
This gives both methods (and some range checking on the axis button
number).
Thanks to leilei being a diehard sw quake fan, and MH being the hacker he
is, engoo's vid_win.c drops Scitech's MGL :) (I really did not want to
resurrect that). However, I've modified it so it /compiles/ in QF: ripped
out the menu code, ripped out the input handling (that's in in_win.c) and
started trying to get it to work for vid_render. The clients at least link,
but I'm certain they'll segfault (GPF?).
The win clients are the native windows (NOT sdl!! *twitch*). Things are
already looking on the up: only three errors in in_win.c. I'm not looking
forward to vid_win.c (ex vid_wgl.c), though.
First, this completely smashes joystick input: it will not work (though it
doesn't crash). This is because there is, as of yet, no means to configure
the system.
Each joystick axis has:
- per-axis amplification (both pre and post).
- per-axis offset (offset applied after pre-amp but before post amp)
- selectable destination:
- linear delta: position and angles (as before)
- axis button: if the value crosses the threshold, the given key is
pressed or released as appropriate.
The axis amplification still uses joy_amp and joy_pre_amp (and
in_amp/in_pre_amp), but now also has the per-axis settings.
The per-axis offset is most useful for axis buttons. For example, the xbox
360 controller triggers are analong but go "all the way to negative on 0
state". Offsetting the input keeps axis button thresholds simple.
Amplification and offset is applied before anything is done with the axis
value. The formula is:
joy_amp * in_amp * axis-amp *
(offset + value * joy_pre_amp * in_pre_amp * axis-pre_amp)
Axis button thresholds are very simple: if the sign of the value is the
same as the sign of the threshold and abs(value) >= abs(threshold), the
button is pressed. While multiple thresholds and keys can be placed on an
axis, only one can be pressed at a time. The threshold furthest from 0
wins.
The ~ gets expanded to CSIDL_LOCAL_APPDATA, $HOME, $USERPROFILE or just
".", whichever succeeds first. The usual location will be:
"C:\windows\profiles\<user>\Local Settings\Application Data".
"." is now the fallback for *nix systems too.
The seed is currently 0xdeadbeef, but I intend on fixing that soon. Now the
particle velocities and origins use fully independent bits (though a big
chunk is wasted right now).
This gives QF a consistent qualilty PRNG on all platforms. The
implementation is slightly different from the standard, but gives the same
results for the same speed (details in mersenne.c).
This is a quick fix until I get a random number generator into QF.
Mingw's RAND_MAX is only 0x7fff and so the (((rnd >> 10) & 63) - 31.5) / 63.0
used for the z component of origin and velocity would never go positive.
For now, change the 10 to 9 (reusing another bit from Y). I plan on
implementing a full 32-bit PRNG in QF so we always have a reliable
generator.
It was pointed out by Blub\w (gmqcc) that OP_MUL_FV and friends were buggy
when the operands overlapped (eg, x = x.x * x) as the result would become
'x.x*x.x x.y*x.x*x.x x.z*x.x*x.x' (note the x.x squared for y and z). On
testing, sure enough the bug was present (and is a nice demonstration that
QF's VM does NOT have strict-aliasing bugs). As a very nice benefit: the
code produced by the fixes is actually faster than the broken version :).
The ruamoko code used for testing:
void (string fmt, ...) printf = #0;
vector foo (vector x)
{
x = x * x.x;
return x;
}
vector bar (vector x)
{
x = x.x * x;
return x;
}
int main ()
{
vector x = '2 3 4';
vector y = foo (x);
vector z = bar (x);
printf ("x=%v y=%v z=%v 2*x=%v\n", x, y, z, 2*x);
return 0;
}
Now the user can create and destroy IMTs at will, though currently
destroying IMTs is currently all or nothing (imt_drop_all).
An IMT is created via imt_create which takes the keydest name (key_game
etc), the name of the IMT (must be unique for all IMTs) and optionally the
name of the IMT to which the key binding search will fall back if there is
no binding in the current IMT, but must be already defined and on the same
keydest. This means that IMTs now have user determined fallback paths. The
requirements for the fallback IMT prevent loops and other weird behaviour.
Actual key binding via in_bind is unaffected. This is why the IMT name must
be unique across all IMTs.
The "imt" command works with the key_game keydest, but imt_keydest is
provided for specifying the active IMT for a specific keydest.
At startup, default IMTs are setup to emulate the previous static IMTs so
old configs will continue to work (mostly). New config files will be
written with commands to drop all of the current IMTs and build new ones,
with the bindings and active IMT set as well.
This fixes the status bar refresh issues in sw. The problem was that with
two viddef's hanging around, things got a little confused and recalc_refdef
wasn't getting into the renderer.