For most (if not all) maps. The heapsort is needed only if the clustered
leafs are not contiguous, but most bsp compilers output contiguous leaf
clusters, so is just a bit of protection. The difference isn't really
noticeable on a fast machine, but no point in doing more work than
necessary.
Now that only 3852 clusters need to be checked for each cluster, fat-pvs
construction for ad_tears completes in about 0.7s, most of which seems
to be loading, conversion, compression and writing. O(N^3) cuts both
ways (hurts like crazy when N increases, does wonders when N decreases,
especially by a factor of 25). And then throw in improved cache
performance...
I suspect having an off-line compiler is still useful, but even if
qfvis's implementation never actually gets used, if cluster
reconstruction is put in the engine, large maps will be feasible even
for quakeworld. Just the reduced memory requirements alone will be a
huge benefit (~3GB down to 1.8MB).
This is only the first half (vertical) in that the vis bits are still
for the leafs rather than the clusters, but ad_tears goes from 500s to
7s for calculating the fat pvs (3852 clusters).
The output fat-pvs data is the *difference* between the base pvs and fat
pvs. This currently makes for about 64kB savings for marcher.bsp, and
about 233MB savings for ad_tears.bsp (or about 50% (470.7MB->237.1MB)).
I expect using utf-8 encoding for the run lengths to make for even
bigger savings (the second output fat-pvs leaf of marcher.bsp is all 0s,
or 6 bytes in the file, which would reduce to 3 bytes using utf-8).
Extremely large maps take a very long time to process their PVS sets for
PHS or shadows, so having an off-line compiler seems like a good idea.
The data isn't written out yet, and the fat pvs code may not be optimal
for cache access, but it gets through ad_tears in about 500s (12
threads, compared to 2100s single-threaded in the qw server).