These two functions draw a vertical line 4 neighboring pixels at a time.
This gives a significant speed boost for a full screen solid and masked wall
scene for x86_64 (where we have plenty of registers), about 60 --> 76 fps.
git-svn-id: https://svn.eduke32.com/eduke32@2497 1a8010ca-5511-0410-912e-c29ae57300e0
- forgot a glogy --> logy in a-c.c
- comment out stretchhline and slopevlin2 in a.nasm, the former also in a-c.c
- make transmaskvline2 use a uintptr_t where appropriate
git-svn-id: https://svn.eduke32.com/eduke32@2448 1a8010ca-5511-0410-912e-c29ae57300e0
Hlines for masked and translucent masked ceiling/floor (sprites).
- apply the --> 'do { ... } while (--cnt)' transformation, making these
functions iterate cnt+1 times like the asm version. This also fixes an
off-by-one issue where sprites or masked ceilings/floors had a one-pixel
non-drawn line to the right.
- This time, only declare-as-local two 'extern' globals (asm1 and asm2).
It seems that I was too eager with "localing" all file-scoped vars earlier.
GCC is able to remove the loads from memory inside the loop by itself, whereas
clang is not. This is not trivial, since it has to prove that the 'screen'
pointer passed to the functions will never alias these globals.
git-svn-id: https://svn.eduke32.com/eduke32@2424 1a8010ca-5511-0410-912e-c29ae57300e0
Affected functions: hlineasm4, vlineasm1, mvlineasm1, tvlineasm1.
Optimizations:
- declare all used variables as possibly const-qualified locals in each
function. This removes unnecessary loads from memory in the loops.
- rewrite "for (; cnt>=0; cnt--) {...}" to "cnt++; do {...} while (--cnt);"
in the three last ones (yes, these function iterate cnt+1 times). This
makes them functionally equivalent to the asm versions (madness ensues for
cnt < 0) and allows the compiler to remove one 'test' instruction at the
end of each loop.
- in the translucence function, replace addition by ORing
Observations (system: Core2 Duo Linux x86_64):
With a 1680x1050 window fully covered by the respective type of wall (simple,
masked, trans. masked), fps increases by 3-4 from the baseline of approx. 60.
git-svn-id: https://svn.eduke32.com/eduke32@2405 1a8010ca-5511-0410-912e-c29ae57300e0
* Sprite cstat 2048 ('use own shade', [N]) now works more or less. (Issues may arise when combined with sector light effects.)
* Begin work on 'smart' tag labeling system for Mapster32. Right now, it only displays a '+' after tags with linking semantics.
*
git-svn-id: https://svn.eduke32.com/eduke32@1866 1a8010ca-5511-0410-912e-c29ae57300e0