registers AMD64 provides, this routine still needs to be written as self-
modifying code for maximum performance. The additional registers do allow
for further optimization over the x86 version by allowing all four pixels
to be in flight at the same time. The end result is that AMD64 ASM is about
2.18 times faster than AMD64 C and about 1.06 times faster than x86 ASM.
(For further comparison, AMD64 C and x86 C are practically the same for
this function.) Should I port any more assembly to AMD64, mvlineasm4 is the
most likely candidate, but it's not used enough at this point to bother.
Also, this may or may not work with Linux at the moment, since it doesn't
have the eh_handler metadata. Win64 is easier, since I just need to
structure the function prologue and epilogue properly and use some
assembler directives/macros to automatically generate the metadata. And
that brings up another point: You need YASM to assemble the AMD64 code,
because NASM doesn't support the Win64 metadata directives.
- Added an SSE version of DoBlending. This is strictly C intrinsics.
VC++ still throws around unneccessary register moves. GCC seems to be
pretty close to optimal, requiring only about 2 cycles/color. They're
both faster than my hand-written MMX routine, so I don't need to feel
bad about not hand-optimizing this for x64 builds.
- Removed an extra instruction from DoBlending_MMX, transposed two
instructions, and unrolled it once, shaving off about 80 cycles from the
time required to blend 256 palette entries. Why? Because I tried writing
a C version of the routine using compiler intrinsics and was appalled by
all the extra movq's VC++ added to the code. GCC was better, but still
generated extra instructions. I only wanted a C version because I can't
use inline assembly with VC++'s x64 compiler, and x64 assembly is a bit
of a pain. (It's a pain because Linux and Windows have different calling
conventions, and you need to maintain extra metadata for functions.) So,
the assembly version stays and the C version stays out.
- Removed all the pixel doubling r_detail modes, since the one platform they
were intended to assist (486) actually sees very little benefit from them.
- Rewrote CheckMMX in C and renamed it to CheckCPU.
- Fixed: CPUID function 0x80000005 is specified to return detailed L1 cache
only for AMD processors, so we must not use it on other architectures, or
we end up overwriting the L1 cache line size with 0 or some other number
we don't actually understand.
SVN r1134 (trunk)
surprised if this doesn't build in Linux right now. The CMakeLists.txt
were checked with MinGW and NMake, but how they fair under Linux is an
unknown to me at this time.
- Converted most sprintf (and all wsprintf) calls to either mysnprintf or
FStrings, depending on the situation.
- Changed the strings in the wbstartstruct to be FStrings.
- Changed myvsnprintf() to output nothing if count is greater than INT_MAX.
This is so that I can use a series of mysnprintf() calls and advance the
pointer for each one. Once the pointer goes beyond the end of the buffer,
the count will go negative, but since it's an unsigned type it will be
seen as excessively huge instead. This should not be a problem, as there's
no reason for ZDoom to be using text buffers larger than 2 GB anywhere.
- Ripped out the disabled bit from FGameConfigFile::MigrateOldConfig().
- Changed CalcMapName() to return an FString instead of a pointer to a static
buffer.
- Changed startmap in d_main.cpp into an FString.
- Changed CheckWarpTransMap() to take an FString& as the first argument.
- Changed d_mapname in g_level.cpp into an FString.
- Changed DoSubstitution() in ct_chat.cpp to place the substitutions in an
FString.
- Fixed: The MAPINFO parser wrote into the string buffer to construct a map
name when given a Hexen map number. This was fine with the old scanner
code, but only a happy coincidence prevents it from crashing with the new
code
- Added the 'B' conversion specifier to StringFormat::VWorker() for printing
binary numbers.
- Added CMake support for building with MinGW, MSYS, and NMake. Linux support
is probably broken until I get around to booting into Linux again. Niceties
provided over the existing Makefiles they're replacing:
* All command-line builds can use the same build system, rather than having
a separate one for MinGW and another for Linux.
* Microsoft's NMake tool is supported as a target.
* Progress meters.
* Parallel makes work from a fresh checkout without needing to be primed
first with a single-threaded make.
* Porting to other architectures should be simplified, whenever that day
comes.
- Replaced the makewad tool with zipdir. This handles the dependency tracking
itself instead of generating an external makefile to do it, since I couldn't
figure out how to generate a makefile with an external tool and include it
with a CMake-generated makefile. Where makewad used a master list of files
to generate the package file, zipdir just zips the entire contents of one or
more directories.
- Added the gdtoa package from netlib's fp library so that ZDoom's printf-style
formatting can be entirely independant of the CRT.
SVN r1082 (trunk)
arbitrary point. It has been replaced with a variant that takes a polyobject
as a source, since that was the only use that couldn't be rewritten with the
other variants. This also fixes the bug that polyobject sounds were not
successfully saved and caused a crash when reloading the game. Note that
this is a significant change to how equality of sound sources is determined,
so some things may not behave quite the same as before. (Which would be a
bug, but hopefully everything still sounds the same.)
SVN r1059 (trunk)
the rest of the game is concerned, these sounds will never stop once they
have been started until they are explicitly stopped. If they are evicted
from their channels, the sound code will restart them as soon as possible.
This means that instead of this:
if (!S_IsActorPlayingSomething(actor, CHAN_WEAPON, -1))
{
S_Sound(actor, CHAN_WEAPON|CHAN_LOOP, soundid, 1, ATTN_NORM);
}
The following is now just as effective:
S_Sound(actor, CHAN_WEAPON|CHAN_LOOP, soundid, 1, ATTN_NORM);
There are also a couple of other ramifications presented by this change:
* The full state of the sound system (sans music) is now stored in save
games. Any sounds that were playing when you saved will still be
playing when you load. (Try saving while Korax is making a speech in
Hexen to hear it.)
* Using snd_reset will also preserve any playing sounds.
* Movie playback is disabled, probably forever. I did not want to
update the MovieDisable/ResumeSound stuff for the new eviction
tracking code. A properly updated movie player will use the VMR,
which doesn't need these functions, since it would pipe the sound
straight through the sound system like everything else, so I decided
to dump them now, which leaves the movie player in a totally unworkable
state.
June 26, 2008
- Changed S_Sound() to take the same floating point attenuation that the
internal S_StartSound() uses. Now ambient sounds can use the public
S_Sound() interface.
- Fixed: S_RelinkSound() compared the points of the channels against the
from actor's point, rather than checking the channels' mover.
- Changed Strife's animated doors so that their sounds originate from the
interior of the sector making them and not from the entire vertical height
of the map.
SVN r1055 (trunk)
P_FindFloorCeiling never did that.
- Merged Check_Sides and PIT_CrossLine into A_PainShootSkull.
- Replaced P_BlockLinesIterator with FBlockLinesIterator in all places it was
used. This also allowed to remove all the global variable saving in
P_CreateSecNodeList.
- Added a new FBlockLinesIterator class that doesn't need a callback
function because debugging the previous bug proved to be a bit annoying
because it involved a P_BlockLinesIterator loop.
- Fixed: The MBF code to move monsters away from dropoffs did not work as
intended due to some random decisions in P_DoNewChaseDir. When in the
avoiding dropoff mode these are ignored now. This should cure the problem
that monsters hanging over a dropoff tended to drop down.
SVN r887 (trunk)
- Added .txt files to the list of types (wad, zip, and pk3) that can be
loaded without listing them after -file.
- Fonts that are created by the ACS setfont command to wrap a texture now
support animated textures.
- FON2 fonts can now use their full palette for CR_UNTRANSLATED when drawn
with the hardware 2D path instead of being restricted to the game palette.
- Fixed: Toggling vid_vsync would reset the displayed fullscreen gamma to 1
on a Radeon 9000.
- Added back the off-by-one palette handling, but in a much more limited
scope than before. The skipped entry is assumed to always be at 248, and
it is assumed that all Shader Model 1.4 cards suffer from this. That's
because all SM1.4 cards are based on variants of the ATI R200 core, and the
RV250 in a Radeon 9000 craps up like this. I see no reason to assume that
other flavors of the R200 are any different. (Interesting note: With the
Radeon 9000, D3DTADDRESS_CLAMP is an invalid address mode when using the
debug Direct3D 9 runtime, but it works perfectly fine with the retail
Direct3D 9 runtime.) (Insight: The R200 probably uses bytes for all its
math inside pixel shaders. That would explain perfectly why I can't use
constants greater than 1 with PS1.4 and why it can't do an exact mapping to
every entry in the color palette.
- Fixed: The software shaded drawer did not work for 2D, because its selected
"color"map was replaced with the identitymap before being used.
- Fixed: I cannot use Printf to output messages before the framebuffer was
completely setup, meaning that Shader Model 1.4 cards could not change
resolution.
- I have decided to let remap palettes specify variable alpha values for
their colors. D3DFB no longer forces them to 255.
- Updated re2c to version 0.12.3.
- Fixed: A_Wander used threshold as a timer, when it should have used
reactiontime.
- Fixed: A_CustomRailgun would not fire at all for actors without a target
when the aim parameter was disabled.
- Made the warp command work in multiplayer, again courtesy of Karate Chris.
- Fixed: Trying to spawn a bot while not in a game made for a crashing time.
(Patch courtesy of Karate Chris.)
- Removed some floating point math from hu_scores.cpp that somebody's GCC
gave warnings for (not mine, though).
- Fixed: The SBarInfo drawbar command crashed if the sprite image was
unavailable.
- Fixed: FString::operator=(const char *) did not release its old buffer when
being assigned to the null string.
- The scanner no longer has an upper limit on the length of strings it
accepts, though short strings will be faster than long ones.
- Moved all the text scanning functions into a class. Mainly, this means that
multiple script scanner states can be stored without being forced to do so
recursively. I think I might be taking advantage of that in the near
future. Possibly. Maybe.
- Removed some potential buffer overflows from the decal parser.
- Applied Blzut3's SBARINFO update #9:
* Fixed: When using even length values in drawnumber it would cap to a 98
value instead of a 99 as intended.
* The SBarInfo parser can now accept negatives for coordinates. This
doesn't allow much right now, but later I plan to add better fullscreen
hud support in which the negatives will be more useful. This also cleans
up the source a bit since all calls for (x, y) coordinates are with the
function getCoordinates().
- Added support for stencilling actors.
- Added support for non-black colors specified with DTA_ColorOverlay to the
software renderer.
- Fixed: The inverse, gold, red, and green fixed colormaps each allocated
space for 32 different colormaps, even though each only used the first one.
- Added two new blending flags to make reverse subtract blending more useful:
STYLEF_InvertSource and STYLEF_InvertOverlay. These invert the color that
gets blended with the background, since that seems like a good idea for
reverse subtraction. They also work with the other two blending operations.
- Added subtract and reverse subtract blending operations to the renderer.
Since the ERenderStyle enumeration was getting rather unwieldy, I converted
it into a new FRenderStyle structure that lets each parameter of the
blending equation be set separately. This simplified the set up for the
blend quite a bit, and it means a number of new combinations are available
by setting the parameters properly.
SVN r710 (trunk)
wadsrc/Makefile, nor did it use the makefile for cleaning.
- Added ST_NetMessage() for mixing miscellaneous messages with the network
startup meter, since they get mixed in the same space on the Linux terminal
and must be handled properly to avoid looking bad.
SVN r429 (trunk)
- If you aren't targeting x86, m_fixed.h only includes basicinlines.h now.
- Moved x64inlines.h into basicinlines.h.
- Replaced uses of __int64 with types from doomtype.h.
- The stop console command no longer ends single player games, just the demo
that was being recorded.
- In C mode, the sc_man parser no longer allows multi-line string constants
without using the \ character to preface the newline character. This makes
it much easier to diagnose errors where you forget the closing quote of a
string.
- Fixed: V_BreakLines() added the terminating '\0' to the last line of the
input string.
- Added font as a parameter to V_BreakLines and removed its keepspace
parameter, which was never passed as anything other than the default.
SVN r331 (trunk)
- The stat meters now return an FString instead of sprintfing into a fixed
output buffer.
- NOASM is now automatically defined when compiling for a non-x86 target.
- Some changes have been made to the integral types in doomtype.h:
- For consistancy with the other integral types, byte is no longer a
synonym for BYTE.
- Most uses of BOOL have been change to the standard C++ bool type. Those
that weren't were changed to INTBOOL to indicate they may contain values
other than 0 or 1 but are still used as a boolean.
- Compiler-provided types with explicit bit sizes are now used. In
particular, DWORD is no longer a long so it will work with both 64-bit
Windows and Linux.
- Since some files need to include Windows headers, uint32 is a synonym
for the non-Windows version of DWORD.
- Removed d_textur.h. The pic_t struct it defined was used nowhere, and that
was all it contained.
SVN r326 (trunk)
- Added multiple-choice sound sequences. These overcome one of the major
deficiences of the Hexen-inherited SNDSEQ system while still being Hexen
compatible: Custom door sounds can now use different opening and closing
sequences, for both normal and blazing speeds.
- Added a serializer for TArray.
- Added a countof macro to doomtype.h. See the1's blog to find out why
it's implemented the way it is.
<http://blogs.msdn.com/the1/articles/210011.aspx>
- Added a new method to FRandom for getting random numbers larger than 255,
which lets me:
- Fixed: SNDSEQ delayrand commands could delay for no more than 255 tics.
- Fixed: If you're going to have sector_t.SoundTarget, then they need to
be included in the pointer cleanup scans.
- Ported back newer name code from 2.1.
- Fixed: Using -warp with only one parameter in Doom and Heretic to
select a map on episode 1 no longer worked.
- New: Loading a multiplayer save now restores the players based on
their names rather than on their connection order. Using connection
order was sensible when -net was the only way to start a network game,
but with -host/-join, it's not so nice. Also, if there aren't enough
players in the save, then the extra players will be spawned normally,
so you can continue a saved game with more players than you started it
with.
- Added some new SNDSEQ commands to make it possible to define Heretic's
ambient sounds in SNDSEQ: volumerel, volumerand, slot, randomsequence,
delayonce, and restart. With these, it is basically possible to obsolete
all of the $ambient SNDINFO commands.
- Fixed: Sound sequences would only execute one command each time they were
ticked.
- Fixed: No bounds checking was done on the volume sound sequences played at.
- Fixed: The tic parameter to playloop was useless and caused it to
act like a redundant playrepeat. I have removed all the logic that
caused playloop to play repeating sounds, and now it acts like an
infinite sequence of play/delay commands until the sequence is
stopped.
- Fixed: Sound sequences were ticked every frame, not every tic, so all
the delay commands were timed incorrectly and varied depending on your
framerate. Since this is useful for restarting looping sounds that got
cut off, I have not changed this. Instead, the delay commands now
record the tic when execution should resume, not the number of tics
left to delay.
SVN r57 (trunk)