mirror of
https://github.com/ZDoom/gzdoom-gles.git
synced 2024-11-24 13:11:33 +00:00
- Ported vlinetallasm4 to AMD64 assembly. Even with the increased number of
registers AMD64 provides, this routine still needs to be written as self- modifying code for maximum performance. The additional registers do allow for further optimization over the x86 version by allowing all four pixels to be in flight at the same time. The end result is that AMD64 ASM is about 2.18 times faster than AMD64 C and about 1.06 times faster than x86 ASM. (For further comparison, AMD64 C and x86 C are practically the same for this function.) Should I port any more assembly to AMD64, mvlineasm4 is the most likely candidate, but it's not used enough at this point to bother. Also, this may or may not work with Linux at the moment, since it doesn't have the eh_handler metadata. Win64 is easier, since I just need to structure the function prologue and epilogue properly and use some assembler directives/macros to automatically generate the metadata. And that brings up another point: You need YASM to assemble the AMD64 code, because NASM doesn't support the Win64 metadata directives. - Added an SSE version of DoBlending. This is strictly C intrinsics. VC++ still throws around unneccessary register moves. GCC seems to be pretty close to optimal, requiring only about 2 cycles/color. They're both faster than my hand-written MMX routine, so I don't need to feel bad about not hand-optimizing this for x64 builds. - Removed an extra instruction from DoBlending_MMX, transposed two instructions, and unrolled it once, shaving off about 80 cycles from the time required to blend 256 palette entries. Why? Because I tried writing a C version of the routine using compiler intrinsics and was appalled by all the extra movq's VC++ added to the code. GCC was better, but still generated extra instructions. I only wanted a C version because I can't use inline assembly with VC++'s x64 compiler, and x64 assembly is a bit of a pain. (It's a pain because Linux and Windows have different calling conventions, and you need to maintain extra metadata for functions.) So, the assembly version stays and the C version stays out. - Removed all the pixel doubling r_detail modes, since the one platform they were intended to assist (486) actually sees very little benefit from them. - Rewrote CheckMMX in C and renamed it to CheckCPU. - Fixed: CPUID function 0x80000005 is specified to return detailed L1 cache only for AMD processors, so we must not use it on other architectures, or we end up overwriting the L1 cache line size with 0 or some other number we don't actually understand. SVN r1134 (trunk)
This commit is contained in:
parent
14e94b86e2
commit
dda5ddd3c2
37 changed files with 1158 additions and 1337 deletions
|
@ -1,3 +1,20 @@
|
|||
August 8, 2008
|
||||
- Ported vlinetallasm4 to AMD64 assembly. Even with the increased number of
|
||||
registers AMD64 provides, this routine still needs to be written as self-
|
||||
modifying code for maximum performance. The additional registers do allow
|
||||
for further optimization over the x86 version by allowing all four pixels
|
||||
to be in flight at the same time. The end result is that AMD64 ASM is about
|
||||
2.18 times faster than AMD64 C and about 1.06 times faster than x86 ASM.
|
||||
(For further comparison, AMD64 C and x86 C are practically the same for
|
||||
this function.) Should I port any more assembly to AMD64, mvlineasm4 is the
|
||||
most likely candidate, but it's not used enough at this point to bother.
|
||||
Also, this may or may not work with Linux at the moment, since it doesn't
|
||||
have the eh_handler metadata. Win64 is easier, since I just need to
|
||||
structure the function prologue and epilogue properly and use some
|
||||
assembler directives/macros to automatically generate the metadata. And
|
||||
that brings up another point: You need YASM to assemble the AMD64 code,
|
||||
because NASM doesn't support the Win64 metadata directives.
|
||||
|
||||
August 8, 2008 (Changes by Graf Zahl)
|
||||
- Replaced the ActorInfo definitions of several internal classes with DECORATE definitions
|
||||
- Converted teleport fog and destinations to DECORATE.
|
||||
|
@ -14,6 +31,23 @@ August 8, 2008 (Changes by Graf Zahl)
|
|||
- Added aWeaponGiver class to generalize the standing AssaultGun.
|
||||
- converted a_Strifeweapons.cpp to DECORATE, except for the Sigil.
|
||||
|
||||
August 7, 2008
|
||||
- Added an SSE version of DoBlending. This is strictly C intrinsics.
|
||||
VC++ still throws around unneccessary register moves. GCC seems to be
|
||||
pretty close to optimal, requiring only about 2 cycles/color. They're
|
||||
both faster than my hand-written MMX routine, so I don't need to feel
|
||||
bad about not hand-optimizing this for x64 builds.
|
||||
- Removed an extra instruction from DoBlending_MMX, transposed two
|
||||
instructions, and unrolled it once, shaving off about 80 cycles from the
|
||||
time required to blend 256 palette entries. Why? Because I tried writing
|
||||
a C version of the routine using compiler intrinsics and was appalled by
|
||||
all the extra movq's VC++ added to the code. GCC was better, but still
|
||||
generated extra instructions. I only wanted a C version because I can't
|
||||
use inline assembly with VC++'s x64 compiler, and x64 assembly is a bit
|
||||
of a pain. (It's a pain because Linux and Windows have different calling
|
||||
conventions, and you need to maintain extra metadata for functions.) So,
|
||||
the assembly version stays and the C version stays out.
|
||||
|
||||
August 7, 2008 (Changes by Graf Zahl)
|
||||
- Converted the rest of a_strifestuff.cpp to DECORATE.
|
||||
- Fixed: AStalker::CheckMeleeRange did not perform all checks of AActor::CheckMeleeRange.
|
||||
|
@ -39,6 +73,13 @@ August 7, 2008 (SBARINfO update)
|
|||
- Fixed: Various bugs I noticed in the fullscreenoffsets code.
|
||||
|
||||
August 6, 2008
|
||||
- Removed all the pixel doubling r_detail modes, since the one platform they
|
||||
were intended to assist (486) actually sees very little benefit from them.
|
||||
- Rewrote CheckMMX in C and renamed it to CheckCPU.
|
||||
- Fixed: CPUID function 0x80000005 is specified to return detailed L1 cache
|
||||
only for AMD processors, so we must not use it on other architectures, or
|
||||
we end up overwriting the L1 cache line size with 0 or some other number
|
||||
we don't actually understand.
|
||||
- The x87 precision control is now explicitly set for double precision, since
|
||||
GCC defaults to extended precision instead, unlike Visual C++.
|
||||
|
||||
|
|
|
@ -173,11 +173,24 @@ endif( FMOD_LIBRARY )
|
|||
|
||||
if( NOT NO_ASM )
|
||||
find_program( NASM_PATH NAMES ${NASM_NAMES} )
|
||||
find_program( YASM_PATH yasm )
|
||||
|
||||
if( YASM_PATH )
|
||||
set( ASSEMBLER ${YASM_PATH} )
|
||||
else( YASM_PATH )
|
||||
if( X64 )
|
||||
message( STATUS "Could not find YASM. Disabling assembly code." )
|
||||
set( NO_ASM ON )
|
||||
else( X64 )
|
||||
if( NOT NASM_PATH )
|
||||
message( STATUS "Could not find YASM or NASM. Disabling assembly code." )
|
||||
set( NO_ASM ON )
|
||||
else( NOT NASM_PATH )
|
||||
set( ASSEMBLER ${NASM_PATH} )
|
||||
endif( NOT NASM_PATH )
|
||||
endif( X64 )
|
||||
endif( YASM_PATH )
|
||||
|
||||
if( NOT NASM_PATH )
|
||||
message( STATUS "Could not find NASM. Disabling assembly code." )
|
||||
set( NO_ASM ON )
|
||||
else( NOT NASM_PATH )
|
||||
# I think the only reason there was a version requirement was because the
|
||||
# executable name for Windows changed from 0.x to 2.0, right? This is
|
||||
# how to do it in case I need to do something similar later.
|
||||
|
@ -188,7 +201,6 @@ if( NOT NO_ASM )
|
|||
# if( NOT NASM_VER LESS 2 )
|
||||
# message( SEND_ERROR "NASM version should be 2 or later. (Installed version is ${NASM_VER}.)" )
|
||||
# endif( NOT NASM_VER LESS 2 )
|
||||
endif( NOT NASM_PATH )
|
||||
endif( NOT NO_ASM )
|
||||
|
||||
if( NOT NO_ASM )
|
||||
|
@ -201,22 +213,31 @@ if( NOT NO_ASM )
|
|||
|
||||
# Tell CMake how to assemble our files
|
||||
if( UNIX )
|
||||
set( NASM_OUTPUT_EXTENSION .o )
|
||||
set( NASM_FLAGS -f elf -DM_TARGET_LINUX )
|
||||
set( ASM_OUTPUT_EXTENSION .o )
|
||||
if( X64 )
|
||||
set( ASM_FLAGS -f elf64 -DM_TARGET_LINUX )
|
||||
else( X64 )
|
||||
set( ASM_FLAGS -f elf -DM_TARGET_LINUX )
|
||||
endif( X64 )
|
||||
else( UNIX )
|
||||
set( NASM_OUTPUT_EXTENSION .obj )
|
||||
set( NASM_FLAGS -f win32 -DWIN32 )
|
||||
set( ASM_OUTPUT_EXTENSION .obj )
|
||||
if( X64 )
|
||||
set( ASM_FLAGS -f win64 -DWIN32 -DWIN64 )
|
||||
else( X64 )
|
||||
set( ASM_FLAGS -f win32 -DWIN32 )
|
||||
endif( X64 )
|
||||
endif( UNIX )
|
||||
if( WIN32 )
|
||||
set( FIXRTEXT fixrtext )
|
||||
endif( WIN32 )
|
||||
message( STATUS "Selected assembler: ${ASSEMBLER}" )
|
||||
MACRO( ADD_ASM_FILE infile )
|
||||
set( ASM_OUTPUT_${infile} "${CMAKE_CURRENT_BINARY_DIR}/CMakeFiles/zdoom.dir/${infile}${NASM_OUTPUT_EXTENSION}" )
|
||||
set( ASM_OUTPUT_${infile} "${CMAKE_CURRENT_BINARY_DIR}/CMakeFiles/zdoom.dir/${infile}${ASM_OUTPUT_EXTENSION}" )
|
||||
if( WIN32 )
|
||||
set( FIXRTEXT_${infile} COMMAND ${FIXRTEXT} "${ASM_OUTPUT_${infile}}" )
|
||||
endif( WIN32 )
|
||||
add_custom_command( OUTPUT ${ASM_OUTPUT_${infile}}
|
||||
COMMAND ${NASM_PATH} ${NASM_FLAGS} -i${CMAKE_CURRENT_SOURCE_DIR}/ -o"${ASM_OUTPUT_${infile}}" "${CMAKE_CURRENT_SOURCE_DIR}/${infile}"
|
||||
COMMAND ${ASSEMBLER} ${ASM_FLAGS} -i${CMAKE_CURRENT_SOURCE_DIR}/ -o"${ASM_OUTPUT_${infile}}" "${CMAKE_CURRENT_SOURCE_DIR}/${infile}"
|
||||
${FIXRTEXT_${infile}}
|
||||
DEPENDS ${infile} ${FIXRTEXT} )
|
||||
set( ASM_SOURCES ${ASM_SOURCES} "${ASM_OUTPUT_${infile}}" )
|
||||
|
@ -320,14 +341,18 @@ else( WIN32 )
|
|||
endif( WIN32 )
|
||||
|
||||
if( NOT NO_ASM )
|
||||
ADD_ASM_FILE( a.nas )
|
||||
ADD_ASM_FILE( misc.nas )
|
||||
ADD_ASM_FILE( tmap.nas )
|
||||
ADD_ASM_FILE( tmap2.nas )
|
||||
ADD_ASM_FILE( tmap3.nas )
|
||||
if( X64 )
|
||||
ADD_ASM_FILE( asm_x86_64/tmap3.asm )
|
||||
else( X64 )
|
||||
ADD_ASM_FILE( asm_ia32/a.asm )
|
||||
ADD_ASM_FILE( asm_ia32/misc.asm )
|
||||
ADD_ASM_FILE( asm_ia32/tmap.asm )
|
||||
ADD_ASM_FILE( asm_ia32/tmap2.asm )
|
||||
ADD_ASM_FILE( asm_ia32/tmap3.asm )
|
||||
endif( X64 )
|
||||
if( WIN32 )
|
||||
if( NOT X64 )
|
||||
ADD_ASM_FILE( win32/wrappers.nas )
|
||||
ADD_ASM_FILE( win32/wrappers.asm )
|
||||
endif( NOT X64 )
|
||||
endif( WIN32 )
|
||||
endif( NOT NO_ASM )
|
||||
|
@ -482,6 +507,7 @@ add_executable( zdoom WIN32
|
|||
v_video.cpp
|
||||
w_wad.cpp
|
||||
wi_stuff.cpp
|
||||
x86.cpp
|
||||
zstrformat.cpp
|
||||
zstring.cpp
|
||||
g_doom/a_arachnotron.cpp
|
||||
|
@ -705,6 +731,9 @@ if( CMAKE_COMPILER_IS_GNUCXX )
|
|||
|
||||
# Compile this one file with SSE2 support.
|
||||
set_source_files_properties( nodebuild_classify_sse2.cpp PROPERTIES COMPILE_FLAGS "-msse2 -mfpmath=sse" )
|
||||
|
||||
# Need to enable intrinsics for this file.
|
||||
set_source_files_properties( x86.cpp PROPERTIES COMPILE_FLAGS "-msse2 -mmmx" )
|
||||
endif( CMAKE_COMPILER_IS_GNUCXX )
|
||||
|
||||
if( MSVC )
|
||||
|
|
|
@ -1766,8 +1766,8 @@ void AM_Drawer ()
|
|||
{
|
||||
f_x = viewwindowx;
|
||||
f_y = viewwindowy;
|
||||
f_w = realviewwidth;
|
||||
f_h = realviewheight;
|
||||
f_w = viewwidth;
|
||||
f_h = viewheight;
|
||||
f_p = screen->GetPitch ();
|
||||
}
|
||||
AM_activateNewScale();
|
||||
|
|
200
src/asm_ia32/misc.asm
Normal file
200
src/asm_ia32/misc.asm
Normal file
|
@ -0,0 +1,200 @@
|
|||
;*
|
||||
;* misc.nas
|
||||
;* Miscellaneous assembly functions
|
||||
;*
|
||||
;*---------------------------------------------------------------------------
|
||||
;* Copyright 1998-2006 Randy Heit
|
||||
;* All rights reserved.
|
||||
;*
|
||||
;* Redistribution and use in source and binary forms, with or without
|
||||
;* modification, are permitted provided that the following conditions
|
||||
;* are met:
|
||||
;*
|
||||
;* 1. Redistributions of source code must retain the above copyright
|
||||
;* notice, this list of conditions and the following disclaimer.
|
||||
;* 2. Redistributions in binary form must reproduce the above copyright
|
||||
;* notice, this list of conditions and the following disclaimer in the
|
||||
;* documentation and/or other materials provided with the distribution.
|
||||
;* 3. The name of the author may not be used to endorse or promote products
|
||||
;* derived from this software without specific prior written permission.
|
||||
;*
|
||||
;* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
;* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
;* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
;* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
;* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
;* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
;* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
;* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
;* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
;* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
;*---------------------------------------------------------------------------
|
||||
;*
|
||||
|
||||
BITS 32
|
||||
|
||||
%ifndef M_TARGET_LINUX
|
||||
|
||||
%define DoBlending_MMX _DoBlending_MMX
|
||||
%define BestColor_MMX _BestColor_MMX
|
||||
|
||||
%endif
|
||||
|
||||
%ifdef M_TARGET_WATCOM
|
||||
SEGMENT DATA PUBLIC ALIGN=16 CLASS=DATA USE32
|
||||
SEGMENT DATA
|
||||
%else
|
||||
SECTION .data
|
||||
%endif
|
||||
|
||||
Blending256:
|
||||
dd 0x01000100,0x00000100
|
||||
|
||||
%ifdef M_TARGET_WATCOM
|
||||
SEGMENT CODE PUBLIC ALIGN=16 CLASS=CODE USE32
|
||||
SEGMENT CODE
|
||||
%else
|
||||
SECTION .text
|
||||
%endif
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; DoBlending_MMX
|
||||
;
|
||||
; MMX version of DoBlending
|
||||
;
|
||||
; (DWORD *from, DWORD *to, count, tor, tog, tob, toa)
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL DoBlending_MMX
|
||||
|
||||
DoBlending_MMX:
|
||||
pxor mm0,mm0 ; mm0 = 0
|
||||
mov eax,[esp+4*4]
|
||||
shl eax,16
|
||||
mov edx,[esp+4*5]
|
||||
shl edx,8
|
||||
or eax,[esp+4*6]
|
||||
or eax,edx
|
||||
mov ecx,[esp+4*3] ; ecx = count
|
||||
movd mm1,eax ; mm1 = 00000000 00RRGGBB
|
||||
mov eax,[esp+4*7]
|
||||
shl eax,16
|
||||
mov edx,[esp+4*7]
|
||||
shl edx,8
|
||||
or eax,[esp+4*7]
|
||||
or eax,edx
|
||||
mov edx,[esp+4*2] ; edx = dest
|
||||
movd mm6,eax ; mm6 = 00000000 00AAAAAA
|
||||
punpcklbw mm1,mm0 ; mm1 = 000000RR 00GG00BB
|
||||
movq mm7,[Blending256]
|
||||
punpcklbw mm6,mm0 ; mm6 = 000000AA 00AA00AA
|
||||
mov eax,[esp+4*1] ; eax = source
|
||||
pmullw mm1,mm6 ; mm1 = 000000RR 00GG00BB (multiplied by alpha)
|
||||
psubusw mm7,mm6 ; mm7 = 000000aa 00aa00aa (one minus alpha)
|
||||
nop ; Does this actually pair on a Pentium?
|
||||
|
||||
; Do four colors per iteration: Count must be a multiple of four.
|
||||
|
||||
.loop movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
|
||||
add eax,8
|
||||
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
|
||||
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
|
||||
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
|
||||
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
|
||||
add edx,8
|
||||
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
|
||||
sub ecx,2
|
||||
paddusw mm2,mm1
|
||||
psrlw mm2,8
|
||||
paddusw mm3,mm1
|
||||
psrlw mm3,8
|
||||
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
|
||||
movq [edx-8],mm2
|
||||
|
||||
movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
|
||||
add eax,8
|
||||
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
|
||||
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
|
||||
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
|
||||
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
|
||||
add edx,8
|
||||
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
|
||||
sub ecx,2
|
||||
paddusw mm2,mm1
|
||||
psrlw mm2,8
|
||||
paddusw mm3,mm1
|
||||
psrlw mm3,8
|
||||
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
|
||||
movq [edx-8],mm2
|
||||
|
||||
jnz .loop
|
||||
|
||||
emms
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; BestColor_MMX
|
||||
;
|
||||
; Picks the closest matching color from a palette
|
||||
;
|
||||
; Passed FFRRGGBB and palette array in same format
|
||||
; FF is the index of the first palette entry to consider
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL BestColor_MMX
|
||||
GLOBAL @BestColor_MMX@8
|
||||
|
||||
BestColor_MMX:
|
||||
mov ecx,[esp+4]
|
||||
mov edx,[esp+8]
|
||||
@BestColor_MMX@8:
|
||||
pxor mm0,mm0
|
||||
movd mm1,ecx ; mm1 = color searching for
|
||||
mov eax,257*257+257*257+257*257 ;eax = bestdist
|
||||
push ebx
|
||||
punpcklbw mm1,mm0
|
||||
mov ebx,ecx ; ebx = best color
|
||||
shr ecx,24 ; ecx = count
|
||||
and ebx,0xffffff
|
||||
push esi
|
||||
push ebp
|
||||
|
||||
.loop movd mm2,[edx+ecx*4] ; mm2 = color considering now
|
||||
inc ecx
|
||||
punpcklbw mm2,mm0
|
||||
movq mm3,mm1
|
||||
psubsw mm3,mm2
|
||||
pmullw mm3,mm3 ; mm3 = color distance squared
|
||||
|
||||
movd ebp,mm3 ; add the three components
|
||||
psrlq mm3,32 ; into ebp to get the real
|
||||
mov esi,ebp ; (squared) distance
|
||||
shr esi,16
|
||||
and ebp,0xffff
|
||||
add ebp,esi
|
||||
movd esi,mm3
|
||||
add ebp,esi
|
||||
|
||||
jz .perf ; found a perfect match
|
||||
cmp eax,ebp
|
||||
jb .skip
|
||||
mov eax,ebp
|
||||
lea ebx,[ecx-1]
|
||||
.skip cmp ecx,256
|
||||
jne .loop
|
||||
mov eax,ebx
|
||||
pop ebp
|
||||
pop esi
|
||||
pop ebx
|
||||
emms
|
||||
ret
|
||||
|
||||
.perf lea eax,[ecx-1]
|
||||
pop ebp
|
||||
pop esi
|
||||
pop ebx
|
||||
emms
|
||||
ret
|
|
@ -51,7 +51,7 @@ FUZZTABLE equ 50
|
|||
%define fuzzpos _fuzzpos
|
||||
%define fuzzoffset _fuzzoffset
|
||||
%define NormalLight _NormalLight
|
||||
%define realviewheight _realviewheight
|
||||
%define viewheight _viewheight
|
||||
%define fuzzviewheight _fuzzviewheight
|
||||
%define CPU _CPU
|
||||
|
||||
|
@ -103,7 +103,7 @@ EXTERN centery
|
|||
EXTERN fuzzpos
|
||||
EXTERN fuzzoffset
|
||||
EXTERN NormalLight
|
||||
EXTERN realviewheight
|
||||
EXTERN viewheight
|
||||
EXTERN fuzzviewheight
|
||||
EXTERN CPU
|
||||
|
182
src/asm_x86_64/tmap3.asm
Normal file
182
src/asm_x86_64/tmap3.asm
Normal file
|
@ -0,0 +1,182 @@
|
|||
%include "valgrind.inc"
|
||||
|
||||
BITS 64
|
||||
DEFAULT REL
|
||||
|
||||
%ifnidn __OUTPUT_FORMAT__,win64
|
||||
|
||||
%macro PROC_FRAME 1
|
||||
%1:
|
||||
%endmacro
|
||||
|
||||
%macro rex_push_reg 1
|
||||
push %1
|
||||
%endmacro
|
||||
|
||||
%macro push_reg 1
|
||||
push %1
|
||||
%endmacro
|
||||
|
||||
%macro alloc_stack 1
|
||||
sub rsp,%1
|
||||
%endmacro
|
||||
|
||||
%define parm1lo dil
|
||||
|
||||
%else
|
||||
|
||||
%define parm1lo cl
|
||||
|
||||
%endif
|
||||
|
||||
SECTION .data
|
||||
|
||||
EXTERN vplce
|
||||
EXTERN vince
|
||||
EXTERN palookupoffse
|
||||
EXTERN bufplce
|
||||
|
||||
EXTERN dc_count
|
||||
EXTERN dc_dest
|
||||
EXTERN dc_pitch
|
||||
|
||||
SECTION .text
|
||||
|
||||
ALIGN 16
|
||||
GLOBAL ASM_PatchPitch
|
||||
ASM_PatchPitch:
|
||||
mov ecx, [dc_pitch]
|
||||
mov [pm+3], ecx
|
||||
mov [vltpitch+3], ecx
|
||||
selfmod pm, vltpitch+6
|
||||
ret
|
||||
|
||||
ALIGN 16
|
||||
GLOBAL setupvlinetallasm
|
||||
setupvlinetallasm:
|
||||
mov [shifter1+2], parm1lo
|
||||
mov [shifter2+2], parm1lo
|
||||
mov [shifter3+2], parm1lo
|
||||
mov [shifter4+2], parm1lo
|
||||
selfmod shifter1, shifter4+3
|
||||
ret
|
||||
|
||||
%ifidn __OUTPUT_FORMAT__,win64
|
||||
; Yasm can't do progbits alloc exec for win64?
|
||||
; Hmm, looks like it's automatic. No worries, then.
|
||||
SECTION .rtext write ;progbits alloc exec
|
||||
%else
|
||||
SECTION .rtext progbits alloc exec write
|
||||
%endif
|
||||
|
||||
ALIGN 16
|
||||
|
||||
GLOBAL vlinetallasm4
|
||||
PROC_FRAME vlinetallasm4
|
||||
rex_push_reg rbx
|
||||
push_reg rdi
|
||||
push_reg r15
|
||||
push_reg r14
|
||||
push_reg r13
|
||||
push_reg r12
|
||||
push_reg rbp
|
||||
push_reg rsi
|
||||
alloc_stack 8 ; Stack must be 16-byte aligned
|
||||
END_PROLOGUE
|
||||
; rax = bufplce base address
|
||||
; rbx =
|
||||
; rcx = offset from rdi/count (negative)
|
||||
; edx/rdx = scratch
|
||||
; rdi = bottom of columns to write to
|
||||
; r8d-r11d = column offsets
|
||||
; r12-r15 = palookupoffse[0] - palookupoffse[4]
|
||||
|
||||
mov ecx, [dc_count]
|
||||
mov rdi, [dc_dest]
|
||||
test ecx, ecx
|
||||
jle vltepilog ; count must be positive
|
||||
|
||||
mov rax, [bufplce]
|
||||
mov r8, [bufplce+8]
|
||||
sub r8, rax
|
||||
mov r9, [bufplce+16]
|
||||
sub r9, rax
|
||||
mov r10, [bufplce+24]
|
||||
sub r10, rax
|
||||
mov [source2+4], r8d
|
||||
mov [source3+4], r9d
|
||||
mov [source4+4], r10d
|
||||
|
||||
pm: imul rcx, 320
|
||||
|
||||
mov r12, [palookupoffse]
|
||||
mov r13, [palookupoffse+8]
|
||||
mov r14, [palookupoffse+16]
|
||||
mov r15, [palookupoffse+24]
|
||||
|
||||
mov r8d, [vince]
|
||||
mov r9d, [vince+4]
|
||||
mov r10d, [vince+8]
|
||||
mov r11d, [vince+12]
|
||||
mov [step1+3], r8d
|
||||
mov [step2+3], r9d
|
||||
mov [step3+3], r10d
|
||||
mov [step4+3], r11d
|
||||
|
||||
add rdi, rcx
|
||||
neg rcx
|
||||
|
||||
mov r8d, [vplce]
|
||||
mov r9d, [vplce+4]
|
||||
mov r10d, [vplce+8]
|
||||
mov r11d, [vplce+12]
|
||||
selfmod loopit, vltepilog
|
||||
jmp loopit
|
||||
|
||||
ALIGN 16
|
||||
loopit:
|
||||
mov edx, r8d
|
||||
shifter1: shr edx, 24
|
||||
step1: add r8d, 0x88888888
|
||||
movzx rdx, BYTE [rax+rdx]
|
||||
mov ebx, r9d
|
||||
mov dl, [r12+rdx]
|
||||
shifter2: shr ebx, 24
|
||||
step2: add r9d, 0x88888888
|
||||
source2: movzx ebx, BYTE [rax+rbx+0x88888888]
|
||||
mov ebp, r10d
|
||||
mov bl, [r13+rbx]
|
||||
shifter3: shr ebp, 24
|
||||
step3: add r10d, 0x88888888
|
||||
source3: movzx ebp, BYTE [rax+rbp+0x88888888]
|
||||
mov esi, r11d
|
||||
mov bpl, BYTE [r14+rbp]
|
||||
shifter4: shr esi, 24
|
||||
step4: add r11d, 0x88888888
|
||||
source4: movzx esi, BYTE [rax+rsi+0x88888888]
|
||||
mov [rdi+rcx], dl
|
||||
mov [rdi+rcx+1], bl
|
||||
mov sil, BYTE [r15+rsi]
|
||||
mov [rdi+rcx+2], bpl
|
||||
mov [rdi+rcx+3], sil
|
||||
|
||||
vltpitch: add rcx, 320
|
||||
jl loopit
|
||||
|
||||
mov [vplce], r8d
|
||||
mov [vplce+4], r9d
|
||||
mov [vplce+8], r10d
|
||||
mov [vplce+12], r11d
|
||||
|
||||
vltepilog:
|
||||
add rsp, 8
|
||||
pop rsi
|
||||
pop rbp
|
||||
pop r12
|
||||
pop r13
|
||||
pop r14
|
||||
pop r15
|
||||
pop rdi
|
||||
pop rbx
|
||||
ret
|
||||
ENDPROC_FRAME
|
|
@ -207,7 +207,7 @@ void CT_Drawer (void)
|
|||
int screen_height = con_scaletext > 1? SCREENHEIGHT/2 : SCREENHEIGHT;
|
||||
int st_y = con_scaletext > 1? ST_Y/2 : ST_Y;
|
||||
|
||||
y += ((SCREENHEIGHT == realviewheight && viewactive) || gamestate != GS_LEVEL) ? screen_height : st_y;
|
||||
y += ((SCREENHEIGHT == viewheight && viewactive) || gamestate != GS_LEVEL) ? screen_height : st_y;
|
||||
|
||||
promptwidth = SmallFont->StringWidth (prompt) * scalex;
|
||||
x = screen->Font->GetCharWidth ('_') * scalex * 2 + promptwidth;
|
||||
|
|
|
@ -582,7 +582,7 @@ void D_Display ()
|
|||
StatusBar->BlendView (blend);
|
||||
}
|
||||
screen->SetBlendingRect(viewwindowx, viewwindowy,
|
||||
viewwindowx + realviewwidth, viewwindowy + realviewheight);
|
||||
viewwindowx + viewwidth, viewwindowy + viewheight);
|
||||
P_CheckPlayerSprites();
|
||||
screen->RenderView(&players[consoleplayer]);
|
||||
if ((hw2d = screen->Begin2D(viewactive)))
|
||||
|
@ -593,8 +593,11 @@ void D_Display ()
|
|||
}
|
||||
if (automapactive)
|
||||
{
|
||||
int saved_ST_Y=ST_Y;
|
||||
if (hud_althud && realviewheight == SCREENHEIGHT) ST_Y=realviewheight;
|
||||
int saved_ST_Y = ST_Y;
|
||||
if (hud_althud && viewheight == SCREENHEIGHT)
|
||||
{
|
||||
ST_Y = viewheight;
|
||||
}
|
||||
AM_Drawer ();
|
||||
ST_Y = saved_ST_Y;
|
||||
}
|
||||
|
@ -603,13 +606,13 @@ void D_Display ()
|
|||
R_RefreshViewBorder ();
|
||||
}
|
||||
|
||||
if (hud_althud && realviewheight == SCREENHEIGHT)
|
||||
if (hud_althud && viewheight == SCREENHEIGHT)
|
||||
{
|
||||
if (DrawFSHUD || automapactive) DrawHUD();
|
||||
StatusBar->DrawTopStuff (HUD_None);
|
||||
}
|
||||
else
|
||||
if (realviewheight == SCREENHEIGHT && viewactive)
|
||||
if (viewheight == SCREENHEIGHT && viewactive)
|
||||
{
|
||||
StatusBar->Draw (DrawFSHUD ? HUD_Fullscreen : HUD_None);
|
||||
StatusBar->DrawTopStuff (DrawFSHUD ? HUD_Fullscreen : HUD_None);
|
||||
|
@ -2085,7 +2088,10 @@ void D_DoomMain (void)
|
|||
_FPU_SETCW(cw);
|
||||
}
|
||||
#elif defined(_PC_53)
|
||||
_control87(_PC_53, _MCW_PC);
|
||||
// On the x64 architecture, changing the floating point precision is not supported.
|
||||
#ifndef _WIN64
|
||||
int cfp = _control87(_PC_53, _MCW_PC);
|
||||
#endif
|
||||
#endif
|
||||
|
||||
PClass::StaticInit ();
|
||||
|
|
|
@ -132,10 +132,6 @@ extern int viewwindowy;
|
|||
extern "C" int viewheight;
|
||||
extern "C" int viewwidth;
|
||||
extern "C" int halfviewwidth; // [RH] Half view width, for plane drawing
|
||||
extern "C" int realviewwidth; // [RH] Physical width of view window
|
||||
extern "C" int realviewheight; // [RH] Physical height of view window
|
||||
extern "C" int detailxshift; // [RH] X shift for horizontal detail level
|
||||
extern "C" int detailyshift; // [RH] Y shift for vertical detail level
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -43,24 +43,62 @@
|
|||
|
||||
// Since this file is included by everything, it seems an appropriate place
|
||||
// to check the NOASM/USEASM macros.
|
||||
#if (!defined(_M_IX86) && !defined(__i386__)) || defined(__APPLE__)
|
||||
// The assembly code requires an x86 processor.
|
||||
// And needs to be tweaked for Mach-O before enabled on Macs.
|
||||
#if defined(__APPLE__)
|
||||
// The assembly code needs to be tweaked for Mach-O before enabled on Macs.
|
||||
#ifndef NOASM
|
||||
#define NOASM
|
||||
#endif
|
||||
#endif
|
||||
|
||||
// There are three assembly-related macros:
|
||||
//
|
||||
// NOASM - Assembly code is disabled
|
||||
// X86_ASM - Using ia32 assembly code
|
||||
// X64_ASM - Using amd64 assembly code
|
||||
//
|
||||
// Note that these relate only to using the pure assembly code. Inline
|
||||
// assembly may still be used without respect to these macros, as
|
||||
// deemed appropriate.
|
||||
|
||||
#ifndef NOASM
|
||||
#ifndef USEASM
|
||||
#define USEASM 1
|
||||
// Select the appropriate type of assembly code to use.
|
||||
|
||||
#if defined(_M_IX86) || defined(__i386__)
|
||||
|
||||
#define X86_ASM
|
||||
#ifdef X64_ASM
|
||||
#undef X64_ASM
|
||||
#endif
|
||||
|
||||
#elif defined(_M_X64) || defined(__amd64__)
|
||||
|
||||
#define X64_ASM
|
||||
#ifdef X86_ASM
|
||||
#undef X86_ASM
|
||||
#endif
|
||||
|
||||
#else
|
||||
#ifdef USEASM
|
||||
#undef USEASM
|
||||
|
||||
#define NOASM
|
||||
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef NOASM
|
||||
// Ensure no assembly macros are defined if NOASM is defined.
|
||||
|
||||
#ifdef X86_ASM
|
||||
#undef X86_ASM
|
||||
#endif
|
||||
|
||||
#ifdef X64_ASM
|
||||
#undef X64_ASM
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
||||
|
||||
#if defined(_MSC_VER) || defined(__WATCOMC__)
|
||||
#define STACK_ARGS __cdecl
|
||||
#else
|
||||
|
|
|
@ -842,7 +842,7 @@ private:
|
|||
{
|
||||
AWeaponHolder *hold = static_cast<AWeaponHolder*>(inv);
|
||||
|
||||
if (hold->PieceWeapon->TypeName == FourthWeaponNames[FourthWeaponClass])
|
||||
if (hold->PieceWeapon->TypeName == FourthWeaponNames[(int)FourthWeaponClass])
|
||||
{
|
||||
// Weapon Pieces
|
||||
if (oldpieces != hold->PieceMask)
|
||||
|
@ -883,7 +883,7 @@ private:
|
|||
}
|
||||
if (oldpieces != 0)
|
||||
{
|
||||
DrawImage (ClassImages[FourthWeaponClass][imgWEAPONSLOT], 190, 0);
|
||||
DrawImage (ClassImages[(int)FourthWeaponClass][imgWEAPONSLOT], 190, 0);
|
||||
oldpieces = 0;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -203,9 +203,9 @@ void DSBarInfo::Draw (EHudState state)
|
|||
if(SBarInfoScript->completeBorder) //Fill the statusbar with the border before we draw.
|
||||
{
|
||||
FTexture *b = TexMan[gameinfo.border->b];
|
||||
R_DrawBorder(viewwindowx, viewwindowy + realviewheight + b->GetHeight(), viewwindowx + realviewwidth, SCREENHEIGHT);
|
||||
R_DrawBorder(viewwindowx, viewwindowy + viewheight + b->GetHeight(), viewwindowx + viewwidth, SCREENHEIGHT);
|
||||
if(screenblocks == 10)
|
||||
screen->FlatFill(viewwindowx, viewwindowy + realviewheight, viewwindowx + realviewwidth, viewwindowy + realviewheight + b->GetHeight(), b, true);
|
||||
screen->FlatFill(viewwindowx, viewwindowy + viewheight, viewwindowx + viewwidth, viewwindowy + viewheight + b->GetHeight(), b, true);
|
||||
}
|
||||
if(SBarInfoScript->automapbar && automapactive)
|
||||
{
|
||||
|
|
|
@ -1067,8 +1067,8 @@ void DBaseStatusBar::DrawCrosshair ()
|
|||
}
|
||||
|
||||
screen->DrawTexture (CrosshairImage,
|
||||
realviewwidth / 2 + viewwindowx,
|
||||
realviewheight / 2 + viewwindowy,
|
||||
viewwidth / 2 + viewwindowx,
|
||||
viewheight / 2 + viewwindowy,
|
||||
DTA_DestWidth, w,
|
||||
DTA_DestHeight, h,
|
||||
DTA_AlphaChannel, true,
|
||||
|
|
|
@ -450,7 +450,6 @@ static void StartScoreboardMenu (void);
|
|||
static void InitCrosshairsList();
|
||||
|
||||
EXTERN_CVAR (Bool, st_scale)
|
||||
EXTERN_CVAR (Int, r_detail)
|
||||
EXTERN_CVAR (Bool, r_stretchsky)
|
||||
EXTERN_CVAR (Int, r_columnmethod)
|
||||
EXTERN_CVAR (Bool, r_drawfuzz)
|
||||
|
@ -464,14 +463,6 @@ EXTERN_CVAR (Int, screenblocks)
|
|||
|
||||
static TArray<valuestring_t> Crosshairs;
|
||||
|
||||
static value_t DetailModes[] =
|
||||
{
|
||||
{ 0.0, "Normal" },
|
||||
{ 1.0, "Double Horizontally" },
|
||||
{ 2.0, "Double Vertically" },
|
||||
{ 3.0, "Double Horiz and Vert" }
|
||||
};
|
||||
|
||||
static value_t ColumnMethods[] = {
|
||||
{ 0.0, "Original" },
|
||||
{ 1.0, "Optimized" }
|
||||
|
@ -517,7 +508,6 @@ static menuitem_t VideoItems[] = {
|
|||
{ slider, "Brightness", {&Gamma}, {1.0}, {3.0}, {0.1}, {NULL} },
|
||||
{ discretes,"Crosshair", {&crosshair}, {8.0}, {0.0}, {0.0}, {NULL} },
|
||||
{ discrete, "Column render mode", {&r_columnmethod}, {2.0}, {0.0}, {0.0}, {ColumnMethods} },
|
||||
{ discrete, "Detail mode", {&r_detail}, {4.0}, {0.0}, {0.0}, {DetailModes} },
|
||||
{ discrete, "Stretch short skies", {&r_stretchsky}, {2.0}, {0.0}, {0.0}, {OnOff} },
|
||||
{ discrete, "Stretch status bar", {&st_scale}, {2.0}, {0.0}, {0.0}, {OnOff} },
|
||||
{ discrete, "Alternative HUD", {&hud_althud}, {2.0}, {0.0}, {0.0}, {OnOff} },
|
||||
|
|
|
@ -120,7 +120,7 @@ inline int BigLong (int x)
|
|||
| ((((unsigned int)x)<<8) & 0xff0000)
|
||||
| (((unsigned int)x)<<24));
|
||||
}
|
||||
#endif // USEASM
|
||||
#endif
|
||||
|
||||
#endif // WORDS_BIGENDIAN
|
||||
|
||||
|
|
537
src/misc.nas
537
src/misc.nas
|
@ -1,537 +0,0 @@
|
|||
;*
|
||||
;* misc.nas
|
||||
;* Miscellaneous assembly functions
|
||||
;*
|
||||
;*---------------------------------------------------------------------------
|
||||
;* Copyright 1998-2006 Randy Heit
|
||||
;* All rights reserved.
|
||||
;*
|
||||
;* Redistribution and use in source and binary forms, with or without
|
||||
;* modification, are permitted provided that the following conditions
|
||||
;* are met:
|
||||
;*
|
||||
;* 1. Redistributions of source code must retain the above copyright
|
||||
;* notice, this list of conditions and the following disclaimer.
|
||||
;* 2. Redistributions in binary form must reproduce the above copyright
|
||||
;* notice, this list of conditions and the following disclaimer in the
|
||||
;* documentation and/or other materials provided with the distribution.
|
||||
;* 3. The name of the author may not be used to endorse or promote products
|
||||
;* derived from this software without specific prior written permission.
|
||||
;*
|
||||
;* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
|
||||
;* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
|
||||
;* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
|
||||
;* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
|
||||
;* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
|
||||
;* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
;* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
;* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
;* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
||||
;* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
;*---------------------------------------------------------------------------
|
||||
;*
|
||||
|
||||
BITS 32
|
||||
|
||||
%ifndef M_TARGET_LINUX
|
||||
|
||||
%define CheckMMX _CheckMMX
|
||||
%define EndMMX _EndMMX
|
||||
%define DoBlending_MMX _DoBlending_MMX
|
||||
%define BestColor_MMX _BestColor_MMX
|
||||
%define DoubleHoriz_MMX _DoubleHoriz_MMX
|
||||
%define DoubleHorizVert_MMX _DoubleHorizVert_MMX
|
||||
%define DoubleVert_ASM _DoubleVert_ASM
|
||||
|
||||
%endif
|
||||
|
||||
%ifdef M_TARGET_WATCOM
|
||||
SEGMENT DATA PUBLIC ALIGN=16 CLASS=DATA USE32
|
||||
SEGMENT DATA
|
||||
%else
|
||||
SECTION .data
|
||||
%endif
|
||||
|
||||
Blending256:
|
||||
dd 0x01000100,0x00000100
|
||||
|
||||
%ifdef M_TARGET_WATCOM
|
||||
SEGMENT CODE PUBLIC ALIGN=16 CLASS=CODE USE32
|
||||
SEGMENT CODE
|
||||
%else
|
||||
SECTION .text
|
||||
%endif
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; CheckMMX
|
||||
;
|
||||
; Checks for the presence of MMX instructions on the
|
||||
; current processor. This code is adapted from the samples
|
||||
; in AMD's document entitled "AMD-K6™ MMX Processor
|
||||
; Multimedia Extensions." Also fills in the vendor
|
||||
; information string.
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL CheckMMX
|
||||
|
||||
; void CheckMMX (struct CPUInfo *)
|
||||
|
||||
CheckMMX:
|
||||
xor eax,eax
|
||||
mov ecx,92/4
|
||||
push ebx
|
||||
push edi
|
||||
mov edi,[esp+12]
|
||||
rep stosd
|
||||
sub edi,92
|
||||
|
||||
mov [edi+88],byte 32; Assume a 32-byte cache line
|
||||
|
||||
pushfd ; save EFLAGS
|
||||
pop eax ; store EFLAGS in EAX
|
||||
mov ebx,eax ; save in EBX for later testing
|
||||
xor eax,0x00200000 ; toggle bit 21
|
||||
push eax ; put to stack
|
||||
popfd ; save changed EAX to EFLAGS
|
||||
pushfd ; push EFLAGS to TOS
|
||||
pop eax ; store EFLAGS in EAX
|
||||
cmp eax,ebx ; see if bit 21 has changed
|
||||
jz near .noid ; if no change, then no CPUID
|
||||
|
||||
; Get vendor ID
|
||||
xor eax,eax
|
||||
CPUID
|
||||
mov [edi],ebx
|
||||
mov [edi+4],edx
|
||||
mov [edi+8],ecx
|
||||
|
||||
cmp ebx,0x68747541 ; 'htuA'
|
||||
jne .notamd
|
||||
cmp edx,0x69746e65 ; 'itne'
|
||||
jne .notamd
|
||||
cmp ecx,0x444d4163 ; 'DMAc'
|
||||
jne .notamd
|
||||
inc byte [edi+87]
|
||||
.notamd:
|
||||
|
||||
; Get features flags and other info
|
||||
mov eax,1
|
||||
CPUID
|
||||
mov [edi+68],ebx ; Store brand index and other stuff
|
||||
mov [edi+72],ecx ; Store extended feature flags
|
||||
mov [edi+76],edx ; Store feature flags
|
||||
|
||||
test edx,(1<<19) ; If CLFLUSH instruction is supported,
|
||||
jz .noclf
|
||||
shl bh,3 ; get the real cache line size.
|
||||
mov [edi+88],bh
|
||||
|
||||
.noclf mov bl,al ; Extract stepping
|
||||
and bl,0x0F
|
||||
mov [edi+64],bl
|
||||
|
||||
mov bl,ah ; Extract processor type
|
||||
shr bl,4 ; (Valid for Intel only)
|
||||
and bl,0x03
|
||||
mov [edi+67],bl
|
||||
|
||||
shr al,4 ; Extract model and family
|
||||
and ah,0x0F ; model in al and family in ah
|
||||
cmp ah,15
|
||||
jne .noex
|
||||
|
||||
mov ebx,eax ; Add extended model and family
|
||||
shr ebx,12
|
||||
and bl,0xF0
|
||||
add ah,bh
|
||||
or al,bl
|
||||
|
||||
.noex mov [edi+65],al
|
||||
mov [edi+66],ah
|
||||
|
||||
; Check for processor brand string
|
||||
mov eax,0x80000000
|
||||
CPUID
|
||||
cmp eax,0x80000001
|
||||
je .feat2
|
||||
jb near .noid
|
||||
cmp eax,0x80000004
|
||||
jb .feat2
|
||||
cmp eax,0x80000005
|
||||
jb .brand
|
||||
|
||||
; Get data L1 cache info
|
||||
mov eax,0x80000005
|
||||
CPUID
|
||||
mov [edi+88],ecx
|
||||
|
||||
; Get processor brand string
|
||||
.brand mov eax,0x80000002
|
||||
CPUID
|
||||
mov [edi+16],eax
|
||||
mov [edi+20],ebx
|
||||
mov [edi+24],ecx
|
||||
mov [edi+28],edx
|
||||
mov eax,0x80000003
|
||||
CPUID
|
||||
mov [edi+32],eax
|
||||
mov [edi+36],ebx
|
||||
mov [edi+40],ecx
|
||||
mov [edi+44],edx
|
||||
mov eax,0x80000004
|
||||
CPUID
|
||||
mov [edi+48],eax
|
||||
mov [edi+52],ebx
|
||||
mov [edi+56],ecx
|
||||
mov [edi+60],edx
|
||||
|
||||
; Get AMD-specific feature flags
|
||||
.feat2 cmp byte [edi+87],0
|
||||
jz .noid
|
||||
mov eax,0x80000001
|
||||
CPUID
|
||||
mov [edi+80],edx
|
||||
|
||||
mov bl,al ; Extract stepping
|
||||
and bl,0x0F
|
||||
mov [edi+84],bl
|
||||
|
||||
shr al,4 ; Extract model and family
|
||||
and ah,0x0F ; model in al and family in ah
|
||||
cmp ah,15
|
||||
jne .noex2
|
||||
|
||||
mov ebx,eax ; Add extended model and family
|
||||
shr ebx,12
|
||||
and bl,0xF0
|
||||
add ah,bh
|
||||
or al,bl
|
||||
|
||||
.noex2 mov [edi+85],al
|
||||
mov [edi+86],ah
|
||||
|
||||
.noid pop edi
|
||||
pop ebx
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; EndMMX
|
||||
;
|
||||
; Signal the end of MMX code for compilers that can't
|
||||
; do inline assembly. Currently unused.
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL EndMMX
|
||||
|
||||
EndMMX:
|
||||
emms
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; DoBlending_MMX
|
||||
;
|
||||
; MMX version of DoBlending
|
||||
;
|
||||
; (DWORD *from, DWORD *to, count, tor, tog, tob, toa)
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL DoBlending_MMX
|
||||
|
||||
DoBlending_MMX:
|
||||
pxor mm0,mm0 ; mm0 = 0
|
||||
mov eax,[esp+4*4]
|
||||
shl eax,16
|
||||
mov edx,[esp+4*5]
|
||||
shl edx,8
|
||||
or eax,[esp+4*6]
|
||||
or eax,edx
|
||||
mov ecx,[esp+4*3] ; ecx = count
|
||||
movd mm1,eax ; mm1 = 00000000 00RRGGBB
|
||||
mov eax,[esp+4*7]
|
||||
shl eax,16
|
||||
mov edx,[esp+4*7]
|
||||
shl edx,8
|
||||
or eax,[esp+4*7]
|
||||
or eax,edx
|
||||
mov edx,[esp+4*2] ; edx = dest
|
||||
movd mm6,eax ; mm6 = 00000000 00AAAAAA
|
||||
punpcklbw mm1,mm0 ; mm1 = 000000RR 00GG00BB
|
||||
movq mm7,[Blending256]
|
||||
punpcklbw mm6,mm0 ; mm6 = 000000AA 00AA00AA
|
||||
mov eax,[esp+4*1] ; eax = source
|
||||
pmullw mm1,mm6 ; mm1 = 000000RR 00GG00BB (multiplied by alpha)
|
||||
psubusw mm7,mm6 ; mm7 = 000000aa 00aa00aa (one minus alpha)
|
||||
nop ; Does this actually pair on a Pentium?
|
||||
|
||||
; Do two colors per iteration: Count must be even.
|
||||
|
||||
.loop movq mm2,[eax] ; mm2 = 00r2g2b2 00r1g1b1
|
||||
add eax,8
|
||||
movq mm3,mm2 ; mm3 = 00r2g2b2 00r1g1b1
|
||||
punpcklbw mm2,mm0 ; mm2 = 000000r1 00g100b1
|
||||
movq mm4,mm1
|
||||
punpckhbw mm3,mm0 ; mm3 = 000000r2 00g200b2
|
||||
pmullw mm2,mm7 ; mm2 = 0000r1rr g1ggb1bb
|
||||
add edx,8
|
||||
pmullw mm3,mm7 ; mm3 = 0000r2rr g2ggb2bb
|
||||
sub ecx,2
|
||||
paddusw mm2,mm1
|
||||
paddusw mm3,mm1
|
||||
psrlw mm2,8
|
||||
psrlw mm3,8
|
||||
packuswb mm2,mm3 ; mm2 = 00r2g2b2 00r1g1b1
|
||||
movq [edx-8],mm2
|
||||
jnz .loop
|
||||
|
||||
emms
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; BestColor_MMX
|
||||
;
|
||||
; Picks the closest matching color from a palette
|
||||
;
|
||||
; Passed FFRRGGBB and palette array in same format
|
||||
; FF is the index of the first palette entry to consider
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL BestColor_MMX
|
||||
GLOBAL @BestColor_MMX@8
|
||||
|
||||
BestColor_MMX:
|
||||
mov ecx,[esp+4]
|
||||
mov edx,[esp+8]
|
||||
@BestColor_MMX@8:
|
||||
pxor mm0,mm0
|
||||
movd mm1,ecx ; mm1 = color searching for
|
||||
mov eax,257*257+257*257+257*257 ;eax = bestdist
|
||||
push ebx
|
||||
punpcklbw mm1,mm0
|
||||
mov ebx,ecx ; ebx = best color
|
||||
shr ecx,24 ; ecx = count
|
||||
and ebx,0xffffff
|
||||
push esi
|
||||
push ebp
|
||||
|
||||
.loop movd mm2,[edx+ecx*4] ; mm2 = color considering now
|
||||
inc ecx
|
||||
punpcklbw mm2,mm0
|
||||
movq mm3,mm1
|
||||
psubsw mm3,mm2
|
||||
pmullw mm3,mm3 ; mm3 = color distance squared
|
||||
|
||||
movd ebp,mm3 ; add the three components
|
||||
psrlq mm3,32 ; into ebp to get the real
|
||||
mov esi,ebp ; (squared) distance
|
||||
shr esi,16
|
||||
and ebp,0xffff
|
||||
add ebp,esi
|
||||
movd esi,mm3
|
||||
add ebp,esi
|
||||
|
||||
jz .perf ; found a perfect match
|
||||
cmp eax,ebp
|
||||
jb .skip
|
||||
mov eax,ebp
|
||||
lea ebx,[ecx-1]
|
||||
.skip cmp ecx,256
|
||||
jne .loop
|
||||
mov eax,ebx
|
||||
pop ebp
|
||||
pop esi
|
||||
pop ebx
|
||||
emms
|
||||
ret
|
||||
|
||||
.perf lea eax,[ecx-1]
|
||||
pop ebp
|
||||
pop esi
|
||||
pop ebx
|
||||
emms
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; DoubleHoriz_MMX
|
||||
;
|
||||
; Stretches an image horizontally using MMX instructions.
|
||||
; The source image is assumed to occupy the right half
|
||||
; of the destination image.
|
||||
;
|
||||
; height of source
|
||||
; width of source
|
||||
; dest pointer (at end of row)
|
||||
; pitch
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL DoubleHoriz_MMX
|
||||
|
||||
DoubleHoriz_MMX:
|
||||
mov edx,[esp+8] ; edx = width
|
||||
push edi
|
||||
|
||||
neg edx ; make edx negative so we can count up
|
||||
mov edi,[esp+16] ; edi = dest pointer
|
||||
|
||||
sar edx,2 ; and make edx count groups of 4 pixels
|
||||
push ebp
|
||||
|
||||
mov ebp,edx ; ebp = # of columns remaining in this row
|
||||
push ebx
|
||||
|
||||
mov ebx,[esp+28] ; ebx = pitch
|
||||
mov ecx,[esp+16] ; ecx = # of rows remaining
|
||||
|
||||
.loop movq mm0,[edi+ebp*4]
|
||||
|
||||
.loop2 movq mm1,mm0
|
||||
punpcklbw mm0,mm0 ; double left 4 pixels
|
||||
|
||||
movq mm2,[edi+ebp*4+8]
|
||||
punpckhbw mm1,mm1 ; double right 4 pixels
|
||||
|
||||
movq [edi+ebp*8],mm0 ; write left pixels
|
||||
movq mm0,mm2
|
||||
|
||||
movq [edi+ebp*8+8],mm1 ; write right pixels
|
||||
|
||||
add ebp,2 ; increment counter
|
||||
jnz .loop2 ; repeat until done with this row
|
||||
|
||||
|
||||
add edi,ebx ; move edi to next row
|
||||
dec ecx ; decrease row counter
|
||||
|
||||
mov ebp,edx ; prep ebp for next row
|
||||
jnz .loop ; repeat until every row is done
|
||||
|
||||
emms
|
||||
pop ebx
|
||||
pop ebp
|
||||
pop edi
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; DoubleHorizVert_MMX
|
||||
;
|
||||
; Stretches an image horizontally and vertically using
|
||||
; MMX instructions. The source image is assumed to occupy
|
||||
; the right half of the destination image and to leave
|
||||
; every other line unused for expansion.
|
||||
;
|
||||
; height of source
|
||||
; width of source
|
||||
; dest pointer (at end of row)
|
||||
; pitch
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL DoubleHorizVert_MMX
|
||||
|
||||
DoubleHorizVert_MMX:
|
||||
mov edx,[esp+8] ; edx = width
|
||||
push edi
|
||||
|
||||
neg edx ; make edx negative so we can count up
|
||||
mov edi,[esp+16] ; edi = dest pointer
|
||||
|
||||
sar edx,2 ; and make edx count groups of 4 pixels
|
||||
push ebp
|
||||
|
||||
mov ebp,edx ; ebp = # of columns remaining in this row
|
||||
push ebx
|
||||
|
||||
mov ebx,[esp+28] ; ebx = pitch
|
||||
mov ecx,[esp+16] ; ecx = # of rows remaining
|
||||
|
||||
push esi
|
||||
lea esi,[edi+ebx]
|
||||
|
||||
.loop movq mm0,[edi+ebp*4] ; get 8 pixels
|
||||
|
||||
movq mm1,mm0
|
||||
punpcklbw mm0,mm0 ; double left 4
|
||||
|
||||
punpckhbw mm1,mm1 ; double right 4
|
||||
add ebp,2 ; increment counter
|
||||
|
||||
movq [edi+ebp*8-16],mm0 ; write them back out
|
||||
|
||||
movq [edi+ebp*8-8],mm1
|
||||
|
||||
movq [esi+ebp*8-16],mm0
|
||||
|
||||
movq [esi+ebp*8-8],mm1
|
||||
|
||||
jnz .loop ; repeat until done with this row
|
||||
|
||||
lea edi,[edi+ebx*2] ; move edi and esi to next row
|
||||
lea esi,[esi+ebx*2]
|
||||
|
||||
dec ecx ; decrease row counter
|
||||
mov ebp,edx ; prep ebp for next row
|
||||
|
||||
jnz .loop ; repeat until every row is done
|
||||
|
||||
emms
|
||||
pop esi
|
||||
pop ebx
|
||||
pop ebp
|
||||
pop edi
|
||||
ret
|
||||
|
||||
;-----------------------------------------------------------
|
||||
;
|
||||
; DoubleVert_ASM
|
||||
;
|
||||
; Stretches an image vertically using regular x86
|
||||
; instructions. The source image should be interleaved.
|
||||
;
|
||||
; height of source
|
||||
; width of source
|
||||
; source/dest pointer
|
||||
; pitch
|
||||
;
|
||||
;-----------------------------------------------------------
|
||||
|
||||
GLOBAL DoubleVert_ASM
|
||||
|
||||
DoubleVert_ASM:
|
||||
mov edx,[esp+16] ; edx = pitch
|
||||
mov eax,[esp+4] ; eax = # of rows left
|
||||
|
||||
push esi
|
||||
mov esi,[esp+16]
|
||||
|
||||
push edi
|
||||
lea edi,[esi+edx]
|
||||
|
||||
shl edx,1 ; edx = pitch*2
|
||||
mov ecx,[esp+16]
|
||||
|
||||
sub edx,ecx ; edx = dist from end of one line to start of next
|
||||
shr ecx,2
|
||||
|
||||
.loop rep movsd
|
||||
|
||||
mov ecx,[esp+16]
|
||||
add esi,edx
|
||||
|
||||
add edi,edx
|
||||
shr ecx,2
|
||||
|
||||
dec eax
|
||||
jnz .loop
|
||||
|
||||
pop edi
|
||||
pop esi
|
||||
ret
|
|
@ -794,7 +794,7 @@ AInventory *AActor::FindInventory (FName type)
|
|||
|
||||
AInventory *AActor::GiveInventoryType (const PClass *type)
|
||||
{
|
||||
AInventory *item;
|
||||
AInventory *item = NULL;
|
||||
|
||||
if (type != NULL)
|
||||
{
|
||||
|
|
178
src/r_draw.cpp
178
src/r_draw.cpp
|
@ -69,19 +69,6 @@ int scaledviewwidth;
|
|||
int viewwindowx;
|
||||
int viewwindowy;
|
||||
|
||||
extern "C" {
|
||||
int realviewwidth; // [RH] Physical width of view window
|
||||
int realviewheight; // [RH] Physical height of view window
|
||||
int detailxshift; // [RH] X shift for horizontal detail level
|
||||
int detailyshift; // [RH] Y shift for vertical detail level
|
||||
}
|
||||
|
||||
#ifdef USEASM
|
||||
extern "C" void STACK_ARGS DoubleHoriz_MMX (int height, int width, BYTE *dest, int pitch);
|
||||
extern "C" void STACK_ARGS DoubleHorizVert_MMX (int height, int width, BYTE *dest, int pitch);
|
||||
extern "C" void STACK_ARGS DoubleVert_ASM (int height, int width, BYTE *dest, int pitch);
|
||||
#endif
|
||||
|
||||
// [RH] Pointers to the different column drawers.
|
||||
// These get changed depending on the current
|
||||
// screen depth and asm/no asm.
|
||||
|
@ -130,8 +117,6 @@ const BYTE* bufplce[4];
|
|||
int dccount;
|
||||
}
|
||||
|
||||
cycle_t DetailDoubleCycles;
|
||||
|
||||
int dc_fillcolor;
|
||||
BYTE *dc_translation;
|
||||
BYTE shadetables[NUMCOLORMAPS*16*256];
|
||||
|
@ -161,7 +146,7 @@ EXTERN_CVAR (Int, r_columnmethod)
|
|||
/* */
|
||||
/************************************/
|
||||
|
||||
#ifndef USEASM
|
||||
#ifndef X86_ASM
|
||||
//
|
||||
// A column is a vertical slice/span from a wall texture that,
|
||||
// given the DOOM style restrictions on the view orientation,
|
||||
|
@ -212,7 +197,7 @@ void R_DrawColumnP_C (void)
|
|||
} while (--count);
|
||||
}
|
||||
}
|
||||
#endif // USEASM
|
||||
#endif
|
||||
|
||||
// [RH] Just fills a column with a color
|
||||
void R_FillColumnP (void)
|
||||
|
@ -404,7 +389,7 @@ void R_InitFuzzTable (int fuzzoff)
|
|||
}
|
||||
}
|
||||
|
||||
#ifndef USEASM
|
||||
#ifndef X86_ASM
|
||||
//
|
||||
// Creates a fuzzy image by copying pixels from adjacent ones above and below.
|
||||
// Used with an all black colormap, this could create the SHADOW effect,
|
||||
|
@ -480,7 +465,7 @@ void R_DrawFuzzColumnP_C (void)
|
|||
fuzzpos = fuzz;
|
||||
}
|
||||
}
|
||||
#endif // USEASM
|
||||
#endif
|
||||
|
||||
//
|
||||
// R_DrawTranlucentColumn
|
||||
|
@ -976,7 +961,7 @@ int dscount;
|
|||
|
||||
//
|
||||
// Draws the actual span.
|
||||
#if !defined(USEASM)
|
||||
#ifndef X86_ASM
|
||||
void R_DrawSpanP_C (void)
|
||||
{
|
||||
dsfixed_t xfrac;
|
||||
|
@ -1256,14 +1241,21 @@ void R_FillSpan (void)
|
|||
|
||||
// wallscan stuff, in C
|
||||
|
||||
#ifndef USEASM
|
||||
#ifndef X86_ASM
|
||||
static DWORD STACK_ARGS vlinec1 ();
|
||||
static void STACK_ARGS vlinec4 ();
|
||||
static int vlinebits;
|
||||
|
||||
DWORD (STACK_ARGS *dovline1)() = vlinec1;
|
||||
DWORD (STACK_ARGS *doprevline1)() = vlinec1;
|
||||
|
||||
#ifdef X64_ASM
|
||||
extern "C" static void vlinetallasm4();
|
||||
#define dovline4 vlinetallasm4
|
||||
extern "C" void setupvlinetallasm (int);
|
||||
#else
|
||||
static void STACK_ARGS vlinec4 ();
|
||||
void (STACK_ARGS *dovline4)() = vlinec4;
|
||||
#endif
|
||||
|
||||
static DWORD STACK_ARGS mvlinec1();
|
||||
static void STACK_ARGS mvlinec4();
|
||||
|
@ -1281,8 +1273,8 @@ DWORD STACK_ARGS prevlineasm1 ();
|
|||
DWORD STACK_ARGS vlinetallasm1 ();
|
||||
DWORD STACK_ARGS prevlinetallasm1 ();
|
||||
void STACK_ARGS vlineasm4 ();
|
||||
void STACK_ARGS vlinetallasm4 ();
|
||||
void STACK_ARGS vlinetallasmathlon4 ();
|
||||
void STACK_ARGS vlinetallasm4 ();
|
||||
void STACK_ARGS setupvlineasm (int);
|
||||
void STACK_ARGS setupvlinetallasm (int);
|
||||
|
||||
|
@ -1301,7 +1293,7 @@ void (STACK_ARGS *domvline4)() = mvlineasm4;
|
|||
|
||||
void setupvline (int fracbits)
|
||||
{
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
if (CPU.Family <= 5)
|
||||
{
|
||||
if (fracbits >= 24)
|
||||
|
@ -1329,10 +1321,13 @@ void setupvline (int fracbits)
|
|||
}
|
||||
#else
|
||||
vlinebits = fracbits;
|
||||
#ifdef X64_ASM
|
||||
setupvlinetallasm(fracbits);
|
||||
#endif
|
||||
#endif
|
||||
}
|
||||
|
||||
#ifndef USEASM
|
||||
#if !defined(X86_ASM)
|
||||
DWORD STACK_ARGS vlinec1 ()
|
||||
{
|
||||
DWORD fracstep = dc_iscale;
|
||||
|
@ -1374,7 +1369,7 @@ void STACK_ARGS vlinec4 ()
|
|||
|
||||
void setupmvline (int fracbits)
|
||||
{
|
||||
#if defined(USEASM)
|
||||
#if defined(X86_ASM)
|
||||
setupmvlineasm (fracbits);
|
||||
domvline1 = mvlineasm1;
|
||||
domvline4 = mvlineasm4;
|
||||
|
@ -1383,7 +1378,7 @@ void setupmvline (int fracbits)
|
|||
#endif
|
||||
}
|
||||
|
||||
#ifndef USEASM
|
||||
#if !defined(X86_ASM)
|
||||
DWORD STACK_ARGS mvlinec1 ()
|
||||
{
|
||||
DWORD fracstep = dc_iscale;
|
||||
|
@ -1863,17 +1858,17 @@ void R_DrawViewBorder (void)
|
|||
SB_state = screen->GetPageCount ();
|
||||
}
|
||||
|
||||
if (realviewwidth == SCREENWIDTH)
|
||||
if (viewwidth == SCREENWIDTH)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
R_DrawBorder (0, 0, SCREENWIDTH, viewwindowy);
|
||||
R_DrawBorder (0, viewwindowy, viewwindowx, realviewheight + viewwindowy);
|
||||
R_DrawBorder (viewwindowx + realviewwidth, viewwindowy, SCREENWIDTH, realviewheight + viewwindowy);
|
||||
R_DrawBorder (0, viewwindowy + realviewheight, SCREENWIDTH, ST_Y);
|
||||
R_DrawBorder (0, viewwindowy, viewwindowx, viewheight + viewwindowy);
|
||||
R_DrawBorder (viewwindowx + viewwidth, viewwindowy, SCREENWIDTH, viewheight + viewwindowy);
|
||||
R_DrawBorder (0, viewwindowy + viewheight, SCREENWIDTH, ST_Y);
|
||||
|
||||
M_DrawFrame (viewwindowx, viewwindowy, realviewwidth, realviewheight);
|
||||
M_DrawFrame (viewwindowx, viewwindowy, viewwidth, viewheight);
|
||||
V_MarkRect (0, 0, SCREENWIDTH, ST_Y);
|
||||
}
|
||||
|
||||
|
@ -1893,7 +1888,7 @@ void R_DrawTopBorder ()
|
|||
FTexture *p;
|
||||
int offset;
|
||||
|
||||
if (realviewwidth == SCREENWIDTH)
|
||||
if (viewwidth == SCREENWIDTH)
|
||||
return;
|
||||
|
||||
offset = gameinfo.border->offset;
|
||||
|
@ -1901,135 +1896,34 @@ void R_DrawTopBorder ()
|
|||
if (viewwindowy < 34)
|
||||
{
|
||||
R_DrawBorder (0, 0, viewwindowx, 34);
|
||||
R_DrawBorder (viewwindowx, 0, viewwindowx+realviewwidth, viewwindowy);
|
||||
R_DrawBorder (viewwindowx+realviewwidth, 0, SCREENWIDTH, 34);
|
||||
R_DrawBorder (viewwindowx, 0, viewwindowx + viewwidth, viewwindowy);
|
||||
R_DrawBorder (viewwindowx + viewwidth, 0, SCREENWIDTH, 34);
|
||||
p = TexMan(gameinfo.border->t);
|
||||
screen->FlatFill(viewwindowx, viewwindowy - p->GetHeight(),
|
||||
viewwindowx + realviewwidth, viewwindowy, p, true);
|
||||
viewwindowx + viewwidth, viewwindowy, p, true);
|
||||
|
||||
p = TexMan(gameinfo.border->l);
|
||||
screen->FlatFill(viewwindowx - p->GetWidth(), viewwindowy,
|
||||
viewwindowx, 35, p, true);
|
||||
p = TexMan(gameinfo.border->r);
|
||||
screen->FlatFill(viewwindowx + realviewwidth, viewwindowy,
|
||||
viewwindowx + realviewwidth + p->GetWidth(), 35, p, true);
|
||||
screen->FlatFill(viewwindowx + viewwidth, viewwindowy,
|
||||
viewwindowx + viewwidth + p->GetWidth(), 35, p, true);
|
||||
|
||||
p = TexMan(gameinfo.border->tl);
|
||||
screen->DrawTexture (p, viewwindowx-offset, viewwindowy - offset, TAG_DONE);
|
||||
screen->DrawTexture (p, viewwindowx - offset, viewwindowy - offset, TAG_DONE);
|
||||
|
||||
p = TexMan(gameinfo.border->tr);
|
||||
screen->DrawTexture (p, viewwindowx+realviewwidth, viewwindowy - offset, TAG_DONE);
|
||||
screen->DrawTexture (p, viewwindowx + viewwidth, viewwindowy - offset, TAG_DONE);
|
||||
}
|
||||
else
|
||||
{
|
||||
R_DrawBorder (0, 0, SCREENWIDTH, 34);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// [RH] Double pixels in the view window horizontally
|
||||
// and/or vertically (or not at all).
|
||||
void R_DetailDouble ()
|
||||
{
|
||||
if (!viewactive) return;
|
||||
DetailDoubleCycles = 0;
|
||||
clock (DetailDoubleCycles);
|
||||
|
||||
switch ((detailxshift << 1) | detailyshift)
|
||||
{
|
||||
case 1: // y-double
|
||||
#ifdef USEASM
|
||||
DoubleVert_ASM (viewheight, viewwidth, dc_destorg, RenderTarget->GetPitch());
|
||||
#else
|
||||
{
|
||||
int rowsize = realviewwidth;
|
||||
int pitch = RenderTarget->GetPitch();
|
||||
int y;
|
||||
BYTE *line;
|
||||
|
||||
line = dc_destorg;
|
||||
for (y = viewheight; y != 0; --y, line += pitch<<1)
|
||||
{
|
||||
memcpy (line+pitch, line, rowsize);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
break;
|
||||
|
||||
case 2: // x-double
|
||||
#ifdef USEASM
|
||||
if (CPU.bMMX && (viewwidth&15)==0)
|
||||
{
|
||||
DoubleHoriz_MMX (viewheight, viewwidth, dc_destorg+viewwidth, RenderTarget->GetPitch());
|
||||
}
|
||||
else
|
||||
#endif
|
||||
{
|
||||
int rowsize = viewwidth;
|
||||
int pitch = RenderTarget->GetPitch();
|
||||
int y,x;
|
||||
BYTE *linefrom, *lineto;
|
||||
|
||||
linefrom = dc_destorg;
|
||||
for (y = viewheight; y != 0; --y, linefrom += pitch)
|
||||
{
|
||||
lineto = linefrom - viewwidth;
|
||||
for (x = 0; x < rowsize; ++x)
|
||||
{
|
||||
BYTE c = linefrom[x];
|
||||
lineto[x*2] = c;
|
||||
lineto[x*2+1] = c;
|
||||
}
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case 3: // x- and y-double
|
||||
#ifdef USEASM
|
||||
if (CPU.bMMX && (viewwidth&15)==0 && 0)
|
||||
{
|
||||
DoubleHorizVert_MMX (viewheight, viewwidth, dc_destorg+viewwidth, RenderTarget->GetPitch());
|
||||
}
|
||||
else
|
||||
#endif
|
||||
{
|
||||
int rowsize = viewwidth;
|
||||
int realpitch = RenderTarget->GetPitch();
|
||||
int pitch = realpitch << 1;
|
||||
int y,x;
|
||||
BYTE *linefrom, *lineto;
|
||||
|
||||
linefrom = dc_destorg;
|
||||
for (y = viewheight; y != 0; --y, linefrom += pitch)
|
||||
{
|
||||
lineto = linefrom - viewwidth;
|
||||
for (x = 0; x < rowsize; ++x)
|
||||
{
|
||||
BYTE c = linefrom[x];
|
||||
lineto[x*2] = c;
|
||||
lineto[x*2+1] = c;
|
||||
lineto[x*2+realpitch] = c;
|
||||
lineto[x*2+realpitch+1] = c;
|
||||
}
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
unclock (DetailDoubleCycles);
|
||||
}
|
||||
|
||||
ADD_STAT(detail)
|
||||
{
|
||||
FString out;
|
||||
out.Format ("doubling = %04.1f ms", (double)DetailDoubleCycles * 1000 * SecondsPerCycle);
|
||||
return out;
|
||||
}
|
||||
|
||||
// [RH] Initialize the column drawer pointers
|
||||
void R_InitColumnDrawers ()
|
||||
{
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
R_DrawColumn = R_DrawColumnP_ASM;
|
||||
R_DrawColumnHoriz = R_DrawColumnHorizP_ASM;
|
||||
R_DrawFuzzColumn = R_DrawFuzzColumnP_ASM;
|
||||
|
|
37
src/r_draw.h
37
src/r_draw.h
|
@ -67,7 +67,12 @@ extern void (*R_DrawColumn)(void);
|
|||
|
||||
extern DWORD (STACK_ARGS *dovline1) ();
|
||||
extern DWORD (STACK_ARGS *doprevline1) ();
|
||||
#ifdef X64_ASM
|
||||
#define dovline4 vlinetallasm4
|
||||
extern "C" void vlinetallasm4();
|
||||
#else
|
||||
extern void (STACK_ARGS *dovline4) ();
|
||||
#endif
|
||||
extern void setupvline (int);
|
||||
|
||||
extern DWORD (STACK_ARGS *domvline1) ();
|
||||
|
@ -151,7 +156,7 @@ void STACK_ARGS rt_addclamp4cols_asm (int sx, int yl, int yh);
|
|||
|
||||
extern void (STACK_ARGS *rt_map4cols)(int sx, int yl, int yh);
|
||||
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
#define rt_copy1col rt_copy1col_asm
|
||||
#define rt_copy4cols rt_copy4cols_asm
|
||||
#define rt_map1col rt_map1col_asm
|
||||
|
@ -175,7 +180,19 @@ void rt_initcols (void);
|
|||
void R_DrawFogBoundary (int x1, int x2, short *uclip, short *dclip);
|
||||
|
||||
|
||||
#ifndef USEASM
|
||||
#ifdef X86_ASM
|
||||
|
||||
extern "C" void R_DrawColumnP_Unrolled (void);
|
||||
extern "C" void R_DrawColumnHorizP_ASM (void);
|
||||
extern "C" void R_DrawColumnP_ASM (void);
|
||||
extern "C" void R_DrawFuzzColumnP_ASM (void);
|
||||
void R_DrawTranslatedColumnP_C (void);
|
||||
void R_DrawShadedColumnP_C (void);
|
||||
extern "C" void R_DrawSpanP_ASM (void);
|
||||
extern "C" void R_DrawSpanMaskedP_ASM (void);
|
||||
|
||||
#else
|
||||
|
||||
void R_DrawColumnHorizP_C (void);
|
||||
void R_DrawColumnP_C (void);
|
||||
void R_DrawFuzzColumnP_C (void);
|
||||
|
@ -184,18 +201,6 @@ void R_DrawShadedColumnP_C (void);
|
|||
void R_DrawSpanP_C (void);
|
||||
void R_DrawSpanMaskedP_C (void);
|
||||
|
||||
#else /* USEASM */
|
||||
|
||||
extern "C" void R_DrawColumnP_Unrolled (void);
|
||||
|
||||
extern "C" void R_DrawColumnHorizP_ASM (void);
|
||||
extern "C" void R_DrawColumnP_ASM (void);
|
||||
extern "C" void R_DrawFuzzColumnP_ASM (void);
|
||||
void R_DrawTranslatedColumnP_C (void);
|
||||
void R_DrawShadedColumnP_C (void);
|
||||
extern "C" void R_DrawSpanP_ASM (void);
|
||||
extern "C" void R_DrawSpanMaskedP_ASM (void);
|
||||
|
||||
#endif
|
||||
|
||||
void R_DrawSpanTranslucentP_C (void);
|
||||
|
@ -232,10 +237,6 @@ extern FDynamicColormap ShadeFakeColormap[16];
|
|||
extern BYTE identitymap[256];
|
||||
extern BYTE *dc_translation;
|
||||
|
||||
// [RH] Double view pixels by detail mode
|
||||
void R_DetailDouble (void);
|
||||
|
||||
|
||||
|
||||
// If the view size is not full screen, draws a border around it.
|
||||
void R_DrawViewBorder (void);
|
||||
|
|
|
@ -59,13 +59,13 @@ unsigned int dc_tspans[4][MAXHEIGHT];
|
|||
unsigned int *dc_ctspan[4];
|
||||
unsigned int *horizspan[4];
|
||||
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
extern "C" void R_SetupShadedCol();
|
||||
extern "C" void R_SetupAddCol();
|
||||
extern "C" void R_SetupAddClampCol();
|
||||
#endif
|
||||
|
||||
#ifndef USEASM
|
||||
#ifndef X86_ASM
|
||||
// Copies one span at hx to the screen at sx.
|
||||
void rt_copy1col_c (int hx, int sx, int yl, int yh)
|
||||
{
|
||||
|
@ -218,7 +218,7 @@ void STACK_ARGS rt_map4cols_c (int sx, int yl, int yh)
|
|||
dest += pitch*2;
|
||||
} while (--count);
|
||||
}
|
||||
#endif /* !USEASM */
|
||||
#endif
|
||||
|
||||
void rt_Translate1col(const BYTE *translation, int hx, int yl, int yh)
|
||||
{
|
||||
|
@ -850,7 +850,7 @@ void rt_draw4cols (int sx)
|
|||
dc_ctspan[x][1] = screen->GetHeight();
|
||||
}
|
||||
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
// Setup assembly routines for changed colormaps or other parameters.
|
||||
if (hcolfunc_post4 == rt_shaded4cols)
|
||||
{
|
||||
|
|
119
src/r_main.cpp
119
src/r_main.cpp
|
@ -191,7 +191,7 @@ bool foggy; // [RH] ignore extralight and fullbright?
|
|||
int r_actualextralight;
|
||||
|
||||
bool setsizeneeded;
|
||||
int setblocks, setdetail = -1;
|
||||
int setblocks;
|
||||
|
||||
fixed_t freelookviewheight;
|
||||
|
||||
|
@ -516,8 +516,8 @@ void R_SetVisibility (float vis)
|
|||
else
|
||||
r_WallVisibility = r_BaseVisibility;
|
||||
|
||||
r_WallVisibility = FixedMul (Scale (InvZtoScale, SCREENWIDTH*(BaseRatioSizes[WidescreenRatio][1]<<detailyshift),
|
||||
(viewwidth<<detailxshift)*SCREENHEIGHT*3), FixedMul (r_WallVisibility, FocalTangent));
|
||||
r_WallVisibility = FixedMul (Scale (InvZtoScale, SCREENWIDTH*BaseRatioSizes[WidescreenRatio][1],
|
||||
viewwidth*SCREENHEIGHT*3), FixedMul (r_WallVisibility, FocalTangent));
|
||||
|
||||
// Prevent overflow on floors/ceilings. Note that the calculation of
|
||||
// MaxVisForFloor means that planes less than two units from the player's
|
||||
|
@ -562,48 +562,6 @@ void R_SetViewSize (int blocks)
|
|||
setblocks = blocks;
|
||||
}
|
||||
|
||||
//==========================================================================
|
||||
//
|
||||
// CVAR r_detail
|
||||
//
|
||||
// Selects a pixel doubling mode
|
||||
//
|
||||
//==========================================================================
|
||||
|
||||
CUSTOM_CVAR (Int, r_detail, 0, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
|
||||
{
|
||||
static bool badrecovery = false;
|
||||
|
||||
if (badrecovery)
|
||||
{
|
||||
badrecovery = false;
|
||||
return;
|
||||
}
|
||||
|
||||
if (self < 0 || self > 3)
|
||||
{
|
||||
Printf ("Bad detail mode. (Use 0-3)\n");
|
||||
badrecovery = true;
|
||||
self = (detailyshift << 1) | detailxshift;
|
||||
return;
|
||||
}
|
||||
|
||||
setdetail = self;
|
||||
setsizeneeded = true;
|
||||
}
|
||||
|
||||
//==========================================================================
|
||||
//
|
||||
// R_SetDetail
|
||||
//
|
||||
//==========================================================================
|
||||
|
||||
void R_SetDetail (int detail)
|
||||
{
|
||||
detailxshift = detail & 1;
|
||||
detailyshift = (detail >> 1) & 1;
|
||||
}
|
||||
|
||||
//==========================================================================
|
||||
//
|
||||
// R_SetWindow
|
||||
|
@ -616,19 +574,19 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
|
|||
|
||||
if (windowSize >= 11)
|
||||
{
|
||||
realviewwidth = fullWidth;
|
||||
freelookviewheight = realviewheight = fullHeight;
|
||||
viewwidth = fullWidth;
|
||||
freelookviewheight = viewheight = fullHeight;
|
||||
}
|
||||
else if (windowSize == 10)
|
||||
{
|
||||
realviewwidth = fullWidth;
|
||||
realviewheight = stHeight;
|
||||
viewwidth = fullWidth;
|
||||
viewheight = stHeight;
|
||||
freelookviewheight = fullHeight;
|
||||
}
|
||||
else
|
||||
{
|
||||
realviewwidth = ((setblocks*fullWidth)/10) & (~15);
|
||||
realviewheight = ((setblocks*stHeight)/10)&~7;
|
||||
viewwidth = ((setblocks*fullWidth)/10) & (~15);
|
||||
viewheight = ((setblocks*stHeight)/10)&~7;
|
||||
freelookviewheight = ((setblocks*fullHeight)/10)&~7;
|
||||
}
|
||||
|
||||
|
@ -637,10 +595,7 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
|
|||
|
||||
DrawFSHUD = (windowSize == 11);
|
||||
|
||||
viewwidth = realviewwidth >> detailxshift;
|
||||
viewheight = realviewheight >> detailyshift;
|
||||
fuzzviewheight = viewheight - 2; // Maximum row the fuzzer can draw to
|
||||
freelookviewheight >>= detailyshift;
|
||||
halfviewwidth = (viewwidth >> 1) - 1;
|
||||
|
||||
if (!bRenderingToCanvas)
|
||||
|
@ -659,8 +614,8 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
|
|||
centerxfrac = centerx<<FRACBITS;
|
||||
centeryfrac = centery<<FRACBITS;
|
||||
|
||||
virtwidth = fullWidth >> detailxshift;
|
||||
virtheight = fullHeight >> detailyshift;
|
||||
virtwidth = fullWidth;
|
||||
virtheight = fullHeight;
|
||||
if (WidescreenRatio & 4)
|
||||
{
|
||||
virtheight = virtheight * BaseRatioSizes[WidescreenRatio][3] / 48;
|
||||
|
@ -692,8 +647,8 @@ void R_SetWindow (int windowSize, int fullWidth, int fullHeight, int stHeight)
|
|||
|
||||
R_InitTextureMapping ();
|
||||
|
||||
MaxVisForWall = FixedMul (Scale (InvZtoScale, SCREENWIDTH*(r_Yaspect<<detailyshift),
|
||||
(viewwidth<<detailxshift)*SCREENHEIGHT), FocalTangent);
|
||||
MaxVisForWall = FixedMul (Scale (InvZtoScale, SCREENWIDTH*r_Yaspect,
|
||||
viewwidth*SCREENHEIGHT), FocalTangent);
|
||||
MaxVisForWall = FixedDiv (0x7fff0000, MaxVisForWall);
|
||||
MaxVisForFloor = Scale (FixedDiv (0x7fff0000, viewheight<<(FRACBITS-2)), FocalLengthY, 160*FRACUNIT);
|
||||
|
||||
|
@ -712,20 +667,13 @@ void R_ExecuteSetViewSize ()
|
|||
setsizeneeded = false;
|
||||
BorderNeedRefresh = screen->GetPageCount ();
|
||||
|
||||
if (setdetail >= 0)
|
||||
{
|
||||
R_SetDetail (setdetail);
|
||||
setdetail = -1;
|
||||
}
|
||||
|
||||
R_SetWindow (setblocks, SCREENWIDTH, SCREENHEIGHT, ST_Y);
|
||||
|
||||
// Handle resize, e.g. smaller view windows with border and/or status bar.
|
||||
viewwindowx = (screen->GetWidth() - (viewwidth<<detailxshift))>>1;
|
||||
viewwindowx = (screen->GetWidth() - viewwidth) >> 1;
|
||||
|
||||
// Same with base row offset.
|
||||
viewwindowy = ((viewwidth<<detailxshift) == screen->GetWidth()) ?
|
||||
0 : (ST_Y-(viewheight<<detailyshift)) >> 1;
|
||||
viewwindowy = (viewwidth == screen->GetWidth()) ? 0 : (ST_Y - viewheight) >> 1;
|
||||
}
|
||||
|
||||
//==========================================================================
|
||||
|
@ -762,7 +710,7 @@ CUSTOM_CVAR (Int, r_columnmethod, 1, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
|
|||
}
|
||||
else
|
||||
{ // Trigger the change
|
||||
r_detail.Callback ();
|
||||
setsizeneeded = true;
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -1434,29 +1382,20 @@ void R_EnterMirror (drawseg_t *ds, int depth)
|
|||
//
|
||||
//==========================================================================
|
||||
|
||||
void R_SetupBuffer (bool inview)
|
||||
void R_SetupBuffer ()
|
||||
{
|
||||
static BYTE *lastbuff = NULL;
|
||||
|
||||
int pitch = RenderTarget->GetPitch();
|
||||
BYTE *lineptr = RenderTarget->GetBuffer() + viewwindowy*pitch + viewwindowx;
|
||||
|
||||
if (inview)
|
||||
{
|
||||
pitch <<= detailyshift;
|
||||
}
|
||||
if (detailxshift)
|
||||
{
|
||||
lineptr += viewwidth;
|
||||
}
|
||||
|
||||
if (dc_pitch != pitch || lineptr != lastbuff)
|
||||
{
|
||||
if (dc_pitch != pitch)
|
||||
{
|
||||
dc_pitch = pitch;
|
||||
R_InitFuzzTable (pitch);
|
||||
#ifdef USEASM
|
||||
#if defined(X86_ASM) || defined(X64_ASM)
|
||||
ASM_PatchPitch ();
|
||||
#endif
|
||||
}
|
||||
|
@ -1478,7 +1417,7 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
|
|||
{
|
||||
WallCycles = PlaneCycles = MaskedCycles = WallScanCycles = 0;
|
||||
|
||||
R_SetupBuffer (true);
|
||||
R_SetupBuffer ();
|
||||
R_SetupFrame (actor);
|
||||
|
||||
// Clear buffers.
|
||||
|
@ -1569,17 +1508,8 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
|
|||
}
|
||||
}
|
||||
WallMirrors.Clear ();
|
||||
|
||||
interpolator.RestoreInterpolations ();
|
||||
|
||||
// If there is vertical doubling, and the view window is not an even height,
|
||||
// draw a black line at the bottom of the view window.
|
||||
if (detailyshift && viewwindowy == 0 && (realviewheight & 1))
|
||||
{
|
||||
screen->Clear (0, realviewheight-1, realviewwidth, realviewheight, 0, 0);
|
||||
}
|
||||
|
||||
R_SetupBuffer (false);
|
||||
R_SetupBuffer ();
|
||||
}
|
||||
|
||||
//==========================================================================
|
||||
|
@ -1593,16 +1523,12 @@ void R_RenderActorView (AActor *actor, bool dontmaplines)
|
|||
void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas,
|
||||
int x, int y, int width, int height, bool dontmaplines)
|
||||
{
|
||||
const int saveddetail = detailxshift | (detailyshift << 1);
|
||||
const bool savedviewactive = viewactive;
|
||||
|
||||
detailxshift = detailyshift = 0;
|
||||
realviewwidth = viewwidth = width;
|
||||
|
||||
viewwidth = width;
|
||||
RenderTarget = canvas;
|
||||
bRenderingToCanvas = true;
|
||||
|
||||
R_SetDetail (0);
|
||||
R_SetWindow (12, width, height, height);
|
||||
viewwindowx = x;
|
||||
viewwindowy = y;
|
||||
|
@ -1612,10 +1538,9 @@ void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas,
|
|||
|
||||
RenderTarget = screen;
|
||||
bRenderingToCanvas = false;
|
||||
R_SetDetail (saveddetail);
|
||||
R_ExecuteSetViewSize ();
|
||||
screen->Lock (true);
|
||||
R_SetupBuffer (false);
|
||||
R_SetupBuffer ();
|
||||
screen->Unlock ();
|
||||
viewactive = savedviewactive;
|
||||
}
|
||||
|
|
|
@ -128,10 +128,6 @@ extern int fixedlightlev;
|
|||
extern lighttable_t* fixedcolormap;
|
||||
|
||||
|
||||
// [RH] New detail modes
|
||||
extern "C" int detailxshift;
|
||||
extern "C" int detailyshift;
|
||||
|
||||
//
|
||||
// Function pointers to switch refresh/drawing functions.
|
||||
// Used to select shadow mode etc.
|
||||
|
@ -190,7 +186,7 @@ void R_SetViewAngle ();
|
|||
// Called by G_Drawer.
|
||||
void R_RenderActorView (AActor *actor, bool dontmaplines = false);
|
||||
void R_RefreshViewBorder ();
|
||||
void R_SetupBuffer (bool inview);
|
||||
void R_SetupBuffer ();
|
||||
|
||||
void R_RenderViewToCanvas (AActor *actor, DCanvas *canvas, int x, int y, int width, int height, bool dontmaplines = false);
|
||||
|
||||
|
|
|
@ -134,7 +134,7 @@ static fixed_t xscale, yscale;
|
|||
static DWORD xstepscale, ystepscale;
|
||||
static DWORD basexfrac, baseyfrac;
|
||||
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
extern "C" void R_SetSpanSource_ASM (const BYTE *flat);
|
||||
extern "C" void STACK_ARGS R_SetSpanSize_ASM (int xbits, int ybits);
|
||||
extern "C" void R_SetSpanColormap_ASM (BYTE *colormap);
|
||||
|
@ -210,7 +210,7 @@ void R_MapPlane (int y, int x1)
|
|||
FixedMul (GlobVis, abs (centeryfrac - (y << FRACBITS))), planeshade) << COLORMAPSHIFT);
|
||||
}
|
||||
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
if (ds_colormap != ds_curcolormap)
|
||||
R_SetSpanColormap_ASM (ds_colormap);
|
||||
#endif
|
||||
|
@ -469,7 +469,7 @@ void R_ClearPlanes (bool fullclear)
|
|||
// [RH] clip ceiling to console bottom
|
||||
clearbufshort (ceilingclip, viewwidth,
|
||||
!screen->Accel2D && ConBottom > viewwindowy && !bRenderingToCanvas
|
||||
? ((ConBottom - viewwindowy) >> detailyshift) : 0);
|
||||
? (ConBottom - viewwindowy) : 0);
|
||||
|
||||
lastopening = 0;
|
||||
}
|
||||
|
@ -988,7 +988,7 @@ void R_DrawSinglePlane (visplane_t *pl, fixed_t alpha, bool masked)
|
|||
}
|
||||
pl->xscale = MulScale16 (pl->xscale, tex->xScale);
|
||||
pl->yscale = MulScale16 (pl->yscale, tex->yScale);
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
R_SetSpanSize_ASM (ds_xbits, ds_ybits);
|
||||
#endif
|
||||
ds_source = tex->GetPixels ();
|
||||
|
@ -1344,7 +1344,7 @@ void R_DrawSkyPlane (visplane_t *pl)
|
|||
|
||||
void R_DrawNormalPlane (visplane_t *pl, fixed_t alpha, bool masked)
|
||||
{
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
if (ds_source != ds_cursource)
|
||||
{
|
||||
R_SetSpanSource_ASM (ds_source);
|
||||
|
@ -1550,7 +1550,7 @@ void R_DrawTiltedPlane (visplane_t *pl, fixed_t alpha, bool masked)
|
|||
}
|
||||
}
|
||||
|
||||
#if defined(USEASM)
|
||||
#if defined(X86_ASM)
|
||||
if (ds_source != ds_curtiltedsource)
|
||||
R_SetTiltedSpanSource_ASM (ds_source);
|
||||
R_MapVisPlane (pl, R_DrawTiltedPlane_ASM);
|
||||
|
|
|
@ -57,7 +57,6 @@ CUSTOM_CVAR (Bool, r_stretchsky, true, CVAR_ARCHIVE)
|
|||
R_InitSkyMap ();
|
||||
}
|
||||
|
||||
extern "C" int detailxshift, detailyshift;
|
||||
extern fixed_t freelookviewheight;
|
||||
|
||||
//==========================================================================
|
||||
|
@ -107,8 +106,8 @@ void R_InitSkyMap ()
|
|||
|
||||
if (viewwidth && viewheight)
|
||||
{
|
||||
skyiscale = (r_Yaspect*FRACUNIT) / (((freelookviewheight<<detailxshift) * viewwidth) / (viewwidth<<detailxshift));
|
||||
skyscale = ((((freelookviewheight<<detailxshift) * viewwidth) / (viewwidth<<detailxshift)) << FRACBITS) /
|
||||
skyiscale = (r_Yaspect*FRACUNIT) / ((freelookviewheight * viewwidth) / viewwidth);
|
||||
skyscale = (((freelookviewheight * viewwidth) / viewwidth) << FRACBITS) /
|
||||
(r_Yaspect);
|
||||
|
||||
skyiscale = Scale (skyiscale, FieldOfView, 2048);
|
||||
|
|
|
@ -33,9 +33,7 @@
|
|||
//
|
||||
|
||||
extern "C" int viewwidth;
|
||||
extern "C" int realviewwidth;
|
||||
extern "C" int viewheight;
|
||||
extern "C" int realviewheight;
|
||||
|
||||
// Sprite....
|
||||
extern int firstspritelump;
|
||||
|
|
|
@ -1583,7 +1583,7 @@ void R_DrawPSprite (pspdef_t* psp, int pspnum, AActor *owner, fixed_t sx, fixed_
|
|||
|
||||
|
||||
if (camera->player && (RenderTarget != screen ||
|
||||
realviewheight == RenderTarget->GetHeight() ||
|
||||
viewheight == RenderTarget->GetHeight() ||
|
||||
(RenderTarget->GetWidth() > 320 && !st_scale)))
|
||||
{ // Adjust PSprite for fullscreen views
|
||||
AWeapon *weapon = NULL;
|
||||
|
@ -1593,7 +1593,7 @@ void R_DrawPSprite (pspdef_t* psp, int pspnum, AActor *owner, fixed_t sx, fixed_
|
|||
}
|
||||
if (pspnum <= ps_flash && weapon != NULL && weapon->YAdjust != 0)
|
||||
{
|
||||
if (RenderTarget != screen || realviewheight == RenderTarget->GetHeight())
|
||||
if (RenderTarget != screen || viewheight == RenderTarget->GetHeight())
|
||||
{
|
||||
vis->texturemid -= weapon->YAdjust;
|
||||
}
|
||||
|
@ -2502,7 +2502,7 @@ void R_DrawParticle (vissprite_t *vis)
|
|||
fg = fg2rgb[color];
|
||||
}
|
||||
|
||||
spacing = (RenderTarget->GetPitch()<<detailyshift) - countbase;
|
||||
spacing = RenderTarget->GetPitch() - countbase;
|
||||
dest = ylookup[yl] + x1 + dc_destorg;
|
||||
|
||||
do
|
||||
|
|
|
@ -63,7 +63,7 @@
|
|||
|
||||
EXTERN_CVAR (String, language)
|
||||
|
||||
#ifdef USEASM
|
||||
#if defined(X86_ASM) || defined(X64_ASM)
|
||||
extern "C" void STACK_ARGS CheckMMX (CPUInfo *cpu);
|
||||
#endif
|
||||
|
||||
|
@ -182,7 +182,7 @@ void SetLanguageIDs ()
|
|||
//
|
||||
void I_Init (void)
|
||||
{
|
||||
#ifndef USEASM
|
||||
#if !defined(X86_ASM) && !defined(X64_ASM)
|
||||
memset (&CPU, 0, sizeof(CPU));
|
||||
#else
|
||||
CheckMMX (&CPU);
|
||||
|
|
|
@ -100,14 +100,11 @@ CUSTOM_CVAR (Float, Gamma, 1.f, CVAR_ARCHIVE|CVAR_GLOBALCONFIG)
|
|||
/* Palette management stuff */
|
||||
/****************************/
|
||||
|
||||
extern "C"
|
||||
{
|
||||
BYTE BestColor_MMX (DWORD rgb, const DWORD *pal);
|
||||
}
|
||||
extern "C" BYTE BestColor_MMX (DWORD rgb, const DWORD *pal);
|
||||
|
||||
int BestColor (const uint32 *pal_in, int r, int g, int b, int first, int num)
|
||||
{
|
||||
#ifdef USEASM
|
||||
#ifdef X86_ASM
|
||||
if (CPU.bMMX)
|
||||
{
|
||||
int pre = 256 - num - first;
|
||||
|
@ -120,9 +117,10 @@ int BestColor (const uint32 *pal_in, int r, int g, int b, int first, int num)
|
|||
|
||||
for (int color = first; color < num; color++)
|
||||
{
|
||||
int dist = (r-pal[color].r)*(r-pal[color].r)+
|
||||
(g-pal[color].g)*(g-pal[color].g)+
|
||||
(b-pal[color].b)*(b-pal[color].b);
|
||||
int x = r - pal[color].r;
|
||||
int y = g - pal[color].g;
|
||||
int z = b - pal[color].b;
|
||||
int dist = x*x + y*y + z*z;
|
||||
if (dist < bestdist)
|
||||
{
|
||||
if (dist == 0)
|
||||
|
@ -454,10 +452,8 @@ void InitPalette ()
|
|||
|
||||
}
|
||||
|
||||
extern "C"
|
||||
{
|
||||
void STACK_ARGS DoBlending_MMX (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
|
||||
}
|
||||
extern "C" void STACK_ARGS DoBlending_MMX (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
|
||||
extern void DoBlending_SSE2 (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a);
|
||||
|
||||
void DoBlending (const PalEntry *from, PalEntry *to, int count, int r, int g, int b, int a)
|
||||
{
|
||||
|
@ -478,29 +474,51 @@ void DoBlending (const PalEntry *from, PalEntry *to, int count, int r, int g, in
|
|||
to[i] = t;
|
||||
}
|
||||
}
|
||||
#ifdef USEASM
|
||||
else if (CPU.bMMX && !(count & 1))
|
||||
else if (CPU.bSSE2)
|
||||
{
|
||||
DoBlending_MMX (from, to, count, r, g, b, a);
|
||||
}
|
||||
#endif
|
||||
else
|
||||
{
|
||||
int i, ia;
|
||||
|
||||
ia = 256 - a;
|
||||
r *= a;
|
||||
g *= a;
|
||||
b *= a;
|
||||
|
||||
for (i = count; i > 0; i--, to++, from++)
|
||||
if (count >= 4)
|
||||
{
|
||||
to->r = (r + from->r*ia) >> 8;
|
||||
to->g = (g + from->g*ia) >> 8;
|
||||
to->b = (b + from->b*ia) >> 8;
|
||||
int not3count = count & ~3;
|
||||
DoBlending_SSE2 (from, to, not3count, r, g, b, a);
|
||||
count &= 3;
|
||||
if (count <= 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
from += not3count;
|
||||
to += not3count;
|
||||
}
|
||||
}
|
||||
#ifdef X86_ASM
|
||||
else if (CPU.bMMX)
|
||||
{
|
||||
if (count >= 4)
|
||||
{
|
||||
int not3count = count & ~3;
|
||||
DoBlending_MMX (from, to, not3count, r, g, b, a);
|
||||
count &= 3;
|
||||
if (count <= 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
from += not3count;
|
||||
to += not3count;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
int i, ia;
|
||||
|
||||
ia = 256 - a;
|
||||
r *= a;
|
||||
g *= a;
|
||||
b *= a;
|
||||
|
||||
for (i = count; i > 0; i--, to++, from++)
|
||||
{
|
||||
to->r = (r + from->r * ia) >> 8;
|
||||
to->g = (g + from->g * ia) >> 8;
|
||||
to->b = (b + from->b * ia) >> 8;
|
||||
}
|
||||
}
|
||||
|
||||
void V_SetBlend (int blendr, int blendg, int blendb, int blenda)
|
||||
|
|
|
@ -1192,7 +1192,6 @@ void DFrameBuffer::PrecacheTexture(FTexture *tex, int cache)
|
|||
void DFrameBuffer::RenderView(player_t *player)
|
||||
{
|
||||
R_RenderActorView (player->mo);
|
||||
R_DetailDouble (); // [RH] Apply detail mode expansion
|
||||
// [RH] Let cameras draw onto textures that were visible this frame.
|
||||
FCanvasTextureInfo::UpdateAll ();
|
||||
}
|
||||
|
@ -1317,7 +1316,7 @@ bool V_DoModeSetup (int width, int height, int bits)
|
|||
|
||||
RenderTarget = screen;
|
||||
screen->Lock (true);
|
||||
R_SetupBuffer (false);
|
||||
R_SetupBuffer ();
|
||||
screen->Unlock ();
|
||||
|
||||
M_RefreshModesList ();
|
||||
|
|
|
@ -458,7 +458,7 @@ FString V_GetColorStringByName (const char *name);
|
|||
// Tries to get color by name, then by string
|
||||
int V_GetColor (const DWORD *palette, const char *str);
|
||||
|
||||
#ifdef USEASM
|
||||
#if defined(X86_ASM) || defined(X64_ASM)
|
||||
extern "C" void ASM_PatchPitch (void);
|
||||
#endif
|
||||
|
||||
|
|
|
@ -67,9 +67,7 @@
|
|||
|
||||
EXTERN_CVAR (String, language)
|
||||
|
||||
#ifdef USEASM
|
||||
extern "C" void STACK_ARGS CheckMMX (CPUInfo *cpu);
|
||||
#endif
|
||||
extern void CheckCPUID(CPUInfo *cpu);
|
||||
|
||||
extern "C"
|
||||
{
|
||||
|
@ -344,12 +342,10 @@ void SetLanguageIDs ()
|
|||
//
|
||||
// I_Init
|
||||
//
|
||||
|
||||
void I_Init (void)
|
||||
{
|
||||
#ifndef USEASM
|
||||
memset (&CPU, 0, sizeof(CPU));
|
||||
#else
|
||||
CheckMMX (&CPU);
|
||||
CheckCPUID(&CPU);
|
||||
CalculateCPUSpeed ();
|
||||
|
||||
// Why does Intel right-justify this string?
|
||||
|
@ -367,7 +363,6 @@ void I_Init (void)
|
|||
}
|
||||
}
|
||||
|
||||
#endif
|
||||
if (CPU.VendorID[0])
|
||||
{
|
||||
Printf ("CPU Vendor ID: %s\n", CPU.VendorID);
|
||||
|
@ -396,7 +391,6 @@ void I_Init (void)
|
|||
Printf ("\n");
|
||||
}
|
||||
|
||||
|
||||
// Use a timer event if possible
|
||||
NewTicArrived = CreateEvent (NULL, FALSE, FALSE, NULL);
|
||||
if (NewTicArrived)
|
||||
|
@ -484,7 +478,7 @@ void CalculateCPUSpeed ()
|
|||
Printf ("Can't determine CPU speed, so pretending.\n");
|
||||
}
|
||||
|
||||
Printf ("CPU Speed: %f MHz\n", CyclesPerSecond / 1e6);
|
||||
Printf ("CPU Speed: %.0f MHz\n", CyclesPerSecond / 1e6);
|
||||
}
|
||||
|
||||
//
|
||||
|
|
|
@ -52,23 +52,23 @@ extern os_t OSPlatform;
|
|||
|
||||
struct CPUInfo // 92 bytes
|
||||
{
|
||||
char VendorID[16];
|
||||
char CPUString[48];
|
||||
char VendorID[16]; // 0
|
||||
char CPUString[48]; // 16
|
||||
|
||||
BYTE Stepping;
|
||||
BYTE Model;
|
||||
BYTE Family;
|
||||
BYTE Type;
|
||||
BYTE Stepping; // 64
|
||||
BYTE Model; // 65
|
||||
BYTE Family; // 66
|
||||
BYTE Type; // 67
|
||||
|
||||
BYTE BrandIndex;
|
||||
BYTE CLFlush;
|
||||
BYTE CPUCount;
|
||||
BYTE APICID;
|
||||
BYTE BrandIndex; // 68
|
||||
BYTE CLFlush; // 69
|
||||
BYTE CPUCount; // 70
|
||||
BYTE APICID; // 71
|
||||
|
||||
DWORD bSSE3:1;
|
||||
DWORD bSSE3:1; // 72
|
||||
DWORD DontCare1:31;
|
||||
|
||||
DWORD bFPU:1;
|
||||
DWORD bFPU:1; // 76
|
||||
DWORD bVME:1;
|
||||
DWORD bDE:1;
|
||||
DWORD bPSE:1;
|
||||
|
@ -76,7 +76,7 @@ struct CPUInfo // 92 bytes
|
|||
DWORD bMSR:1;
|
||||
DWORD bPAE:1;
|
||||
DWORD bMCE:1;
|
||||
DWORD bCX8:1;
|
||||
DWORD bCX8:1; // 77
|
||||
DWORD bAPIC:1;
|
||||
DWORD bReserved1:1;
|
||||
DWORD bSEP:1;
|
||||
|
@ -84,7 +84,7 @@ struct CPUInfo // 92 bytes
|
|||
DWORD bPGE:1;
|
||||
DWORD bMCA:1;
|
||||
DWORD bCMOV:1;
|
||||
DWORD bPAT:1;
|
||||
DWORD bPAT:1; // 78
|
||||
DWORD bPSE36:1;
|
||||
DWORD bPSN:1;
|
||||
DWORD bCFLUSH:1;
|
||||
|
@ -92,7 +92,7 @@ struct CPUInfo // 92 bytes
|
|||
DWORD bDS:1;
|
||||
DWORD bACPI:1;
|
||||
DWORD bMMX:1;
|
||||
DWORD bFXSR:1;
|
||||
DWORD bFXSR:1; // 79
|
||||
DWORD bSSE:1;
|
||||
DWORD bSSE2:1;
|
||||
DWORD bSS:1;
|
||||
|
@ -101,22 +101,22 @@ struct CPUInfo // 92 bytes
|
|||
DWORD bReserved3:1;
|
||||
DWORD bPBE:1;
|
||||
|
||||
DWORD DontCare2:22;
|
||||
DWORD DontCare2:22; // 80
|
||||
DWORD bMMXPlus:1; // AMD's MMX extensions
|
||||
DWORD bMMXAgain:1; // Just a copy of bMMX above
|
||||
DWORD DontCare3:6;
|
||||
DWORD b3DNowPlus:1;
|
||||
DWORD b3DNow:1;
|
||||
|
||||
BYTE AMDStepping;
|
||||
BYTE AMDModel;
|
||||
BYTE AMDFamily;
|
||||
BYTE bIsAMD;
|
||||
BYTE AMDStepping; // 84
|
||||
BYTE AMDModel; // 85
|
||||
BYTE AMDFamily; // 86
|
||||
BYTE bIsAMD; // 87
|
||||
|
||||
BYTE DataL1LineSize;
|
||||
BYTE DataL1LinesPerTag;
|
||||
BYTE DataL1Associativity;
|
||||
BYTE DataL1SizeKB;
|
||||
BYTE DataL1LineSize; // 88
|
||||
BYTE DataL1LinesPerTag; // 89
|
||||
BYTE DataL1Associativity;//90
|
||||
BYTE DataL1SizeKB; // 91
|
||||
};
|
||||
|
||||
|
||||
|
|
840
zdoom.vcproj
840
zdoom.vcproj
File diff suppressed because it is too large
Load diff
Loading…
Reference in a new issue