An Improved Quake C Compiler
Find a file
2012-05-06 16:58:30 -04:00
data util_strncmpexact 2012-05-04 22:01:02 -04:00
test removing ast-test macro concept 2012-05-04 13:04:06 +02:00
.gitattributes gitattributes for whitespace 2012-04-28 16:40:03 -04:00
asm.c util_strncmpexact 2012-05-04 22:01:02 -04:00
ast.c Fixed some dangling '}', it compiles again now 2012-05-04 12:26:24 +02:00
ast.h ast_function gets 'breakblock' and 'continueblock' for break and continue support; fixed some typos; added huge ast_loop_codegen implementation... need to go through it and check 2012-05-04 00:16:51 +02:00
AUTHORS Test write access by adding an AUTHORS file 2012-04-24 18:47:09 +02:00
code.c More assembler code (less allocations too) 2012-05-03 16:54:34 -04:00
error.c Remove trailing whitespace from everything 2012-04-28 16:43:39 -04:00
gmqcc.h Updated readme 2012-05-06 16:58:30 -04:00
ir.c Changing life-range calculation to include the last read because then it doesn't need another vector to keep elements in. In order for this to make sense, the life-range overlap test now returns false if the end of one range _equals_ the beginning of the other, since that's not truly overlapping anymore 2012-05-04 10:20:04 +02:00
ir.h Comment about the form of instruction general_instr is used for 2012-05-01 16:29:29 +02:00
lex.c Function parsing for the assembler now works, and adds the function to the function table for the code writer, quake and darkplaces can see it as well (since a def is also created) 2012-05-01 16:42:11 -04:00
main.c Updated readme 2012-05-06 16:58:30 -04:00
Makefile More assembler code (less allocations too) 2012-05-03 16:54:34 -04:00
parse.c Remove trailing whitespace 2012-04-28 19:03:16 -04:00
propsal.txt new progs format proposal for engine developers (45% of globals are 0, why write them, let the engine populate them. We can essentially save 9884 bytes in xonotic's progs.dat with this new format.) 2012-04-24 08:19:48 -04:00
README Updated readme 2012-05-06 16:58:30 -04:00
typedef.c Remove trailing whitespace from everything 2012-04-28 16:43:39 -04:00
util.c Updated readme 2012-05-06 16:58:30 -04:00

This is a work in progress Quake C compiler. There are very few good QC
compilers out there on the internet that can be used in the opensource
community.  There are a lot of mediocre compilers, but no one wants those.
This is the solution for that, for once a proper Quake C compiler that is
capable of doing proper optimization.

The compiler is intended to implement modern day compiler design princibles
and support modifications through extensions that are provided for the
user through a low-level syntax specific-language inside the language itself
to implement language functionality.

The design goals of the compiler are very large, it's intended the compiler
supports a multitude of things, these things along with the status of
completeness is represented below in a table.

+-------------------+-----------------------------+------------------+
|     Feature       |  What's it for?             | Complete Factor  |
+-------------------+-----------------------------+------------------+
. Lexical analysis  .  Tokenization               .       90%        .
.-------------------.-----------------------------.------------------.
. Tokenization      .  Parsing                    .       90%        .
.-------------------.-----------------------------.------------------.
. Parsing / SYA     .  AST Generation             .       09%        .
.-------------------.-----------------------------.------------------.
. AST Generation    .  IR  Generation             .       ??%        .
.-------------------.-----------------------------.------------------.
. IR  Generation    .  Code Generation            .       ??%        .
.-------------------.-----------------------------.------------------.
. Code Generation   .  Binary Generation          .       ??%        .
.-------------------.-----------------------------.------------------.
. Binary Generation .  Binary                     .      100%        .
+-------------------+-----------------------------+------------------+

Design tree:
	The compiler is intended to work in the following order:
		Lexical analysis ->
			Tokenization ->
				Parsing:
					Operator precedence:
						Shynting yard algorithm
					Inline assembly:
						 Usage of the assembler subsystem:
						 	top-down parsing and assemblation no optimization
					Other parsing:
						recrusive decent
				->
					Abstract syntax tree generation ->
						Immediate representation (SSA):
							Optimizations:
								Constant propagation
								Value range propogation
								Sparse conditional constant propagation (possibly?)
									Dead code elimination
									Constant folding
								Global value numbering
								Partial redundancy elimination
								Strength reduction
								Common subexpression elimination
								Peephole optimizations
								Loop-invariant code motion
								Inline expansion
								Constant folding
								Induction variable recognition and elimination
								Dead store elimination
								Jump threading
						->
							Code Generation:
								Optimizations:
									Rematerialization
									Code Factoring
									Recrusion Elimination
									Loop unrolling
									Deforestation
							->
								Binary Generation

File tree and explination:
	gmqcc.h
		This is the common header with all definitions, structures, and
		constants for everything.

	error.c
		This is the error subsystem, this handles the output of good detailed
		error messages (not currently, but will), with colors and such.
	
	lex.c
		This is the lexer, a very small basic step-seek lexer that can be easily
		changed to add new tokens, very retargetable.
		
	main.c
		This is the core compiler entry, handles switches (will) to toggle on
		and off certian compiler features.
		
	parse.c
		This is the parser which goes over all tokens and generates a parse tree
		and check for syntax correctness.
		
	typedef.c
		This is the typedef system, this is a seperate file because it's a lot more
		complicated than it sounds.  This handles all typedefs, and even recrusive
		typedefs.
		
	util.c
		These are utilities for the compiler, some things in here include a
		allocator used for debugging, and some string functions.
		
	assembler.c
		This implements support for assembling Quake assembler (which doesn't
		actually exist untill now: documentation of the Quake assembler is below.
		This also implements (will) inline assembly for the C compiler.
		
	README
		This is the file you're currently reading
		
	Makefile
		The makefile, when sources are added you should add them to the SRC=
		line otherwise the build will not pick it up.  Trivial stuff, small
		easy to manage makefile, no need to complicate it.
		Some targets:
			#make gmqcc
				Builds gmqcc, creating a `gmqcc` binary file in the current
				directory as the makefile.
			#make test
				Builds the ir and ast tests, creating a `test_ir` and `test_ast`
				binary file in the current directory as the makefile.
			#make test_ir
				Builds the ir test, creating a `test_ir` binary file in the
				current directory as the makefile.
			#make test_ast
				Builds the asr test, creating a `test_ast` binary file in the
				current directory as the makefile.
			#make clean
				Cleans the build files left behind by a previous build, as
				well as all the binary files.
			#make all
				Builds the tests and the compiler binary all in the current
				directory of the makefile.

////////////////////////////////////////////////////////////////////////
///////////////////// Quake Assembler Documentation ////////////////////
////////////////////////////////////////////////////////////////////////
Quake assembler is quite simple: it's just an annotated version of the binary
produced by any existing QuakeC compiler, but made cleaner to use, (so that
the location of various globals or strings are not required to be known).

Constants:
	Using one of the following valid constant typenames, you can declare
	a constant {FLOAT,VECTOR,FUNCTION,FIELD,ENTITY}, all typenames are
	proceeded by a colon, and the name (white space doesn't matter).
	
	Examples:
		FLOAT: foo 1
		VECTOR: bar 1 2 1
		STRING: hello "hello world"
		
Comments:
	Commenting assembly requires the use of either # or ; on the line
	that you'd like to be ignored by the assembler. You can only comment
	blank lines, and not lines assembly already exists on.
	
	Examples:
		; this is allowed
		# as is this
		FLOAT: foo 1 ; this is not allowed
		FLOAT: bar 2 # neither is this
	
Functions:
	Creating functions is the same as declaring a constant, simply use
	FUNCTION followed by a colon, and the name (white space doesn't matter)
	and start the statements for that function on the line after it
	
	Examples:
		FLOAT: foo 1
		FLOAT: bar 2
		FUNCTION: test1
			ADD foo, bar, OFS_RETURN
			RETURN
			
		FUNCTION: test2
			CALL0 test1
			DONE
			
Internal:
	The Quake engine provides some internal functions such as print, to
	access these you first must declare them and their names. To do this
	you create a FUNCTION as you currently do. Adding a $ followed by the
	number of the engine builtin (negated).
	
	Examples:
		FUNCTION: print $4
		FUNCTION: error $3

Misc:
	There are some rules as to what your identifiers can be for functions
	and constants.  All indentifiers mustn't begin with a numeric digit,
	identifiers cannot include spaces, or tabs; they cannot contain symbols,
	and they cannot exceed 32768 characters. Identifiers cannot be all 
	capitalized either, as all capatilized identifiers are reserved by the
	assembler.
	
	Numeric constants cannot contain special notation such as `1-e10`, all
	numeric constants have to be numeric, they can contain decmial points
	and signs (+, -) however.
	
	Constants cannot be assigned values of other constants, their value must
	be fully expressed inspot of the declartion.
	
	No two identifiers can be the same name, this applies for variables allocated
	inside a function scope (despite it being considered local).
	
	There exists one other keyword that is considered sugar, and that
	is AUTHOR, this keyword will allow you to speciy the AUTHOR(S) of
	the assembly being assembled. The string represented for each usage
	of AUTHOR is wrote to the end of the string table. Simaler to the
	usage of constants and functions the AUTHOR keyword must be proceeded
	by a colon.
	
	Examples:
		AUTHOR: "Dale Weiler"
		AUTHOR: "Wolfgang Bumiller"
		
	Colons exist for the sole reason of not having to use spaces after
	keyword usage (however spaces are allowed).  To understand the
	following examples below are equivlent.
	
	Example 1:
		FLOAT:foo 1
	Example 2:
		FLOAT: foo 1
	Example 3:
		FLOAT:  foo 2
		
	variable amounts of whitespace is allowed anywhere (as it should be).
	think of `:` as a delimiter (which is what it's used for during assembly).
	
////////////////////////////////////////////////////////////////////////
/////////////////////// Quake C Documentation //////////////////////////
////////////////////////////////////////////////////////////////////////
TODO ....