According to the formal GNU philosophy, each language that is present in the GCC (ie. for languages having separate front ends) should have a subdirectory of its own. So the first thing the GCC expects from us is a separate directory . Let us create a new subdirectory in the srcdir/gcc with the name of our language (say 'demo').
As explained before gcc expects a number of files and functions. In our directory also there should be some files present. The first thing that should be present is a make file. It is divided into two, Make-lang.in and Makefile.in. Both are part of the main make file of gcc.
Make-lang.in is actually included in the main make file of the gcc and from there (objdir/gcc) it calls the make in the language directory. So the filenames should be provided with full path names. Don't try to give relative ones. The file gives information about the source files of our language. So it is here that we should specify the files that is required for our language.
Makefile.in is used to create the make file in the language directory. It is included in the language directory.
The third file expected by the gcc is config-lang.in. It is used by the file srcdir/gcc/configure (a shell script).
The fourth file gcc expects is lang-specs.h. This file helps in the modification of the gcc driver program. The driver has to understand the presence of our new language. This is neatly accomplished by this file. It is here that the details like extensions of our language exists. You can specify them according to your taste.
Now the most important thing. No one is expecting you to create all these files from scratch. The best method is to copy these files from any existing directory and make the relevant changes. The changes may include some modifications in the name of the language, the files used by us, the extensions of our choice etc.
All the information provided are from my observation and sometimes there may be variations in the above details.
As already explained we are going to use a major part of the gcc for compiling our language. So it is our responsibility to define some functions and variables which are used by the back end. Most of them are of no direct help to us. But back end expects it, so we should provide it.
As in the above case it is better to copy from some existing front ends. But let's have a general idea of what each function is. If you find this section boring you can skip this section but don't skip the inclusion of these routines in your program.
type_for_size(unsigned precision, int unsignedp)It returns a tree of integer type with number of bits given by the argument precision. If unsignedp is nonzero, then it is unsigned type, else it is signed type.
init_parse(char *filename)It initialize parsing.
finish_parse()It does the parsing cleanup.
lang_init_options()Language specific initialization option processing.
lang_print_xnode(FILE *file,tree t,int i)Required by gcc. Don't know what exactly it does.
type_for_mode(enum machine_mode mode,int unsignedp)It returns a tree type of the desired mode given by us. mode represents machine data type like whole number. unsignedp, as usual is used for obtaining an unsigned type or else a signed type is returned.
unsigned_type(tree type_node)Returns the unsigned version of type_node.
signed_type(tree type_node)Returns the signed version of type_node.
signed_or_unsigned_type(int unsignedp, tree type)Returns signed or unsigned tree node depending upon unsignedp.
global_bindings_p()Returns nonzero if we are currently in the global binding level.
getdecls()Returns the list of declarations in the current level, but in reverse order.
kept_level_p()It is nonzero when a 'BLOCK' must be created for the current level of symbol table.
pushlevel(int ignore)Enter a new binding level. A symbol name used before in another binding level is covered when entered into a new level.
poplevel(int keep, int reverse, int functionbody)Removes a new level created by pushlevel. The symbol table status is regained (which was present before the pushlevel).
insert_block(tree block)Insert block at the end of the list of subblocks of the current binding level.
set_block(tree block)Sets the block for the current scope.
pushdecl(tree decl)Inserts the declaration, decl into the symbol table and returns the tree back.
init_decl_processing()Initializes the symbol table. It sets global variables and inserts other variables into the symbol table.
lang_decode_option(int a, char **p)It decodes all language specific options that cannot be decoded by the GCC. Returns 1 if successful, otherwise 0.
lang_init()Performs all the initialization steps required by the front end. It includes setting certain global variables.
lang_finish()Performs all front end specific clean up.
lang_identify()Returns a short string identifying the language to the debugger.
maybe_build_cleanup(tree decl)Creates a tree node, which represents an automatic cleanup action.
incomplete_type_error(tree value, tree type)Prints an error message for invalid use of incomplete type.
truthvalue_conversion(tree expr)It returns the same expr, but in a type which represents truthvalues.
mark_addressable(tree expr)Marks expr as a construct which need an address in storage.
print_lang_statics()Prints any language-specific compilation statics.
copy_lang_decl(tree node)It copies the declarations if DECL_LANG_SPECIFIC is nonzero.
print_lang_decl(FILE *file, tree node, int indent)Outputs the declaration for node with indentation depth indent to the file, file.
print_lang_type(FILE *file, tree node, int indent)Outputs the type for node with indentation depth indent to the file, file.
print_lang_identifier(FILE *file, tree node, int indent)Outputs the identifier for node with indentation depth indent to the file, file.
int_lex()Performs whatever initialization steps are required by the language dependent lexical analyzer.
set_yydebug()Sets some debug flags for the parser.
yyerror(char *s)Routine to print parse error message.
language_stringA character string to hold the name of our language. say, demo.
flag_traditionalA variable needed by the file dwarfout.c
error_mark_nodeA tree node used to define errors. It represents a partial tree. It is of great help when some errors occur in the syntax analysis phase.
integer_type_node, char_type_node, void_type_nodeClear from the names.
integer_zero_node, integer_one_nodeConstants of type integer_type_node with values 0 and 1 respectively.