• C Programming Video Tutorials

C - Compilation Process



C is a compiled language. Compiled languages provide faster execution performance as compared to interpreted languages. Different compiler products may be used to compile a C program. They are GCC, Clang, MSVC, etc. In this chapter, we shall know about the process of compilation of a C program with GCC compiler.

What is a Compilation?

Modern computers are capable of processing binary instructions. A sequence of binary instructions, consisting of 1 and 0 bits is called machine code. On the other hand, higher-level languages such as C, C++, Java, etc consist of keywords that are closer to human languages such as English. Hence, a program written in C (or any other high-level language) needs to be converted into its equivalent machine code. This process is called compilation.

Note that the machine code is specific to the hardware architecture and the operating system. In other words, the machine code of a certain C program, compiled on a computer with Windows OS is not compatible with another computer using Linux OS. Hence, we must use the compiler suitable for the target OS.

Compilation

C Compilation Process Steps

In this tutorial, we will be using the gcc (which stands for GNU Compiler Collection). The GNU project is a free software project by Richard Stallman that allows developers to have access to powerful tools for free.

The gcc compiler supports various programming languages, including C. To use it, we should install its version compatible with the target computer.

The compilation process has four different steps −

  • Preprocessing
  • Compiling
  • Assembling
  • Linking

The following program illustrates the compilation process −

Compilation Process

To understand this process, let us consider the following source code in C languge(main.c)

Example

main.c

#include <stdio.h>

int main() {
   /* my first program in C */
   printf("Hello, World! \n");

   return 0;
}

Output

Hello, World!

The ".c" is a file extension that usually means the file is written in C. The first line is the preprocessor directive #include which tells the compiler to include the "stdio.h" header file. The text inside /* and */ are comments and these are useful for remembering what your code does months after having created it.

The entry point of the program is the main() function. It means the program will start by executing the statements that are inside this function’s block made up of the curly brackets. Here, there are only two statements: one that will print the sentence "Hello, World" on the terminal, and another one that tells the program to "return" 0 if it exited, or ended, correctly. So once we compiled it, if we run this program we will only see the phrase "Hello, World" appearing.

What Goes Inside the C Compilation Process?

For our main.c code to be executable, we need to enter the command "gcc main.c", and the compiling process will go through all of the four steps it contains.

Step 1: Preprocessing

The preprocessor performs the following actions −

  • it removes all the comments in the source file(s)

  • it includes the code of the header file(s), which is a file with extension .h which contains C function declarations and macro definitions.

  • it replaces all of the macros (fragments of code which have been given a name) by their values

The output of this step will be stored in a file with a ".i" extension, so here it will be in main.i.

In order to stop the compilation right after this step, we can use the option "-E" with the gcc command on the source file, and press enter.

gcc -E main.c

Step 2: Compiling

The compiler generates IR code (Intermediate Representation) from the preprocessed file, so this will produce a ".s" file. That being said, other compilers might produce assembly code at this step of compilation.

We can stop after this step with the "-S" option on the gcc command, and press enter.

gcc -S main.c

This is what the main.s file should look like −

.file	"helloworld.c"
	.text
	.def	__main;	.scl	2;	.type	32;	.endef
	.section .rdata,"dr"
.LC0:
	.ascii "Hello, World! \0"
	.text
	.globl	main
	.def	main;	.scl	2;	.type	32;	.endef
	.seh_proc	main
main:
	pushq	%rbp
	.seh_pushreg	%rbp
	movq	%rsp, %rbp
	.seh_setframe	%rbp, 0
	subq	$32, %rsp
	.seh_stackalloc	32
	.seh_endprologue
	call	__main
	leaq	.LC0(%rip), %rcx
	call	puts
	movl	$0, %eax
	addq	$32, %rsp
	popq	%rbp
	ret
	.seh_endproc
	.ident	"GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0"
	.def	puts;	.scl	2;	.type	32;	.endef

Step 3: Assembling

The assembler takes the IR code and transforms it into object code, that is code in machine language (i.e. binary). This will produce a file ending in ".o".

We can stop the compilation process after this step by using the option "-c" with the gcc command, and pressing enter.

The main.o file is not a text file, hence when opened with a text editor, it should look like this: readable):

Readable

Step 4: Linking

The linker creates the final executable, in binary. It links object codes of all the source files together. The linker knows where to look for the function definitions in the static libraries or dynamic libraries. Static libraries are "the result of the linker making a copy of all used library functions to the executable file". The code in dynamic libraries is not copied entirely, only the name of the library is placed in the binary file.

By default, after this fourth and last step, that is when you type the whole "gcc main.c" command without any options, the compiler will create an executable program called main.out (or main.exe in the case of Windows), that we can run from the command line.

We can also choose to create an executable program with the name we want, by adding the "-o" option to the gcc command, placed after the name of the file or files we are compiling, and pressing enter:

gcc main.c -o hello.out

So now we could either type "./hello.out" if you didn’t use the -o option or "./hello" to execute the compiled code, the output will be "Hello, World", and following it the shell prompt will appear again.

Advertisements