- C Programming Tutorial
- C - Home
- C - Overview
- C - Features
- C - History
- C - Environment Setup
- C - Program Structure
- C - Hello World
- C - Compilation Process
- C - Comments
- C - Tokens
- C - Keywords
- C - Identifiers
- C - User Input
- C - Basic Syntax
- C - Data Types
- C - Variables
- C - Integer Promotions
- C - Constants
- C - Literals
- C - Escape sequences
- C - Storage Classes
- C - Operators
- C - Decision Making
- C - if statement
- C - if...else statement
- C - nested if statements
- C - switch statement
- C - nested switch statements
- C - Loops
- C - While loop
- C - For loop
- C - Do...while loop
- C - Nested loop
- C - Infinite loop
- C - Break Statement
- C - Continue Statement
- C - goto Statement
- C - Functions
- C - Main Functions
- C - Return Statement
- C - Recursion
- C - Scope Rules
- C - Arrays
- C - Properties of Array
- C - Multi-Dimensional Arrays
- C - Passing Arrays to Function
- C - Return Array from Function
- C - Variable Length Arrays
- C - Pointers
- C - Pointer Arithmetics
- C - Passing Pointers to Functions
- C - Strings
- C - Array of Strings
- C - Structures
- C - Structures and Functions
- C - Arrays of Structures
- C - Pointers to Structures
- C - Self-Referential Structures
- C - Nested Structures
- C - Unions
- C - Bit Fields
- C - Typedef
- C - Input & Output
- C - File I/O
- C - Preprocessors
- C - Header Files
- C - Type Casting
- C - Error Handling
- C - Variable Arguments
- C - Memory Management
- C - Command Line Arguments
- C Programming Resources
- C - Questions & Answers
- C - Quick Guide
- C - Useful Resources
- C - Discussion
C - Compilation Process
C is a compiled language. Compiled languages provide faster execution performance as compared to interpreted languages. Different compiler products may be used to compile a C program. They are GCC, Clang, MSVC, etc. In this chapter, we shall know about the process of compilation of a C program with GCC compiler.
What is a Compilation?
Modern computers are capable of processing binary instructions. A sequence of binary instructions, consisting of 1 and 0 bits is called machine code. On the other hand, higher-level languages such as C, C++, Java, etc consist of keywords that are closer to human languages such as English. Hence, a program written in C (or any other high-level language) needs to be converted into its equivalent machine code. This process is called compilation.
Note that the machine code is specific to the hardware architecture and the operating system. In other words, the machine code of a certain C program, compiled on a computer with Windows OS is not compatible with another computer using Linux OS. Hence, we must use the compiler suitable for the target OS.
C Compilation Process Steps
In this tutorial, we will be using the gcc (which stands for GNU Compiler Collection). The GNU project is a free software project by Richard Stallman that allows developers to have access to powerful tools for free.
The gcc compiler supports various programming languages, including C. To use it, we should install its version compatible with the target computer.
The compilation process has four different steps −
- Preprocessing
- Compiling
- Assembling
- Linking
The following program illustrates the compilation process −
To understand this process, let us consider the following source code in C languge(main.c)
Example
main.c
#include <stdio.h> int main() { /* my first program in C */ printf("Hello, World! \n"); return 0; }
Output
Hello, World!
The ".c" is a file extension that usually means the file is written in C. The first line is the preprocessor directive #include which tells the compiler to include the "stdio.h" header file. The text inside /* and */ are comments and these are useful for remembering what your code does months after having created it.
The entry point of the program is the main() function. It means the program will start by executing the statements that are inside this function’s block made up of the curly brackets. Here, there are only two statements: one that will print the sentence "Hello, World" on the terminal, and another one that tells the program to "return" 0 if it exited, or ended, correctly. So once we compiled it, if we run this program we will only see the phrase "Hello, World" appearing.
What Goes Inside the C Compilation Process?
For our main.c code to be executable, we need to enter the command "gcc main.c", and the compiling process will go through all of the four steps it contains.
Step 1: Preprocessing
The preprocessor performs the following actions −
it removes all the comments in the source file(s)
it includes the code of the header file(s), which is a file with extension .h which contains C function declarations and macro definitions.
it replaces all of the macros (fragments of code which have been given a name) by their values
The output of this step will be stored in a file with a ".i" extension, so here it will be in main.i.
In order to stop the compilation right after this step, we can use the option "-E" with the gcc command on the source file, and press enter.
gcc -E main.c
Step 2: Compiling
The compiler generates IR code (Intermediate Representation) from the preprocessed file, so this will produce a ".s" file. That being said, other compilers might produce assembly code at this step of compilation.
We can stop after this step with the "-S" option on the gcc command, and press enter.
gcc -S main.c
This is what the main.s file should look like −
.file "helloworld.c" .text .def __main; .scl 2; .type 32; .endef .section .rdata,"dr" .LC0: .ascii "Hello, World! \0" .text .globl main .def main; .scl 2; .type 32; .endef .seh_proc main main: pushq %rbp .seh_pushreg %rbp movq %rsp, %rbp .seh_setframe %rbp, 0 subq $32, %rsp .seh_stackalloc 32 .seh_endprologue call __main leaq .LC0(%rip), %rcx call puts movl $0, %eax addq $32, %rsp popq %rbp ret .seh_endproc .ident "GCC: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0" .def puts; .scl 2; .type 32; .endef
Step 3: Assembling
The assembler takes the IR code and transforms it into object code, that is code in machine language (i.e. binary). This will produce a file ending in ".o".
We can stop the compilation process after this step by using the option "-c" with the gcc command, and pressing enter.
The main.o file is not a text file, hence when opened with a text editor, it should look like this: readable):
Step 4: Linking
The linker creates the final executable, in binary. It links object codes of all the source files together. The linker knows where to look for the function definitions in the static libraries or dynamic libraries. Static libraries are "the result of the linker making a copy of all used library functions to the executable file". The code in dynamic libraries is not copied entirely, only the name of the library is placed in the binary file.
By default, after this fourth and last step, that is when you type the whole "gcc main.c" command without any options, the compiler will create an executable program called main.out (or main.exe in the case of Windows), that we can run from the command line.
We can also choose to create an executable program with the name we want, by adding the "-o" option to the gcc command, placed after the name of the file or files we are compiling, and pressing enter:
gcc main.c -o hello.out
So now we could either type "./hello.out" if you didn’t use the -o option or "./hello" to execute the compiled code, the output will be "Hello, World", and following it the shell prompt will appear again.