Introduction to C language
Marc-Olivier Buob (Nokia Bell Labs)
LINCS Tools Trick and Tips, April 2024
Outline¶
- Introduction
- Hello world (
man
,gcc
,gdb
) - Precompiler (
#define
,#ifdef
,#include
) - Base types (
char
,int
,float
,double
) and qualifiers (long
,unsigned
,const
,static
) - Operators (arithmetic, logical, bitwise)
- Basic I/O (
printf
,scanf
) - Tests (
if else
,switch case
) and loops (while
,do while
,for
) - Memory (stack, heap,
malloc
,free
) - Advanced types (
struct
,union
,typedef
) - File descriptors (
fopen
,fclose
) - Python and C/C++ (
poetry
)
Introduction¶
- What is C?
- Why C?
What is C?¶
Programming language by D. Ritchie in 1970
- Compiled
- Widely used
- Very efficient
- OS, drivers, and protocol stacks
- Less used for applications
Why C?¶
The C language is significantly more (energy) efficient than Python
Hello world¶
- Source file (
.c
) - Source structure
- Compilation (
gcc
) - Under the hood
- Documentation (
man
) - Debugging (
gdb
)
Source file (.c
)¶
Let's write hello.c
:
%%file hello.c
#include <stdio.h>
int main() {
printf("Hello world");
return 0;
}
Overwriting hello.c
Source structure¶
- A source file
.c
is a list of declaration and functions.- Instructions are always inside a function.
- Declarations function type declarations (and global variables, but don't do this).
- The
main
function is mandatory to code a runnable.- It defines where the programs starts.
- It returns
0
if everything is fine, another value otherwise (called error code).
- The instructions (of any function):
- are wrapped in a scope
{...}
- and ends with
;
- may be split in sub-instructions using
,
- are wrapped in a scope
Compilation (gcc
)¶
- A compiler transforms a source code to a binary (e.g., a runnable).
- There are many C compilers. Here, we will consider
gcc
(GNU C compiler). - Let's compile
hello.c
to produce the runnablehello
:
!gcc hello.c -o hello
Now, let's execute hello
:
!./hello
Hello world
Under the hood¶
- Precompilation: resolves every operation starting with
#
- Compilation: converts each
.c
source file to a corresponding.o
binary object - Linkage: gathers every
.o
to build the target binary (runnable, library, kernel module, etc.)
Common extensions
Binary | Windows | GNU/Linux |
---|---|---|
Object | .o |
.o |
Runnable | .exe |
No extension |
Static library | .lib |
.a |
Dynamic library | .dll |
.so |
Kernel module | N/A | .ko |
man
¶
- Under Linux,
man
provides the documentation about Shell function (section 1), C (sections 2, 3), etc. - The
man
is useful to understand how to include and call a C function. Do not forget to indicate the man section if the function exists in shell and C (e.g., thewrite
function): - If you don't use Linux, just do this search on the Internet, the manpages are also publicly available ;)
man printf
man 2 write
gdb
¶
- Under Linux,
gdb
is the C debugger used in a console. IDE (likekdevelop
,anjuta
) wrapsgdb
in a Graphical User Interface (GUI).
gcc -g hello.c -o hello
gdb hello
Shortcut | Meaning |
---|---|
b main |
Breakpoint on main function |
b 18 |
Breakpoint line 18 |
no br |
Clear breakpoints |
r |
Run (until breakpoint) |
c |
Continue (until breakpoint) |
n |
Next |
s |
Step |
p x |
Print value of x |
Precompiler¶
- Precompiler instruction
- Module inclusion:
#include ...
- Constants and macros:
#define
- Guards:
#ifdef/#ifndef ... #else ... #endif
- Module inclusion:
- Compilation in real-life
- Project module example
#include
¶
- Two kinds of inclusions:
#include <...>
: the header located in a standard directory (typically standard headers). You could add other directories using:gcc -I
.#include "..."
: the header located in relative directory (typically your own headers).
#include
copies and pastes a file into the current file
Example:
- Under Linux,
#include <stdio.h>
is replaced by the content of/usr/include/stdio.h
. - This file declares many functions, including
printf
.
#define
(constant)¶
#define
substitutes a symbol (here,MESSAGE
) by arbitrary an arbitrary text (here,"Hello world"
).- Note this text could be C code.
%%file hello2.c
#include <stdio.h>
#define GREETINGS printf("Hello world")
int main() {
GREETINGS;
return 0;
}
Overwriting hello2.c
!gcc hello2.c -o hello2
!./hello2
Hello world
#define
(macro)¶
#define
can be parametrized by one or more argument, used to perform the substitution.
%%file macro.c
#include <stdio.h>
#define REPEAT_TWICE(a) a "-" a
int main() {
printf("1) " "Hello" "-" "Hello" "\n");
printf("2) " REPEAT_TWICE("Hello") "\n");
return 0;
}
Overwriting macro.c
!gcc macro.c -o macro
!./macro
1) Hello-Hello 2) Hello-Hello
#ifdef ... #else ... #endif
¶
#ifdef
(resp.#ifndef
) checks whether a variable is defined (resp. not defined).- One can define variable outside of the program through
gcc -D
:
%%file hello3.c
#include <stdio.h>
#ifdef FRENCH
# define MESSAGE "Bonjour tout le monde"
#else
# define MESSAGE "Hello world"
#endif
int main() { printf(MESSAGE); return 0; }
Overwriting hello3.c
!gcc hello3.c -o hello3 && ./hello3
Hello world
!gcc hello3.c -o hello3 -DFRENCH && ./hello3
Bonjour tout le monde
Compilation in real-life¶
- Each module
".h"
as a guard:
#ifndef MY_MODULE_H
#define MY_MODULE_H
// ...
#endif
- Compile each
".c"
that has no main (gcc -c module.c
)
gcc -c module1.c # produces module1.o
gcc -c module2.c # produces module2.o
- Compile the final binary (e.g. the runnable) and link with the
.o
files and the eventual library (e.glibfoo.so
,libbar.so
, typically in/usr/lib
):
gcc -o programme main.c module1.o module2.so -lfoo -lbar
Usually done through a Makefile
(usually, itself produced thanks to cmake
).
Project module example¶
%%file a.h
#ifndef A_H
#define A_H
void a();
#endif
Overwriting a.h
%%file a.c
#include <stdio.h>
#include "a.h"
void a() {
printf("hello\n");
}
Overwriting a.c
%%file main.c
#include "a.h"
int main() {
a();
return 0;
}
Overwriting main.c
!gcc -c a.c
!gcc main.c a.o -o main
!./main
hello
Base types¶
- Types and type qualifiers
- Good practice
sizeof
- Function qualifiers
Types¶
In C, all variable must be typed.
- Integral:
char
,int
,size_t
bool
is not defined (unless you#include <stdbool.h>
)
- Fractionary:
float
,double
A type may be qualified:
unsigned
: unsignedshort
,long
,long long
: more bits, more precisionconst
: read-only valuestatic
: value preserved between each function call
Example: const unsigned long long int
Good practices¶
- Prefer
double
overfloat
if you need accurate results. - The size of an
int
depends on the OS. If you develop low-level code, use a type defined in<stdint.h>
.int16
,uint16
int32
,uint32
int64
,uint64
- Avoid to decrement an unsigned or to substract some values, because a small negative
unsigned
become large positiveunsigned
!
sizeof
¶
sizeof
returns the size in bytes of a type or a variable.
%%file sizeof.c
#include <stdint.h>
#include <stdio.h>
int main() {
int a;
printf("sizof(a) = %zu\n", sizeof(a));
printf("sizof(int) = %zu\n", sizeof(int));
printf("sizof(int16) = %zu\n", sizeof(int16_t));
printf("sizof(int32) = %zu\n", sizeof(int32_t));
return 0;
}
Overwriting sizeof.c
!gcc sizeof.c -o sizeof
!./sizeof
sizof(a) = 4 sizof(int) = 4 sizof(int16) = 2 sizof(int32) = 4
Function qualifiers¶
static
: the function is only visible inside its .c
(local function)
static void detail() {
//...
}
void f() {
detail();
//...
}
inline
: the function is replaced by it's called (no multiple definition when linking, so you can implement an (short) inline function in a .h
)
#ifndef MYMODULE_H
#define MODULE_H
#include <stdio;H>
inline void hello() {
printf("hello\n");
}
void f();
#endif
Operators¶
- Preliminaries
- Arithmetic operators
- Logical operators
- Comparison operators
- Bitwise operators
Preliminaries¶
In C:
- Operators only exists for base types and memory addresses.
- Operators cannot be define/overloaded.
Arithmetic operators¶
Operation | Operator | In place operator |
---|---|---|
Addition | + |
+= |
Subtraction | - |
-= |
Multiplication | * |
*= |
Division | / |
/= |
Module | % |
%= |
Remarks:
- No power function. See
pow()
provided by<cmath.h>
i = i + 1
,i += 1
,i++
,++i
, incrementi
by1
i = i - 1
,i -= 1
,i--
,--i
, decrementi
by1
Result type¶
- The result type of an operation is the same as the left operand.
%%file division.c
#include <stdio.h>
int main() {
int a = 7, b = 2;
// Euclidean division
printf("%d = %d * %d + %d\n", a, b, a / b, a % b);
// Usual division
double x = 7;
printf("%lf = %d * %lf\n", x, b, x / b);
return 0;
}
Overwriting division.c
!gcc division.c -o division
!./division
7 = 2 * 3 + 1 7.000000 = 2 * 3.500000
Logical operators¶
Operation | Operator |
---|---|
Not | ! |
And | && |
Or | || |
Xor | ^ |
Remarks:
- Only
0
(orNULL
) is false, everything else is true. - Lazy evaluation: the operand of a clause are only evaluated if it may affect the result
Lazy evaluations¶
%%file lazy.c
#include <stdbool.h>
#include <stdio.h>
bool is_true() {
printf("true");
return true;
}
bool is_false() {
printf("false");
return false;
}
int main() {
bool b = is_false() && is_true();
return 0;
}
Overwriting lazy.c
!gcc lazy.c -o lazy
!./lazy
false
Comparison operators¶
Operation | Operator |
---|---|
Equals | == |
Not equals | != |
Lower than | < |
Lower or equal | <= |
Greater than | > |
Greater or equal | >= |
Warning:
- To test whether i is in $[0, 5)$ you must write:
(0 <= i) && (i < 5)
- Indeed, if you write:
(0 <= i < 5)
... it means, if i = -1
, the clause is evaluated as
((0 <= -1) < 5)
((0) < 5)
(5) // non-null
Bitwise operators¶
Operation | Operator | In place operator |
---|---|---|
Bitwise not | ~ | ~= |
Bitwise and | & |
&= |
Bitwise or | | |
|= |
Left shift | << |
<<= |
Right shift | >> |
>>= |
Example: To extract the $7^{th}$ bit (by increasing weight) of x:
(x & (1 << 7)) >> 7
Basic I/O¶
We restrict to I/O related to the standard input (read from console) and output (print to console).
- Preliminaries
printf
scanf
Preliminaries¶
- Reading a value from to the standard output is done using
printf
, provided by<stdio.h>
- The variable needs to be formatted depending on their type.
- Use
\n
to start a new line - If a string is very long, you can write in on multiple lines as follows:
#include <stdio.h>
int main() {
printf(
"Hello, my name is Bond.\n"
"James Bond\n"
);
return 0;
}
printf
¶
%%file printf.c
#include <stdio.h>
int main() {
int x = 7;
float y = 5.67;
double z = 5.67;
printf("1) x = %d\n", x);
printf("2) y = %f\n", y);
printf("3) y = %lf\n", z);
printf("4) x = %d y = %0.2f z = %0.3lf\n", x, y, z);
return 0;
}
Overwriting printf.c
!gcc -Wall printf.c -o printf
!./printf
1) x = 7 2) y = 5.670000 3) y = 5.670000 4) x = 7 y = 5.67 z = 5.670
scanf
¶
- Reading a value from the standard input is done using
scanf
, also provided by<stdio.h>
.- Variables must also be formatted.
- For files, use
fscanf
.
- The syntax looks a bit weird, but I'll explain it later when presenting the pointers.
#include <stdio.h>
int main() {
int x;
printf("Integer? ");
scanf("%d", &x);
printf("x = %d\n");
return 0;
}
Remark: You should check the value returned by scanf
to check whether the input is well-formed.
Tests¶
if
...else if
...else
switch
...case
...break
if
... else if
... else
¶
A test is true if and only if it is evaluated to a non-null value.
#include <stdio.h>
int main() {
int x = 7;
if (x % 2 == 0) {
printf("Even\n");
} else {
printf("Odd\n");
} // Odd
return 0;
}
Remark: If there is a single instruction block in a if
(resp. else
) you may omit the {...}
.
switch
... case
... break
¶
- Only for
int
-like variables (including,unsigned int
,char
, etc.) - Do not forget to
break
to exit case, otherwise, you enter in the next case!
#include <stdio.h>
#define BLUE 1
#define RED 2
int main() {
int choice = 1;
switch(choice) {
case BLUE:
printf("Blue!");
break;
case RED:
printf("Red!");
break;
default:
printf("Invalid number!");
break;
} // "Blue!"
return 0;
}
Loops¶
while(...) ...
do ... while(...)
for(...; ...; ...)
while(...) ...
¶
The while
loops until the condition is true.
- Ensure this race condition is always reached, otherwise the program is trapped an endless loop.
int i = 0;
while (i < 10) {
printf("%d\n", i);
i++;
}
do ... while(...)
¶
A do ... while
loop is repeated until its condition is met.
int i, ret;
do {
printf("Enter an integer: ");
ret = scanf("%d", &i);
} while (ret == 1);
for(...; ...; ...)
¶
A for
loop is made of three instructions:
- Initialization (possibly involving several
,
-separated sub-instructions) - Test
- Post instruction (possibly involving several
,
-separated sub-instructions)
Example 1:
int i;
for (i = 0; i < 10; i++) printf("i = %d\n", i);
Example 2:
Print the $10$ first values of the sequence $v$, where $v_0 = 2$ and $v_{n} = 3 \cdot v_{n-1}$:
int i, j;
for (i = 0, j = 2; i < 10; i++, j *= 3) {
printf("v_%d = %d\n", i, j);
}
Memory¶
- Unary operators
&
and*
- Pointers
- Address types and qualifiers
- Operators
- Strings
- Generic address (
void *
) - Stack
- Passage by copy
- Passage by pointer
- Arrays
- Heap
- Dynamic allocation
Unary operators &
and *
¶
&
gets the address of a (declared) variable. This address is called pointer. Every pointer has the same size (e.g., 64 bits for 64-bits operating systems).*
access what is at the address of the pointer. The data is of arbitrary size. This size is deduced by the pointer type.
int main() {
int i = 7;
int *pi = &i;
printf("1) i = %d *pi = %d\n", i, *pi); // 1) i = 7 *pi = 7
*pi = 8;
printf("2) i = %d *pi = %d\n", i, *pi); // 2) i = 8 *pi = 8
return 0;
}
1) i = 7 pi = 7
2) i = 8 pi = 8
Pointers of pointers (... of pointers)¶
#include <stdio.h>
int main() {
int i = 7;
int *pi = &i;
printf("1) i = %d *pi = %d\n", i, *pi);
*pi = 8;
printf("2) i = %d *pi = %d\n", i, *pi);
int **ppi = π
**ppi = 9;
printf(
"3) i = %d pi = %p *pi = %d *ppi = %p **ppi = %d\n",
i, pi, *pi, *ppi, **ppi
);
return 0;
}
1) i = 7 pi = 7
2) i = 8 pi = 8
3) i = 9 pi = 0x7fffa58c4094 pi = 9 ppi = 0x7fffa58c4094 **ppi = 9
Address types and qualifiers¶
Type
T *
: A pointer to aT
variable.T **
: A pointer to aT *
pointer.
Qualifiers:
const T *
: The pointed data is constantT * const
: The pointer is constant (but not the pointed data) --unused in practice.const T * const
: The pointer and the pointed data are constant --unused in practice.
Operators¶
Assume T * p
, where T
is not void
.
p + i
andp - i
shift the address byi * sizeof(T)
*p
retrievessizeof(T)
bytes at addressp
p[i]
is a short hand for*(p + i)
. In particularp[0]
and*p
are equivalent.
Applications:
- strings
- "arrays"
Strings¶
#include <stdio.h> // printf
#include <string.h> // strlen
int main() {
char * s = "hello";
printf("%s\n", s);
for (size_t i = 0; i < strlen(s); i++) {
printf("%d: %c %c %p + (%d * %d) = %p\n",
i, s[i], *(s + i),
s, sizeof(char), i, s + i
);
}
return 0;
}
hello
0: h h 0x55b415cd4008 + (1 * 0) = 0x55b415cd4008
1: e e 0x55b415cd4008 + (1 * 1) = 0x55b415cd4009
2: l l 0x55b415cd4008 + (1 * 2) = 0x55b415cd400a
3: l l 0x55b415cd4008 + (1 * 3) = 0x55b415cd400b
4: o o 0x55b415cd4008 + (1 * 4) = 0x55b415cd400c
Generic address¶
- A generic address is of type
void *
. - As
sizeof(void)
is not defined, and so do+
,-
,*
,[]
. Only&p
is well-defined. void **
: Pointer to avoid *
pointer.- A generic address can be casted to a more accurate type.
Stack¶
Let's try to modify an argument in place:
#include <stdio.h>
int increment(int i) {
printf("2) increment: %d\n", i);
i++;
printf("3) increment: %d\n", i);
}
int main() {
int i = 7;
printf("1) main: %d\n", i);
increment(i);
printf("4) main: %d\n", i);
return 0;
}
1) main: 7
2) increment: 7
3) increment: 8
4) main: 7
Passage by copy¶
- C always pass argument by copy.
- When calling a function, its argument are copied in the stack. So onside
increment
, we manipulate a copy ofi
.
#include <stdio.h>
void increment(int i) {
i++;
printf("2) increment: %d %x\n", i, &i);
}
int main() {
int i = 7;
printf("1) main: %d %x\n", i, &i);
increment(i);
printf("3) main: %d %x\n", i, &i);
return 0;
}
1) main: 7 6700826c
2) increment: 8 6700824c
3) main: 7 6700826c
Passage by pointer¶
A pointer can be used as a reference (we pass a copy of the pointer, but the pointed data is the same). The argument is passed by pointer (which is copied to the string).
Warning: Contrary to a reference, a pointer is not always initialized to a valid data block.
#include <stdio.h>
void increment(int * i) {
(*i)++; // Or in short, *i++
}
int main() {
int i = 7;
printf("1) main: %d\n", i);
increment(&i);
printf("2) main: %d\n", i);
return 0;
}
1) main: 7
2) main: 8
Arrays¶
- Static memory block of homogeneous instances (the size is known at the compile time).
sizeof
returns the size of the cell data.- If passed to a function, you copy the entire block!
#include <stdio.h>
int main() {
int array[3] = {10, 11, 12};
printf("array size: %zu %zu\n", sizeof(array), sizeof(int) * 3);
for (size_t i = 0; i < 3; i++) {
printf("%d: %d, %d\n", i, array[i], *(array + i));
}
return 0;
}
0: 10, 10
1: 11, 11
2: 12, 12
Heap¶
- The data can be allocated outside of the stack using dynamic memory allocation.
- Such data is allocated in the heap.
Lifecycle:
- Allocate memory.
- If successfully allocated.
- Use the allocated memory.
- Free the memory when no more needed.
- (Else print memory error and exit the program.)
Dynamic allocation¶
malloc
: unitialized memory block (faster)
#include <stdlib.h>
int * array = (int *) malloc(sizeof(int) * n);
if (array) {
// ...
free(array)
}
calloc
: memory block set to 0
#include <string.h>
#include <stdlib.h>
int * array = (int *) calloc(sizeof(int), n);
if (array) {
// ...
free(array)
}
Advanced types¶
struct
enum
union
typedef
struct
¶
A struct
packs a list of (typed) attributes.
- Structures are used define "objects" in C.
- A
struct
may involve any type (including anotherstruct
)
#include <stdio.h>
struct pair_t {
int a;
double b;
}; // <-- do not forget ';' here
int main() {
struct pair_t p = {3, 4.5};
printf("p = (a = %d, b = %lf)\n", p.a, p.b);
p.a = -2;
p.b = -7.8;
printf("p = (a = %d, b = %lf)\n", p.a, p.b);
return 0;
}
p = (a = 3, b = 4.500000)
p = (a = -2, b = -7.800000)
enum
¶
An enum
is used to name distinct integers.
#include <stdio.h>
enum shape_t {
SQUARE,
CIRCLE,
TRIANGLE,
};
int main() {
enum shape_t shape1 = SQUARE;
enum shape_t shape2 = TRIANGLE,
shape3 = CIRCLE;
printf("shape1 = %d\n", shape1);
printf("shape2 = %d\n", shape2);
printf("shape3 = %d\n", shape3);
return 0;
}
shape1 = 0
shape2 = 2
shape3 = 1
union
¶
An union
wraps several exclusive attribute. Its size is the largest embedded type.
#include <stdio.h>
union my_union_t {
int i;
double d;
};
int main() {
union my_union_t u;
u.i = 7;
printf(
"sizeof(u.i) = %zu\n" // sizeof(u.i) = 4
"sizeof(u.d) = %zu\n" // sizeof(u.d) = 8
"sizeof(u) = %zu\n", // sizeof(u) = 8
sizeof(u.i), sizeof(u.d), sizeof(u)
);
printf("1) u.i = %d\n", u.i); // 7
printf("1) u.d = %lf\n", u.d); // 0.00000
u.d = 1.234;
printf("2) u.i = %d\n", u.i); // -927712936
printf("2) u.d = %lf\n", u.d); // 1.234000
return 0;
}
typedef
¶
typedef
defines an alias to another type.
struct pair {
int a;
double b;
};
typedef struct pair pair_t;
... or in short:
typedef struct pair {
int a;
double b;
} pair_t;
Containers¶
By default, there is no containers in C!
See libparistraceroute
to see possible implementations of the following containers inspired from C++:
pair<T1, T2>
list<T>
vector<T>
set<T>
map<K, V>
https://github.com/libparistraceroute/libparistraceroute/tree/master/libparistraceroute/containers
File descriptors¶
- Preliminaries
fopen
...fclose
fprintf
fscanf
- Standard streams
Preliminaries¶
A file descriptor can be anything descripted by a file in the Linux sens
- regular file
- device
- socket
- ...
In general, a file descriptor (e.g., regular files, sockets) must be:
- opened
- checked (only if successfully opened)
- used
- closed
A closed/invalid file descriptor must never be used/closed.
fopen
... fclose
¶
fopen
is used to create a file descriptor of typeFILE *
. If it fails it returns an invalid file descriptor (NULL
, equals to0
). Several modes:"r"
read;"w"
write;- ... and others (see
man fopen
)
fclose
closes a valid file descriptor.
#include <stdio.h>
int main() {
FILE * fp = fopen("/etc/passwd", "r");
if (fp) {
// Do something
fclose(fp);
} else {
// Oops
}
return 0;
}
fprintf
¶
fprintf
is used to write into a valid"w"
file descriptor.
int i, j, k;
FILE * fp = fopen("some_file", "w");
if (fp) {
fprintf(fp, "%d %d %d", i, j, k);
// ....
fclose(fp);
}
fscanf
¶
fscanf
is used to read values from a valid"r"
file descriptor.
int i, j, k;
FILE * fp = fopen("some_file", "r");
if (fp) {
fscanf(fp, "%d %d %d", &i, &j, &k);
// ....
fclose(fp);
}
Standard streams¶
Any process involves three standard stream:
stdin
: read from standard input;stdout
: write to standard output;stderr
: write to standard error output.
Remark: These streams are always valid (never open/close them).
#include <stdio.h>
int main() {
int i;
printf("Hello"); // or fprintf(stdout, "Hello");
fprintf(stderr, "Hello");
scanf("%d", &i); // or fscanf("%d", &i);
return 0;
}
poetry
and C¶
setuptools
: former tool to build python packages.- Can wrap C code using "extensions"
- Abstract platform-dependant considerations (compiler, etc.).
poetry
: modern tool to build python packages- Template: https://github.com/Lucky-Mano/Poetry_C_Extension_Example
Acknowledgements¶
- Jupyter lab
- RISE
- How to Code in C with a Jupyter Notebook
- Lucky-Mano:
poetry
and C template: thanks for having accepted my PR ;)