<- Back to my website

Templated dynamic arrays in good old C

Table of contents

  1. Introduction
  2. Dynamic array theory
  3. I love the preprocessor
  4. Preprocessor hell
  5. For each loops?????? In my C programming language????
  6. In practice

Introduction

C is one of my favorite programming languages. There are lots of reasons for that, the main one being that I hate myself. In the process of creating a game engine in C (I told you I hated myself), I asked myself an interesting question: where the fuck are my std::vectors?

In more modern (and arguably better) languages like C++, Java or Rust, the standard library provides nice classes and interfaces to create dynamic arrays easily. C is no such language. C's standard library sucks. C makes you do all the dirty work yourself.

Fine.

I'll do it myself.

Dynamic array theory

A dynamic array is a contiguous area of memory that can expand based on the needs of the situation. For example in C++, you could create an std::vector (which is a the C++ (aka stupid) way of naming a dynamic array) of any initial size, and every time you call the push_back (or any variant) method, the array will grow as needed.

This is trivial to implement honestly, you just need to define a structure that holds a buffer, the capacity of the buffer (initialized to 3) and the length of the array (initialized to 0):

typedef struct cool_array_t {
    int* buffer;
    size_t length;
    size_t capacity;
} cool_array_t;

cool_array_t* array_new() {
    cool_array_t* self = (cool_array_t*)malloc(sizeof(cool_array_t));
    self->capacity = 3;
    self->buffer = (int*)malloc(sizeof(int) * self->capacity);
    self->length = 0;
    return self;
}

void array_delete(cool_array_t* self) {
    free(self->buffer);
    free(self);
}

Alright, that's cool and all, but how do we add an element to the array? Let's declare a function to do that. When we add an element to the array, we want to check if the length of the array is equal or larger than the capacity. If that is the case, it means we need to grow the array (with a simple realloc call that will transfer the memory to a new, bigger buffer). How much we grow the array by is completly arbitrary, so I chose to add half the array size each time we grow it (I thinks it's the way MSVC does it, but I'm really not sure).

void array_add(cool_array_t* self, int element) {
    if (self->length >= self->capacity) {
        self->capacity += self->capacity / 2 + 1;
        self->buffer = (cool_array_t*)realloc(self->buffer, self->capacity * sizeof(int));
    }

    self->buffer[self->length] = element;
    self->length++;
}

That's basically all there is to know about adding an element in a dynamic array. All other insertions follow the same principle. I will leave the removing operations as an exercice for the reader because I'm feeling lazy today and I've decided it's; out of the scope of this article.

But there's a problem: I promised you templates, but there is no template to be seen. Did I lie to you? Did I deceive you? Well, kind of.

I love the preprocessor

Every C developper knows that C offers no kind of polymorphism (it hadn't been invented yet). So I did lie to you when I said I would use templates in C. But we have something else. We have the all mighty preprocessor.

The preprocessor is a powerful tool in the C compiler that allows us to write unreadable code that no one can understand and maintain, but it is able to make every C developper dream come true (I'm talking about templates). So how can we leverage the power of the preprocessor to achieve polymorphism? The answer is simple: just declare more functions for more types.

To achieve this, we will use the concatenation operator ##, which is an operator so obscure that I've only seen it used in StackOverflow answers with less than 5 score. The idea is that you can take a macro parameter, and "glue" it to anything else in your macro. For example:

#define GLUE(x, y) x##y
printf("%d\n", GLUE(6, 9));

will print 69 to the console.

Using this operator allows us to create new function names on the fly with the preprocessor :

#define DECL(prefix) \
    void* prefix##_is_shit() { \
        return NULL; \
    }

DECL(rust);
DECL(java);

will create two functions: one named rust_is_shit() and another named java_is_shit(). You can probably see where I'm going with this.

Preprocessor hell

Prepare yourself, we're about to stop writing C and start writing for the preprocessor (which is still C, but for REAL developers). Instead of declaring our array type and functions as usual, we will declare them in a big ass macro that will take one argument: the type of the data we want to store. We can then use it to generate more functions that will take the specialized array type as an argument :

#define ARRAY_DECL(type) \
    typedef struct type##_array_t { \
        type* buffer; \
        size_t capacity; \
        size_t length; \
    } type##_array_t;

ARRAY_DECL(int);
ARRAY_DECL(float);

typedef struct cool_type_t { int foo; } cool_type_t;
ARRAY_DECL(cool_type_t);

With this bit of code, we are now able to create new array types on the fly using the preprocessor. It even works with custom types!

There's something wrong with this though. The array of cool_type_t is named cool_type_t_array_t, which is ugly and not easy to type. Idealy, we'd want to remove the _t from cool_type_t before glueing the _array_t part. (Note that this is only a problem due to my naming conventions (which are undeniably superior than yours), and if you name your types the Java way, you won't have that problem)

To solve this problem, let's modify the macro:

#define ARRAY_DECL(type, prefix) \
    typedef struct prefix##_array_t { \
        // blablabla
    } prefix##_array_t;

typedef struct cool_type_t { int foo; } cool_type_t;
ARRAY_DECL(cool_type_t, cool_type);

Now, the array is named cool_type_array_t, which is a bit better. The neat thing is that you can also give arbitrary names to your arrays, you're not limited to the name of the type you want to create an array of (which I'm still struggling to find a use case for).

This is very nice, but we still need to declare the functions to actually use the array:

#define ARRAY_DECL(type, prefix) \
    typedef struct prefix##_array_t { \
        // blablabla
    } prefix##_array_t; \
    \
    prefix##_array_t* array_new() { \
        prefix##_array_t* self = (prefix##_array_t*)malloc(sizeof(cool_array_t)); \
        self->capacity = 3; \
        self->buffer = (int*)malloc(sizeof(type) * self->capacity); \
        self->length = 0; \
        return self; \
    } \
    \
    void array_delete(prefix##_array_t* self) { \
        free(self->buffer); \
        free(self); \
    } \
    \
    void array_add(prefix##_array_t* self, int element) { \
        if (self->length >= self->capacity) { \
            self->capacity += self->capacity / 2 + 1; \
            self->buffer = (prefix##_array_t*)realloc(self->buffer, self->capacity * sizeof(type)); \
        } \
        self->buffer[self->length] = element; \
        self->length++; \
    }

I just copy-pasted the code from before, and replaced all occurences of cool_array_t with prefix##_array_t and int with type (the macro argument).

For each loops?????? In my C programming language????

You know what else we can do with the preprocessor? That's right, for each loops! No need to write the old clunky C-style for loop, we can make our own! But before that, we need to talk about the comma operator.

Just like the concatenation operator, the comma operator is a rather obscure functionality of C. You may have used it before, without knowing its existence, always in the shadows.

The way it works is really simple: it evaluates the right-hand expression, then evaluates the left-hand expression and returns it. Take this snippet of code:

int a = 0;
int b = 69;
int c = a++, a += b;

Here, when evaluating the assignment to c, the program first increments a, then adds b to a, and returns the result. That way, in a single line of code, we have incremented a, added b to a, and assigned c to the final value of a, which would be 70.

This operator is rarely used in real code because it creates an unreadable mess of a codebase, but we're not writing real code here. We're doing crimes against the C community.

Using this knowledge, we are now able to create our own for each loop, by replacing it with a real for loop in the preprocessor.

#define ARRAY_FOREACH(self, elem, i) for ((i) = 0, (elem) = (self)->buffer[0]; (i) < (self)->length; (i)++, (elem) = (self)->buffer[i])

int_array_t* a = int_array_new();
int elem;
int i;
ARRAY_FOREACH(a, elem, i) {
    // do stuff with elem, which is the i-th element of the array
}

This is how it works: the C-style for loop is composed of three phases (in order):

In the initializtion phase, we set i to 0, as usual, but we also set the current element (which is named elem here) to array[0]. This is because the incrementation phase is executed after the loop body, so we need to initialize the first element ourselves.

The condition phase is as usual.

In the incrementation phase, we increment i as usual, and we also update elem to be the i-th element of the array.

Our for each loop can now be used as a usual for loop, but better! (depending on what you would define as better)

We are now finished with the preprocessor code (thank god). Let's see what it looks like in action!

In practice

I have put all the preprocessor stuff in a file named array.h.

#include "array.h"
#include <stdio.h>

typedef struct foo_t {
    int foo;
    float bar;
} foo_t;
ARRAY_DECL(foo_t, foo);

int main(int argc, char* argv[]) {
    foo_array_t* array = foo_array_new();
    foo_array_add(array, (foo_t){ .foo = 0, .bar = 0.0f });
    foo_array_add(array, (foo_t){ .foo = 1, .bar = 1.0f });
    foo_array_add(array, (foo_t){ .foo = 2, .bar = 2.0f });

    foo_t elem;
    int i;
    ARRAY_FOREACH(array, elem, i) {
        printf("element %d: foo = %d, bar = %f\n", i, elem.foo, elem.bar);
    }

    foo_array_delete(array);
    return 0;
}

And there we go! We have achieved our goal!

With <3, from Tom

07/01/2025