Skip to main content

Exploring features of C23

The new shiny version of the C language standard ISO/IEC 9899:2024 [1] has been there for a while now. To confuse everyone who's not familiar with standardization processes, it is called C23, not C24, as the year after the standard number may suggest. Let's try to use a few of its new features and check if they are worth being adopted widely.

Many developers, especially juniors, consider the language as a strongly typed one just because of the need to define types for all declared variables, returns from functions, etc. To prove this, they like to compare just the syntax of different languages, like in these misleading C and Python examples:

int my_var = 1; // Variable that holds an integer.
my_var = 2.576; // Type of `my_var` doesn't change,
                // a float is implicitly converted to an integer.
my_var = 125 # Variable that holds an integer value.
my_var = 2.0 # Now the same variable holds a float value.
             # But is it really 'the same' variable?

The issue here is that the typing model is merely about the syntax; it's much more about how a compiler or interpreter treats the numbers under the hood (but this is definitely a topic for another story). So, speaking about operations on numbers in C, they may not be as strict as you might think. First, the standard defines a set of rules for implicit conversions required to perform basic calculations. Second, it also has requirements (less strict, however!) about the byte sizes of the variables stored in the memory. And third, recent versions (starting from C11) give a few quite neat features that allow the compiler to automatically infer the type. Let's see what can happen if we combine the knowledge about these three principles in pure C to mimic one of the Holy Grails of object-oriented programming: polymorphism.

If it quacks, it's a duck

There is nothing more encouraging for developers (especially during life-coding interviews) than writing an array sorting algorithm for a gazillionth time. So we'll start with a simple function implementing the so-called optimized bubble sort, as follows:

void array_sort_bubble(int* arr, unsigned int len)
{
    while (1 < len)
    {
        unsigned int new_len = 0;
        for (unsigned int i = 1; i < len; i++)
        {
            if (arr[i - 1] > arr[i])
            {
                int t = arr[i - 1];
                arr[i - 1] = arr[i];
                arr[i] = t;
                new_len = i;
            }
        }
        len = new_len;
    }
}

Ok, so what types of numbers can we sort with the above function other than predefined integers? The C standard tells us that int is a signed integer number, encoded using two's complement, at least 16-bit wide. It can be wider, but this detail is implementation-defined, and the exact width you can find in <limits.h> header of your standard library. On a majority of 32 and 64-bit machines, our plain integer is 32-bit wide (16-bit int you can find on small 8-bit microcontrollers). Let's assume we are working on some modern hardware, so the integer is 32-bit wide. Are there any other 32-bit numbers that we can safely put into an array and pass to the sorting routine? For sure, another compatible thing here is int32_t because it's just a typedef from <stdint.h>.

But could we pass arrays of integers, but with different sizes, like char or long int? Usually, no (again, it's implementation defined). The compiler will probably yell at us, and even if we silence it with some casts, then all pointer operations make a big mess in the memory. So maybe we can try a small casting cheat and pass an array of unsigned numbers, just keeping 32-bit size? Nah, that's also not possible; the comparison result will be wrong for numbers with MSB equal to 1.

The sad conclusion is that our sorting function is very limited, to only one species of ducks. But other ducks want to quack in order, too!

One type to rule them all

The very well-known technique used in C for passing 'anything' to a function is to use a pointer to void. Then we also need to somehow tell the function what is hidden under such a pointer, if we want to manipulate the data pointed to. Cool, but C doesn't have any built-in type intended for storing the 'type' itself. Problem? Nope, we can create our own type!

typedef enum : unsigned char
{
    TYPE_INT,
    TYPE_UINT,
    TYPE_FLOAT,
    // More types if needed ...
} type_t;

Additionally, a new fancy feature of C23 is used here. Enumerations can have a defined underlying type (the part after the colon), so we may precisely control their sizes and limits.

With the type-of-something type defined, let's start modifying the sorting function to make it more universal. First, we change the parameters:

void array_sort_bubble(void* arr, unsigned int len, type_t type)
{
    // ...
}

Now, looking closely at the function body, we can see that there are two operations where the type of array elements is important: comparison and swapping:

if (arr[i - 1] > arr[i]) // Compare.
{
    // Swap.
    int t = arr[i - 1];
    arr[i - 1] = arr[i];
    arr[i] = t;
    // ...
}

More precisely, the type is needed to calculate the locations of two elements in the array, compare them, and then swap. Note that for swapping the positions and the size have to be known, while the element's internal structure is irrelevant. Having these facts in mind, we can make our sorting function a generic one and move all type-dependent operations to separate procedures:

void array_sort_bubble(void* arr, unsigned int len, type_t type)
{
    while (1 < len)
    {
        unsigned int new_len = 0;
        for (unsigned int i = 1; i < len; i++)
        {
            if (is_less_than_prev(arr, i, type))
            {
                swap_with_prev(arr, i, type);
                new_len = i;
            }
        }
        len = new_len;
    }
}

And now the magic begins: how to implement type-dependent operations? The information about the array type is passed as an enumeration, so naturally, we can do it with a switch:

bool is_less_than_prev(void* arr, unsigned int i, type_t type)
{
    switch (type)
    {
        case TYPE_INT:
            return ((int*)arr)[i - 1] > ((int*)arr)[i];
        case TYPE_UINT:
            return ((unsigned int*)arr)[i - 1] > ((unsigned int*)arr)[i];
        // More cases for other types ...
        default:
            assert(false); // Not supported type.
    }
}

The same rule applies to the swap procedure. Each case could be similar to this:

case TYPE_INT:
    int t = ((int*)arr)[i - 1];
    ((int*)arr)[i - 1] = ((int*)arr)[i];
    ((int*)arr)[i] = t;
    break;
// Next case ...

You now probably see where this solution is going. Each time we do a comparison or a swap, we need to select an appropriate case. More types supported - more cases to add (in two functions already). Also, we omitted a part where we should check if a type passed to the sorting algorithm is supported at all (to exit gracefully and not crash at assertions). Moreover, casting from void might lead to disastrous effects if the provided type is wrong.

Is there something we can improve? Somehow, but this is an architecture-specific hack. As we noted before, the most important information for the swap function is the size of an element. Then, if we know that some types occupy the same number of bytes, we can make a workaround like this:

// 4-byte long types.
case TYPE_INT:
case TYPE_UINT:
case TYPE_FLOAT:
case TYPE_CUSTOM_4_BYTES_LONG:
    int t = ((int*)arr)[i - 1];
    ((int*)arr)[i - 1] = ((int*)arr)[i];
    ((int*)arr)[i] = t;
    break;
// Next case ...

Probably it will work on the majority of 32-bit machines, but it's super ugly, so please, do not code like that.

Guess who I am

At this point, we see that providing a type explicitly bounded with a variable and taking a different path for each one during runtime might not be the best idea when we require computations to be fast. As usual, the speed of the code may be traded off against its size and vice versa, so we can simply create sorting functions for all the types we need:

void array_sort_bubble_int(int* arr, unsigned int len);
void array_sort_bubble_char(char* arr, unsigned int len);
void array_sort_bubble_float(float* arr, unsigned int len);
// More functions for different types...

Now, for the above declarations, we just need to meticulously copy, paste, and then tweak a little bit the implementation we've already seen at the beginning of the article. All types will be covered, and if not, then even the dumbest LLM is able to generate a few more dozen almost identical functions, so we can call it a day, can't we? Note that the bodies of all of these type-dependent functions differ in only one line, which is the definition of a temporary variable required for the swap operation. If we can make it identical, maybe we will be able to somehow use the same code for all the types, too?

With C23, the answer is yes, we can! The magic word is auto. Sounds familiar? Yup, C standard maintainers made a little bit of an awkward choice here. The auto keyword still works as a storage-class specifier (does anyone use this feature nowadays?), but since C23, it can also be used to tell the compiler to evaluate the type from the expression on the right side of an assignment operator:

auto int i = 10;  // Plain old storage-class specifier.
auto a = 2.5 * i; // But here we infer the type and `a` becomes a double.
auto b = &a;      // And therefore `b` is a pointer to a double.

So the temporary variable declaration from the original algorithm might be changed to:

auto t = arr[i - 1];

Or even better, we can move the whole swap operation to a separate macro:

#define SWAP(A, B)  \
do {                \
    auto tmp = (A); \
    (A) = (B);      \
    (B) = tmp;      \
} while (0)

And the whole algorithm can be a macro, too! Moreover, we can make it completely agnostic to parameter types using another neat C23 standard feature, the typeof operator. As the name suggests, it returns the type of the expression (recall sizeof here), so we may use it as a type specifier:

#define ARRAY_SORT_BUBBLE(ARR, LEN)             \
do {                                            \
    while (1 < (LEN))                           \
    {                                           \
        typeof(LEN) new_len = 0;                \
        for (typeof(LEN) i = 1; i < (LEN); i++) \
        {                                       \
            if ((ARR)[i - 1] > (ARR)[i])        \
            {                                   \
                SWAP((ARR)[i - 1], (ARR)[i]);   \
                new_len = i;                    \
            }                                   \
        }                                       \
        (LEN) = new_len;                        \
    }                                           \
} while (0)

Having this one implementation, there is no longer a need to copy-paste the code, so the definitions of our type-dependent sorting functions become as simple as:

void array_sort_bubble_int(int* arr, unsigned int len)
{
    ARRAY_SORT_BUBBLE(arr, len);
};

void array_sort_bubble_char(char* arr, unsigned int len);
{
    ARRAY_SORT_BUBBLE(arr, len);
};

// More functions for different types...

At last, we can use the feature that has been present since C11 (or even earlier, as a non-standard extension of a few compilers): the generic selection. You may already know it, e.g., from <tgmath.h> header from the standard library. However, its widespread usage was quite limited before C23, but now, combined with the power of auto and typeof, it can finally shine on.

#define array_sort_bubble(arr, len) _Generic((arr), \
      int*: array_sort_bubble_int,                  \
    float*: array_sort_bubble_float                 \
    )(arr, len)

// We can call 'the same' function for any kind of array!

float arr_float[3] = {1.5, -5.1, 2.678};
int arr_int[3] = {7, 3245, -67};
array_sort_bubble(arr_float, 3);
array_sort_bubble(arr_int, 3);

How does it work? Wherever we call array_sort_bubble, the compiler evaluates the type of the controlling expression put just after the _Generic keyword (arr argument) and then looks into the association list, which is a set of type : expression pairs. If the type matches, the original call is substituted with the paired expression. In the above example, we have only two cases defined, for int* and float* (note the pointers!); any other type will lead to a compile error. You may also add a special default case to deal with such errors in a controlled way, or just call a function that can work with any other type than the ones defined.

Note that the most important limitation of the generic selection is that it has only one controlling expression, so there is no easy way to check combinations of two or more parameters. Solutions exist, but they involve quite awkward constructions with nested generics and a lot of dark macro magic. Also, according to the C23 standard, type qualifiers are stripped during evaluation of the _Generic, so e.g., both const int and int are treated as a simple int (workarounds are present as compilers' extensions [3]). Moreover, not-so-obvious evaluation rules for pointers and arrays might quickly lead to confusion if not treated properly [4].

Fight!

Enough of the theory, let's check the real numbers! As a test platform, I used the RP2040 microcontroller, which has a 32-bit Cortex-M0+ inside, running at 133MHz. The app is based on Pico SDK and compiled using arm-none-eabi-gcc 13.2.1.

First, I modified the C23-based code a little by adding one more dummy type parameter to sorting functions, which is simply cast to void. The _Generic needed to be modified, too. Thanks to these changes, we have the same API in both 'modern' and 'classic' solutions.

void array_sort_bubble_int(int* arr, unsigned int len, type_t type)
{
    (void) type;
    ARRAY_SORT_BUBBLE(arr, len);
}

// ...

#define array_sort_bubble(arr, len, type) _Generic((arr),   \
      int*: array_sort_bubble_int,                          \
    float*: array_sort_bubble_float                         \
)(arr, len, type)

Next, I created a very simple program which, for the given int and float arrays, draws random numbers and then sorts them, measuring the time spent in the sorting functions. The cycle is repeated, and at the end, the average execution times are printed to the console.

#define ARR_LEN (1000U)
#define REPEATS (50U)

absolute_time_t start, finish;
float average_int = 0, average_float = 0;
int array_int[ARR_LEN];
float array_float[ARR_LEN];

for (int repeat = 0; repeat < REPEATS; repeat++)
{
    for(unsigned int i = 0; i < ARR_LEN; i++)
    {
        array_int[i] = get_rand_32();
        array_float[i] = (float)array_int[i];
    }

    start = get_absolute_time();
    array_sort_bubble(array_int, ARR_LEN, TYPE_INT);
    finish = get_absolute_time();
    average_int += absolute_time_diff_us(start, finish);

    start = get_absolute_time();
    array_sort_bubble(array_float, ARR_LEN, TYPE_FLOAT);
    finish = get_absolute_time();
    average_float += absolute_time_diff_us(start, finish);
}

printf("average int sorting time: %.3f ms\n", average_int / (REPEATS * 1000));
printf("average float sorting time: %.3f ms\n", average_float / (REPEATS * 1000));

Sorting functions were put into two separate libraries and then linked with the main.c file with a bit of CMake glue, with 'modern' and 'classic' solutions selected by the CMake option. Pico SDK types and functions like absolute_time_t and get_rand_32() are HW-dependent versions of the standard time_t and rand() ones. The whole app has been compiled with -O2 optimization level (RelWithDebInfo build type).

Finally, the results were:

// Classic
average int sorting time: 69.736 ms
average float sorting time: 304.292 ms

// Modern
average int sorting time: 55.716 ms
average float sorting time: 258.323 ms

Binary sizes (for sorting 'libraries'):

text    data    bss     dec     hex
 136       0      0     136      88 // Classic
 120       0      0     120      78 // Modern

And the cyclomatic complexity (calculated using mccabe tool, after manually expanding macros for the C23-based solution):

// Classic
swap_with_prev          2
is_less_than_prev       2
array_sort_bubble       4

// Modern
array_sort_bubble_int   4
array_sort_bubble_float 4

Examining the produced assembly further acknowledges the above results. It clearly shows that in the case of C23, the compiler replaces each invocation of array_sort_bubble() with a direct call to either array_sort_bubble_int() or array_sort_bubble_float(). The two functions are quite short with minimal numbers of branch instructions, and easy to follow even in assembly. For the non-C23 option, we always have a call to one type-independent array_sort_bubble(), which then, during the run, decides what kind of compare and swap operations to execute depending on the provided type parameter. All these differences are pictured on the beautiful graphs below (created with radare2 reverse-engineering tool), which show blocks of assembly combined with all the jumps between them.

If anyone wants to play around with this on their own, here is the full source code.

And the winner is...

'The faster one!' said the server software guy. 'No! The smaller one!', replied his colleague from the embedded team. 'Whatever, nerds, how much will we earn from both?', concluded their boss.

As always in software development, there are no universal solutions for all the problems. Both presented ways might be acceptable or not, and it depends heavily on the final application needs, hardware resources, safety and security rules, client's requirements, etc. Let's summarize what we learned so far, having in mind that the goal was to create a type-independent array sorting algorithm.

One thing that both solutions have in common is, of course, a single place in code where we actually write the algorithm's implementation. Other characteristics are more or less opposites:

The 'classic' solution

  • Doesn't use any fancy new features. It will work even with some ancient C99-compatible compilers.

  • No macros, everything is implemented as plain functions, which eases debugging and improves static analysis.

  • Supported types must be explicitly defined, and there are no under-the-hood evaluations. It shall generally be easier to prove it conforms to safety requirements.

  • Extending support for more types requires changes in all type-dependent functions. It will be a copy-pasting for basic ones, but on the other hand, we can easily define a totally different logic for complex struct/union-based types.

  • Each time the type-dependent operation is executed, we need to jump into it and then go back, so consequently, the overall speed is low, stack usage increases, and the complexity metrics worsen.

The 'modern' solution

  • Requires a compiler supporting the new standard, which might not be so easy, especially in the embedded world, or if we are forced to use some proprietary non-free compilers.

  • Defines, defines everywhere. And all the issues that follow, from developers reluctant to use them, through problems with code quality checkers, to nasty macro preprocessing rules.

  • Supports types that are compatible with the ones listed in the generic selection, so more cases are automatically covered, but the developer must take care of the evaluation of expressions and ensure that the final behavior is the desired one.

  • Just a couple more new lines are needed to support another type, but only if this particular type can be manipulated using the same operations as other already covered types (e.g., compare two elements with one < operator), so it might be impossible to mix basic types with complex structures, as we do not have operator overloading in C.

  • No jumping in the run time, speed is high, complexity low, but the compiler needs to create a set of type-dependent functions and put their logic somewhere in the memory, so the size of the application could become an issue.

In the end, are fancy C23 additions worth the effort? In my opinion: yes, they are. I think they should give better results in the majority of cases, similar to the one presented above. But I'm this kind of developer who always likes to try new features and has to have all packages installed in the newest versions... So just try it yourself and remember: YMMV!

References