alexeev-dev notes

How to Speed Up Python Scripts: C Extensions and the Python/C API

During software development, we often face a choice between a language's convenience and its performance. Python has gained popularity due to its simplicity and elegance, but when it comes to low-level operations or tasks requiring high performance and speed, C comes to the rescue.

We will be focusing specifically on integrating extensions at build time, not just loading libraries via ctypes.

In this article, I want to discuss how to integrate C extensions using the Python.h library. I will also explain how to create your own Python library with C extensions. We will explore how Python works internally — for example, we'll recall that everything is an object. I will use Poetry as my environment manager.

Everything will be demonstrated using my small library for various algorithms and computations. At the end, I will conduct an analysis of pure Python algorithms, our library, and pure C algorithms: execution speed, distributability, pros and cons, and code volume.

Without further ado, let's begin!


So, suppose you want to implement some functionality in your project. But you realize that pure Python is too slow or too high-level for your task. Therefore, you can create C extensions that will implement code critical for execution speed.

C extensions are only available for CPython — the reference implementation of Python.

Python also allows you to create C extensions that perform operations without the GIL — Global Interpreter Lock.

The GIL imposes some limitations on threads, specifically that you cannot use multiple processors simultaneously. It is a mutex that blocks access to Python interpreter objects in multithreaded environments, allowing only one instruction to execute at a time. Although this mechanism ensures data integrity, it can slow down program execution.

You can also offload other tasks to C extensions, such as:

It's important to understand that you shouldn't write everything in C. Sometimes ordinary optimizations and profiling are sufficient.

Environment Setup

So, how does creating Python projects usually begin? The typical approach is creating a virtual environment:

python3 -m venv venv
source venv/bin/activate

But in this project, I decided to deviate from that method and instead use Poetry, a project management system. Poetry is a tool for managing dependencies and building packages in Python. Moreover, using Poetry makes it very easy to publish your library on PyPI!

Poetry provides a complete set of tools that may be needed for deterministic management of Python projects, including package building, support for different language versions, testing, and project deployment.

You can install Poetry via pipx: pipx install poetry or via pip: pip install poetry --break-system-requirements. This will install Poetry globally.

Let's initialize the project in your home directory:

poetry init

But to write C extensions, we need access to the Python C API (the Python.h header file). To do this, you need to install the python-dev or python3-dev package.

To activate the virtual environment, use poetry shell or poetry env activate (or eval $(poetry env activate)).

That's not all. Since we're using extensions in the compilable C language, we need to create a build script (build.py):

"""Build script."""

from setuptools import Extension
from setuptools.command.build_ext import build_ext

extensions = [
    Extension("libnumerixpy.base", sources=["ext/src/lnpy_base.c"]),
    Extension("libnumerixpy.math.basemath", sources=['ext/src/libbasemath.c', "ext/src/lnpy_basemath.c"], include_dirs=['ext/src']),
]


class BuildFailed(Exception):
    pass


class ExtBuilder(build_ext):
    def run(self):
        try:
            build_ext.run(self)
        except Exception as ex:
            print(f'[run] Error: {ex}')

    def build_extension(self, ext):
        try:
            build_ext.build_extension(self, ext)
        except Exception as ex:
            print(f'[build] Error: {ex}')


def build(setup_kwargs):
    setup_kwargs.update(
        {"ext_modules": extensions, "cmdclass": {"build_ext": ExtBuilder}}
    )

My project is called libnumerixpy, and I will implement some functions for mathematical calculations. Let's break down the code:

extensions = [
    Extension("libnumerixpy.base", sources=["ext/src/lnpy_base.c"]),
    Extension("libnumerixpy.math.basemath", sources=['ext/src/libbasemath.c', "ext/src/lnpy_basemath.c"], include_dirs=['ext/src']),
]

This is a list of extensions, module names, paths to source code files, and include directories (needed so it can see libbasemath.h, which we will write later).

ext
└── src
    ├── libbasemath.c
    ├── libbasemath.h
    ├── lnpy_base.c
    └── lnpy_basemath.c

We'll examine them later. But before we study the Python C API, let's connect our build script to pyproject.toml:

[build-system]
requires = ["poetry-core", 'setuptools']
build-backend = "poetry.core.masonry.api"

pyproject.toml is our project's file containing metadata, dependencies, and build rules. It is automatically created when using Poetry.

Python C API

So, to write Python extensions, we need to learn the Python/C API.

The Python/C API is Python's application programming interface that provides developers with access to the Python interpreter.

For writing C code for Python, there is PEP 7.

Here are its main guidelines:

  1. C standard version — C11 (Python >= 3.11, while Python 3.6–3.10 use C89/C99).
  2. Do not use compiler-specific extensions.
  3. All function declarations and definitions must use full prototypes (i.e., specify types of all arguments).
  4. No warnings during compilation (with major compilers).
  5. Use 4-space indents (no tabs). In my opinion, this part is rarely followed.
  6. No line should be longer than 79 characters.
  7. Function definition style (function type on the first line, name and arguments on the second, brace on the third, blank line after local variable declarations):
static PyObject
*calculate_discriminant(PyObject *self, PyObject *args) {
    double a, b, c;

    if (!PyArg_ParseTuple(args, "ddd", &a, &b, &c)) {
        return NULL;
    }

    double discriminant = b * b - 4 * a * c;

    return Py_BuildValue("d", discriminant);
}

An example code looks like this:

#define PY_SSIZE_T_CLEAN
// #define Py_GIL_DISABLED // Only include if the experimental GIL-disabling feature in Python 3.13 is enabled
#include <Python.h>

/**
 * @brief      Execute a shell command
 *
 * @param      self  The object
 * @param      args  The arguments
 *
 * @return     status code
 */
static PyObject
*lnpy_exec_system(PyObject *self, PyObject *args)
{
    const char *command;
    int sts;

    if (!PyArg_ParseTuple(args, "s", &command)) {
        return NULL;
    }
    sts = system(command);

    return PyLong_FromLong(sts);
}

static PyMethodDef LNPYMethods[] = { { "exec_shell_command", lnpy_exec_system, METH_VARARGS,
                                       "Execute a shell command." },
                                     { NULL, NULL, 0, NULL } };

static struct PyModuleDef lnpy_base = { PyModuleDef_HEAD_INIT, "base", NULL, -1, LNPYMethods };

PyMODINIT_FUNC PyInit_base(void) { return PyModule_Create(&lnpy_base); }

Macros

The Python/C API provides several useful macros. Let's look at a few:

It is needed to define the module initialization function (PyInit). The function must return a PyObject.

The initialization function must be named in the format PyInit_name, where name is the module's name, and the function must be the only non-static element.

Usage examples:

static struct PyModuleDef lnpy_base = { PyModuleDef_HEAD_INIT, "base", NULL, -1, LNPYMethods };

PyMODINIT_FUNC PyInit_base(void) { return PyModule_Create(&lnpy_base); }

From the official documentation:

static struct PyModuleDef spam_module = {
    PyModuleDef_HEAD_INIT,
    .m_name = "spam",
    ...
};

PyMODINIT_FUNC
PyInit_spam(void)
{
    return PyModule_Create(&spam_module);
}
PyDoc_STRVAR(pop_doc, "Remove and return the rightmost element.");

static PyMethodDef deque_methods[] = {
    // ...
    {"pop", (PyCFunction)deque_pop, METH_NOARGS, pop_doc},
    // ...
}
static PyMethodDef pysqlite_row_methods[] = {
    {"keys", (PyCFunction)pysqlite_row_keys, METH_NOARGS,
        PyDoc_STR("Returns the keys of the row.")},
    {NULL, NULL}
};

Exceptions

Python programmers need to deal with exceptions only when special error handling is required; unhandled exceptions are automatically passed to the caller, then to the caller's caller, and so on, until they reach the top-level interpreter, where they are reported to the user along with a stack trace.

However, for C programmers, error checking must always be explicit. All functions in the Python/C API may raise exceptions unless the function's documentation explicitly states otherwise. In general, if a function encounters an error, it sets an exception, discards any object references it owns, and returns an error indicator. Unless otherwise specified, this indicator may be NULL or -1, depending on the function's return type. A few functions return a boolean true/false, with false indicating an error. Very few functions return no explicit error indicator or have an ambiguous return value and require explicit error checking using PyErr_Occurred(). These exceptions are always explicitly documented.

Example in Python:

def incr_item(dict, key):
    try:
        item = dict[key]
    except KeyError:
        item = 0
    dict[key] = item + 1

In C:

int
incr_item(PyObject *dict, PyObject *key)
{
    PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
    int rv = -1;

    item = PyObject_GetItem(dict, key);
    if (item == NULL) {
        /* Handle KeyError */
        if (!PyErr_ExceptionMatches(PyExc_KeyError))
            goto error;

        /* Clear the error and use zero: */
        PyErr_Clear();
        item = PyLong_FromLong(0L);
        if (item == NULL)
            goto error;
    }
    const_one = PyLong_FromLong(1L);
    if (const_one == NULL)
        goto error;

    incremented_item = PyNumber_Add(item, const_one);
    if (incremented_item == NULL)
        goto error;

    if (PyObject_SetItem(dict, key, incremented_item) < 0)
        goto error;
    rv = 0; /* Success */
    /* End and cleanup code */

 error:
    /* Cleanup code */

    /* Use Py_XDECREF() to ignore NULL references. */
    Py_XDECREF(item);
    Py_XDECREF(const_one);
    Py_XDECREF(incremented_item);

    return rv;
}

Python exceptions are very different from C/C++ exceptions. If you want to raise Python exceptions from your C extension module, you can use the Python API for that. Here are some functions provided by the Python API for raising exceptions:

Although you cannot raise exceptions in C, the Python API will allow you to raise exceptions from your Python C extension module. Let's test this functionality by adding PyErr_SetString():

static PyObject *method_fputs(PyObject *self, PyObject *args) {
    char *str, *filename = NULL;
    int bytes_copied = -1;

    /* Parse arguments */
    if(!PyArg_ParseTuple(args, "ss", &str, &filename)) 
        return NULL;
    
    if (strlen(str) < 10) {
        PyErr_SetString(PyExc_ValueError, "String length must be greater than 10");
        return NULL;
    }

    FILE *fp = fopen(filename, "w");
    bytes_copied = fputs(str, fp);
    fclose(fp);

    return PyLong_FromLong(bytes_copied);
}

This code will raise an exception if we try to write a string shorter than 10 characters to a file.

Custom Exceptions

You can also raise custom exceptions in your Python extension module:

static PyObject *StringTooShortError = NULL;

PyMODINIT_FUNC PyInit_fputs(void) {
    /* Assign module value */
    PyObject *module = PyModule_Create(&fputsmodule);

    /* Initialize new exception object */
    StringTooShortError = PyErr_NewException("fputs.StringTooShortError", NULL, NULL);

    /* Add exception object to your module */
    PyModule_AddObject(module, "StringTooShortError", StringTooShortError);

    return module;
}

As before, you start by creating a module object. Then you create a new exception object using PyErr_NewException. This takes a string of the form module.classname as the name of the exception class you want to create. Choose something descriptive so the user can easily interpret what actually went wrong.

Then you add this to your module object using PyModule_AddObject. It takes your module object, the name of the new object being added, and the custom exception object itself as arguments. Finally, you return your module object.

Now that you've defined a custom exception for your module, you need to update method_fputs() to raise the appropriate exception:

static PyObject *method_fputs(PyObject *self, PyObject *args) {
    char *str, *filename = NULL;
    int bytes_copied = -1;

    /* Parse arguments */
    if(!PyArg_ParseTuple(args, "ss", &str, &filename)) 
        return NULL;
    
    if (strlen(str) < 10) {
        /* Custom exception */
        PyErr_SetString(StringTooShortError, "String length must be greater than 10");
        return NULL;
    }

    FILE *fp = fopen(filename, "w");
    bytes_copied = fputs(str, fp);
    fclose(fp);

    return PyLong_FromLong(bytes_copied);
}

Defining Constants

You can define constants you need directly in your C code. For integer constants, you can use PyModule_AddIntConstant:

PyMODINIT_FUNC PyInit_module(void) {
    PyObject *module = PyModule_Create(<your module>);

    /* Add integer constant */
    PyModule_AddIntConstant(module, "INT_PI", 3);

    #define INT_PI 256

    PyModule_AddIntMacro(module, INT_PI);

    return module;
}

This Python API function takes the following arguments:

  1. Your module instance.
  2. The name of the constant.
  3. The value of the constant.

Why Everything in Python Is an Object

We've already mentioned a certain universal PyObject several times in the code. All object types are extensions of this type. It is a type that contains the information Python needs to treat a pointer to an object as an object.

In Python, almost everything is an object, whether it's a number, a function, or a module. Python uses a pure object model where classes are instances of the metaclass type. The terms "type" and "class" are synonyms, and type is the only class that is an instance of itself.

PyObject is the object structure you use to define object types for Python. All Python objects have a small number of fields defined by the PyObject structure. All other object types are extensions of this type.

PyObject tells the Python interpreter to treat a pointer to an object as an object. For example, setting the return type of a function to PyObject defines the common fields required by the Python interpreter to recognize it as a valid Python type.

Implementing Functions and Methods

To create functions and methods in the C/Python API, the PyCFunction type is used. This is the type of functions used to implement most Python callable objects in C. Functions of this type take two PyObject * parameters and return one such value.

PyObject *PyCFunction(PyObject *self,
                      PyObject *args);

There are also the following macro-based calling conventions:

  1. METH_VARARGS
    This is the typical calling convention where methods have the PyCFunction type. The function expects two PyObject* values. The first is the self object for methods; for module functions, it is the module object. The second parameter (often called args) is a tuple object representing all arguments. This parameter is typically handled using PyArg_ParseTuple() or PyArg_UnpackTuple().

  2. METH_KEYWORDS
    Can only be used in certain combinations with other flags: METH_VARARGS | METH_KEYWORDS, METH_FASTCALL | METH_KEYWORDS, and METH_METHOD | METH_FASTCALL | METH_KEYWORDS.

More information can be found in the official documentation.

Practical Example

Let's create a few functions to explore the Python/C API through examples.

The environment setup guide is at the top of the article.

First, let's create a small pure C file libbasemath.c in the ext/src directory:

double calculate_discriminant(double a, double b, double c) {
    double discriminant = b * b - 4 * a * c;

    return discriminant;
}

unsigned long factorial(long n) {
    if (n == 0)
        return 1;
    
    return (unsigned)n * factorial(n-1);
}

unsigned long cfactorial_sum(char num_chars[]) {
    unsigned long fact_num;
    unsigned long sum = 0;

    for (int i = 0; num_chars[i]; i++) {
        int ith_num = num_chars[i] - '0';
        fact_num = factorial(ith_num);
        sum = sum + fact_num;
    }
    return sum;
}

unsigned long ifactorial_sum(long nums[], int size) {
    unsigned long fact_num;
    unsigned long sum = 0;
    for (int i = 0; i < size; i++) {
        fact_num = factorial(nums[i]);
        sum += fact_num;
    }
    return sum;
}

In this file, you can see the calculation of the discriminant of a quadratic equation and factorial computation.

Now let's create the header file libbasemath.h:

#ifndef LIBBASEMATH_H
#define LIBBASEMATH_H

double calculate_discriminant(double a, double b, double c);
unsigned long cfactorial_sum(char num_chars[]);
unsigned long ifactorial_sum(long nums[], int size);
unsigned long factorial(long n);

#endif // LIBBASEMATH_H

Let's break down the code:

Now let's work on lnpy_basemath.c, which will contain the Python/C wrappers for the functions:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <stdio.h>
#include <libbasemath.h>

/**
 * @brief      Calculates the discriminant.
 *
 * @param      self  The object
 * @param      args  The arguments
 *
 * @return     The discriminant.
 */
static PyObject
*Py_calculate_discriminant(PyObject *self, PyObject *args) {
    double a, b, c;

    if (!PyArg_ParseTuple(args, "ddd", &a, &b, &c)) {
        return NULL;
    }

    double discriminant = calculate_discriminant(a, b, c);

    return Py_BuildValue("d", discriminant);
}

static PyObject
*cFactorial_sum(PyObject *self, PyObject *args) {
    char *char_nums;
    if (!PyArg_ParseTuple(args, "s", &char_nums)) {
        return NULL;
    }

    unsigned long fact_sum;
    fact_sum = cfactorial_sum(char_nums);

    return Py_BuildValue("i", fact_sum);
}

static PyObject
*iFactorial_sum(PyObject *self, PyObject *args) {
    PyObject *lst;
    if (!PyArg_ParseTuple(args, "O", &lst)) {
        return NULL;
    }

    int n = PyObject_Length(lst);
    if (n < 0) {
        return NULL;
    }

    long nums[n];
    for (int i = 0; i < n; i++) {
        PyObject *item = PyList_GetItem(lst, i);
        long num = PyLong_AsLong(item);
        nums[i] = num;
    }

    unsigned long fact_sum;
    fact_sum = ifactorial_sum(nums, n);

    return Py_BuildValue("i", fact_sum);
}


static PyMethodDef LNPYMethods[] = { { "calculate_discriminant", Py_calculate_discriminant, METH_VARARGS,
                                       "Calculate the discriminant by formula: D = b^2 * 4ac" },
                                     { "ifactorial_sum", iFactorial_sum, METH_VARARGS,
                                       "Calculate the iFactorial sum (from list of ints)" },
                                     { "cfactorial_sum", cFactorial_sum, METH_VARARGS,
                                       "Calculate the cFactorial sum (from digits in string of numbers)" },
                                     { NULL, NULL, 0, NULL } };

static struct PyModuleDef lnpy_basemath = { PyModuleDef_HEAD_INIT, "math", "Libnumerixpy - BaseMath", -1, LNPYMethods };

PyMODINIT_FUNC PyInit_basemath(void) { return PyModule_Create(&lnpy_basemath); }

In the first lines, we define macros and include the necessary header files:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <stdio.h>
#include <libbasemath.h>

We also create static wrapper functions that return PyObject: Py_calculate_discriminant, cFactorial_sum, iFactorial_sum.

Let's examine several interesting points:

Py_BuildValue("s", "A") // "A"
Py_BuildValue("i", 10) // 10
Py_BuildValue("(iii)", 1, 2, 3) // (1, 2, 3)
Py_BuildValue("{si,si}", "a", 4, "b", 9) // {"a": 4, "b": 9}
Py_BuildValue("") // None

At the very end, we work with the module: using the PyMethodDef type, we create an array with n+1 elements, where n is the number of functions.

static PyMethodDef LNPYMethods[] = { { "calculate_discriminant", Py_calculate_discriminant, METH_VARARGS,
                                       "Calculate the discriminant by formula: D = b^2 * 4ac" },
                                     { "ifactorial_sum", iFactorial_sum, METH_VARARGS,
                                       "Calculate the iFactorial sum (from list of ints)" },
                                     { "cfactorial_sum", cFactorial_sum, METH_VARARGS,
                                       "Calculate the cFactorial sum (from digits in string of numbers)" },
                                     { NULL, NULL, 0, NULL } };

To call the methods defined in your module, you first need to inform the Python interpreter about them. You can do this using PyMethodDef. This is a structure with 4 members representing a single method in your module.

Ideally, your Python C extension module should have several methods that you want to call from the Python interpreter. That's why you need to define an array of PyMethodDef structures.

METH_VARARGS is a flag that tells the interpreter that the function will take two arguments of type PyObject.

Using calculate_discriminant as an example:

The last lines are module initialization:

static struct PyModuleDef lnpy_basemath = { PyModuleDef_HEAD_INIT, "math", "Libnumerixpy - BaseMath", -1, LNPYMethods };

PyMODINIT_FUNC PyInit_basemath(void) { return PyModule_Create(&lnpy_basemath); }

Just as PyMethodDef contains information about the methods in your Python extension module, the PyModuleDef structure contains information about the module itself. It's not an array of structures but a single structure used to define the module.

These lines will allow us to use the construct from libnumerixpy.math.basemath import calculate_discriminant, cfactorial_sum, ifactorial_sum.

The first line is a PyModuleDef structure where we specify the name, docstring, and method list. The last line is responsible for final creation and initialization of the module.


Let's look at another, simpler example:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

/**
 * @brief      Execute a shell command
 *
 * @param      self  The object
 * @param      args  The arguments
 *
 * @return     status code
 */
static PyObject
*lnpy_exec_system(PyObject *self, PyObject *args)
{
    const char *command;
    int sts;

    if (!PyArg_ParseTuple(args, "s", &command)) {
        return NULL;
    }
    sts = system(command);

    return PyLong_FromLong(sts);
}

static PyMethodDef LNPYMethods[] = { { "lnpy_exec_system", lnpy_exec_system, METH_VARARGS,
                                       "Execute a shell command." },
                                     { NULL, NULL, 0, NULL } };

static struct PyModuleDef lnpy_base = { PyModuleDef_HEAD_INIT, "base", NULL, -1, LNPYMethods };

PyMODINIT_FUNC PyInit_base(void) { return PyModule_Create(&lnpy_base); }

Here it's exactly the same, but there's only one function — for executing a system command (import construct: from libnumerixpy.base import lnpy_exec_system).


You may have noticed functions from the Python/C API such as PyLong_FromLong, PyArg_ParseTuple, and others. I will now discuss them in more detail.

PyArg_ParseTuple() parses the arguments you receive from your Python program into local variables:

    const char *command;

    if (!PyArg_ParseTuple(args, "s", &command)) {
        return NULL;
    }

It takes an argument array, data types, and a pointer to variables. The data type specifiers are similar to those in printf. You can find the format specifications in the official documentation.

Now, PyLong_FromLong:

PyLong_FromLong() returns a PyLongObject, which represents an integer object in Python. You can find it at the very end of your C code:

    return PyLong_FromLong(sts);

Benchmark

Let's compare the execution speed of pure Python functions and C extensions.

Let's write our pure Python functions for factorial sum from a list and a string, as well as discriminant calculation:

def pure_calculate_discriminant(a: int, b: int, c: int) -> float:
    d = b * b - 4 * a * c
    return d

def fac(n):
    if n == 1:
        return 1
    return fac(n - 1) * n

def pure_cfactorial_sum(array: list):
    fac_sum = 0
    for n in array:
        n = int(n)
        fac_sum += fac(n)
    return fac_sum

def pure_ifactorial_sum(array: str):
    fac_sum = 0
    for n in list(array):
        n = int(n)
        fac_sum += fac(n)
    return fac_sum

Now let's create the benchmarking code:

from purepython import pure_calculate_discriminant, pure_cfactorial_sum, pure_ifactorial_sum
from libnumerixpy.math import calculate_discriminant, cfactorial_sum, ifactorial_sum
import timeit
from functools import wraps

def timing(f):
    @wraps(f)
    def wrapper(*args, **kwargs):
        start_time = timeit.default_timer()
        result = f(*args, **kwargs)
        elapsed_time = timeit.default_timer() - start_time
        return result, elapsed_time
    return wrapper

@timing
def pure_python():
    d = pure_calculate_discriminant(1.0, -3.0, 1.0)
    assert d == 5.0
    assert pure_cfactorial_sum("12345") == 153
    assert pure_ifactorial_sum([1,2,3,4,5]) == 153

@timing
def c_extension():
    d = calculate_discriminant(1.0, -3.0, 1.0)
    assert d == 5.0
    assert cfactorial_sum("12345") == 153
    assert ifactorial_sum([1,2,3,4,5]) == 153

def ppure_python():
    d = pure_calculate_discriminant(1.0, -3.0, 1.0)
    assert d == 5.0
    assert pure_cfactorial_sum("12345") == 153
    assert pure_ifactorial_sum([1,2,3,4,5]) == 153

def pc_extension():
    d = calculate_discriminant(1.0, -3.0, 1.0)
    assert d == 5.0
    assert cfactorial_sum("12345") == 153
    assert ifactorial_sum([1,2,3,4,5]) == 153

_, elapsed_time = pure_python()
_, elapsed_time2 = c_extension()

print(f'[PURE PYTHON] Elapsed time: {elapsed_time}')
print(f'[C EXTENSION] Elapsed time: {elapsed_time2}')

execution_time = timeit.timeit(ppure_python, number=1000)  # number of function runs
print("Average execution time for pure_python:", execution_time)

execution_time2 = timeit.timeit(pc_extension, number=1000)
print("Average execution time for c_extension:", execution_time2)

I got the following output:

[PURE PYTHON] Elapsed time: 0.0001080130004993407
[C EXTENSION] Elapsed time: 2.0455998310353607e-05
Average execution time for pure_python: 0.027656054000544827
Average execution time for c_extension: 0.0061371510000753915

Converting from exponential notation, the speed of the function using C extensions is 0.000020455998310353607 seconds. 0.00002045599831035 is 5 times smaller than 0.00010801300049934. That's a 5x speed increase!

In the second case, similarly: 0.00613715100007539 is 5 times smaller than 0.02765605400054482. Again, a 5x speed increase!

If we increase the number of runs to 100,000:

Average execution time for pure_python: 2.7358566370003246
Average execution time for c_extension: 0.5545017490003374

And for one million runs:

Average execution time for pure_python: 39.040380934000495
Average execution time for c_extension: 5.6175804949998565

Here the speedup is 7x!

For the sake of purity, let's compute the average time if we run the code 100 times with 10,000 runs each:

sum_1 = []
sum_2 = []

average_1 = 0
average_2 = 0

for i in range(100):
    execution_time = timeit.timeit(ppure_python, number=10000)
    print("Average execution time for pure_python:", execution_time)
    sum_1.append(execution_time)

    execution_time2 = timeit.timeit(pc_extension, number=10000)
    print("Average execution time for c_extension:", execution_time2)
    sum_2.append(execution_time2)

average_1 = sum(sum_1) / len(sum_1)
average_2 = sum(sum_2) / len(sum_2)

print(f'Average pure_python: {average_1}')
print(f'Average c_extension: {average_2}')
>>> Average pure_python: 0.32199167845032206
>>> Average c_extension: 0.06911946617015928

Again, a 5x speedup.

Conclusion

C extensions are a useful tool. Suppose you need to perform a series of complex computations, whether it's a cryptographic algorithm, machine learning, or processing large volumes of data. C extensions can take a share of the Python load and speed up your application.

Decided to create a low-level interface or work directly with memory from Python? C extensions are at your service, provided you know how to work with raw pointers.

Thinking about improving an existing but poorly performing Python application without rewriting it entirely in another language? There is a solution — C extensions.

Or maybe you're simply a committed advocate of optimization, striving to maximize the execution speed of your code without giving up high-level abstractions for networking, GUI, etc.

In this article, we examined the Python/C API and wrote several of our own extensions. If you believe I made a mistake or expressed something incorrectly — please write in the comments.

We have clearly seen how much program speed can be increased at the cost of readability.

The GitHub repository with all source code and tests is available at this link.

You can install my library via pip:

pip3 install libnumerixpy

Sources