# `cstructimpl`

> A Python package for translating C `struct`s into Python classes.

[![PyPI version](https://img.shields.io/pypi/v/cstructimpl.svg)](https://pypi.org/project/cstructimpl/)
[![License](https://img.shields.io/github/license/Brendon-Mendicino/cstructimpl.svg)](https://github.com/Brendon-Mendicino/cstructimpl/blob/master/LICENSE)
[![Python Versions](https://img.shields.io/pypi/pyversions/cstructimpl.svg)](https://pypi.org/project/cstructimpl/)

---

## Quick Start

Install from PyPI:

```bash
pip install cstructimpl
```

Define your struct and parse raw bytes:

```pycon
>>> from cstructimpl import *
>>> class Info(CStruct):
...     age: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8]
...
>>> class Person(CStruct):
...     info: Info
...     name: Annotated[str, CStr(6)]
...
>>> Person.c_decode(bytes([18, 170]) + b"Pippo\x00")
Person(info=Info(age=18, height=170), name='Pippo')

```

---

## Introduction

`cstructimpl` makes working with binary data in Python simple and intuitive.  
By subclassing `CStruct`, you can define Python classes that map directly to C-style `struct`s and parse raw bytes into fully typed objects.

No manual parsing, no boilerplate — just define your struct and let the library do the heavy lifting.

---

## Type System

At the core of the library is the `BaseType` protocol, which defines how types behave in the C world:

```python
class BaseType(Protocol[T]):

    def c_size(self) -> int: ...
    def c_align(self) -> int: ...

    def c_decode(
        self,
        raw: bytes,
        *,
        is_little_endian: bool = True,
    ) -> T | None: ...

    def c_encode(
        self,
        data: T,
        *,
        is_little_endian: bool = True,
    ) -> bytes: ...
```

Any class that follows this protocol can act as a `BaseType`, controlling its own parsing, size, and alignment.

When parsing a struct:

- If a field type is itself a `BaseType`, parsing happens automatically.
- Otherwise, annotate the field with `Annotated[..., BaseType]` to tell the parser how to interpret it.
- Types such as `int` have a default converter for a `BaseType` if no annotation is provided. If you want to change this behavior you need to ovveride them in the following dictionary `cstructimpl.c_lib.DEFAULT_TYPE_TO_BASETYPE`.

The library comes with a set of ready-to-use type definitions that cover the majority of C primitive types.

| Class / Type       | Description                                                                                                                     |
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------- |
| **`BaseType[T]`**  | Protocol that defines the interface for any encodable/decodable C-compatible type.                                              |
| **`HasBaseType`**  | Protocol for classes that return their own associated `BaseType`.                                                               |
| **`GetType`**      | Wrapper calling `c_get_type()` on classes implementing `HasBaseType`. Enables automatic size, alignment, decode, encode access. |
| **`CInt`**         | Enum covering signed/unsigned C integer types (I8/U8 → I128/U128).                                                              |
| **`CBool`**        | Boolean BaseType with a C-compatible single-byte representation.                                                                |
| **`CFloat`**       | Enum of IEEE‑754‑compliant floating‑point formats (`F32`, `F64`).                                                               |
| **`CArray[T]`**    | Generic BaseType for fixed‑length arrays of a given element type.                                                               |
| **`CPadding`**     | Represents unused/padding bytes between struct fields.                                                                          |
| **`CStr`**         | C‑style null‑terminated string of fixed max length.                                                                             |
| **`CMapper[T,U]`** | Adapts between a `BaseType[U]` and custom Python type `T`. Useful for enums or custom conversions.                              |

---

## Examples

Here are a few practical examples showing how `cstructimpl` works in real-world scenarios.

### Basic Deserialization

Define a simple struct with two fields:

```pycon
>>> class Point(CStruct):
...     x: Annotated[int, CInt.U8]
...     y: Annotated[int, CInt.U8]
...
>>> Point.c_size()
2
>>> Point.c_align()
1
>>> Point.c_decode(bytes([1, 2]))
Point(x=1, y=2)

```

---

### Serializing a Class

Create a class instance and serlialize it to raw bytes

```pycon
>>> class Rect(CStruct):
...     width: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8] = 10
...
>>> rect = Rect(2)
>>> list(rect.c_encode())
[2, 10]

```

---

### Nested Structs

You can embed structs inside other structs:

```pycon
>>> class Dimensions(CStruct):
...     width: Annotated[int, CInt.U8]
...     height: Annotated[int, CInt.U8]
...
>>> class Rectangle(CStruct):
...     id: Annotated[int, CInt.U16]
...     dims: Dimensions
...
>>> Rectangle.c_size()
4
>>> Rectangle.c_align()
2
>>> Rectangle.c_decode(bytes([1, 0, 2, 3]))
Rectangle(id=1, dims=Dimensions(width=2, height=3))

```

---

### Strings in Structs

Support for C-style null-terminated strings:

```pycon
>>> class Message(CStruct):
...     length: Annotated[int, CInt.U16]
...     text: Annotated[str, CStr(6)]
...
>>> raw = bytes([5, 0]) + b"Hello\x00"
>>> Message.c_decode(raw)
Message(length=5, text='Hello')

```

---

### Enums with Autocast

Automatically cast numeric values into Python `Enum`s:

```pycon
>>> class Mood(IntEnum):
...     HAPPY = 0
...     SAD = 1
...
>>> class Person(CStruct):
...     age: Annotated[int, CInt.U16]
...     mood: Annotated[Mood, CInt.U8, Autocast()]
...
>>> raw = bytes([18, 0, 1, 0])
>>> Person.c_decode(raw)
Person(age=18, mood=<Mood.SAD: 1>)

```

---

### Arrays of Structs

Define fixed-size arrays of structs inside another struct:

```pycon
>>> class Item(CStruct, align=2):
...     a: Annotated[int, CInt.U8]
...     b: Annotated[int, CInt.U8]
...     c: Annotated[int, CInt.U8]
...
>>> class ItemList(CStruct):
...     items: Annotated[list[Item], CArray(Item, 3)]
...
>>> data = bytes(range(1, 13))  # 3 items x 4 bytes each
>>> parsed = ItemList.c_decode(data)
>>> parsed == ItemList([
...     Item(1, 2, 3),
...     Item(5, 6, 7),
...     Item(9, 10, 11),
... ])
True

```

### Custom BaseType

> > Hey! Is there a type that serializes an hash-map of list of structs of ...?
>
> > Yeah, sure there is! You can do it yourself!

`cstructimpl` lets you define your own `BaseType` implementations to handle any kind of data that is not present among the built-in primitives.

For example, here's a custom type that interprets a raw integer as a **Unix timestamp**, returning a Python `datetime` object:

```pycon
>>> from datetime import datetime
>>> class UnixTimestamp(BaseType[datetime]):
...     def c_size(self) -> int:
...         return 4
...
...     def c_align(self) -> int:
...         return 4
...
...     def c_decode(self, raw: bytes, *, is_little_endian: bool = True) -> datetime:
...         byteorder = "little" if is_little_endian else "big"
...         ts = int.from_bytes(raw, byteorder=byteorder, signed=False)
...         return datetime.fromtimestamp(ts)
...
...     def c_encode(self): pass
...
>>> class LogEntry(CStruct):
...     timestamp: Annotated[datetime, UnixTimestamp()]
...     level: Annotated[int, CInt.U8]
...
>>> LogEntry.c_decode(bytes([55, 0, 0, 0, 3, 0, 0, 0]))
LogEntry(timestamp=datetime.datetime(1970, ..., 55), level=3)

```

---

## Bit-Fields

Bit fields are very useful especially in the networking context, having
the ability to name the bit ranges is very powerful. `cstructimpl` has
the capability to reinterpret the bits into its own type system,
enabling the use of all its tools, like autocasting, mapping, ...

Example of a header call with bitfields as enumeration and
optional flags.

```pycon
>>> from enum import IntFlag, IntEnum
>>> class Flags(IntFlag):
...     ACK = 1 << 0
...     SYN = 1 << 1
...     URG = 1 << 2
...
>>> class State(IntEnum):
...     PENDING = 0
...     ERROR = 1
...     SUCCESS = 2
...
>>> class Header(CStruct):
...     port: Annotated[int, CInt.U8, BitField(4)]
...     id: Annotated[int, CInt.U8, BitField(4)]
...     state: Annotated[State, CInt.U8, BitField(2), Autocast()]
...     flags: Annotated[Flags, CInt.U8, BitField(3), Autocast()]
...     len: Annotated[int, CInt.U8]
...
>>> raw = 0x101A21.to_bytes(3, byteorder="little", signed=False)
>>> Header.c_decode(raw)
Header(port=1, id=2, state=<State.SUCCESS: 2>, flags=<Flags.SYN|URG: 6>, len=16)

```

---

## Autocast

Sometimes raw numeric values carry semantic meaning. In C, this is usually handled with `enum`s.  
With `cstructimpl`, you can automatically reinterpret values into enums (or other types) using `Autocast`.

```pycon
>>> class ResultType(IntEnum):
...     OK = 0
...     ERROR = 1
...
>>> class Person(CStruct):
...     kind: Annotated[ResultType, CInt.U8, Autocast()]
...     error_code: Annotated[int, CInt.I32]
...

```

This is equivalent to writing a custom builder:

```pycon
>>> class ResultType(IntEnum):
...     OK = 0
...     ERROR = 1
...
>>> class Person(CStruct):
...     kind: Annotated[ResultType, CMapper(CInt.U8, ResultType, int)]
...     error_code: Annotated[int, CInt.I32]
...

```

But much simpler and less error-prone.

---

## Features

- Define Python classes that map directly to C `struct`s
- Parse raw bytes into typed objects with a single method call
- Serialize a class to raw bytes using built-in type system
- Built-in type system for common C primitives
- Support for nested structs
- Flexible extension via the `BaseType` protocol

---

## Use Cases

- Parsing binary network protocols
- Working with binary file formats
- Interfacing with C libraries and data structures
- Replacing boilerplate parsing code with clean, type-safe classes

---

## Documentation

More detailed usage examples and advanced topics are available in the [documentation](https://github.com/Brendon-Mendicino/cstructimpl/wiki).

---

## Contributing

Contributions are welcome!

If you'd like to improve `cstructimpl`, please open an issue or submit a pull request on [GitHub](https://github.com/Brendon-Mendicino/cstructimpl).

---

## License

This project is licensed under the terms of the [Apache-2.0 License](https://github.com/Brendon-Mendicino/cstructimpl/blob/main/LICENSE).
