Avro DBO 🚀

PyPI version License Python Versions

Avro DBO is a robust Python library designed for handling Apache Avro schemas. It facilitates seamless data serialization and schema management, making it ideal for data engineering pipelines and stream processing applications.

✨ Features

  • 🏗️ Schema-First Development: Generate Python classes from Avro schemas.
  • 🔄 Full Type Support: Supports all Avro logical types including arrays and enums.
  • 🛠️ Custom Serialization: Offers flexible serializers and deserializers.
  • 🌐 Schema Registry Integration: Integrates natively with Confluent Schema Registry.
  • 🔒 Type Safety: Ensures full static type checking.
  • High Performance: Optimized for high-load production environments.

🚀 Quick Start

  1. Install from PyPI

bash pip install avro-dbo

Example Schemas for Each Avro Logical Type

  • Decimal Type

```python from attrs import field, define from decimal import Decimal

@define @avro_schema class DecimalModel: amount: Decimal = field( default=Decimal("100.00"), metadata={ "logicalType": "decimal", "precision": 10, "scale": 2 } ) ```

  • Timestamp (millis) Type

```python from attrs import field, define import datetime

@define @avro_schema class TimestampModel: created_at: datetime.datetime = field( metadata={ "logicalType": "timestamp-millis" } ) ```

  • Enum Type

```python from attrs import field, define from enum import Enum

class Status(Enum): ACTIVE = "ACTIVE" INACTIVE = "INACTIVE"

@define @avro_schema class EnumModel: status: Status = field( default=Status.ACTIVE, metadata={ "logicalType": "enum", "symbols": list(Status) } ) ```

  • Array Type

```python from attrs import field, define from typing import List

@define @avro_schema class ArrayModel: tags: List[str] = field( factory=list, metadata={ "logicalType": "array", "items": "string" } ) ```

  • Kitchen Sink Example

```python from attrs import field, define from decimal import Decimal from enum import Enum from typing import List import datetime

class Status(Enum): ACTIVE = "ACTIVE" INACTIVE = "INACTIVE"

@define @avro_schema class KitchenSinkModel: name: str = field(default="") amount: Decimal = field( default=Decimal("999.99"), metadata={ "logicalType": "decimal", "precision": 10, "scale": 2 } ) status: Status = field( default=Status.ACTIVE, metadata={ "logicalType": "enum", "symbols": list(Status) } ) created_at: datetime.datetime = field( metadata={ "logicalType": "timestamp-millis" } ) tags: List[str] = field( factory=list, metadata={ "logicalType": "array", "items": "string" } ) ```

Example Avro Schema Output

You can use the export_schema() method to export the schema as a JSON object.

print(KitchenSinkModel.export_schema())

The result will be a JSON object that can be used to define the schema in a Confluent Schema Registry.

{
  "type": "record",
  "name": "KitchenSinkModel",
  "fields": [
    {"name": "name", "type": "string", "default": ""},
    {"name": "amount", "type": "decimal", "precision": 10, "scale": 2},
    {"name": "status", "type": "enum", "symbols": ["ACTIVE", "INACTIVE"]},
    {"name": "created_at", "type": "long", "logicalType": "timestamp-millis"},
    {"name": "tags", "type": "array", "items": "string"}
  ]
}

Saving an Avro Schema to a File

You can use the export_schema() method to export the schema as a JSON object.

KitchenSinkModel.export_schema(filename="kitchen_sink_model.json")

Coercing a Python Class Using Avro Schema Model

Avro-DBO will coerce automnatically all fields in the schema to the correct type.

Avro to datetime, date, decimal, enum, array, and more.

Example with Decimal

```python from attrs import field, define from decimal import Decimal

@define @avro_schema class DecimalModel: amount: Decimal = field( default=Decimal("100.00"), metadata={ "logicalType": "decimal", "precision": 10, "scale": 2 } )

my_model = DecimalModel() print(my_model.amount)

> Decimal("100.00")

extra precision is truncated to the scale

my_model.amount = Decimal("100.00383889328932") print(my_model.amount) # > Decimal("100.00") ```

📚 Documentation

For detailed usage instructions, type hints, and comprehensive examples, please refer to our documentation.

🤝 Contributing

We welcome contributions! To submit issues or propose changes, please visit our GitHub repository. See the CONTRIBUTING.md file for more information on how to contribute.

📜 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.