Getting Started

Welcome to Zuma! 👋 This guide will help you understand what Zuma is and how to use it in your Python projects.

Zuma is a Python framework that helps you build and manage workflow pipelines. Think of it like a recipe book - you can define a series of steps (like a recipe's instructions), and Zuma will help you execute them in the right order, handle any errors, and make sure everything runs smoothly.

What is a Workflow?

A workflow in Zuma is like a set of instructions that need to be followed in a specific order. For example, imagine you're building a data processing application that needs to:

  1. Download data from a server
  2. Clean and format the data
  3. Save the results to a database

Each of these tasks would be a "step" in your workflow, and Zuma helps you organize and run these steps efficiently.

Installation

You can install Zuma using your preferred Python package manager:

Choose your package manager:

# Using pip
pip install zuma-workflow

# Using Poetry
poetry add zuma-workflow

# Using Pipenv
pipenv install zuma-workflow

# Using uv
uv add zuma-workflow

Note on Virtual Environments

It's recommended to always use virtual environments when working with Python projects. Each package manager handles virtual environments differently, but all of the above commands will work within a virtual environment.

Verifying Installation

To verify that Zuma is installed correctly, open a Python shell and try importing it:

from zuma import ZumaWorkflow
print("Zuma is installed successfully!")

Core Concepts

Let's understand the main building blocks of Zuma. Don't worry if it seems complex at first - we'll break it down with examples!

1. Workflow (ZumaWorkflow)

A workflow is the main container that holds all your steps. Think of it as a project manager that:

from zuma import ZumaWorkflow

# Create a simple workflow
workflow = ZumaWorkflow(
    name="My First Workflow",  # Give your workflow a descriptive name
    steps=[
        # We'll add steps here
    ]
)
graph TD A((Start)) --> B[Step 1] B --> C[Step 2] C --> D[Step 3] D --> E((End)) style A fill:#f4b860,stroke:#f4b860 style B fill:transparent,stroke:#f4b860 style C fill:transparent,stroke:#f4b860 style D fill:transparent,stroke:#f4b860 style E fill:#6ee7b7,stroke:#6ee7b7

The diagram above shows a simple linear workflow where steps execute one after another. Each step can:

2. Steps (ZumaActionStep)

Steps are the individual tasks in your workflow. Each step:

from zuma import ZumaActionStep

class DownloadDataStep(ZumaActionStep):
    """A step that downloads data from somewhere"""
    
    async def execute(self, context):
        # This is where you put your step's logic
        print(f"[{self.name}] Downloading data...")
        
        # Simulate downloading data
        data = {"user": "john", "age": 30}
        
        # Return data for next steps to use
        return {"downloaded_data": data}

class ProcessDataStep(ZumaActionStep):
    """A step that processes the downloaded data"""
    
    async def execute(self, context):
        # Get data from previous step
        data = context.get("downloaded_data")
        
        # Process the data
        processed_data = {
            "name": data["user"].upper(),
            "age_in_months": data["age"] * 12
        }
        
        return {"processed_data": processed_data}

3. Context

The context is like a shared notebook that all steps can read from and write to. It helps steps communicate with each other by passing data along.

For example, if Step A downloads some data and Step B needs to process it:

# Step A puts data in the context
async def execute(self, context):
    data = download_something()
    return {"my_data": data}  # This goes into the context

# Step B reads data from the context
async def execute(self, context):
    data = context.get("my_data")  # Get data from Step A
    result = process_data(data)
    return {"processed": result}

Basic Usage

Let's put everything together and create your first complete workflow! We'll build a simple workflow that processes user data.

Step 1: Create Your Steps

from zuma import ZumaWorkflow, ZumaActionStep, ZumaRunner
import asyncio

class FetchUserStep(ZumaActionStep):
    """Gets user data (simulated)"""
    async def execute(self, context):
        # Simulate fetching user data
        print("Fetching user data...")
        await asyncio.sleep(1)  # Simulate network delay
        
        user_data = {
            "name": "Alice",
            "age": 25,
            "city": "New York"
        }
        return {"user": user_data}

class ValidateUserStep(ZumaActionStep):
    """Validates the user data"""
    async def execute(self, context):
        user = context.get("user")
        
        # Simple validation
        if not user.get("name"):
            raise ValueError("User must have a name!")
        
        if not isinstance(user.get("age"), int):
            raise ValueError("Age must be a number!")
        
        return {"validated": True}

class ProcessUserStep(ZumaActionStep):
    """Processes the validated user data"""
    async def execute(self, context):
        user = context.get("user")
        
        # Do some processing
        processed_data = {
            "full_name": user["name"].upper(),
            "age_group": "adult" if user["age"] >= 18 else "minor",
            "location": user["city"]
        }
        
        return {"processed_user": processed_data}

Step 2: Create and Run the Workflow

# Create the workflow
workflow = ZumaWorkflow(
    "User Processing Workflow",
    steps=[
        FetchUserStep("Fetch User"),
        ValidateUserStep("Validate User"),
        ProcessUserStep("Process User")
    ]
)

# Create a runner
runner = ZumaRunner()

# Run the workflow
async def main():
    result = await runner.run_workflow(workflow)
    print("\nWorkflow completed!")
    print("Final result:", result)

# Run it!
if __name__ == "__main__":
    asyncio.run(main())

Step 3: Understanding the Output

When you run this workflow, you'll see output like:

Fetching user data...
[Fetch User] ✓ Completed
[Validate User] ✓ Completed
[Process User] ✓ Completed

Workflow completed!
Final result: {
    'user': {'name': 'Alice', 'age': 25, 'city': 'New York'},
    'validated': True,
    'processed_user': {
        'full_name': 'ALICE',
        'age_group': 'adult',
        'location': 'New York'
    }
}

Workflow Components

Zuma provides several special components to handle common workflow patterns. Let's look at each one:

1. Parallel Steps (ZumaParallelAction)

When you have multiple steps that can run at the same time (like processing different files), use ZumaParallelAction:

from zuma import ZumaParallelAction

# Define steps that can run in parallel
parallel_steps = ZumaParallelAction(
    "Process Files",
    steps=[
        ProcessCSVStep("Process CSV"),
        ProcessJSONStep("Process JSON"),
        ProcessXMLStep("Process XML")
    ],
    max_concurrency=2  # Run 2 steps at a time
)
flowchart TD S((Start)) --> A[Process CSV] S --> B[Process JSON] S --> C[Process XML] A --> E((End)) B --> E C --> E style S fill:#f4b860,stroke:#f4b860 style E fill:#6ee7b7,stroke:#6ee7b7 style A fill:#000000,stroke:#f4b860,color:#ffffff style B fill:#000000,stroke:#f4b860,color:#ffffff style C fill:#000000,stroke:#f4b860,color:#ffffff

The diagram above shows how parallel steps work. The workflow:

2. Conditional Steps (ZumaConditionalStep)

Sometimes you need different steps based on certain conditions. Use ZumaConditionalStep for this:

def check_data_size(context):
    """Decide which processing path to take"""
    data_size = len(context.get("data", []))
    return data_size > 1000

# Create a conditional step
processing_step = ZumaConditionalStep(
    "Choose Processing Path",
    condition=check_data_size,
    true_component=BatchProcessStep("Batch Process"),  # For large data
    false_component=SimpleProcessStep("Simple Process")  # For small data
)
flowchart TD S((Start)) --> D{Size > 1000?} D -->|Yes| B[Batch Process] D -->|No| P[Simple Process] B --> E((End)) P --> E style S fill:#f4b860,stroke:#f4b860 style E fill:#6ee7b7,stroke:#6ee7b7 style D fill:#000000,stroke:#3b82f6,color:#ffffff style B fill:#000000,stroke:#f4b860,color:#ffffff style P fill:#000000,stroke:#f4b860,color:#ffffff

The diagram above illustrates conditional workflow branching:

3. Error Handling

Zuma helps you handle errors gracefully. You can:

class RetryableStep(ZumaActionStep):
    """A step that might fail but can retry"""
    
    def __init__(self, name):
        super().__init__(
            name=name,
            retries=3,  # Try up to 3 times
            retry_delay=1.0  # Wait 1 second between retries
        )
    
    async def execute(self, context):
        try:
            result = await some_risky_operation()
            return {"data": result}
        except Exception as e:
            raise ZumaExecutionError(f"Operation failed: {str(e)}")

# Create workflow with error handling
workflow = ZumaWorkflow(
    "Fault Tolerant Workflow",
    steps=[RetryableStep("Risky Step")],
    continue_on_failure=True  # Continue even if steps fail
)
graph TD Start((Start)) --> Attempt1[Attempt 1] Attempt1 -->|Success| Continue[Continue] Attempt1 -->|Fail| Attempt2[Attempt 2] Attempt2 -->|Success| Continue Attempt2 -->|Fail| Attempt3[Attempt 3] Attempt3 -->|Success| Continue Attempt3 -->|Fail| HandleError[Handle Error] HandleError --> Continue Continue --> End((End)) style Start fill:#f4b860,stroke:#f4b860 style End fill:#6ee7b7,stroke:#6ee7b7 style HandleError fill:transparent,stroke:#ef4444 style Attempt1 fill:transparent,stroke:#f4b860 style Attempt2 fill:transparent,stroke:#f4b860 style Attempt3 fill:transparent,stroke:#f4b860 style Continue fill:transparent,stroke:#f4b860

The diagram above shows how error handling works:

  • Each step can be configured to retry on failure
  • After max retries, error handling logic is triggered
  • The workflow can continue even after failures if configured

Advanced Features

1. Step Dependencies

You can specify that certain steps depend on others:

class DependentStep(ZumaActionStep):
    """A step that needs data from specific previous steps"""
    
    def __init__(self, name):
        super().__init__(
            name=name,
            required_contexts=["user_data", "preferences"]  # Names of required data
        )
    
    async def execute(self, context):
        # This will only run if both user_data and preferences exist in context
        user = context.get("user_data")
        prefs = context.get("preferences")
        return {"result": process_user_with_prefs(user, prefs)}

2. Progress Tracking

Monitor your workflow's progress with built-in tracking:

class TrackableStep(ZumaActionStep):
    async def execute(self, context):
        total_items = 100
        
        for i in range(total_items):
            # Update progress
            self.update_progress(
                completed=i + 1,
                total=total_items,
                message=f"Processing item {i + 1}/{total_items}"
            )
            await process_item(i)
        
        return {"completed": True}

3. Custom Context Processors

Transform data between steps automatically:

from zuma import ZumaContextProcessor

class DataNormalizer(ZumaContextProcessor):
    """Normalizes data between steps"""
    
    def process(self, context):
        if "user_data" in context:
            # Convert all string values to lowercase
            data = context["user_data"]
            normalized = {
                k: v.lower() if isinstance(v, str) else v
                for k, v in data.items()
            }
            context["user_data"] = normalized
        return context

# Use the processor in your workflow
workflow = ZumaWorkflow(
    "Normalized Workflow",
    steps=[...],
    context_processors=[DataNormalizer()]
)

Best Practices

Here are comprehensive guidelines to help you write better Zuma workflows:

1.

Step Design

  • Single Responsibility: Keep steps focused on one specific task. This makes them easier to test, maintain, and reuse.
  • Descriptive Naming: Use clear, action-oriented names for steps and workflows (e.g., 'ValidateUserData', 'ProcessPayment').
  • Documentation: Add comprehensive docstrings that explain:
    • What the step does
    • Required input context
    • Expected output
    • Possible errors
  • Type Hints: Use Python type hints to make your code more maintainable and catch type-related errors early.
2.

Error Handling

  • Anticipate Failures: Always consider what can go wrong and handle edge cases appropriately.
  • Retry Strategy: Configure retry settings based on the operation:
    • Network operations: Multiple retries with exponential backoff
    • Database operations: Short retry window with linear backoff
    • CPU-bound operations: Minimal retries to avoid resource waste
  • Error Messages: Provide detailed error messages that help diagnose issues
  • Resource Cleanup: Implement proper cleanup in case of failures
3.

Performance

  • Parallel Processing: Use ZumaParallelAction for independent operations
  • Resource Management:
    • Use connection pooling for databases
    • Implement proper caching strategies
    • Release resources promptly
  • Batch Processing: Process large datasets in chunks to manage memory usage
  • Progress Tracking: Implement progress tracking for long-running operations
4.

Testing

  • Unit Tests: Write comprehensive tests for individual steps:
    @pytest.mark.asyncio
    async def test_user_processor():
        # Arrange
        processor = UserProcessor("Test Processor")
        context = {"user_data": {"name": "John", "age": 30}}
        
        # Act
        result = await processor.execute(context)
        
        # Assert
        assert "processed_user" in result
        assert result["processed_user"]["name"] == "john"
  • Integration Tests: Test complete workflows with realistic data.
  • Mock External Services: Use mocking for external dependencies.
  • Error Scenarios: Test error handling and recovery mechanisms.
5.

Monitoring and Logging

  • Structured Logging: Use consistent log formats:
    self.logger.info("Processing user data", extra={
        "user_id": user.id,
        "step": self.name,
        "batch_size": self.batch_size
    })
  • Metrics Collection: Track important metrics:
    • Step execution time
    • Success/failure rates
    • Resource usage
    • Throughput
  • Alerting: Set up appropriate alerting for critical failures.
  • Debugging: Include sufficient context in logs for debugging.

API Reference

Core Components

ZumaWorkflow

Main container for organizing workflow steps.

class ZumaWorkflow:
    def __init__(
        self,
        name: str,
        steps: List[ZumaComponent],
        continue_on_failure: bool = False,
        context_processors: List[ZumaContextProcessor] = None,
        description: str = None
    ):
        """Initialize a workflow.

        Args:
            name: Workflow name
            steps: List of workflow steps
            continue_on_failure: Whether to continue if a step fails
            context_processors: List of context processors
            description: Optional workflow description
        """
Attributes:
  • nameUnique identifier for the workflow
  • stepsList of workflow components to execute
  • continue_on_failureIf True, continue executing remaining steps when a step fails
  • context_processorsOptional processors for modifying context between steps

ZumaActionStep

Base class for implementing workflow steps.

class ZumaActionStep:
    def __init__(
        self,
        name: str,
        description: str = None,
        retries: int = 0,
        retry_delay: float = 1.0,
        timeout: float = None,
        required_contexts: List[str] = None
    ):
        """Initialize an action step.

        Args:
            name: Step name
            description: Optional step description
            retries: Number of retry attempts
            retry_delay: Delay between retries in seconds
            timeout: Maximum execution time in seconds
            required_contexts: List of required context keys
        """
    
    async def execute(
        self, 
        context: Dict[str, Any],
        dependencies: Dict[str, Any],
        **kwargs
    ) -> Dict[str, Any]:
        """Execute the step logic.
        
        Override this method in your custom steps.
        """
        raise NotImplementedError()
Key Methods:
  • execute(context, dependencies, **kwargs) Main execution method to override in custom steps
  • update_progress(completed, total, message=None) Update step progress during execution
  • on_retry(attempt: int, error: Exception) Called before each retry attempt

ZumaParallelAction

Execute multiple steps concurrently.

class ZumaParallelAction:
    def __init__(
        self,
        name: str,
        steps: List[ZumaComponent],
        max_concurrency: int = None,
        fail_fast: bool = True,
        description: str = None
    ):
        """Initialize parallel execution.

        Args:
            name: Action name
            steps: List of steps to execute in parallel
            max_concurrency: Maximum concurrent executions
            fail_fast: Stop all steps if one fails
            description: Optional description
        """

ZumaConditionalStep

Conditional branching in workflows.

class ZumaConditionalStep:
    def __init__(
        self,
        name: str,
        condition: Callable[[Dict[str, Any]], bool],
        true_component: ZumaComponent,
        false_component: ZumaComponent = None,
        description: str = None
    ):
        """Initialize conditional step.

        Args:
            name: Step name
            condition: Function that returns True/False
            true_component: Component to execute if True
            false_component: Optional component if False
            description: Optional description
        """

ZumaRunner

Executes and manages workflows.

class ZumaRunner:
    async def run_workflow(
        self,
        workflow: ZumaWorkflow,
        context: Dict[str, Any] = None,
        generate_diagram: bool = False,
        diagram_output: str = None
    ) -> ZumaResult:
        """Execute a workflow.

        Args:
            workflow: Workflow to execute
            context: Initial context data
            generate_diagram: Create visualization
            diagram_output: Diagram output path
        """
Key Methods:
  • run_workflow(workflow, context, **kwargs) Execute a workflow with optional visualization
  • create_workflow_diagram(result, output_file) Generate workflow visualization
  • print_execution_summary(result) Print workflow execution details

Extension Points

ZumaContextProcessor

Custom context modification between steps.

class ZumaContextProcessor:
    def process(self, context: Dict[str, Any]) -> Dict[str, Any]:
        """Process and modify the context.
        
        Override this method to implement custom processing.
        
        Args:
            context: Current workflow context
            
        Returns:
            Modified context dictionary
        """
        return context

ZumaExecutionError

Custom error type for workflow execution failures.

class ZumaExecutionError(Exception):
    def __init__(
        self,
        message: str,
        error_code: str = None,
        details: Dict[str, Any] = None
    ):
        """Initialize execution error.

        Args:
            message: Error message
            error_code: Optional error code
            details: Additional error details
        """

Results and Status

ZumaResult

Contains workflow execution results and metadata.

Attributes:
  • statusCurrent execution status (PENDING, RUNNING, SUCCESS, FAILED)
  • contextFinal workflow context
  • errorError information if failed
  • durationTotal execution time in seconds
  • metadataAdditional execution metadata

ZumaStatus

Enumeration of possible execution states.

class ZumaStatus(Enum):
    PENDING = "PENDING"
    RUNNING = "RUNNING"
    SUCCESS = "SUCCESS"
    FAILED = "FAILED"
    SKIPPED = "SKIPPED"

Visualization

Diagram Generation

Generate visual workflow representations.

Features:
  • Clear visualization of workflow steps and relationships
  • Support for retry mechanisms visualization
    • Main workflow path at top
    • Retry attempts branching below
    • Success paths rejoining main flow
    • Error handling visualization
  • Visual representation of parallel processing
  • Dark theme support for better readability
  • Automatic diagram generation during workflow execution