EventCore Logo

EventCore

Multi-stream event sourcing with dynamic consistency boundaries

Crates.io Documentation License Build Status


Why EventCore?

Traditional event sourcing forces you to choose aggregate boundaries upfront, leading to complex workarounds when business logic spans multiple aggregates. EventCore eliminates this constraint with dynamic consistency boundaries - each command defines exactly which streams it needs, enabling atomic operations across multiple event streams.

🚀 Key Features

🔄 Multi-Stream Atomicity
Read from and write to multiple event streams in a single atomic operation. No more saga patterns for simple cross-aggregate operations.

🎯 Type-Safe Commands
Leverage Rust’s type system to ensure compile-time correctness. Illegal states are unrepresentable.

⚡ High Performance
Optimized for both in-memory and PostgreSQL backends with sophisticated caching and batching strategies.

🔍 Built-in CQRS
First-class support for projections and read models with automatic position tracking and replay capabilities.

🛡️ Production Ready
Battle-tested with comprehensive observability, monitoring, and error recovery mechanisms.

🧪 Testing First
Extensive testing utilities including property-based tests, chaos testing, and deterministic event stores.

Quick Example

#![allow(unused)]
fn main() {
use eventcore::prelude::*;

#[derive(Command)]
#[command(event = "BankingEvent")]
struct TransferMoney {
    from_account: AccountId,
    to_account: AccountId,
    amount: Money,
}

impl TransferMoney {
    fn read_streams(&self) -> Vec<StreamId> {
        vec![
            self.from_account.stream_id(),
            self.to_account.stream_id(),
        ]
    }
}

#[async_trait]
impl CommandLogic for TransferMoney {
    type State = BankingState;
    type Event = BankingEvent;

    async fn handle(
        &self,
        _: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Validate business rules
        require!(state.balance(&self.from_account) >= self.amount,
            "Insufficient funds"
        );

        // Emit events - atomically written to both streams
        Ok(vec![
            emit!(self.from_account.stream_id(), 
                BankingEvent::Withdrawn { amount: self.amount }
            ),
            emit!(self.to_account.stream_id(),
                BankingEvent::Deposited { amount: self.amount }
            ),
        ])
    }
}
}

Getting Started

Use Cases

EventCore excels in domains where business operations naturally span multiple entities:

  • 💰 Financial Systems: Atomic transfers, double-entry bookkeeping, complex trading operations
  • 🛒 E-Commerce: Order fulfillment, inventory management, distributed transactions
  • 🏢 Enterprise Applications: Workflow engines, approval processes, resource allocation
  • 🎮 Gaming: Player interactions, economy systems, real-time state synchronization
  • 📊 Analytics Platforms: Event-driven architectures, audit trails, temporal queries

Performance

187,711
ops/sec (in-memory)
83
ops/sec (PostgreSQL)
12ms
avg latency
820,000
events/sec write

Community

Join our growing community of developers building event-sourced systems:

Resources

Supported By

EventCore is an open-source project supported by the community.


Built with ❤️ by the EventCore community

Released under the MIT License

Quick Start Guide

Get up and running with EventCore in 15 minutes!

Installation

Add EventCore to your Cargo.toml:

[dependencies]
eventcore = "0.1"
eventcore-postgres = "0.1"  # For PostgreSQL backend
# OR
eventcore-memory = "0.1"   # For in-memory backend

# Required dependencies
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
async-trait = "0.1"

Your First Event-Sourced Application

Let’s build a simple task management system to demonstrate EventCore’s key concepts.

1. Define Your Domain Types

#![allow(unused)]
fn main() {
use eventcore::prelude::*;
use serde::{Deserialize, Serialize};

// Domain types with compile-time validation
#[nutype(sanitize(trim), validate(not_empty, len_char_max = 50))]
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize, AsRef)]
pub struct TaskId(String);

#[nutype(sanitize(trim), validate(not_empty, len_char_max = 200))]
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TaskTitle(String);

// Events that represent state changes
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum TaskEvent {
    Created { title: TaskTitle },
    Completed,
    Reopened,
}
}

2. Create Your First Command

#![allow(unused)]
fn main() {
#[derive(Clone, Command)]
#[command(event = "TaskEvent")]
pub struct CreateTask {
    pub task_id: TaskId,
    pub title: TaskTitle,
}

impl CreateTask {
    fn read_streams(&self) -> Vec<StreamId> {
        vec![StreamId::from(self.task_id.as_ref())]
    }
}

#[async_trait]
impl CommandLogic for CreateTask {
    type State = Option<TaskState>;
    type Event = TaskEvent;

    async fn handle(
        &self,
        _: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Ensure task doesn't already exist
        require!(state.is_none(), "Task already exists");

        // Emit the event
        Ok(vec![emit!(
            StreamId::from(self.task_id.as_ref()),
            TaskEvent::Created {
                title: self.title.clone()
            }
        )])
    }
}
}

3. Define State and Event Application

#![allow(unused)]
fn main() {
#[derive(Default, Debug, Clone)]
pub struct TaskState {
    pub title: TaskTitle,
    pub completed: bool,
}

impl CreateTask {
    fn apply(&self, state: &mut Self::State, event: &StoredEvent<TaskEvent>) {
        if let Some(task_state) = state {
            match &event.event {
                TaskEvent::Created { title } => {
                    // This shouldn't happen with proper command validation
                    *task_state = TaskState {
                        title: title.clone(),
                        completed: false,
                    };
                }
                TaskEvent::Completed => {
                    task_state.completed = true;
                }
                TaskEvent::Reopened => {
                    task_state.completed = false;
                }
            }
        } else if let TaskEvent::Created { title } = &event.event {
            *state = Some(TaskState {
                title: title.clone(),
                completed: false,
            });
        }
    }
}
}

4. Set Up the Event Store and Execute Commands

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize event store (using in-memory for this example)
    let event_store = InMemoryEventStore::new();
    let executor = CommandExecutor::new(event_store);

    // Create a new task
    let task_id = TaskId::try_new("task-001".to_string())?;
    let title = TaskTitle::try_new("Learn EventCore".to_string())?;
    
    let create_cmd = CreateTask {
        task_id: task_id.clone(),
        title,
    };

    // Execute the command
    let result = executor.execute(create_cmd).await?;
    println!("Task created with {} event(s)", result.events.len());

    // Create a complete task command
    let complete_cmd = CompleteTask {
        task_id: task_id.clone(),
    };

    let result = executor.execute(complete_cmd).await?;
    println!("Task completed!");

    Ok(())
}

5. Add a Projection for Queries

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct TaskListProjection {
    tasks: Arc<RwLock<HashMap<TaskId, TaskSummary>>>,
}

#[derive(Debug, Clone)]
struct TaskSummary {
    title: TaskTitle,
    completed: bool,
}

#[async_trait]
impl Projection for TaskListProjection {
    type Event = TaskEvent;

    async fn handle_event(
        &mut self,
        event: StoredEvent<Self::Event>,
        stream_id: &StreamId,
    ) -> Result<(), ProjectionError> {
        let task_id = TaskId::try_new(stream_id.as_ref().to_string())
            .map_err(|e| ProjectionError::InvalidData(e.to_string()))?;

        let mut tasks = self.tasks.write().await;
        
        match event.event {
            TaskEvent::Created { title } => {
                tasks.insert(task_id, TaskSummary {
                    title,
                    completed: false,
                });
            }
            TaskEvent::Completed => {
                if let Some(task) = tasks.get_mut(&task_id) {
                    task.completed = true;
                }
            }
            TaskEvent::Reopened => {
                if let Some(task) = tasks.get_mut(&task_id) {
                    task.completed = false;
                }
            }
        }

        Ok(())
    }
}
}

Next Steps

Congratulations! You’ve built your first event-sourced application with EventCore. Here’s what to explore next:

  1. Domain Modeling Guide - Learn best practices for modeling your domain with types
  2. Commands Deep Dive - Understand multi-stream operations and dynamic consistency
  3. Building Web APIs - Integrate EventCore with Axum or Actix
  4. Testing Strategies - Property-based testing and chaos testing

Example Projects

Check out these complete examples in the repository:

  • Banking System - Multi-account transfers with ACID guarantees
  • E-Commerce Platform - Order processing with inventory management
  • Saga Orchestration - Long-running business processes

Getting Help

Installation

Requirements

  • Rust 1.70.0 or later
  • PostgreSQL 13+ (for PostgreSQL backend)
  • Tokio async runtime

Adding EventCore to Your Project

Add the following to your Cargo.toml:

[dependencies]
# Core library
eventcore = "0.1"

# Choose your backend (one of these):
eventcore-postgres = "0.1"  # Production-ready PostgreSQL backend
eventcore-memory = "0.1"    # In-memory backend for development/testing

# Required dependencies
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
uuid = { version = "1", features = ["v7", "serde"] }
thiserror = "1"

# Optional but recommended
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }

Backend Configuration

PostgreSQL Backend

  1. Database Setup
# Using Docker
docker run -d \
  --name eventcore-postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=eventcore \
  -p 5432:5432 \
  postgres:15-alpine

# Or use the provided docker-compose.yml
docker-compose up -d
  1. Run Migrations

EventCore will automatically create required tables on first use. For manual setup:

-- See eventcore-postgres/migrations/ for schema
  1. Connection Configuration
#![allow(unused)]
fn main() {
use eventcore_postgres::PostgresEventStore;

let database_url = "postgres://postgres:postgres@localhost/eventcore";
let event_store = PostgresEventStore::new(database_url).await?;
}

In-Memory Backend

Perfect for development and testing:

#![allow(unused)]
fn main() {
use eventcore_memory::InMemoryEventStore;

let event_store = InMemoryEventStore::new();
}

Feature Flags

EventCore supports various feature flags:

[dependencies]
eventcore = { version = "0.1", features = ["full"] }

# Individual features:
# - "testing" - Testing utilities and fixtures
# - "chaos" - Chaos testing support
# - "monitoring" - OpenTelemetry integration
# - "cqrs" - CQRS pattern support

Verification

Create a simple test to verify installation:

#![allow(unused)]
fn main() {
use eventcore::prelude::*;

#[tokio::test]
async fn test_eventcore_setup() {
    let event_store = eventcore_memory::InMemoryEventStore::new();
    let executor = CommandExecutor::new(event_store);
    
    // If this compiles, EventCore is properly installed!
    assert!(true);
}
}

Next Steps

Your First EventCore Application

Let’s build a complete event-sourced application from scratch: a simple blog engine that demonstrates EventCore’s key concepts.

Project Setup

  1. Create a new Rust project
cargo new blog-engine
cd blog-engine
  1. Update Cargo.toml
[package]
name = "blog-engine"
version = "0.1.0"
edition = "2021"

[dependencies]
eventcore = "0.1"
eventcore-memory = "0.1"
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
uuid = { version = "1", features = ["v7", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
thiserror = "1"
nutype = { version = "0.4", features = ["serde"] }

Step 1: Define Domain Types

Create src/types.rs:

#![allow(unused)]
fn main() {
use eventcore::prelude::*;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};

// Use nutype for domain validation
#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 100),
    derive(Debug, Clone, PartialEq, Serialize, Deserialize, AsRef)
)]
pub struct PostId(String);

#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 200),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct PostTitle(String);

#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 10000),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct PostContent(String);

#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 100),
    derive(Debug, Clone, PartialEq, Serialize, Deserialize)
)]
pub struct AuthorId(String);

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Comment {
    pub id: String,
    pub author: AuthorId,
    pub content: String,
    pub created_at: DateTime<Utc>,
}
}

Step 2: Define Events

Create src/events.rs:

#![allow(unused)]
fn main() {
use crate::types::*;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum BlogEvent {
    PostPublished {
        title: PostTitle,
        content: PostContent,
        author: AuthorId,
        published_at: DateTime<Utc>,
    },
    PostUpdated {
        title: PostTitle,
        content: PostContent,
        updated_at: DateTime<Utc>,
    },
    PostDeleted {
        deleted_at: DateTime<Utc>,
    },
    CommentAdded {
        comment: Comment,
    },
    CommentRemoved {
        comment_id: String,
    },
}
}

Step 3: Define State

Create src/state.rs:

#![allow(unused)]
fn main() {
use crate::types::*;
use crate::events::BlogEvent;
use chrono::{DateTime, Utc};
use std::collections::HashMap;

#[derive(Debug, Clone, Default)]
pub struct PostState {
    pub exists: bool,
    pub title: Option<PostTitle>,
    pub content: Option<PostContent>,
    pub author: Option<AuthorId>,
    pub published_at: Option<DateTime<Utc>>,
    pub updated_at: Option<DateTime<Utc>>,
    pub deleted_at: Option<DateTime<Utc>>,
    pub comments: HashMap<String, Comment>,
}

impl PostState {
    pub fn is_deleted(&self) -> bool {
        self.deleted_at.is_some()
    }

    pub fn apply_event(&mut self, event: &BlogEvent) {
        match event {
            BlogEvent::PostPublished {
                title,
                content,
                author,
                published_at,
            } => {
                self.exists = true;
                self.title = Some(title.clone());
                self.content = Some(content.clone());
                self.author = Some(author.clone());
                self.published_at = Some(*published_at);
            }
            BlogEvent::PostUpdated {
                title,
                content,
                updated_at,
            } => {
                self.title = Some(title.clone());
                self.content = Some(content.clone());
                self.updated_at = Some(*updated_at);
            }
            BlogEvent::PostDeleted { deleted_at } => {
                self.deleted_at = Some(*deleted_at);
            }
            BlogEvent::CommentAdded { comment } => {
                self.comments.insert(comment.id.clone(), comment.clone());
            }
            BlogEvent::CommentRemoved { comment_id } => {
                self.comments.remove(comment_id);
            }
        }
    }
}
}

Step 4: Implement Commands

Create src/commands.rs:

#![allow(unused)]
fn main() {
use crate::events::BlogEvent;
use crate::state::PostState;
use crate::types::*;
use chrono::Utc;
use eventcore::prelude::*;

// Publish a new blog post
#[derive(Clone, Command)]
#[command(event = "BlogEvent")]
pub struct PublishPost {
    pub post_id: PostId,
    pub title: PostTitle,
    pub content: PostContent,
    pub author: AuthorId,
}

impl PublishPost {
    fn read_streams(&self) -> Vec<StreamId> {
        vec![StreamId::from(format!("post-{}", self.post_id.as_ref()))]
    }
}

#[async_trait]
impl CommandLogic for PublishPost {
    type State = PostState;
    type Event = BlogEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        state.apply_event(&event.event);
    }

    async fn handle(
        &self,
        _: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Validate business rules
        require!(!state.exists, "Post already exists");

        // Emit event
        Ok(vec![emit!(
            StreamId::from(format!("post-{}", self.post_id.as_ref())),
            BlogEvent::PostPublished {
                title: self.title.clone(),
                content: self.content.clone(),
                author: self.author.clone(),
                published_at: Utc::now(),
            }
        )])
    }
}

// Add a comment to a post
#[derive(Clone, Command)]
#[command(event = "BlogEvent")]
pub struct AddComment {
    pub post_id: PostId,
    pub comment_id: String,
    pub author: AuthorId,
    pub content: String,
}

impl AddComment {
    fn read_streams(&self) -> Vec<StreamId> {
        vec![StreamId::from(format!("post-{}", self.post_id.as_ref()))]
    }
}

#[async_trait]
impl CommandLogic for AddComment {
    type State = PostState;
    type Event = BlogEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        state.apply_event(&event.event);
    }

    async fn handle(
        &self,
        _: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Validate
        require!(state.exists, "Post does not exist");
        require!(!state.is_deleted(), "Cannot comment on deleted post");
        require!(!state.comments.contains_key(&self.comment_id), 
                "Comment ID already exists");

        // Emit event
        Ok(vec![emit!(
            StreamId::from(format!("post-{}", self.post_id.as_ref())),
            BlogEvent::CommentAdded {
                comment: Comment {
                    id: self.comment_id.clone(),
                    author: self.author.clone(),
                    content: self.content.clone(),
                    created_at: Utc::now(),
                }
            }
        )])
    }
}
}

Step 5: Create the Application

Update src/main.rs:

mod commands;
mod events;
mod state;
mod types;

use commands::*;
use eventcore::prelude::*;
use eventcore_memory::InMemoryEventStore;
use types::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize the event store
    let event_store = InMemoryEventStore::new();
    let executor = CommandExecutor::new(event_store);

    // Create author and post IDs
    let author = AuthorId::try_new("alice".to_string())?;
    let post_id = PostId::try_new("hello-eventcore".to_string())?;

    // Publish a blog post
    let publish_cmd = PublishPost {
        post_id: post_id.clone(),
        title: PostTitle::try_new("Hello EventCore!".to_string())?,
        content: PostContent::try_new(
            "This is my first event-sourced blog post!".to_string()
        )?,
        author: author.clone(),
    };

    let result = executor.execute(publish_cmd).await?;
    println!("Post published with {} event(s)", result.events.len());

    // Add a comment
    let comment_cmd = AddComment {
        post_id: post_id.clone(),
        comment_id: "comment-1".to_string(),
        author: AuthorId::try_new("bob".to_string())?,
        content: "Great post!".to_string(),
    };

    let result = executor.execute(comment_cmd).await?;
    println!("Comment added!");

    // Try to add duplicate comment (will fail)
    let duplicate_comment = AddComment {
        post_id,
        comment_id: "comment-1".to_string(), // Same ID!
        author: AuthorId::try_new("charlie".to_string())?,
        content: "Another comment".to_string(),
    };

    match executor.execute(duplicate_comment).await {
        Ok(_) => println!("This shouldn't happen!"),
        Err(e) => println!("Expected error: {}", e),
    }

    Ok(())
}

Step 6: Run Your Application

cargo run

You should see:

Post published with 1 event(s)
Comment added!
Expected error: Comment ID already exists

What You’ve Learned

In this tutorial, you’ve implemented:

  1. Type-Safe Domain Modeling - Using nutype for validation
  2. Event Sourcing Basics - Events as the source of truth
  3. Command Pattern - Encapsulating business operations
  4. Business Rule Validation - Enforcing invariants
  5. State Reconstruction - Building state from events

Next Steps

Enhance your blog engine with:

  • Projections for querying posts by author or tag
  • Multi-stream operations for author profiles
  • Web API using Axum or Actix
  • PostgreSQL backend for persistence
  • Subscriptions for real-time updates

Continue learning:

Part 1: Introduction

Welcome to EventCore! This section introduces the library, its philosophy, and when to use it.

Chapters in This Part

  1. What is EventCore? - Understanding multi-stream event sourcing
  2. When to Use EventCore - Decision guide for choosing EventCore
  3. Event Modeling Fundamentals - Learn to design systems with events
  4. Architecture Overview - High-level view of EventCore’s design

What You’ll Learn

  • The problems EventCore solves
  • How multi-stream event sourcing differs from traditional approaches
  • When EventCore is the right choice (and when it’s not)
  • How to think in events and model your domain
  • EventCore’s architecture and design principles

Prerequisites

  • Basic Rust knowledge
  • Familiarity with async programming helpful but not required
  • No prior event sourcing experience needed

Time to Complete

  • Reading: ~20 minutes
  • With exercises: ~45 minutes

Ready? Let’s start with What is EventCore?

Chapter 1.1: What is EventCore?

EventCore is a Rust library that implements multi-stream event sourcing - a powerful pattern that eliminates the traditional constraints of aggregate boundaries while maintaining strong consistency guarantees.

The Problem with Traditional Event Sourcing

Traditional event sourcing forces you to define rigid aggregate boundaries upfront:

#![allow(unused)]
fn main() {
// Traditional approach - forced aggregate boundaries
struct BankAccount {
    id: AccountId,
    balance: Money,
    // Can only modify THIS account
}

// Problem: How do you transfer money atomically?
// Option 1: Two separate commands (not atomic!)
// Option 2: Process managers/sagas (complex!)
// Option 3: Eventual consistency (risky!)
}

These boundaries often don’t match real business requirements:

  • Money transfers need to modify two accounts atomically
  • Order fulfillment needs to update inventory, orders, and shipping together
  • User registration might need to create accounts, profiles, and notifications

The EventCore Solution

EventCore introduces dynamic consistency boundaries - each command defines which streams it needs:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct TransferMoney {
    #[stream]
    from_account: StreamId,  // Read and write this stream
    #[stream]  
    to_account: StreamId,    // Read and write this stream too
    amount: Money,
}

// This command atomically:
// 1. Reads both account streams
// 2. Validates the business rules
// 3. Writes events to both streams
// 4. All in ONE atomic transaction!
}

Key Concepts

1. Event Streams

Instead of aggregates, EventCore uses streams - ordered sequences of events identified by a StreamId:

#![allow(unused)]
fn main() {
// Streams are just identifiers
let alice_account = StreamId::from_static("account-alice");
let bob_account = StreamId::from_static("account-bob");
let order_123 = StreamId::from_static("order-123");
}

2. Multi-Stream Commands

Commands can read from and write to multiple streams atomically:

#![allow(unused)]
fn main() {
// A command that involves multiple business entities
#[derive(Command, Clone)]
struct FulfillOrder {
    #[stream]
    order_id: StreamId,       // The order to fulfill
    #[stream]
    inventory_id: StreamId,   // The inventory to deduct from
    #[stream]
    shipping_id: StreamId,    // Create shipping record
}
}

3. Type-Safe Stream Access

The macro system ensures you can only write to streams you declared:

#![allow(unused)]
fn main() {
// In your handle method:
let events = vec![
    StreamWrite::new(
        &read_streams,
        self.order_id.clone(),      // ✅ OK - declared with #[stream]
        OrderEvent::Fulfilled
    )?,
    StreamWrite::new(
        &read_streams,
        some_other_stream,           // ❌ Compile error! Not declared
        SomeEvent::Happened
    )?,
];
}

4. Optimistic Concurrency Control

EventCore tracks stream versions to detect conflicts:

  1. Command reads streams at specific versions
  2. Command produces new events
  3. Write only succeeds if streams haven’t changed
  4. Automatic retry on conflicts

Benefits

  1. Simplified Architecture

    • No aggregate boundaries to design upfront
    • No process managers for cross-aggregate operations
    • No eventual consistency complexity
  2. Strong Consistency

    • All changes are atomic
    • No partial failures between streams
    • Transactions that match business requirements
  3. Type Safety

    • Commands declare their streams at compile time
    • Illegal operations won’t compile
    • Self-documenting code
  4. Performance

    • ~100 operations/second with PostgreSQL
    • Optimized for correctness over raw throughput
    • Batched operations for better performance

How It Works

  1. Command Declaration: Use #[derive(Command)] to declare which streams you need
  2. State Reconstruction: EventCore reads all requested streams and builds current state
  3. Business Logic: Your command validates rules and produces events
  4. Atomic Write: All events are written in a single transaction
  5. Optimistic Retry: On conflicts, EventCore retries automatically

Example: Complete Money Transfer

#![allow(unused)]
fn main() {
use eventcore::prelude::*;
use eventcore_macros::Command;

#[derive(Command, Clone)]
struct TransferMoney {
    #[stream]
    from_account: StreamId,
    #[stream]
    to_account: StreamId,
    amount: Money,
}

#[async_trait]
impl CommandLogic for TransferMoney {
    type State = AccountBalances;
    type Event = BankingEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        // Update state based on events
        match &event.payload {
            BankingEvent::MoneyWithdrawn { amount, .. } => {
                state.debit(&event.stream_id, *amount);
            }
            BankingEvent::MoneyDeposited { amount, .. } => {
                state.credit(&event.stream_id, *amount);
            }
            _ => {}
        }
    }

    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Check balance
        let from_balance = state.balance(&self.from_account);
        require!(
            from_balance >= self.amount.value(),
            "Insufficient funds: balance={}, requested={}",
            from_balance,
            self.amount
        );

        // Create atomic events for both accounts
        Ok(vec![
            StreamWrite::new(
                &read_streams,
                self.from_account.clone(),
                BankingEvent::MoneyWithdrawn {
                    amount: self.amount.value(),
                    to: self.to_account.to_string(),
                }
            )?,
            StreamWrite::new(
                &read_streams,
                self.to_account.clone(),
                BankingEvent::MoneyDeposited {
                    amount: self.amount.value(),
                    from: self.from_account.to_string(),
                }
            )?,
        ])
    }
}
}

Next Steps

Now that you understand what EventCore is, let’s explore when to use it

Chapter 1.2: When to Use EventCore

In the modern age of fast computers and cheap storage, event sourcing should be the default approach for any line-of-business application. This chapter explores why EventCore is the right choice for your next project and addresses common concerns.

Why Event Sourcing Should Be Your Default

Traditional CRUD databases were designed in an era of expensive storage and slow computers. They optimize for storage efficiency by throwing away history - a terrible trade-off in today’s world. Here’s why event sourcing, and specifically EventCore, should be your default choice:

1. History is Free

Storage costs have plummeted. The complete history of your business operations costs pennies to store but provides immense value:

  • Debug production issues by replaying events
  • Satisfy any future audit requirement
  • Build new features on historical data
  • Prove compliance retroactively

2. CRUD Lies About Your Business

CRUD operations (Create, Read, Update, Delete) are technical concepts that don’t match business reality:

  • “Update” erases the reason for change
  • “Delete” pretends things never existed
  • State-based models lose critical business context

Event sourcing captures what actually happened: “CustomerChangedAddress”, “OrderCancelled”, “PriceAdjusted”

3. Future-Proof by Default

With EventCore, you never have to say “we didn’t track that”:

  • New reporting requirements? Replay events into new projections
  • Need to add analytics? The data is already there
  • Compliance rules changed? Full history available

EventCore Makes Event Sourcing Practical

While event sourcing should be the default, EventCore specifically excels by solving traditional event sourcing pain points:

1. Complex Business Transactions

Problem: Your business operations span multiple entities that must change together.

Example: E-commerce order fulfillment

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct FulfillOrder {
    #[stream]
    order: StreamId,         // Update order status
    #[stream]
    inventory: StreamId,     // Deduct items
    #[stream]
    shipping: StreamId,      // Create shipping record
    #[stream]
    customer: StreamId,      // Update loyalty points
}
}

Why EventCore: Traditional systems require distributed transactions or eventual consistency. EventCore makes this atomic and simple.

2. Financial Systems

Problem: Need complete audit trail and strong consistency for money movements.

Example: Payment processing

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ProcessPayment {
    #[stream]
    customer_account: StreamId,
    #[stream]
    merchant_account: StreamId,
    #[stream]
    payment_gateway: StreamId,
    #[stream]
    tax_authority: StreamId,
}
}

Why EventCore:

  • Every state change is recorded
  • Natural audit log for compliance
  • Atomic operations prevent partial payments
  • Easy to replay for reconciliation

3. Collaborative Systems

Problem: Multiple users modifying shared resources with conflict resolution needs.

Example: Project management tool

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct MoveTaskToColumn {
    #[stream]
    task: StreamId,
    #[stream]
    from_column: StreamId,
    #[stream]
    to_column: StreamId,
    #[stream]
    project: StreamId,
}
}

Why EventCore:

  • Event streams enable real-time updates
  • Natural conflict resolution through events
  • Complete history of who did what when

4. Regulatory Compliance

Problem: Regulations require you to show complete history of data changes.

Example: Healthcare records

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct UpdatePatientRecord {
    #[stream]
    patient: StreamId,
    #[stream]
    physician: StreamId,
    #[stream]
    audit_log: StreamId,
}
}

Why EventCore:

  • Immutable event log satisfies auditors
  • Can prove system state at any point in time
  • Natural GDPR compliance (event-level data retention)

5. Domain-Driven Design

Problem: Your domain has complex rules that span multiple aggregates.

Example: Insurance claim processing

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ProcessClaim {
    #[stream]
    claim: StreamId,
    #[stream]
    policy: StreamId,
    #[stream]
    customer: StreamId,
    #[stream]
    adjuster: StreamId,
}
}

Why EventCore:

  • Commands match business operations exactly
  • No artificial aggregate boundaries
  • Domain events become first-class citizens

Addressing Common Concerns

“But Event Sourcing is Complex!”

Myth: Event sourcing adds unnecessary complexity.

Reality: EventCore makes it simpler than CRUD:

  • No O/R mapping impedance mismatch
  • Commands map directly to business operations
  • No “load-modify-save” race conditions
  • Debugging is easier with full history

“What About Performance?”

Myth: Event sourcing is slow because it stores everything.

Reality:

  • EventCore achieves ~83 ops/sec with PostgreSQL - plenty for most business applications
  • Read models can be optimized for any query pattern
  • No complex joins needed - data is pre-projected
  • Scales horizontally by splitting streams

“Storage Costs Will Explode!”

Myth: Storing all events is expensive.

Reality: Let’s do the math:

  • Average event size: ~1KB
  • 1000 events/day = 365K events/year = 365MB/year
  • S3 storage cost: ~$0.023/GB/month = $0.10/year
  • Your complete business history costs less than a coffee

“What About GDPR/Privacy?”

Myth: You can’t delete data with event sourcing.

Reality: EventCore provides better privacy controls:

  • Crypto-shredding: Delete encryption keys to make data unreadable
  • Event-level retention policies
  • Selective projection rebuilding
  • Actually know what data you have about someone

Special Considerations

Large Binary Data

For systems with large binary data (images, videos), use a hybrid approach:

  • Store metadata and operations as events
  • Store binaries in object storage (S3)
  • Best of both worlds

Graph-Heavy Queries

For social networks or recommendation engines:

  • Use EventCore for the write side
  • Project into graph databases for queries
  • Maintain consistency through event streams

Cache-Like Workloads

For session storage or caching:

  • These aren’t business operations
  • Use appropriate tools (Redis)
  • EventCore for business logic, Redis for caching

Migration Considerations

From Traditional Database

Good fit if:

  • You need better audit trails
  • Business rules span multiple tables
  • You’re already using event-driven architecture

Poor fit if:

  • Current solution works well
  • No complex business rules
  • Just need basic CRUD

From Microservices

Good fit if:

  • Struggling with distributed transactions
  • Need better consistency guarantees
  • Want to simplify architecture

Poor fit if:

  • True service isolation is required
  • Different teams own different services
  • Services use different tech stacks

Performance Considerations

EventCore is optimized for:

  • ✅ Correctness and consistency
  • ✅ Complex business operations
  • ✅ Audit and compliance needs

EventCore is NOT optimized for:

  • ❌ Maximum throughput (~83 ops/sec with PostgreSQL)
  • ❌ Minimum latency (ms-level operations)
  • ❌ Large binary data

The Right Question

Instead of asking “Do I need event sourcing?”, ask:

“Can I afford to throw away my business history?”

In an era of:

  • Regulatory scrutiny
  • Data-driven decisions
  • Machine learning opportunities
  • Debugging production issues
  • Changing business requirements

The answer is almost always NO.

Decision Framework

Start with EventCore for:

  • Any line-of-business application - Your default choice
  • Multi-entity operations - EventCore’s sweet spot
  • Financial systems - Audit trail included
  • Collaborative tools - Natural conflict resolution
  • Regulated industries - Compliance built-in
  • Domain-driven design - Commands match your domain

Consider Alternatives Only For:

  • 🤔 Pure caching layers - Use Redis alongside EventCore
  • 🤔 Binary blob storage - Hybrid approach with S3
  • 🤔 >1000 ops/sec - Add caching or consider specialized solutions

Summary

In 2024 and beyond, the question isn’t “Why event sourcing?” but “Why would you throw away your business history?”

EventCore makes event sourcing practical by:

  • Eliminating aggregate boundary problems
  • Providing multi-stream atomicity
  • Making it type-safe and simple
  • Scaling to real business needs

Storage is cheap. History is valuable. Make event sourcing your default.

Ready to dive deeper? Let’s explore Event Modeling Fundamentals

Chapter 1.3: Event Modeling Fundamentals

Event modeling is a visual technique for designing event-driven systems. It helps you discover your domain events, commands, and read models before writing any code. This chapter teaches you how to model systems that naturally translate to EventCore implementations.

What is Event Modeling?

Event modeling is a method of describing systems using three core elements:

  1. Events (Orange) - Things that happened
  2. Commands (Blue) - Things users want to do
  3. Read Models (Green) - Views of current state

The genius is in its simplicity: model your system on a timeline showing what happens when.

The Event Modeling Process

Step 1: Brain Storming Events

Start by identifying what happens in your system. Use past-tense language:

Example: Task Management System

Events (what happened):
- Task Created
- Task Assigned
- Task Completed
- Comment Added
- Due Date Changed
- Task Archived

Key principles:

  • Past tense (“Created” not “Create”)
  • Record facts (“Task Completed” not “Complete Task”)
  • Include relevant data in event names

Step 2: Building the Timeline

Arrange events on a timeline to tell the story of your system:

Time →
|
├─ Task Created ──┬─ Task Assigned ──┬─ Comment Added ──┬─ Task Completed
|   (by: Alice)   |   (to: Bob)      |   (by: Bob)      |   (by: Bob)
|   title: "Fix"  |                  |   "Working on it" |
|                 |                  |                   |
└─────────────────┴──────────────────┴───────────────────┴─────────────────

This visual representation helps you:

  • See the flow of your system
  • Identify missing events
  • Understand event relationships

Step 3: Identifying Commands

Commands trigger events. Look at each event and ask “What user action caused this?”

Command (Blue)           →  Event (Orange)
─────────────────────────────────────────
Create Task              →  Task Created
Assign Task              →  Task Assigned  
Complete Task            →  Task Completed
Add Comment              →  Comment Added

In EventCore, these become your command types:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CreateTask {
    #[stream]
    task_id: StreamId,
    title: TaskTitle,
    description: TaskDescription,
}

#[derive(Command, Clone)]
struct AssignTask {
    #[stream]
    task_id: StreamId,
    #[stream]
    user_id: StreamId,
}
}

Step 4: Designing Read Models

Read models answer questions. Look at your UI/API needs:

Question                     →  Read Model (Green)
────────────────────────────────────────────────
"What tasks do I have?"      →  My Tasks List
"What's the project status?" →  Project Dashboard
"Who worked on what?"        →  Activity Timeline

In EventCore, these become projections:

#![allow(unused)]
fn main() {
// Read model for "My Tasks"
struct MyTasksProjection {
    tasks_by_user: HashMap<UserId, Vec<TaskSummary>>,
}

impl CqrsProjection for MyTasksProjection {
    fn apply(&mut self, event: &StoredEvent<TaskEvent>) {
        match &event.payload {
            TaskEvent::TaskAssigned { user_id, .. } => {
                // Update tasks_by_user
            }
            // ... handle other events
        }
    }
}
}

Event Modeling Patterns

Pattern 1: State Transitions

Many business processes are state machines:

Draft → Published → Archived
  ↓         ↓
Deleted  Unpublished

Events:
- ArticleDrafted
- ArticlePublished  
- ArticleUnpublished
- ArticleArchived
- ArticleDeleted

In EventCore:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct PublishArticle {
    #[stream]
    article_id: StreamId,
    #[stream]
    author_id: StreamId,    // Also track author actions
    scheduled_time: Option<Timestamp>,
}
}

Pattern 2: Collaborative Operations

When multiple entities participate:

Money Transfer Timeline:
                          
Source Account ──────┬──────────────┬─────────
                     ↓              ↑
                Money Withdrawn     │
                                    │
Target Account ──────────────┬──────┴─────────
                             ↓
                        Money Deposited

In EventCore, this is ONE atomic command:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct TransferMoney {
    #[stream]
    from_account: StreamId,
    #[stream]
    to_account: StreamId,
    amount: Money,
}
}

Pattern 3: Process Flows

Complex business processes with multiple steps:

Order Flow:
Order Created → Payment Processed → Inventory Reserved → Order Shipped
      |                |                    |                  |
   Order Stream   Payment Stream    Inventory Stream    Shipping Stream

Each step might be a separate command or one complex command:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct FulfillOrder {
    #[stream]
    order_id: StreamId,
    #[stream]
    payment_id: StreamId,
    #[stream]
    inventory_id: StreamId,
    #[stream]
    shipping_id: StreamId,
}
}

From Model to Implementation

1. Events Become Rust Enums

Your discovered events:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Serialize, Deserialize)]
enum TaskEvent {
    Created { title: String, description: String },
    Assigned { user_id: UserId },
    Completed { completed_at: Timestamp },
    CommentAdded { author: UserId, text: String },
}
}

2. Commands Become EventCore Commands

Your identified commands:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CreateTask {
    #[stream]
    task_id: StreamId,
    title: TaskTitle,
}

#[async_trait]
impl CommandLogic for CreateTask {
    type Event = TaskEvent;
    type State = TaskState;
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        require!(!state.exists, "Task already exists");
        
        Ok(vec![
            StreamWrite::new(
                &read_streams,
                self.task_id.clone(),
                TaskEvent::Created {
                    title: self.title.as_ref().to_string(),
                    description: String::new(),
                }
            )?
        ])
    }
}
}

3. Read Models Become Projections

Your view requirements:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct TasksByUserProjection {
    index: HashMap<UserId, HashSet<TaskId>>,
}

impl CqrsProjection for TasksByUserProjection {
    fn apply(&mut self, event: &StoredEvent<TaskEvent>) {
        match &event.payload {
            TaskEvent::Assigned { user_id } => {
                self.index
                    .entry(user_id.clone())
                    .or_default()
                    .insert(TaskId::from(&event.stream_id));
            }
            _ => {}
        }
    }
}
}

Workshop: Model a Coffee Shop

Let’s practice with a simple domain:

Step 1: Brainstorm Events

What happens in a coffee shop?

  • Customer Entered
  • Order Placed
  • Payment Received
  • Coffee Prepared
  • Order Completed
  • Customer Left

Step 2: Build Timeline

Customer Entered → Order Placed → Payment Received → Coffee Prepared → Order Completed
     |                 |               |                  |                |
  Customer ID      Order Stream   Payment Stream    Barista Stream    Order Stream

Step 3: Identify Commands

  • Enter Shop → Customer Entered
  • Place Order → Order Placed
  • Process Payment → Payment Received
  • Prepare Coffee → Coffee Prepared
  • Complete Order → Order Completed

Step 4: Design Read Models

  • Queue Display: Shows pending orders for baristas
  • Customer Receipt: Shows order details and status
  • Daily Sales Report: Aggregates all payments

Step 5: Implement in EventCore

#![allow(unused)]
fn main() {
// One command handling the full order flow
#[derive(Command, Clone)]
struct PlaceAndPayOrder {
    #[stream]
    order_id: StreamId,
    #[stream]
    customer_id: StreamId,
    #[stream]
    register_id: StreamId,
    items: Vec<MenuItem>,
    payment: PaymentMethod,
}
}

Best Practices

  1. Start with Events, Not Structure

    • Don’t design database schemas
    • Focus on what happens in the business
  2. Use Domain Language

    • “InvoiceSent” not “UpdateInvoiceStatus”
    • Match the language your users use
  3. Model Time Explicitly

    • Show the flow of events
    • Understand concurrent vs sequential operations
  4. Keep Events Focused

    • One event = one business fact
    • Don’t combine unrelated changes
  5. Commands Match User Intent

    • “TransferMoney” not “UpdateAccountBalance”
    • Commands are what users want to do

Common Pitfalls

Modeling State Instead of Events

#![allow(unused)]
fn main() {
// Bad: Thinking in state
AccountUpdated { balance: 100 }

// Good: Thinking in events  
MoneyDeposited { amount: 50 }
}

Technical Events

#![allow(unused)]
fn main() {
// Bad: Technical focus
DatabaseRecordInserted

// Good: Business focus
CustomerRegistered
}

Missing the Why

#![allow(unused)]
fn main() {
// Bad: Just the what
PriceChanged { new_price: 100 }

// Good: Including why
PriceReducedForSale { original: 150, sale_price: 100, reason: "Black Friday" }
}

Summary

Event modeling helps you:

  1. Understand your domain before coding
  2. Discover events, commands, and read models
  3. Design systems that map naturally to EventCore
  4. Communicate with stakeholders visually

The key insight: Model what happens, not what is.

Next, let’s look at EventCore’s Architecture to understand how your models become working systems →

Chapter 1.4: Architecture Overview

This chapter provides a high-level view of EventCore’s architecture, showing how commands, events, and projections work together to create robust event-sourced systems.

Core Architecture

EventCore follows a clean, layered architecture:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Application   │     │   Application   │     │   Application   │
│     (Axum)      │     │    (CLI)        │     │   (gRPC)        │
└────────┬────────┘     └────────┬────────┘     └────────┬────────┘
         │                       │                       │
         └───────────────────────┴───────────────────────┘
                                 │
                    ┌────────────▼────────────┐
                    │    Command Executor     │
                    │  (Validation & Retry)   │
                    └────────────┬────────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         │                       │                       │
┌────────▼────────┐   ┌──────────▼──────────┐  ┌────────▼────────┐
│    Commands     │   │   Event Store       │  │  Projections    │
│  (Domain Logic) │   │  (PostgreSQL)       │  │  (Read Models)  │
└─────────────────┘   └─────────────────────┘  └─────────────────┘

Key Components

1. Commands

Commands encapsulate business operations. They declare what streams they need and contain the business logic:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ApproveOrder {
    #[stream]
    order: StreamId,
    #[stream]
    approver: StreamId,
    #[stream]
    inventory: StreamId,
}
}

Responsibilities:

  • Declare stream dependencies via #[stream] attributes
  • Implement business validation rules
  • Generate events representing what happened
  • Ensure consistency within their boundaries

2. Command Executor

The executor orchestrates command execution with automatic retry logic:

#![allow(unused)]
fn main() {
let executor = CommandExecutor::builder()
    .with_store(event_store)
    .with_retry_policy(RetryPolicy::exponential_backoff())
    .build();

let result = executor.execute(&command).await?;
}

Execution Flow:

  1. Read Phase: Fetch all declared streams
  2. Reconstruct State: Apply events to build current state
  3. Execute Command: Run business logic
  4. Write Phase: Atomically write new events
  5. Retry on Conflict: Handle optimistic concurrency

3. Event Store

The event store provides durable, ordered storage of events:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait EventStore: Send + Sync {
    async fn read_stream(&self, stream_id: &StreamId) -> Result<Vec<StoredEvent>>;
    async fn write_events(&self, events: Vec<EventToWrite>) -> Result<()>;
}
}

Guarantees:

  • Atomic multi-stream writes
  • Optimistic concurrency control
  • Global ordering via UUIDv7 event IDs
  • Exactly-once semantics

4. Projections

Projections build read models from events:

#![allow(unused)]
fn main() {
impl CqrsProjection for OrderSummaryProjection {
    type Event = OrderEvent;
    type Error = ProjectionError;

    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match &event.payload {
            OrderEvent::Approved { .. } => {
                self.approved_count += 1;
            }
            // Handle other events
        }
        Ok(())
    }
}
}

Capabilities:

  • Real-time updates from event streams
  • Rebuild from any point in time
  • Multiple projections from same events
  • Optimized for specific queries

Data Flow

Write Path (Commands)

User Action
    ↓
HTTP Request
    ↓
Command Creation ──────→ #[derive(Command)] macro generates boilerplate
    ↓
Executor.execute()
    ↓
Read Streams ──────────→ PostgreSQL: SELECT events WHERE stream_id IN (...)
    ↓
Reconstruct State ─────→ Fold events into current state
    ↓
Command.handle() ──────→ Business logic validates and generates events
    ↓
Write Events ──────────→ PostgreSQL: INSERT events (atomic transaction)
    ↓
Return Result

Read Path (Projections)

Events Written
    ↓
Event Notification
    ↓
Projection Runner ─────→ Subscribes to event streams
    ↓
Load Event
    ↓
Projection.apply() ────→ Update read model state
    ↓
Save Checkpoint ───────→ Track position for resume
    ↓
Query Read Model ──────→ Optimized for specific access patterns

Multi-Stream Atomicity

EventCore’s key innovation is atomic operations across multiple streams:

Traditional Event Sourcing

Account A         Account B
    │                 │
    ├─ Withdraw?      │        ❌ Two separate operations
    │                 ├─ Deposit?   (not atomic!)
    ↓                 ↓

EventCore Approach

        TransferMoney Command
               │
    ┌──────────┴──────────┐
    ↓                     ↓
Account A              Account B
    │                     │
    ├─ Withdrawn ←────────┤ Deposited    ✅ One atomic operation!
    ↓                     ↓

Concurrency Model

EventCore uses optimistic concurrency control:

  1. Version Tracking: Each stream has a version number
  2. Read Version: Commands note the version when reading
  3. Conflict Detection: Writes fail if version changed
  4. Automatic Retry: Executor retries with fresh data
#![allow(unused)]
fn main() {
// Internally tracked by EventCore
struct StreamVersion {
    stream_id: StreamId,
    version: EventVersion,
}

// Automatic retry on conflicts
let result = executor
    .execute(&command)
    .await?;  // Retries handled internally
}

Type Safety

EventCore leverages Rust’s type system for correctness:

Stream Access Control

#![allow(unused)]
fn main() {
// Compile-time enforcement
impl TransferMoney {
    fn handle(&self, read_streams: ReadStreams<Self::StreamSet>) {
        // ✅ Can only write to declared streams
        StreamWrite::new(&read_streams, self.from_account, event)?;
        
        // ❌ Compile error - stream not declared!
        StreamWrite::new(&read_streams, other_stream, event)?;
    }
}
}

Validated Types

#![allow(unused)]
fn main() {
// Parse, don't validate
#[nutype(validate(greater = 0))]
struct Money(u64);

// Once created, always valid
let amount = Money::try_new(100)?;  // Validated at boundary
transfer_money(amount);              // No validation needed
}

Deployment Architecture

Simple Deployment

┌─────────────┐     ┌──────────────┐
│  Your App   │────▶│  PostgreSQL  │
└─────────────┘     └──────────────┘

Production Deployment

                    Load Balancer
                         │
        ┌────────────────┼────────────────┐
        ↓                ↓                ↓
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│   App Pod 1   │ │   App Pod 2   │ │   App Pod 3   │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
        │                 │                 │
        └─────────────────┼─────────────────┘
                          ↓
                ┌─────────────────┐
                │   PostgreSQL    │
                │   (Primary)     │
                └────────┬────────┘
                         │
        ┌────────────────┼────────────────┐
        ↓                                  ↓
┌───────────────┐                 ┌───────────────┐
│  PG Replica 1 │                 │  PG Replica 2 │
└───────────────┘                 └───────────────┘

Performance Characteristics

EventCore is optimized for correctness and developer productivity:

Throughput

  • Single-stream commands: ~83 ops/sec (PostgreSQL), 187,711 ops/sec (in-memory)
  • Multi-stream commands: ~25-50 ops/sec (PostgreSQL)
  • Batch operations: 750,000-820,000 events/sec (in-memory)

Latency

  • Command execution: 10-20ms (typical)
  • Conflict retry: +5-10ms per retry
  • Projection lag: <100ms (typical)

Scaling Strategies

  1. Vertical: Larger PostgreSQL instance
  2. Read Scaling: PostgreSQL read replicas
  3. Stream Sharding: Partition by stream ID
  4. Caching: Read model caching layer

Error Handling

EventCore provides structured error handling:

#![allow(unused)]
fn main() {
pub enum CommandError {
    ValidationFailed(String),      // Business rule violations
    ConcurrencyConflict,          // Version conflicts (retried)
    StreamNotFound(StreamId),     // Missing streams
    EventStoreFailed(String),     // Infrastructure errors
}
}

Errors are categorized for appropriate handling:

  • Retriable: Concurrency conflicts, transient failures
  • Non-retriable: Validation failures, business rule violations
  • Fatal: Infrastructure failures, panic recovery

Monitoring and Observability

Built-in instrumentation for production visibility:

#![allow(unused)]
fn main() {
// Automatic metrics
eventcore.commands.executed{command="TransferMoney", status="success"} 
eventcore.events.written{stream="account-123"} 
eventcore.retries{reason="concurrency_conflict"}

// Structured logging
{"level":"info", "command":"TransferMoney", "duration_ms":15, "events_written":2}

// OpenTelemetry traces
TransferMoney
  ├─ read_streams (5ms)
  ├─ reconstruct_state (2ms)
  ├─ handle_command (3ms)
  └─ write_events (5ms)
}

Summary

EventCore’s architecture provides:

  1. Clean Separation: Commands, events, and projections have clear responsibilities
  2. Multi-Stream Atomicity: Complex operations remain consistent
  3. Type Safety: Rust’s type system prevents errors
  4. Production Ready: Built-in retry, monitoring, and error handling
  5. Flexible Deployment: From simple to highly-scaled architectures

The architecture is designed to make the right thing easy and the wrong thing impossible.

Ready to build something? Continue to Part 2: Getting Started

Part 2: Getting Started

This comprehensive tutorial walks you through building a complete task management system with EventCore. You’ll learn event modeling, domain design, command implementation, projections, and testing.

What We’ll Build

A task management system with:

  • Creating and managing tasks
  • Assigning tasks to users
  • Comments and activity tracking
  • Real-time task lists and dashboards
  • Complete audit trail

Chapters in This Part

  1. Setting Up Your Project - Create a new Rust project with EventCore
  2. Modeling the Domain - Design events and commands using event modeling
  3. Implementing Commands - Build commands with the macro system
  4. Working with Projections - Create read models for queries
  5. Testing Your Application - Write comprehensive tests

Prerequisites

  • Rust 1.70+ installed
  • Basic Rust knowledge (ownership, traits, async)
  • PostgreSQL 12+ (or use in-memory store for learning)
  • 30-60 minutes to complete

Learning Outcomes

By the end of this tutorial, you’ll understand:

  • How to model domains with events
  • Using EventCore’s macro system
  • Building multi-stream commands
  • Creating and updating projections
  • Testing event-sourced systems

Code Repository

The complete code for this tutorial is available at:

git clone https://github.com/your-org/eventcore-task-tutorial
cd eventcore-task-tutorial

Ready? Let’s set up your project

Chapter 2.1: Setting Up Your Project

Let’s create a new Rust project and add EventCore dependencies. We’ll build a task management system that demonstrates EventCore’s key features.

Create a New Project

cargo new taskmaster --bin
cd taskmaster

Add Dependencies

Edit Cargo.toml to include EventCore and related dependencies:

[package]
name = "taskmaster"
version = "0.1.0"
edition = "2021"

[dependencies]
# EventCore core functionality
eventcore = "0.1"
eventcore-macros = "0.1"

# For development/testing - switch to eventcore-postgres for production
eventcore-memory = "0.1"

# Async runtime
tokio = { version = "1.40", features = ["full"] }
async-trait = "0.1"

# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

# Type validation
nutype = { version = "0.6", features = ["serde"] }

# Utilities
uuid = { version = "1.11", features = ["v7", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
thiserror = "2.0"

# For our CLI interface
clap = { version = "4.5", features = ["derive"] }

[dev-dependencies]
# Testing utilities
proptest = "1.6"

Project Structure

Create the following directory structure:

taskmaster/
├── Cargo.toml
├── src/
│   ├── main.rs           # Application entry point
│   ├── domain/
│   │   ├── mod.rs        # Domain module
│   │   ├── types.rs      # Domain types with validation
│   │   ├── events.rs     # Event definitions
│   │   └── commands/     # Command implementations
│   │       ├── mod.rs
│   │       ├── create_task.rs
│   │       ├── assign_task.rs
│   │       └── complete_task.rs
│   ├── projections/
│   │   ├── mod.rs        # Projections module
│   │   ├── task_list.rs  # User task lists
│   │   └── statistics.rs # Task statistics
│   └── api/
│       ├── mod.rs        # API module (we'll add this in Part 4)
│       └── handlers.rs   # HTTP handlers

Create the directories:

mkdir -p src/domain/commands
mkdir -p src/projections
mkdir -p src/api

Initial Setup Code

Let’s create the basic module structure:

src/main.rs

mod domain;
mod projections;

use clap::{Parser, Subcommand};
use eventcore::prelude::*;
use eventcore_memory::InMemoryEventStore;

#[derive(Parser)]
#[command(name = "taskmaster")]
#[command(about = "A task management system built with EventCore")]
struct Cli {
    #[command(subcommand)]
    command: Commands,
}

#[derive(Subcommand)]
enum Commands {
    /// Create a new task
    Create {
        /// Task title
        title: String,
        /// Task description
        description: String,
    },
    /// List all tasks
    List,
    /// Run interactive demo
    Demo,
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize event store (in-memory for now)
    let event_store = InMemoryEventStore::new();
    let executor = CommandExecutor::new(event_store);
    
    let cli = Cli::parse();
    
    match cli.command {
        Commands::Create { title, description } => {
            println!("Creating task: {} - {}", title, description);
            // We'll implement this in Chapter 2.3
        }
        Commands::List => {
            println!("Listing tasks...");
            // We'll implement this in Chapter 2.4
        }
        Commands::Demo => {
            println!("Running demo...");
            run_demo(executor).await?;
        }
    }
    
    Ok(())
}

async fn run_demo<ES: EventStore>(executor: CommandExecutor<ES>) 
-> Result<(), Box<dyn std::error::Error>> 
where
    ES::Event: From<domain::events::TaskEvent> + TryInto<domain::events::TaskEvent>,
{
    println!("🚀 EventCore Task Management Demo");
    println!("================================\n");
    
    // We'll add demo code as we build features
    
    Ok(())
}

src/domain/mod.rs

#![allow(unused)]
fn main() {
pub mod types;
pub mod events;
pub mod commands;

// Re-export commonly used items
pub use types::*;
pub use events::*;
}

src/domain/types.rs

#![allow(unused)]
fn main() {
use nutype::nutype;
use serde::{Deserialize, Serialize};
use uuid::Uuid;

/// Validated task title - must be non-empty and reasonable length
#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 200),
    derive(
        Debug,
        Clone,
        PartialEq,
        Eq,
        AsRef,
        Serialize,
        Deserialize,
        Display
    )
)]
pub struct TaskTitle(String);

/// Validated task description
#[nutype(
    sanitize(trim),
    validate(len_char_max = 2000),
    derive(
        Debug,
        Clone,
        PartialEq,
        Eq,
        AsRef,
        Serialize,
        Deserialize
    )
)]
pub struct TaskDescription(String);

/// Validated comment text
#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 1000),
    derive(
        Debug,
        Clone,
        PartialEq,
        Eq,
        AsRef,
        Serialize,
        Deserialize
    )
)]
pub struct CommentText(String);

/// Validated user name
#[nutype(
    sanitize(trim),
    validate(not_empty, len_char_max = 100),
    derive(
        Debug,
        Clone,
        PartialEq,
        Eq,
        Hash,
        AsRef,
        Serialize,
        Deserialize,
        Display
    )
)]
pub struct UserName(String);

/// Strongly-typed task ID
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct TaskId(Uuid);

impl TaskId {
    pub fn new() -> Self {
        Self(Uuid::now_v7())
    }
}

impl Default for TaskId {
    fn default() -> Self {
        Self::new()
    }
}

impl std::fmt::Display for TaskId {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(f, "{}", self.0)
    }
}

/// Task priority levels
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum Priority {
    Low,
    Medium,
    High,
    Critical,
}

impl Default for Priority {
    fn default() -> Self {
        Self::Medium
    }
}

/// Task status - note we model this as events, not mutable state
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum TaskStatus {
    Open,
    InProgress,
    Completed,
    Cancelled,
}

impl Default for TaskStatus {
    fn default() -> Self {
        Self::Open
    }
}
}

src/domain/events.rs

#![allow(unused)]
fn main() {
use super::types::*;
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};

/// Events that can occur in our task management system
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum TaskEvent {
    /// A new task was created
    Created {
        task_id: TaskId,
        title: TaskTitle,
        description: TaskDescription,
        creator: UserName,
        created_at: DateTime<Utc>,
    },
    
    /// Task was assigned to a user
    Assigned {
        task_id: TaskId,
        assignee: UserName,
        assigned_by: UserName,
        assigned_at: DateTime<Utc>,
    },
    
    /// Task was unassigned
    Unassigned {
        task_id: TaskId,
        unassigned_by: UserName,
        unassigned_at: DateTime<Utc>,
    },
    
    /// Task priority was changed
    PriorityChanged {
        task_id: TaskId,
        old_priority: Priority,
        new_priority: Priority,
        changed_by: UserName,
        changed_at: DateTime<Utc>,
    },
    
    /// Comment was added to task
    CommentAdded {
        task_id: TaskId,
        comment: CommentText,
        author: UserName,
        commented_at: DateTime<Utc>,
    },
    
    /// Task was completed
    Completed {
        task_id: TaskId,
        completed_by: UserName,
        completed_at: DateTime<Utc>,
    },
    
    /// Task was reopened after completion
    Reopened {
        task_id: TaskId,
        reopened_by: UserName,
        reopened_at: DateTime<Utc>,
        reason: Option<String>,
    },
    
    /// Task was cancelled
    Cancelled {
        task_id: TaskId,
        cancelled_by: UserName,
        cancelled_at: DateTime<Utc>,
        reason: Option<String>,
    },
}

// Required for EventCore's type conversion
impl TryFrom<&TaskEvent> for TaskEvent {
    type Error = std::convert::Infallible;
    
    fn try_from(value: &TaskEvent) -> Result<Self, Self::Error> {
        Ok(value.clone())
    }
}
}

src/domain/commands/mod.rs

#![allow(unused)]
fn main() {
mod create_task;
mod assign_task;
mod complete_task;

pub use create_task::*;
pub use assign_task::*;
pub use complete_task::*;
}

src/projections/mod.rs

#![allow(unused)]
fn main() {
mod task_list;
mod statistics;

pub use task_list::*;
pub use statistics::*;
}

Verify Setup

Let’s make sure everything compiles:

cargo build

You should see output like:

   Compiling taskmaster v0.1.0
    Finished dev [unoptimized + debuginfo] target(s) in X.XXs

Create a Simple Test

Add to src/main.rs:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use crate::domain::types::*;
    
    #[test]
    fn test_validated_types() {
        // Valid title
        let title = TaskTitle::try_new("Fix the bug").unwrap();
        assert_eq!(title.as_ref(), "Fix the bug");
        
        // Empty title should fail
        assert!(TaskTitle::try_new("").is_err());
        
        // Whitespace is trimmed
        let title = TaskTitle::try_new("  Trimmed  ").unwrap();
        assert_eq!(title.as_ref(), "Trimmed");
    }
    
    #[test]
    fn test_task_id_generation() {
        let id1 = TaskId::new();
        let id2 = TaskId::new();
        
        // IDs should be unique
        assert_ne!(id1, id2);
        
        // IDs should be sortable by creation time (UUIDv7 property)
        assert!(id1.0 < id2.0);
    }
}
}

Run the tests:

cargo test

Environment Setup for PostgreSQL (Optional)

If you want to use PostgreSQL instead of the in-memory store:

  1. Start PostgreSQL with Docker:
docker run -d \
  --name eventcore-postgres \
  -e POSTGRES_PASSWORD=password \
  -e POSTGRES_DB=taskmaster \
  -p 5432:5432 \
  postgres:17
  1. Update Cargo.toml:
[dependencies]
eventcore-postgres = "0.1"
sqlx = { version = "0.8", features = ["runtime-tokio-rustls", "postgres"] }
  1. Set environment variable:
export DATABASE_URL="postgres://postgres:password@localhost/taskmaster"

Summary

We’ve set up:

  • ✅ A new Rust project with EventCore dependencies
  • ✅ Domain types with validation using nutype
  • ✅ Event definitions for our task system
  • ✅ Basic project structure
  • ✅ Test infrastructure

Next, we’ll model our domain using event modeling techniques →

Chapter 2.2: Modeling the Domain

Now that our project is set up, let’s use event modeling to design our task management system. We’ll discover the events, commands, and read models that make up our domain.

Step 1: Brainstorm the Events

What happens in a task management system? Let’s think through a typical workflow:

Events (Orange - things that happened):
- Task Created
- Task Assigned 
- Task Started
- Comment Added
- Task Completed
- Task Reopened
- Priority Changed
- Due Date Set
- Task Cancelled

Step 2: Build the Timeline

Let’s visualize how these events flow through time:

Timeline →
            Task Created ──┬── Task Assigned ──┬── Comment Added ──┬── Task Completed
                          │                    │                   │
User: Alice               │   User: Bob       │   User: Bob      │   User: Bob
Title: "Fix login bug"    │   Assignee: Bob   │   "Found issue"  │   
                          │                    │                   │
Stream: task-123          │   Streams:        │   Stream:        │   Streams:
                          │   - task-123      │   - task-123     │   - task-123
                          │   - user-bob      │                  │   - user-bob

Notice how some operations involve multiple streams - this is where EventCore shines!

Step 3: Identify Commands

For each event, what user action triggered it?

Command (Blue)Events (Orange)Streams Involved
Create TaskTask Createdtask
Assign TaskTask Assignedtask, assignee
Start TaskTask Startedtask, user
Add CommentComment Addedtask
Complete TaskTask Completedtask, user
Reopen TaskTask Reopenedtask, user
Change PriorityPriority Changedtask
Cancel TaskTask Cancelledtask, user

Step 4: Design Read Models

What questions do users need answered?

QuestionRead Model (Green)Updated By Events
“What are my tasks?”User Task ListAssigned, Completed, Cancelled
“What’s the task status?”Task DetailsAll task events
“What’s the team workload?”Team DashboardCreated, Assigned, Completed
“What happened to this task?”Task HistoryAll events (audit log)

Step 5: Discover Business Rules

As we model, we discover rules that our commands must enforce:

  1. Task Creation

    • Title is required and non-empty
    • Description has reasonable length limit
    • Creator must be identified
  2. Task Assignment

    • Can’t assign to non-existent user
    • Should track assignment history
    • Unassigning is explicit action
  3. Task Completion

    • Only assigned user can complete (or admin)
    • Can’t complete cancelled tasks
    • Completion can be undone (reopen)
  4. Comments

    • Must have content
    • Track author and timestamp
    • Comments are immutable

Translating to EventCore

Events Stay Close to Our Model

Our discovered events map directly to code:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum TaskEvent {
    Created {
        task_id: TaskId,
        title: TaskTitle,
        description: TaskDescription,
        creator: UserName,
        created_at: DateTime<Utc>,
    },
    Assigned {
        task_id: TaskId,
        assignee: UserName,
        assigned_by: UserName,
        assigned_at: DateTime<Utc>,
    },
    // ... other events
}
}

Commands Declare Their Streams

Multi-stream operations are explicit:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct AssignTask {
    #[stream]
    task_id: StreamId,      // The task stream
    #[stream]
    user_id: StreamId,      // The assignee's stream
    assigned_by: UserName,
}
}

This command will:

  1. Read both streams atomically
  2. Validate the assignment
  3. Write events to both streams
  4. All in one transaction!

State Models for Each Command

Each command needs different state views:

#![allow(unused)]
fn main() {
// State for task operations
#[derive(Default)]
struct TaskState {
    exists: bool,
    title: String,
    status: TaskStatus,
    assignee: Option<UserName>,
    creator: UserName,
}

// State for user operations  
#[derive(Default)]
struct UserTasksState {
    user_name: UserName,
    assigned_tasks: Vec<TaskId>,
    completed_count: u32,
}
}

Modeling Complex Scenarios

Scenario: Task Handoff

When reassigning a task from Alice to Bob:

Timeline →
        Task Assigned       Task Unassigned      Task Assigned
        (to: Alice)         (from: Alice)        (to: Bob)
             │                    │                   │
             ├────────────────────┴───────────────────┤
             │                                        │
    Streams affected:                        Streams affected:
    - task-123                               - task-123
    - user-alice                             - user-alice  
                                             - user-bob

In EventCore, we can model this as one atomic operation:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ReassignTask {
    #[stream]
    task_id: StreamId,
    #[stream]
    from_user: StreamId,
    #[stream]
    to_user: StreamId,
    reassigned_by: UserName,
}
}

Scenario: Bulk Operations

Assigning multiple tasks to a user:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct BulkAssignTasks {
    #[stream]
    user_id: StreamId,
    #[stream("tasks")]  // Multiple task streams
    task_ids: Vec<StreamId>,
    assigned_by: UserName,
}
}

The beauty of EventCore: this remains atomic across ALL streams!

Visual Domain Model

Here’s our complete domain model:

┌─────────────────────────────────────────────────────────────┐
│                        COMMANDS                              │
├─────────────────────────────────────────────────────────────┤
│ CreateTask │ AssignTask │ CompleteTask │ AddComment │ ...   │
└─────────────┬───────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────┐
│                         EVENTS                               │
├─────────────────────────────────────────────────────────────┤
│ TaskCreated │ TaskAssigned │ TaskCompleted │ CommentAdded   │
└─────────────┬───────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────┐
│                     READ MODELS                              │
├─────────────────────────────────────────────────────────────┤
│ UserTaskList │ TaskDetails │ TeamDashboard │ ActivityFeed   │
└─────────────────────────────────────────────────────────────┘

Key Insights from Modeling

  1. Multi-Stream Operations are Common

    • Task assignment affects task AND user streams
    • Completion updates task AND user statistics
    • EventCore handles this naturally
  2. Events are Business Facts

    • “TaskAssigned” not “UpdateTask”
    • Events capture intent and context
    • Rich events enable better projections
  3. Commands Match User Intent

    • “AssignTask” not “UpdateTaskAssignee”
    • Commands are what users want to do
    • Natural API emerges from modeling
  4. Read Models Serve Specific Needs

    • UserTaskList for “my tasks” view
    • TeamDashboard for manager overview
    • Different projections from same events

Refining Our Event Model

Based on our modeling, let’s update src/domain/events.rs:

#![allow(unused)]
fn main() {
use super::types::*;
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};
use eventcore::StreamId;

/// Events that can occur in our task management system
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum TaskEvent {
    // Task lifecycle events
    Created {
        task_id: TaskId,
        title: TaskTitle,
        description: TaskDescription,
        creator: UserName,
        created_at: DateTime<Utc>,
    },
    
    // Assignment events - note these affect multiple streams
    Assigned {
        task_id: TaskId,
        assignee: UserName,
        assigned_by: UserName,
        assigned_at: DateTime<Utc>,
    },
    
    Unassigned {
        task_id: TaskId,
        previous_assignee: UserName,
        unassigned_by: UserName,
        unassigned_at: DateTime<Utc>,
    },
    
    // Work events
    Started {
        task_id: TaskId,
        started_by: UserName,
        started_at: DateTime<Utc>,
    },
    
    Completed {
        task_id: TaskId,
        completed_by: UserName,
        completed_at: DateTime<Utc>,
    },
    
    // Collaboration events
    CommentAdded {
        task_id: TaskId,
        comment_id: Uuid,
        comment: CommentText,
        author: UserName,
        commented_at: DateTime<Utc>,
    },
    
    // Management events
    PriorityChanged {
        task_id: TaskId,
        old_priority: Priority,
        new_priority: Priority,
        changed_by: UserName,
        changed_at: DateTime<Utc>,
    },
    
    DueDateSet {
        task_id: TaskId,
        due_date: DateTime<Utc>,
        set_by: UserName,
        set_at: DateTime<Utc>,
    },
}

/// Events specific to user streams
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum UserEvent {
    /// Track when user is assigned a task
    TaskAssigned {
        user_name: UserName,
        task_id: TaskId,
        assigned_at: DateTime<Utc>,
    },
    
    /// Track when user completes a task
    TaskCompleted {
        user_name: UserName,
        task_id: TaskId,
        completed_at: DateTime<Utc>,
    },
    
    /// Track workload changes
    WorkloadUpdated {
        user_name: UserName,
        active_tasks: u32,
        completed_today: u32,
    },
}

/// Combined event type for our system
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(tag = "event_type", rename_all = "snake_case")]
pub enum SystemEvent {
    Task(TaskEvent),
    User(UserEvent),
}

// Required conversions for EventCore
impl TryFrom<&SystemEvent> for SystemEvent {
    type Error = std::convert::Infallible;
    
    fn try_from(value: &SystemEvent) -> Result<Self, Self::Error> {
        Ok(value.clone())
    }
}
}

Summary

Through event modeling, we’ve discovered:

  1. Our Events: Business facts that capture what happened
  2. Our Commands: User intentions that trigger events
  3. Our Read Models: Views that answer user questions
  4. Our Streams: How data is organized (tasks, users)

The key insight: by modeling events first, the rest of the system design follows naturally. EventCore’s multi-stream capabilities mean we can implement our model exactly as designed, without compromise.

Next, let’s implement our commands using EventCore’s macro system →

Chapter 2.3: Implementing Commands

Now we’ll implement the commands we discovered during domain modeling. EventCore’s macro system makes this straightforward while maintaining type safety.

Command Structure

Every EventCore command follows this pattern:

  1. Derive the Command macro - Generates boilerplate
  2. Declare streams with #[stream] - Define what streams you need
  3. Implement CommandLogic - Your business logic
  4. Generate events - What happened as a result

Our First Command: Create Task

Let’s implement task creation:

src/domain/commands/create_task.rs

#![allow(unused)]
fn main() {
use crate::domain::{events::*, types::*};
use async_trait::async_trait;
use chrono::Utc;
use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite};
use eventcore_macros::Command;

/// Command to create a new task
#[derive(Command, Clone)]
pub struct CreateTask {
    /// The task stream - will contain all task events
    #[stream]
    pub task_id: StreamId,
    
    /// Task details
    pub title: TaskTitle,
    pub description: TaskDescription,
    pub creator: UserName,
    pub priority: Priority,
}

impl CreateTask {
    /// Smart constructor ensures valid StreamId
    pub fn new(
        task_id: TaskId,
        title: TaskTitle,
        description: TaskDescription,
        creator: UserName,
    ) -> Result<Self, CommandError> {
        Ok(Self {
            task_id: StreamId::from_static(&format!("task-{}", task_id)),
            title,
            description,
            creator,
            priority: Priority::default(),
        })
    }
}

/// State for create task command - tracks if task exists
#[derive(Default)]
pub struct CreateTaskState {
    exists: bool,
}

#[async_trait]
impl CommandLogic for CreateTask {
    type State = CreateTaskState;
    type Event = TaskEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        match &event.payload {
            TaskEvent::Created { .. } => {
                state.exists = true;
            }
            _ => {} // Other events don't affect creation
        }
    }

    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Business rule: Can't create a task that already exists
        require!(
            !state.exists,
            "Task {} already exists",
            self.task_id
        );

        // Generate the TaskCreated event
        let event = TaskEvent::Created {
            task_id: TaskId::from(&self.task_id),
            title: self.title.clone(),
            description: self.description.clone(),
            creator: self.creator.clone(),
            created_at: Utc::now(),
        };

        // Write to the task stream
        Ok(vec![
            StreamWrite::new(&read_streams, self.task_id.clone(), event)?
        ])
    }
}
}

Key Points

  1. #[derive(Command)] generates:

    • The StreamSet phantom type
    • Implementation of CommandStreams trait
    • The read_streams() method
  2. #[stream] attribute declares which streams this command needs

  3. apply() method reconstructs state from events

  4. handle() method contains your business logic

  5. require! macro provides clean validation with good error messages

  6. StreamWrite::new() ensures type-safe writes to declared streams

Multi-Stream Command: Assign Task

Task assignment affects both the task and the user:

src/domain/commands/assign_task.rs

#![allow(unused)]
fn main() {
use crate::domain::{events::*, types::*};
use async_trait::async_trait;
use chrono::Utc;
use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite};
use eventcore_macros::Command;

/// Command to assign a task to a user
/// This is a multi-stream command affecting both task and user streams
#[derive(Command, Clone)]
pub struct AssignTask {
    #[stream]
    pub task_id: StreamId,
    
    #[stream]
    pub assignee_id: StreamId,
    
    pub assigned_by: UserName,
}

impl AssignTask {
    pub fn new(
        task_id: TaskId,
        assignee: UserName,
        assigned_by: UserName,
    ) -> Result<Self, CommandError> {
        Ok(Self {
            task_id: StreamId::from_static(&format!("task-{}", task_id)),
            assignee_id: StreamId::from_static(&format!("user-{}", assignee)),
            assigned_by,
        })
    }
}

/// State that combines task and user information
#[derive(Default)]
pub struct AssignTaskState {
    // Task state
    task_exists: bool,
    task_title: String,
    current_assignee: Option<UserName>,
    task_status: TaskStatus,
    
    // User state
    user_exists: bool,
    user_name: Option<UserName>,
    active_task_count: u32,
}

#[async_trait]
impl CommandLogic for AssignTask {
    type State = AssignTaskState;
    type Event = SystemEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        // Apply events from different streams
        match &event.payload {
            SystemEvent::Task(task_event) => {
                match task_event {
                    TaskEvent::Created { title, .. } => {
                        state.task_exists = true;
                        state.task_title = title.to_string();
                    }
                    TaskEvent::Assigned { assignee, .. } => {
                        state.current_assignee = Some(assignee.clone());
                    }
                    TaskEvent::Unassigned { .. } => {
                        state.current_assignee = None;
                    }
                    TaskEvent::Completed { .. } => {
                        state.task_status = TaskStatus::Completed;
                    }
                    _ => {}
                }
            }
            SystemEvent::User(user_event) => {
                match user_event {
                    UserEvent::TaskAssigned { user_name, .. } => {
                        state.user_exists = true;
                        state.user_name = Some(user_name.clone());
                        state.active_task_count += 1;
                    }
                    UserEvent::TaskCompleted { .. } => {
                        state.active_task_count = state.active_task_count.saturating_sub(1);
                    }
                    _ => {}
                }
            }
        }
    }

    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Validate business rules
        require!(
            state.task_exists,
            "Cannot assign non-existent task"
        );
        
        require!(
            state.task_status != TaskStatus::Completed,
            "Cannot assign completed task"
        );
        
        require!(
            state.task_status != TaskStatus::Cancelled,
            "Cannot assign cancelled task"
        );

        // Check if already assigned to this user
        if let Some(current) = &state.current_assignee {
            require!(
                current != &state.user_name.clone().unwrap_or_default(),
                "Task is already assigned to this user"
            );
        }

        let now = Utc::now();
        let task_id = TaskId::from(&self.task_id);
        let assignee = UserName::from(&self.assignee_id);

        let mut events = Vec::new();

        // If task is currently assigned, unassign first
        if let Some(previous_assignee) = state.current_assignee {
            events.push(StreamWrite::new(
                &read_streams,
                self.task_id.clone(),
                SystemEvent::Task(TaskEvent::Unassigned {
                    task_id,
                    previous_assignee,
                    unassigned_by: self.assigned_by.clone(),
                    unassigned_at: now,
                })
            )?);
        }

        // Write assignment event to task stream
        events.push(StreamWrite::new(
            &read_streams,
            self.task_id.clone(),
            SystemEvent::Task(TaskEvent::Assigned {
                task_id,
                assignee: assignee.clone(),
                assigned_by: self.assigned_by.clone(),
                assigned_at: now,
            })
        )?);

        // Write assignment event to user stream
        events.push(StreamWrite::new(
            &read_streams,
            self.assignee_id.clone(),
            SystemEvent::User(UserEvent::TaskAssigned {
                user_name: assignee,
                task_id,
                assigned_at: now,
            })
        )?);

        // Update user workload
        events.push(StreamWrite::new(
            &read_streams,
            self.assignee_id.clone(),
            SystemEvent::User(UserEvent::WorkloadUpdated {
                user_name: assignee,
                active_tasks: state.active_task_count + 1,
                completed_today: 0, // Would calculate from state
            })
        )?);

        Ok(events)
    }
}
}

Multi-Stream Benefits

  1. Atomic Updates: Both task and user streams update together
  2. Consistent State: No partial updates possible
  3. Rich Events: Each stream gets relevant events
  4. Type Safety: Can only write to declared streams

Command with Business Logic: Complete Task

src/domain/commands/complete_task.rs

#![allow(unused)]
fn main() {
use crate::domain::{events::*, types::*};
use async_trait::async_trait;
use chrono::Utc;
use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite};
use eventcore_macros::Command;

/// Command to complete a task
#[derive(Command, Clone)]
pub struct CompleteTask {
    #[stream]
    pub task_id: StreamId,
    
    #[stream]
    pub user_id: StreamId,
    
    pub completed_by: UserName,
}

#[derive(Default)]
pub struct CompleteTaskState {
    task_exists: bool,
    task_status: TaskStatus,
    assignee: Option<UserName>,
    
    user_name: Option<UserName>,
    completed_count: u32,
}

#[async_trait]
impl CommandLogic for CompleteTask {
    type State = CompleteTaskState;
    type Event = SystemEvent;

    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        match &event.payload {
            SystemEvent::Task(task_event) => {
                match task_event {
                    TaskEvent::Created { .. } => {
                        state.task_exists = true;
                        state.task_status = TaskStatus::Open;
                    }
                    TaskEvent::Assigned { assignee, .. } => {
                        state.assignee = Some(assignee.clone());
                    }
                    TaskEvent::Started { .. } => {
                        state.task_status = TaskStatus::InProgress;
                    }
                    TaskEvent::Completed { .. } => {
                        state.task_status = TaskStatus::Completed;
                    }
                    _ => {}
                }
            }
            SystemEvent::User(user_event) => {
                match user_event {
                    UserEvent::TaskCompleted { user_name, .. } => {
                        state.user_name = Some(user_name.clone());
                        state.completed_count += 1;
                    }
                    _ => {}
                }
            }
        }
    }

    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Business rules
        require!(
            state.task_exists,
            "Cannot complete non-existent task"
        );
        
        require!(
            state.task_status != TaskStatus::Completed,
            "Task is already completed"
        );
        
        require!(
            state.task_status != TaskStatus::Cancelled,
            "Cannot complete cancelled task"
        );

        // Only assigned user can complete (or admin)
        if let Some(assignee) = &state.assignee {
            require!(
                assignee == &self.completed_by || self.completed_by.as_ref() == "admin",
                "Only assigned user or admin can complete task"
            );
        }

        let now = Utc::now();
        let task_id = TaskId::from(&self.task_id);

        Ok(vec![
            // Mark task as completed
            StreamWrite::new(
                &read_streams,
                self.task_id.clone(),
                SystemEvent::Task(TaskEvent::Completed {
                    task_id,
                    completed_by: self.completed_by.clone(),
                    completed_at: now,
                })
            )?,
            
            // Update user's completion stats
            StreamWrite::new(
                &read_streams,
                self.user_id.clone(),
                SystemEvent::User(UserEvent::TaskCompleted {
                    user_name: self.completed_by.clone(),
                    task_id,
                    completed_at: now,
                })
            )?,
        ])
    }
}
}

Helper Functions

Add these to src/domain/types.rs:

#![allow(unused)]
fn main() {
use eventcore::StreamId;

impl From<&StreamId> for TaskId {
    fn from(stream_id: &StreamId) -> Self {
        // Extract TaskId from stream ID like "task-{uuid}"
        let id_str = stream_id.as_ref()
            .strip_prefix("task-")
            .unwrap_or(stream_id.as_ref());
        
        TaskId(Uuid::parse_str(id_str).unwrap_or_else(|_| Uuid::nil()))
    }
}

impl From<&StreamId> for UserName {
    fn from(stream_id: &StreamId) -> Self {
        // Extract UserName from stream ID like "user-{name}"
        let name = stream_id.as_ref()
            .strip_prefix("user-")
            .unwrap_or(stream_id.as_ref());
        
        UserName::try_new(name).unwrap_or_else(|_| 
            UserName::try_new("unknown").unwrap()
        )
    }
}
}

Testing Our Commands

Add to src/main.rs:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod command_tests {
    use super::*;
    use crate::domain::commands::*;
    use crate::domain::types::*;
    use eventcore_memory::InMemoryEventStore;

    #[tokio::test]
    async fn test_create_task() {
        // Setup
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        
        // Create command
        let task_id = TaskId::new();
        let command = CreateTask::new(
            task_id,
            TaskTitle::try_new("Write tests").unwrap(),
            TaskDescription::try_new("Add unit tests").unwrap(),
            UserName::try_new("alice").unwrap(),
        ).unwrap();
        
        // Execute
        let result = executor.execute(&command).await.unwrap();
        
        // Verify
        assert_eq!(result.events_written.len(), 1);
        assert_eq!(result.streams_affected.len(), 1);
        
        // Try to create again - should fail
        let result = executor.execute(&command).await;
        assert!(result.is_err());
    }

    #[tokio::test]
    async fn test_assign_task() {
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        
        // First create a task
        let task_id = TaskId::new();
        let create = CreateTask::new(
            task_id,
            TaskTitle::try_new("Test task").unwrap(),
            TaskDescription::try_new("Description").unwrap(),
            UserName::try_new("alice").unwrap(),
        ).unwrap();
        
        executor.execute(&create).await.unwrap();
        
        // Now assign it
        let assign = AssignTask::new(
            task_id,
            UserName::try_new("bob").unwrap(),
            UserName::try_new("alice").unwrap(),
        ).unwrap();
        
        let result = executor.execute(&assign).await.unwrap();
        
        // Should write to both task and user streams
        assert_eq!(result.events_written.len(), 3); // Assigned + UserAssigned + Workload
        assert_eq!(result.streams_affected.len(), 2); // task and user streams
    }
}
}

Running the Demo

Update the demo in src/main.rs:

#![allow(unused)]
fn main() {
async fn run_demo<ES: EventStore>(executor: CommandExecutor<ES>) 
-> Result<(), Box<dyn std::error::Error>> 
where
    ES::Event: From<SystemEvent> + TryInto<SystemEvent>,
{
    println!("🚀 EventCore Task Management Demo");
    println!("================================\n");
    
    // Create a task
    let task_id = TaskId::new();
    println!("1. Creating task {}...", task_id);
    
    let create = CreateTask::new(
        task_id,
        TaskTitle::try_new("Build awesome features").unwrap(),
        TaskDescription::try_new("Use EventCore to build great things").unwrap(),
        UserName::try_new("alice").unwrap(),
    )?;
    
    let result = executor.execute(&create).await?;
    println!("   ✅ Task created with {} event(s)\n", result.events_written.len());
    
    // Assign the task
    println!("2. Assigning task to Bob...");
    
    let assign = AssignTask::new(
        task_id,
        UserName::try_new("bob").unwrap(),
        UserName::try_new("alice").unwrap(),
    )?;
    
    let result = executor.execute(&assign).await?;
    println!("   ✅ Task assigned, {} stream(s) updated\n", result.streams_affected.len());
    
    // Complete the task
    println!("3. Bob completes the task...");
    
    let complete = CompleteTask {
        task_id: StreamId::from_static(&format!("task-{}", task_id)),
        user_id: StreamId::from_static("user-bob"),
        completed_by: UserName::try_new("bob").unwrap(),
    };
    
    let result = executor.execute(&complete).await?;
    println!("   ✅ Task completed!\n", );
    
    println!("Demo complete! 🎉");
    Ok(())
}
}

Key Takeaways

  1. Macro Magic: #[derive(Command)] eliminates boilerplate
  2. Stream Declaration: #[stream] attributes declare what you need
  3. Type Safety: Can only write to declared streams
  4. Multi-Stream: Natural support for operations across entities
  5. Business Logic: Clear separation in handle() method
  6. State Building: apply() reconstructs state from events

Common Patterns

Conditional Stream Access

Sometimes you need streams based on runtime data:

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,  // Note: not unused
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Discover we need another stream
    if state.requires_manager_approval {
        let manager_stream = StreamId::from_static("user-manager");
        stream_resolver.add_streams(vec![manager_stream]);
        // EventCore will re-execute with the additional stream
    }
    
    // Continue with logic...
}
}

Batch Operations

For operations on multiple items:

#![allow(unused)]
fn main() {
let mut events = Vec::new();

for task_id in &self.task_ids {
    events.push(StreamWrite::new(
        &read_streams,
        task_id.clone(),
        TaskEvent::BatchUpdated { /* ... */ }
    )?);
}

Ok(events)
}

Summary

We’ve implemented our core commands using EventCore’s macro system:

  • ✅ Single-stream commands (CreateTask)
  • ✅ Multi-stream commands (AssignTask)
  • ✅ Complex business logic (CompleteTask)
  • ✅ Type-safe stream access
  • ✅ Comprehensive testing

Next, let’s build projections to query our data →

Chapter 2.4: Working with Projections

Projections transform your event streams into read models optimized for queries. This chapter shows how to build projections that answer specific questions about your data.

What Are Projections?

Projections are read-side views built from events. They:

  • Listen to event streams
  • Apply events to build state
  • Optimize for specific queries
  • Can be rebuilt from scratch

Think of projections as materialized views that are kept up-to-date by processing events.

Our First Projection: User Task List

Let’s build a projection that answers: “What tasks does each user have?”

src/projections/task_list.rs

#![allow(unused)]
fn main() {
use crate::domain::{events::*, types::*};
use eventcore::prelude::*;
use eventcore::cqrs::{CqrsProjection, ProjectionError};
use std::collections::{HashMap, HashSet};
use serde::{Serialize, Deserialize};
use chrono::{DateTime, Utc};

/// A summary of a task for display
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TaskSummary {
    pub id: TaskId,
    pub title: String,
    pub status: TaskStatus,
    pub priority: Priority,
    pub assigned_at: DateTime<Utc>,
    pub completed_at: Option<DateTime<Utc>>,
}

/// Projection that maintains task lists for each user
#[derive(Default, Clone, Serialize, Deserialize)]
pub struct UserTaskListProjection {
    /// Tasks indexed by user
    tasks_by_user: HashMap<UserName, HashMap<TaskId, TaskSummary>>,
    
    /// Reverse index: task to user
    task_assignments: HashMap<TaskId, UserName>,
    
    /// Task details cache
    task_details: HashMap<TaskId, TaskDetails>,
}

#[derive(Clone, Serialize, Deserialize)]
struct TaskDetails {
    title: String,
    created_at: DateTime<Utc>,
    priority: Priority,
}

impl UserTaskListProjection {
    /// Get all tasks for a user
    pub fn get_user_tasks(&self, user: &UserName) -> Vec<TaskSummary> {
        self.tasks_by_user
            .get(user)
            .map(|tasks| {
                let mut list: Vec<_> = tasks.values().cloned().collect();
                // Sort by priority (high to low) then by assigned date
                list.sort_by(|a, b| {
                    b.priority.cmp(&a.priority)
                        .then_with(|| a.assigned_at.cmp(&b.assigned_at))
                });
                list
            })
            .unwrap_or_default()
    }
    
    /// Get active task count for a user
    pub fn get_active_task_count(&self, user: &UserName) -> usize {
        self.tasks_by_user
            .get(user)
            .map(|tasks| {
                tasks.values()
                    .filter(|t| matches!(t.status, TaskStatus::Open | TaskStatus::InProgress))
                    .count()
            })
            .unwrap_or(0)
    }
    
    /// Get task by ID
    pub fn get_task(&self, task_id: &TaskId) -> Option<&TaskSummary> {
        self.task_assignments
            .get(task_id)
            .and_then(|user| {
                self.tasks_by_user
                    .get(user)?
                    .get(task_id)
            })
    }
}

#[async_trait]
impl CqrsProjection for UserTaskListProjection {
    type Event = SystemEvent;
    type Error = ProjectionError;

    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match &event.payload {
            SystemEvent::Task(task_event) => {
                self.apply_task_event(task_event, &event.occurred_at)?;
            }
            SystemEvent::User(_) => {
                // User events handled separately if needed
            }
        }
        Ok(())
    }
    
    fn name(&self) -> &str {
        "user_task_list"
    }
}

impl UserTaskListProjection {
    fn apply_task_event(
        &mut self, 
        event: &TaskEvent, 
        occurred_at: &DateTime<Utc>
    ) -> Result<(), ProjectionError> {
        match event {
            TaskEvent::Created { task_id, title, creator, .. } => {
                // Cache task details for later use
                self.task_details.insert(
                    *task_id,
                    TaskDetails {
                        title: title.to_string(),
                        created_at: *occurred_at,
                        priority: Priority::default(),
                    }
                );
            }
            
            TaskEvent::Assigned { task_id, assignee, assigned_at, .. } => {
                // Remove from previous assignee if any
                if let Some(previous_user) = self.task_assignments.get(task_id) {
                    if let Some(user_tasks) = self.tasks_by_user.get_mut(previous_user) {
                        user_tasks.remove(task_id);
                    }
                }
                
                // Add to new assignee
                let task_details = self.task_details.get(task_id)
                    .ok_or_else(|| ProjectionError::InvalidState(
                        format!("Task {} not found in cache", task_id)
                    ))?;
                
                let summary = TaskSummary {
                    id: *task_id,
                    title: task_details.title.clone(),
                    status: TaskStatus::Open,
                    priority: task_details.priority,
                    assigned_at: *assigned_at,
                    completed_at: None,
                };
                
                self.tasks_by_user
                    .entry(assignee.clone())
                    .or_default()
                    .insert(*task_id, summary);
                
                self.task_assignments.insert(*task_id, assignee.clone());
            }
            
            TaskEvent::Unassigned { task_id, previous_assignee, .. } => {
                // Remove from assignee
                if let Some(user_tasks) = self.tasks_by_user.get_mut(previous_assignee) {
                    user_tasks.remove(task_id);
                }
                self.task_assignments.remove(task_id);
            }
            
            TaskEvent::Started { task_id, .. } => {
                // Update status
                if let Some(user) = self.task_assignments.get(task_id) {
                    if let Some(task) = self.tasks_by_user
                        .get_mut(user)
                        .and_then(|tasks| tasks.get_mut(task_id)) 
                    {
                        task.status = TaskStatus::InProgress;
                    }
                }
            }
            
            TaskEvent::Completed { task_id, completed_at, .. } => {
                // Update status and completion time
                if let Some(user) = self.task_assignments.get(task_id) {
                    if let Some(task) = self.tasks_by_user
                        .get_mut(user)
                        .and_then(|tasks| tasks.get_mut(task_id)) 
                    {
                        task.status = TaskStatus::Completed;
                        task.completed_at = Some(*completed_at);
                    }
                }
            }
            
            TaskEvent::PriorityChanged { task_id, new_priority, .. } => {
                // Update priority in cache and summary
                if let Some(details) = self.task_details.get_mut(task_id) {
                    details.priority = *new_priority;
                }
                
                if let Some(user) = self.task_assignments.get(task_id) {
                    if let Some(task) = self.tasks_by_user
                        .get_mut(user)
                        .and_then(|tasks| tasks.get_mut(task_id)) 
                    {
                        task.priority = *new_priority;
                    }
                }
            }
            
            _ => {} // Handle other events as needed
        }
        
        Ok(())
    }
}
}

Statistics Projection

Let’s build another projection for team statistics:

src/projections/statistics.rs

#![allow(unused)]
fn main() {
use crate::domain::{events::*, types::*};
use eventcore::prelude::*;
use eventcore::cqrs::{CqrsProjection, ProjectionError};
use std::collections::HashMap;
use serde::{Serialize, Deserialize};
use chrono::{DateTime, Utc, Datelike};

/// Team statistics projection
#[derive(Default, Clone, Serialize, Deserialize)]
pub struct TeamStatisticsProjection {
    /// Total tasks created
    pub total_tasks_created: u64,
    
    /// Tasks by status
    pub tasks_by_status: HashMap<TaskStatus, u64>,
    
    /// Tasks by priority
    pub tasks_by_priority: HashMap<Priority, u64>,
    
    /// User statistics
    pub user_stats: HashMap<UserName, UserStatistics>,
    
    /// Daily completion rates
    pub daily_completions: HashMap<String, u64>, // Date string -> count
    
    /// Average completion time in hours
    pub avg_completion_hours: f64,
    
    /// Completion times for average calculation
    completion_times: Vec<f64>,
}

#[derive(Default, Clone, Serialize, Deserialize)]
pub struct UserStatistics {
    pub tasks_assigned: u64,
    pub tasks_completed: u64,
    pub tasks_in_progress: u64,
    pub total_comments: u64,
    pub avg_completion_hours: f64,
    completion_times: Vec<f64>,
}

impl TeamStatisticsProjection {
    /// Get completion rate percentage
    pub fn completion_rate(&self) -> f64 {
        if self.total_tasks_created == 0 {
            return 0.0;
        }
        
        let completed = self.tasks_by_status
            .get(&TaskStatus::Completed)
            .copied()
            .unwrap_or(0);
            
        (completed as f64 / self.total_tasks_created as f64) * 100.0
    }
    
    /// Get most productive user
    pub fn most_productive_user(&self) -> Option<(&UserName, u64)> {
        self.user_stats
            .iter()
            .max_by_key(|(_, stats)| stats.tasks_completed)
            .map(|(user, stats)| (user, stats.tasks_completed))
    }
    
    /// Get workload distribution
    pub fn workload_distribution(&self) -> Vec<(UserName, f64)> {
        let total_active: u64 = self.user_stats
            .values()
            .map(|s| s.tasks_in_progress)
            .sum();
            
        if total_active == 0 {
            return vec![];
        }
        
        self.user_stats
            .iter()
            .filter(|(_, stats)| stats.tasks_in_progress > 0)
            .map(|(user, stats)| {
                let percentage = (stats.tasks_in_progress as f64 / total_active as f64) * 100.0;
                (user.clone(), percentage)
            })
            .collect()
    }
}

#[async_trait]
impl CqrsProjection for TeamStatisticsProjection {
    type Event = SystemEvent;
    type Error = ProjectionError;

    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match &event.payload {
            SystemEvent::Task(task_event) => {
                self.apply_task_event(task_event, &event.occurred_at)?;
            }
            SystemEvent::User(user_event) => {
                self.apply_user_event(user_event)?;
            }
        }
        Ok(())
    }
    
    fn name(&self) -> &str {
        "team_statistics"
    }
}

impl TeamStatisticsProjection {
    fn apply_task_event(
        &mut self, 
        event: &TaskEvent, 
        occurred_at: &DateTime<Utc>
    ) -> Result<(), ProjectionError> {
        match event {
            TaskEvent::Created { .. } => {
                self.total_tasks_created += 1;
                *self.tasks_by_status.entry(TaskStatus::Open).or_insert(0) += 1;
                *self.tasks_by_priority.entry(Priority::default()).or_insert(0) += 1;
            }
            
            TaskEvent::Assigned { assignee, .. } => {
                let stats = self.user_stats.entry(assignee.clone()).or_default();
                stats.tasks_assigned += 1;
                stats.tasks_in_progress += 1;
            }
            
            TaskEvent::Completed { task_id, completed_by, completed_at, .. } => {
                // Update status counts
                *self.tasks_by_status.entry(TaskStatus::Open).or_insert(0) = 
                    self.tasks_by_status.get(&TaskStatus::Open).unwrap_or(&0).saturating_sub(1);
                *self.tasks_by_status.entry(TaskStatus::Completed).or_insert(0) += 1;
                
                // Update user stats
                let stats = self.user_stats.entry(completed_by.clone()).or_default();
                stats.tasks_completed += 1;
                stats.tasks_in_progress = stats.tasks_in_progress.saturating_sub(1);
                
                // Track daily completions
                let date_key = completed_at.format("%Y-%m-%d").to_string();
                *self.daily_completions.entry(date_key).or_insert(0) += 1;
                
                // Calculate completion time (would need task creation time)
                // For demo, using a placeholder
                let completion_hours = 24.0; // In real app, calculate from creation
                self.completion_times.push(completion_hours);
                stats.completion_times.push(completion_hours);
                
                // Update averages
                self.avg_completion_hours = self.completion_times.iter().sum::<f64>() 
                    / self.completion_times.len() as f64;
                stats.avg_completion_hours = stats.completion_times.iter().sum::<f64>() 
                    / stats.completion_times.len() as f64;
            }
            
            TaskEvent::CommentAdded { author, .. } => {
                let stats = self.user_stats.entry(author.clone()).or_default();
                stats.total_comments += 1;
            }
            
            TaskEvent::PriorityChanged { old_priority, new_priority, .. } => {
                *self.tasks_by_priority.entry(*old_priority).or_insert(0) = 
                    self.tasks_by_priority.get(old_priority).unwrap_or(&0).saturating_sub(1);
                *self.tasks_by_priority.entry(*new_priority).or_insert(0) += 1;
            }
            
            _ => {}
        }
        
        Ok(())
    }
    
    fn apply_user_event(&mut self, event: &UserEvent) -> Result<(), ProjectionError> {
        // Handle user-specific events if needed
        Ok(())
    }
}
}

Running Projections

EventCore provides infrastructure for running projections:

Setting Up Projection Runner

#![allow(unused)]
fn main() {
use eventcore::prelude::*;
use eventcore::cqrs::{
    CqrsProjectionRunner, 
    InMemoryCheckpointStore, 
    InMemoryReadModelStore,
    ProjectionRunnerConfig,
};
use eventcore_memory::InMemoryEventStore;

async fn setup_projections() -> Result<(), Box<dyn std::error::Error>> {
    // Event store
    let event_store = InMemoryEventStore::<SystemEvent>::new();
    
    // Projection infrastructure
    let checkpoint_store = InMemoryCheckpointStore::new();
    let read_model_store = InMemoryReadModelStore::new();
    
    // Create projection
    let mut task_list_projection = UserTaskListProjection::default();
    
    // Configure runner
    let config = ProjectionRunnerConfig::default()
        .with_batch_size(100)
        .with_checkpoint_frequency(50);
    
    // Create and start runner
    let runner = CqrsProjectionRunner::new(
        event_store.clone(),
        checkpoint_store,
        read_model_store.clone(),
        config,
    );
    
    // Run projection
    runner.run_projection(&mut task_list_projection).await?;
    
    // Query the projection
    let alice_tasks = task_list_projection.get_user_tasks(
        &UserName::try_new("alice").unwrap()
    );
    
    println!("Alice has {} tasks", alice_tasks.len());
    
    Ok(())
}
}

Querying Projections

EventCore provides a query builder for complex queries:

#![allow(unused)]
fn main() {
use eventcore::cqrs::{QueryBuilder, FilterOperator};

async fn query_tasks(
    projection: &UserTaskListProjection,
) -> Result<(), Box<dyn std::error::Error>> {
    let alice = UserName::try_new("alice").unwrap();
    
    // Get all tasks for Alice
    let all_tasks = projection.get_user_tasks(&alice);
    
    // Filter high priority tasks
    let high_priority: Vec<_> = all_tasks
        .iter()
        .filter(|t| t.priority == Priority::High)
        .collect();
    
    // Get active tasks only
    let active_tasks: Vec<_> = all_tasks
        .iter()
        .filter(|t| matches!(t.status, TaskStatus::Open | TaskStatus::InProgress))
        .collect();
    
    println!("Alice's tasks:");
    println!("- Total: {}", all_tasks.len());
    println!("- High priority: {}", high_priority.len());
    println!("- Active: {}", active_tasks.len());
    
    Ok(())
}
}

Real-time Updates

Projections can be updated in real-time as events are written:

#![allow(unused)]
fn main() {
use tokio::sync::RwLock;
use std::sync::Arc;

struct ProjectionService {
    projection: Arc<RwLock<UserTaskListProjection>>,
    event_store: Arc<dyn EventStore>,
}

impl ProjectionService {
    async fn start_real_time_updates(self) {
        let mut last_position = EventId::default();
        
        loop {
            // Poll for new events
            let events = self.event_store
                .read_all_events(ReadOptions::default().after(last_position))
                .await
                .unwrap_or_default();
            
            if !events.is_empty() {
                let mut projection = self.projection.write().await;
                
                for event in &events {
                    if let Err(e) = projection.apply(event).await {
                        eprintln!("Projection error: {}", e);
                    }
                    last_position = event.id.clone();
                }
            }
            
            // Sleep before next poll
            tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
        }
    }
}
}

Rebuilding Projections

One of the powerful features of event sourcing is the ability to rebuild projections:

#![allow(unused)]
fn main() {
use eventcore::cqrs::{RebuildCoordinator, RebuildStrategy};

async fn rebuild_projection(
    event_store: Arc<dyn EventStore>,
    projection: &mut UserTaskListProjection,
) -> Result<(), Box<dyn std::error::Error>> {
    let coordinator = RebuildCoordinator::new(event_store);
    
    // Clear existing state
    *projection = UserTaskListProjection::default();
    
    // Rebuild from beginning
    let strategy = RebuildStrategy::FromBeginning;
    
    coordinator.rebuild(projection, strategy).await?;
    
    println!("Projection rebuilt successfully");
    Ok(())
}
}

Testing Projections

Testing projections is straightforward:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use eventcore::testing::prelude::*;
    
    #[tokio::test]
    async fn test_user_task_list_projection() {
        let mut projection = UserTaskListProjection::default();
        
        // Create test events
        let task_id = TaskId::new();
        let alice = UserName::try_new("alice").unwrap();
        
        // Apply created event
        let created_event = create_test_event(
            StreamId::from_static("task-123"),
            SystemEvent::Task(TaskEvent::Created {
                task_id,
                title: TaskTitle::try_new("Test").unwrap(),
                description: TaskDescription::try_new("").unwrap(),
                creator: alice.clone(),
                created_at: Utc::now(),
            })
        );
        
        projection.apply(&created_event).await.unwrap();
        
        // Apply assigned event
        let assigned_event = create_test_event(
            StreamId::from_static("task-123"),
            SystemEvent::Task(TaskEvent::Assigned {
                task_id,
                assignee: alice.clone(),
                assigned_by: alice.clone(),
                assigned_at: Utc::now(),
            })
        );
        
        projection.apply(&assigned_event).await.unwrap();
        
        // Verify
        let tasks = projection.get_user_tasks(&alice);
        assert_eq!(tasks.len(), 1);
        assert_eq!(tasks[0].id, task_id);
        assert_eq!(tasks[0].status, TaskStatus::Open);
    }
    
    #[tokio::test]
    async fn test_statistics_projection() {
        let mut projection = TeamStatisticsProjection::default();
        
        // Apply multiple events
        for i in 0..10 {
            let event = create_test_event(
                StreamId::from_static(&format!("task-{}", i)),
                SystemEvent::Task(TaskEvent::Created {
                    task_id: TaskId::new(),
                    title: TaskTitle::try_new("Task").unwrap(),
                    description: TaskDescription::try_new("").unwrap(),
                    creator: UserName::try_new("alice").unwrap(),
                    created_at: Utc::now(),
                })
            );
            projection.apply(&event).await.unwrap();
        }
        
        assert_eq!(projection.total_tasks_created, 10);
        assert_eq!(projection.completion_rate(), 0.0);
    }
}
}

Performance Considerations

1. Batch Processing

Process events in batches for better performance:

#![allow(unused)]
fn main() {
let config = ProjectionRunnerConfig::default()
    .with_batch_size(1000)  // Process 1000 events at a time
    .with_checkpoint_frequency(100);  // Checkpoint every 100 events
}

2. Selective Projections

Only process relevant streams:

#![allow(unused)]
fn main() {
impl CqrsProjection for UserTaskListProjection {
    fn relevant_streams(&self) -> Vec<&str> {
        vec!["task-*", "user-*"]  // Only process task and user streams
    }
}
}

3. Caching

Use in-memory caching for frequently accessed data:

#![allow(unused)]
fn main() {
struct CachedProjection {
    inner: UserTaskListProjection,
    cache: HashMap<UserName, Vec<TaskSummary>>,
    cache_ttl: Duration,
}
}

Common Patterns

1. Denormalized Views

Projections often denormalize data for query performance:

#![allow(unused)]
fn main() {
// Instead of joins, store everything needed
struct TaskView {
    task_id: TaskId,
    title: String,
    assignee_name: String,      // Denormalized
    assignee_email: String,     // Denormalized
    creator_name: String,       // Denormalized
    // ... all data needed for display
}
}

2. Multiple Projections

Create different projections for different query needs:

  • UserTaskListProjection - For user-specific views
  • TeamDashboardProjection - For manager overview
  • SearchIndexProjection - For full-text search
  • ReportingProjection - For analytics

3. Event Enrichment

Projections can enrich events with additional context:

#![allow(unused)]
fn main() {
async fn enrich_event(&self, event: &TaskEvent) -> EnrichedTaskEvent {
    // Add user details, timestamps, etc.
}
}

Summary

Projections in EventCore:

  • ✅ Transform events into query-optimized read models
  • ✅ Can be rebuilt from events at any time
  • ✅ Support real-time updates
  • ✅ Enable complex queries without affecting write performance
  • ✅ Allow multiple views of the same data

Key benefits:

  • Flexibility: Change read models without touching events
  • Performance: Optimized for specific queries
  • Evolution: Add new projections as needs change
  • Testing: Easy to test with synthetic events

Next, let’s look at testing your application

Chapter 2.5: Testing Your Application

Testing event-sourced systems is actually easier than testing traditional CRUD applications. With EventCore, you can test commands, projections, and entire workflows using deterministic event streams.

Testing Philosophy

EventCore testing follows these principles:

  1. Test Behavior, Not Implementation - Focus on what events are produced
  2. Use Real Events - Test with actual domain events, not mocks
  3. Deterministic Tests - Events provide repeatable test scenarios
  4. Fast Feedback - In-memory event store for rapid testing

Testing Commands

Basic Command Testing

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use eventcore::prelude::*;
    use eventcore::testing::prelude::*;
    use eventcore_memory::InMemoryEventStore;
    
    #[tokio::test]
    async fn test_create_task_success() {
        // Arrange
        let store = InMemoryEventStore::<SystemEvent>::new();
        let executor = CommandExecutor::new(store);
        
        let task_id = TaskId::new();
        let command = CreateTask::new(
            task_id,
            TaskTitle::try_new("Write tests").unwrap(),
            TaskDescription::try_new("Add comprehensive test coverage").unwrap(),
            UserName::try_new("alice").unwrap(),
        ).unwrap();
        
        // Act
        let result = executor.execute(&command).await;
        
        // Assert
        assert!(result.is_ok());
        let execution_result = result.unwrap();
        assert_eq!(execution_result.events_written.len(), 1);
        
        // Verify the event
        match &execution_result.events_written[0] {
            SystemEvent::Task(TaskEvent::Created { title, creator, .. }) => {
                assert_eq!(title.as_ref(), "Write tests");
                assert_eq!(creator.as_ref(), "alice");
            }
            _ => panic!("Expected TaskCreated event"),
        }
    }
    
    #[tokio::test]
    async fn test_create_duplicate_task_fails() {
        // Arrange
        let store = InMemoryEventStore::<SystemEvent>::new();
        let executor = CommandExecutor::new(store);
        
        let task_id = TaskId::new();
        let command = CreateTask::new(
            task_id,
            TaskTitle::try_new("Task").unwrap(),
            TaskDescription::try_new("").unwrap(),
            UserName::try_new("alice").unwrap(),
        ).unwrap();
        
        // Act - Create first time
        executor.execute(&command).await.unwrap();
        
        // Act - Try to create again
        let result = executor.execute(&command).await;
        
        // Assert
        assert!(result.is_err());
        match result.unwrap_err() {
            CommandError::ValidationFailed(msg) => {
                assert!(msg.contains("already exists"));
            }
            _ => panic!("Expected ValidationFailed error"),
        }
    }
}
}

Testing Multi-Stream Commands

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_assign_task_multi_stream() {
    // Arrange
    let store = InMemoryEventStore::<SystemEvent>::new();
    let executor = CommandExecutor::new(store);
    
    // Create a task first
    let task_id = TaskId::new();
    let create = CreateTask::new(
        task_id,
        TaskTitle::try_new("Multi-stream test").unwrap(),
        TaskDescription::try_new("").unwrap(),
        UserName::try_new("alice").unwrap(),
    ).unwrap();
    
    executor.execute(&create).await.unwrap();
    
    // Assign the task
    let assign = AssignTask::new(
        task_id,
        UserName::try_new("bob").unwrap(),
        UserName::try_new("alice").unwrap(),
    ).unwrap();
    
    // Act
    let result = executor.execute(&assign).await.unwrap();
    
    // Assert - Should affect both task and user streams
    assert_eq!(result.streams_affected.len(), 2);
    assert!(result.streams_affected.contains(&StreamId::from_static(&format!("task-{}", task_id))));
    assert!(result.streams_affected.contains(&StreamId::from_static("user-bob")));
    
    // Verify events in both streams
    let task_events = store.read_stream(
        &StreamId::from_static(&format!("task-{}", task_id)),
        ReadOptions::default()
    ).await.unwrap();
    
    let user_events = store.read_stream(
        &StreamId::from_static("user-bob"),
        ReadOptions::default()
    ).await.unwrap();
    
    assert_eq!(task_events.events.len(), 2); // Created + Assigned
    assert_eq!(user_events.events.len(), 2); // TaskAssigned + WorkloadUpdated
}
}

Testing Projections

Unit Testing Projections

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_user_task_list_projection() {
    use eventcore::testing::builders::*;
    
    // Arrange
    let mut projection = UserTaskListProjection::default();
    let task_id = TaskId::new();
    let alice = UserName::try_new("alice").unwrap();
    
    // Build test events
    let events = vec![
        StoredEventBuilder::new()
            .with_stream_id(StreamId::from_static("task-123"))
            .with_payload(SystemEvent::Task(TaskEvent::Created {
                task_id,
                title: TaskTitle::try_new("Test task").unwrap(),
                description: TaskDescription::try_new("").unwrap(),
                creator: alice.clone(),
                created_at: Utc::now(),
            }))
            .build(),
        StoredEventBuilder::new()
            .with_stream_id(StreamId::from_static("task-123"))
            .with_payload(SystemEvent::Task(TaskEvent::Assigned {
                task_id,
                assignee: alice.clone(),
                assigned_by: alice.clone(),
                assigned_at: Utc::now(),
            }))
            .build(),
    ];
    
    // Act
    for event in events {
        projection.apply(&event).await.unwrap();
    }
    
    // Assert
    let tasks = projection.get_user_tasks(&alice);
    assert_eq!(tasks.len(), 1);
    assert_eq!(tasks[0].id, task_id);
    assert_eq!(tasks[0].status, TaskStatus::Open);
}
}

Testing Projection Accuracy

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_statistics_projection_accuracy() {
    let mut projection = TeamStatisticsProjection::default();
    
    // Create a series of events
    let events = create_test_scenario(TestScenario {
        tasks_created: 10,
        tasks_assigned: 8,
        tasks_completed: 5,
        users: vec!["alice", "bob", "charlie"],
    });
    
    // Apply all events
    for event in events {
        projection.apply(&event).await.unwrap();
    }
    
    // Verify statistics
    assert_eq!(projection.total_tasks_created, 10);
    assert_eq!(projection.tasks_by_status[&TaskStatus::Completed], 5);
    assert_eq!(projection.tasks_by_status[&TaskStatus::Open], 2); // 10 - 8 assigned
    assert_eq!(projection.tasks_by_status[&TaskStatus::InProgress], 3); // 8 - 5 completed
    
    // Verify completion rate
    assert_eq!(projection.completion_rate(), 50.0); // 5/10 * 100
}
}

Property-Based Testing

EventCore works well with property-based testing:

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn task_assignment_maintains_consistency(
        task_count in 1..50usize,
        user_count in 1..10usize,
        assignment_ratio in 0.0..1.0f64,
    ) {
        // Property: Total assigned tasks equals sum of user assignments
        let runtime = tokio::runtime::Runtime::new().unwrap();
        runtime.block_on(async {
            let mut projection = UserTaskListProjection::default();
            let users = generate_users(user_count);
            let tasks = generate_tasks(task_count);
            
            // Assign tasks based on ratio
            let assignments = assign_tasks_to_users(&tasks, &users, assignment_ratio);
            
            // Apply events
            for event in assignments {
                projection.apply(&event).await.unwrap();
            }
            
            // Verify consistency
            let total_assigned: usize = users.iter()
                .map(|u| projection.get_user_tasks(u).len())
                .sum();
                
            let expected_assigned = (task_count as f64 * assignment_ratio) as usize;
            assert_eq!(total_assigned, expected_assigned);
        });
    }
}
}

Integration Testing

Testing Complete Workflows

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_complete_task_workflow() {
    // Setup
    let store = InMemoryEventStore::<SystemEvent>::new();
    let executor = CommandExecutor::new(store.clone());
    let mut projection = UserTaskListProjection::default();
    
    // Execute workflow
    let task_id = TaskId::new();
    let alice = UserName::try_new("alice").unwrap();
    let bob = UserName::try_new("bob").unwrap();
    
    // 1. Create task
    let create = CreateTask::new(
        task_id,
        TaskTitle::try_new("Complete workflow").unwrap(),
        TaskDescription::try_new("Test the entire flow").unwrap(),
        alice.clone(),
    ).unwrap();
    executor.execute(&create).await.unwrap();
    
    // 2. Assign to Bob
    let assign = AssignTask::new(task_id, bob.clone(), alice.clone()).unwrap();
    executor.execute(&assign).await.unwrap();
    
    // 3. Bob completes the task
    let complete = CompleteTask {
        task_id: StreamId::from_static(&format!("task-{}", task_id)),
        user_id: StreamId::from_static(&format!("user-{}", bob)),
        completed_by: bob.clone(),
    };
    executor.execute(&complete).await.unwrap();
    
    // Update projection with all events
    let all_events = store.read_all_events(ReadOptions::default()).await.unwrap();
    for event in all_events {
        projection.apply(&event).await.unwrap();
    }
    
    // Verify end state
    let bob_tasks = projection.get_user_tasks(&bob);
    assert_eq!(bob_tasks.len(), 1);
    assert_eq!(bob_tasks[0].status, TaskStatus::Completed);
    assert!(bob_tasks[0].completed_at.is_some());
}
}

Testing Helpers

EventCore provides testing utilities:

Event Builders

#![allow(unused)]
fn main() {
use eventcore::testing::builders::*;

fn create_test_event(payload: SystemEvent) -> StoredEvent<SystemEvent> {
    StoredEventBuilder::new()
        .with_id(EventId::new())
        .with_stream_id(StreamId::from_static("test-stream"))
        .with_version(EventVersion::new(1))
        .with_payload(payload)
        .with_metadata(
            EventMetadataBuilder::new()
                .with_user_id(UserId::from("test-user"))
                .build()
        )
        .build()
}
}

Test Scenarios

#![allow(unused)]
fn main() {
use eventcore::testing::fixtures::*;

struct TaskScenario;

impl TestScenario for TaskScenario {
    type Event = SystemEvent;
    
    fn events(&self) -> Vec<EventToWrite<Self::Event>> {
        vec![
            // Series of events that create a test scenario
            create_task_event("task-1", "Test Task 1"),
            assign_task_event("task-1", "alice"),
            complete_task_event("task-1", "alice"),
        ]
    }
}
}

Assertion Helpers

#![allow(unused)]
fn main() {
use eventcore::testing::assertions::*;

#[tokio::test]
async fn test_event_ordering() {
    let events = vec![/* ... */];
    
    // Assert events are properly ordered
    assert_events_ordered(&events);
    
    // Assert no duplicate event IDs
    assert_unique_event_ids(&events);
    
    // Assert version progression
    assert_stream_version_progression(&events, &StreamId::from_static("test"));
}
}

Testing Error Cases

Command Validation Errors

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_invalid_command_inputs() {
    let executor = CommandExecutor::new(InMemoryEventStore::<SystemEvent>::new());
    
    // Test empty title
    let result = TaskTitle::try_new("");
    assert!(result.is_err());
    
    // Test whitespace-only title
    let result = TaskTitle::try_new("   ");
    assert!(result.is_err());
    
    // Test overly long description
    let long_desc = "x".repeat(3000);
    let result = TaskDescription::try_new(&long_desc);
    assert!(result.is_err());
}
}

Concurrency Conflicts

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_concurrent_modifications() {
    let store = InMemoryEventStore::<SystemEvent>::new();
    let executor = CommandExecutor::new(store);
    
    // Create a task
    let task_id = TaskId::new();
    let create = CreateTask::new(
        task_id,
        TaskTitle::try_new("Concurrent test").unwrap(),
        TaskDescription::try_new("").unwrap(),
        UserName::try_new("alice").unwrap(),
    ).unwrap();
    executor.execute(&create).await.unwrap();
    
    // Simulate concurrent updates
    let assign1 = AssignTask::new(task_id, UserName::try_new("bob").unwrap(), UserName::try_new("alice").unwrap()).unwrap();
    let assign2 = AssignTask::new(task_id, UserName::try_new("charlie").unwrap(), UserName::try_new("alice").unwrap()).unwrap();
    
    // Execute both concurrently
    let (result1, result2) = tokio::join!(
        executor.execute(&assign1),
        executor.execute(&assign2)
    );
    
    // One should succeed, one should retry and then succeed
    assert!(result1.is_ok() || result2.is_ok());
}
}

Performance Testing

#![allow(unused)]
fn main() {
#[tokio::test]
#[ignore] // Run with --ignored flag
async fn test_high_volume_event_processing() {
    use std::time::Instant;
    
    let mut projection = UserTaskListProjection::default();
    let event_count = 10_000;
    
    // Generate events
    let events: Vec<_> = (0..event_count)
        .map(|i| create_task_assigned_event(i))
        .collect();
    
    // Measure processing time
    let start = Instant::now();
    
    for event in events {
        projection.apply(&event).await.unwrap();
    }
    
    let duration = start.elapsed();
    let events_per_second = event_count as f64 / duration.as_secs_f64();
    
    println!("Processed {} events in {:?}", event_count, duration);
    println!("Rate: {:.2} events/second", events_per_second);
    
    // Assert reasonable performance
    assert!(events_per_second > 1000.0, "Projection too slow");
}
}

Test Organization

Structure your tests for clarity:

tests/
├── unit/
│   ├── commands/
│   │   ├── create_task_test.rs
│   │   ├── assign_task_test.rs
│   │   └── complete_task_test.rs
│   └── projections/
│       ├── task_list_test.rs
│       └── statistics_test.rs
├── integration/
│   ├── workflows/
│   │   └── task_lifecycle_test.rs
│   └── projections/
│       └── real_time_updates_test.rs
└── performance/
    └── high_volume_test.rs

Debugging Tests

EventCore provides excellent debugging support:

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_with_debugging() {
    // Enable debug logging
    let _ = env_logger::builder()
        .filter_level(log::LevelFilter::Debug)
        .try_init();
    
    let store = InMemoryEventStore::<SystemEvent>::new();
    
    // Print all events after execution
    let events = store.read_all_events(ReadOptions::default()).await.unwrap();
    
    for event in &events {
        println!("Event: {:?}", event);
        println!("  Stream: {}", event.stream_id);
        println!("  Version: {}", event.version);
        println!("  Payload: {:?}", event.payload);
        println!("  Metadata: {:?}", event.metadata);
        println!();
    }
}
}

Summary

Testing EventCore applications is straightforward because:

  • Events are deterministic - Same events always produce same state
  • No mocking needed - Use real events and in-memory stores
  • Fast feedback - In-memory testing is instantaneous
  • Complete scenarios - Test entire workflows easily
  • Time travel - Test any historical state

Best practices:

  1. Test commands by verifying produced events
  2. Test projections by applying known events
  3. Use property-based testing for invariants
  4. Test complete workflows for integration
  5. Keep tests fast with in-memory stores

You’ve now completed the Getting Started tutorial! You can:

  • Model domains with events
  • Implement type-safe commands
  • Build projections for queries
  • Test everything thoroughly

Continue to Part 3: Core Concepts for deeper understanding →

Part 3: Core Concepts

This part provides a deep dive into EventCore’s core concepts and design principles. Understanding these concepts will help you build robust, scalable event-sourced systems.

Chapters in This Part

  1. Commands and the Macro System - Deep dive into command implementation
  2. Events and Event Stores - Understanding events and storage
  3. State Reconstruction - How EventCore rebuilds state from events
  4. Multi-Stream Atomicity - The key innovation of EventCore
  5. Error Handling - Comprehensive error handling strategies

What You’ll Learn

  • How the #[derive(Command)] macro works internally
  • Event design principles and best practices
  • The state reconstruction algorithm
  • How multi-stream atomicity is guaranteed
  • Error handling patterns for production systems

Prerequisites

  • Completed Part 2: Getting Started
  • Basic understanding of Rust macros helpful
  • Familiarity with database transactions

Time to Complete

  • Reading: ~30 minutes
  • With examples: ~1 hour

Ready to dive deep? Let’s start with Commands and the Macro System

Chapter 3.1: Commands and the Macro System

This chapter explores how EventCore’s command system works, focusing on the #[derive(Command)] macro that eliminates boilerplate while maintaining type safety.

The Command Pattern

Commands in EventCore represent user intentions - things that should happen in your system. They:

  1. Declare required streams - What data they need access to
  2. Validate business rules - Ensure operations are allowed
  3. Generate events - Record what actually happened
  4. Maintain consistency - All changes are atomic

Anatomy of a Command

Let’s dissect a command to understand each part:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]         // 1. Derive macro generates boilerplate
struct TransferMoney {
    #[stream]                     // 2. Declares this field is a stream
    from_account: StreamId,       
    
    #[stream] 
    to_account: StreamId,
    
    amount: Money,                // 3. Regular fields for command data
    reference: String,
}
}

What the Macro Generates

The #[derive(Command)] macro generates several things:

#![allow(unused)]
fn main() {
// 1. A phantom type for compile-time stream tracking
#[derive(Debug, Clone, Copy, Default)]
pub struct TransferMoneyStreamSet;

// 2. Implementation of CommandStreams trait
impl CommandStreams for TransferMoney {
    type StreamSet = TransferMoneyStreamSet;
    
    fn read_streams(&self) -> Vec<StreamId> {
        vec![
            self.from_account.clone(),
            self.to_account.clone(),
        ]
    }
}

// 3. Blanket implementation gives you Command trait
// (because TransferMoney also implements CommandLogic)
}

The Two-Trait Design

EventCore splits the Command pattern into two traits:

CommandStreams (Generated)

Handles infrastructure concerns:

#![allow(unused)]
fn main() {
pub trait CommandStreams: Send + Sync + Clone {
    /// Phantom type for compile-time stream access control
    type StreamSet: Send + Sync;
    
    /// Returns the streams this command needs to read
    fn read_streams(&self) -> Vec<StreamId>;
}
}

CommandLogic (You Implement)

Contains your domain logic:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait CommandLogic: CommandStreams {
    /// State type that will be reconstructed from events
    type State: Default + Send + Sync;
    
    /// Event type this command produces
    type Event: Send + Sync;
    
    /// Apply an event to update state (event sourcing fold)
    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>);
    
    /// Business logic that validates and produces events
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>>;
}
}

Stream Declaration Patterns

Basic Stream Declaration

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct UpdateProfile {
    #[stream]
    user_id: StreamId,  // Single stream
}
}

Multiple Streams

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ProcessOrder {
    #[stream]
    order_id: StreamId,
    
    #[stream]
    customer_id: StreamId,
    
    #[stream]
    inventory_id: StreamId,
    
    #[stream]
    payment_id: StreamId,
}
}

Stream Arrays (Planned Feature)

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct BulkUpdate {
    #[stream("items")]
    item_ids: Vec<StreamId>,  // Multiple streams of same type
}
}

Conditional Streams

For streams discovered at runtime:

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Discover we need another stream based on state
    if state.requires_approval {
        let approver_stream = StreamId::from_static("approver-stream");
        stream_resolver.add_streams(vec![approver_stream]);
        // EventCore will re-execute with the additional stream
    }
    
    // Continue with logic...
}
}

Type-Safe Stream Access

The ReadStreams type ensures you can only write to declared streams:

#![allow(unused)]
fn main() {
// In your handle method:
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    _stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // ✅ This works - from_account was declared with #[stream]
    let withdraw_event = StreamWrite::new(
        &read_streams,
        self.from_account.clone(),
        BankEvent::MoneyWithdrawn { amount: self.amount }
    )?;
    
    // ❌ This won't compile - random_stream wasn't declared
    let invalid = StreamWrite::new(
        &read_streams,
        StreamId::from_static("random-stream"),
        SomeEvent {}
    )?; // Compile error!
    
    Ok(vec![withdraw_event])
}
}

State Reconstruction

The apply method builds state by folding events:

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        BankEvent::AccountOpened { balance, .. } => {
            state.exists = true;
            state.balance = *balance;
        }
        BankEvent::MoneyDeposited { amount, .. } => {
            state.balance += amount;
        }
        BankEvent::MoneyWithdrawn { amount, .. } => {
            state.balance = state.balance.saturating_sub(*amount);
        }
    }
}
}

This is called for each event in sequence to rebuild current state.

Command Validation Patterns

Using the require! Macro

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    _stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Business rule validation with good error messages
    require!(
        state.balance >= self.amount,
        "Insufficient funds: balance={}, requested={}",
        state.balance,
        self.amount
    );
    
    require!(
        self.amount > 0,
        "Transfer amount must be positive"
    );
    
    require!(
        self.from_account != self.to_account,
        "Cannot transfer to same account"
    );
    
    // Generate events after validation passes
    Ok(vec![/* events */])
}
}

Custom Validation Functions

#![allow(unused)]
fn main() {
impl TransferMoney {
    fn validate_transfer_limits(&self, state: &AccountState) -> CommandResult<()> {
        const DAILY_LIMIT: u64 = 10_000;
        
        let daily_total = state.transfers_today + self.amount;
        require!(
            daily_total <= DAILY_LIMIT,
            "Daily transfer limit exceeded: {} > {}",
            daily_total,
            DAILY_LIMIT
        );
        
        Ok(())
    }
}
}

Advanced Macro Features

Custom Stream Names

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ComplexCommand {
    #[stream(name = "primary")]
    main_stream: StreamId,
    
    #[stream(name = "secondary", optional = true)]
    optional_stream: Option<StreamId>,
}
}

Computed Streams

#![allow(unused)]
fn main() {
impl ComplexCommand {
    fn compute_streams(&self) -> Vec<StreamId> {
        let mut streams = vec![self.main_stream.clone()];
        
        if let Some(ref optional) = self.optional_stream {
            streams.push(optional.clone());
        }
        
        streams
    }
}
}

Command Composition

Commands can be composed for complex operations:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CompleteOrderWorkflow {
    #[stream]
    order_id: StreamId,
    
    // Sub-commands to execute
    payment: ProcessPayment,
    fulfillment: FulfillOrder,
    notification: SendNotification,
}

impl CommandLogic for CompleteOrderWorkflow {
    // ... implementation delegates to sub-commands
}
}

Performance Optimizations

Pre-computed State

For expensive computations:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct PrecomputedState {
    balance: u64,
    transaction_count: u64,
    daily_totals: HashMap<Date, u64>,  // Pre-aggregated
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    // Update pre-computed values incrementally
    match &event.payload {
        BankEvent::MoneyTransferred { amount, date, .. } => {
            state.balance -= amount;
            *state.daily_totals.entry(*date).or_insert(0) += amount;
        }
        // ...
    }
}
}

Lazy State Loading

For large states:

#![allow(unused)]
fn main() {
struct LazyState {
    core: AccountCore,           // Always loaded
    history: Option<Box<TransactionHistory>>,  // Load on demand
}

async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    mut state: Self::State,
    _stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Load history only if needed
    if self.requires_history_check() {
        state.load_history().await?;
    }
    
    // Continue...
}
}

Testing Commands

Unit Testing

#![allow(unused)]
fn main() {
#[test]
fn test_command_stream_declaration() {
    let cmd = TransferMoney {
        from_account: StreamId::from_static("account-1"),
        to_account: StreamId::from_static("account-2"),
        amount: 100,
        reference: "test".to_string(),
    };
    
    let streams = cmd.read_streams();
    assert_eq!(streams.len(), 2);
    assert!(streams.contains(&StreamId::from_static("account-1")));
    assert!(streams.contains(&StreamId::from_static("account-2")));
}
}

Testing State Reconstruction

#![allow(unused)]
fn main() {
#[test]
fn test_apply_events() {
    let cmd = TransferMoney { /* ... */ };
    let mut state = AccountState::default();
    
    let event = create_test_event(BankEvent::AccountOpened {
        balance: 1000,
        owner: "alice".to_string(),
    });
    
    cmd.apply(&mut state, &event);
    
    assert_eq!(state.balance, 1000);
    assert!(state.exists);
}
}

Common Patterns

Idempotent Commands

Make commands idempotent by checking for duplicate operations:

#![allow(unused)]
fn main() {
async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Check if operation was already performed
    if state.transfers.contains(&self.reference) {
        // Already processed - return success with no new events
        return Ok(vec![]);
    }
    
    // Process normally...
}
}

Command Versioning

Handle command evolution:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
#[command(version = 2)]
struct TransferMoneyV2 {
    #[stream]
    from_account: StreamId,
    
    #[stream]
    to_account: StreamId,
    
    amount: Money,
    reference: String,
    
    // New in V2
    category: TransferCategory,
}
}

Summary

The EventCore command system provides:

  • Zero boilerplate through #[derive(Command)]
  • Type-safe stream access preventing invalid writes
  • Clear separation between infrastructure and domain logic
  • Flexible validation with the require! macro
  • Extensibility through the two-trait design

Key takeaways:

  1. Use #[derive(Command)] to eliminate boilerplate
  2. Declare streams with #[stream] attributes
  3. Implement business logic in CommandLogic
  4. Leverage type safety for compile-time guarantees
  5. Commands are just data - easy to test and reason about

Next, let’s explore Events and Event Stores

Chapter 3.2: Events and Event Stores

Events are the heart of EventCore - immutable records of things that happened in your system. This chapter explores event design, storage, and the guarantees EventCore provides.

What Makes a Good Event?

Events should be:

  1. Past Tense - They record what happened, not what should happen
  2. Immutable - Once written, events never change
  3. Self-Contained - Include all necessary data
  4. Business-Focused - Represent domain concepts, not technical details

Event Design Principles

#![allow(unused)]
fn main() {
// ❌ Bad: Technical focus, present tense, missing context
#[derive(Serialize, Deserialize)]
struct UpdateUser {
    id: String,
    data: HashMap<String, Value>,
}

// ✅ Good: Business focus, past tense, complete information
#[derive(Serialize, Deserialize)]
struct CustomerEmailChanged {
    customer_id: CustomerId,
    old_email: Email,
    new_email: Email,
    changed_by: UserId,
    changed_at: DateTime<Utc>,
    reason: EmailChangeReason,
}
}

Event Structure in EventCore

Core Event Types

#![allow(unused)]
fn main() {
/// Your domain event
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct OrderShipped {
    pub order_id: OrderId,
    pub tracking_number: TrackingNumber,
    pub carrier: Carrier,
    pub shipped_at: DateTime<Utc>,
}

/// Event ready to be written
pub struct EventToWrite<E> {
    pub stream_id: StreamId,
    pub payload: E,
    pub metadata: Option<EventMetadata>,
    pub expected_version: ExpectedVersion,
}

/// Event as stored in the event store
pub struct StoredEvent<E> {
    pub id: EventId,                  // UUIDv7 for global ordering
    pub stream_id: StreamId,          // Which stream this belongs to
    pub version: EventVersion,        // Position in the stream
    pub payload: E,                   // Your domain event
    pub metadata: EventMetadata,      // Who, when, why
    pub occurred_at: DateTime<Utc>,   // When it happened
}
}

Event IDs and Ordering

EventCore uses UUIDv7 for event IDs, providing:

#![allow(unused)]
fn main() {
// UUIDv7 properties:
// - Globally unique
// - Time-ordered (sortable)
// - Millisecond precision timestamp
// - No coordination required

let event1 = EventId::new();
let event2 = EventId::new();

// Later events have higher IDs
assert!(event2 > event1);

// Extract timestamp
let timestamp = event1.timestamp();
}

Event Metadata

Every event carries metadata for auditing and debugging:

#![allow(unused)]
fn main() {
pub struct EventMetadata {
    /// Who triggered this event
    pub user_id: Option<UserId>,
    
    /// Correlation ID for tracking across services
    pub correlation_id: CorrelationId,
    
    /// What caused this event (previous event ID)
    pub causation_id: Option<CausationId>,
    
    /// Custom metadata
    pub custom: HashMap<String, Value>,
}

// Building metadata
let metadata = EventMetadata::new()
    .with_user_id(UserId::from("alice@example.com"))
    .with_correlation_id(CorrelationId::new())
    .caused_by(&previous_event)
    .with_custom("ip_address", "192.168.1.1")
    .with_custom("user_agent", "MyApp/1.0");
}

Event Store Abstraction

EventCore defines a trait that storage adapters implement:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait EventStore: Send + Sync {
    type Event: Send + Sync;
    type Error: Error + Send + Sync;

    /// Read events from a specific stream
    async fn read_stream(
        &self,
        stream_id: &StreamId,
        options: ReadOptions,
    ) -> Result<StreamEvents<Self::Event>, Self::Error>;

    /// Read events from multiple streams
    async fn read_streams(
        &self,
        stream_ids: &[StreamId],
        options: ReadOptions,
    ) -> Result<Vec<StreamEvents<Self::Event>>, Self::Error>;

    /// Write events atomically to multiple streams
    async fn write_events(
        &self,
        events: Vec<EventToWrite<Self::Event>>,
    ) -> Result<WriteResult, Self::Error>;

    /// Subscribe to real-time events
    async fn subscribe(
        &self,
        options: SubscriptionOptions,
    ) -> Result<Box<dyn EventSubscription<Self::Event>>, Self::Error>;
}
}

Stream Versioning

Streams maintain version numbers for optimistic concurrency:

#![allow(unused)]
fn main() {
pub struct StreamEvents<E> {
    pub stream_id: StreamId,
    pub version: EventVersion,    // Current version after these events
    pub events: Vec<StoredEvent<E>>,
}

// Version control options
pub enum ExpectedVersion {
    /// Stream must not exist
    NoStream,
    
    /// Stream must be at this exact version
    Exact(EventVersion),
    
    /// Stream must exist but any version is OK
    Any,
    
    /// No version check (dangerous!)
    NoCheck,
}
}

Using Version Control

#![allow(unused)]
fn main() {
// First write - stream shouldn't exist
let first_event = EventToWrite {
    stream_id: stream_id.clone(),
    payload: AccountOpened { /* ... */ },
    metadata: None,
    expected_version: ExpectedVersion::NoStream,
};

// Subsequent writes - check version
let next_event = EventToWrite {
    stream_id: stream_id.clone(),
    payload: MoneyDeposited { /* ... */ },
    metadata: None,
    expected_version: ExpectedVersion::Exact(EventVersion::new(1)),
};
}

Storage Adapters

PostgreSQL Adapter

The production-ready adapter with ACID guarantees:

#![allow(unused)]
fn main() {
use eventcore_postgres::{PostgresEventStore, PostgresConfig};

let config = PostgresConfig::new("postgresql://localhost/eventcore")
    .with_pool_size(20)
    .with_schema("eventcore");

let event_store = PostgresEventStore::new(config).await?;

// Initialize schema (one time)
event_store.initialize().await?;
}

PostgreSQL schema:

-- Events table with optimal indexing
CREATE TABLE events (
    id UUID PRIMARY KEY DEFAULT gen_uuidv7(),
    stream_id VARCHAR(255) NOT NULL,
    version BIGINT NOT NULL,
    event_type VARCHAR(255) NOT NULL,
    payload JSONB NOT NULL,
    metadata JSONB NOT NULL,
    occurred_at TIMESTAMPTZ NOT NULL,
    
    -- Ensure stream version uniqueness
    UNIQUE(stream_id, version),
    
    -- Indexes for common queries
    INDEX idx_stream_id (stream_id),
    INDEX idx_occurred_at (occurred_at),
    INDEX idx_event_type (event_type)
);

In-Memory Adapter

Perfect for testing and development:

#![allow(unused)]
fn main() {
use eventcore_memory::InMemoryEventStore;

let event_store = InMemoryEventStore::<MyEvent>::new();

// Optionally add chaos for testing
let chaotic_store = event_store
    .with_chaos(ChaosConfig {
        failure_probability: 0.1,  // 10% chance of failure
        latency_ms: Some(50..200), // Random latency
    });
}

Event Design Patterns

Event Granularity

Choose the right level of detail:

#![allow(unused)]
fn main() {
// ❌ Too coarse - loses important details
struct OrderUpdated {
    order_id: OrderId,
    new_state: OrderState,  // What actually changed?
}

// ❌ Too fine - creates event spam
struct OrderFieldUpdated {
    order_id: OrderId,
    field_name: String,
    old_value: Value,
    new_value: Value,
}

// ✅ Just right - meaningful business events
enum OrderEvent {
    OrderPlaced { customer: CustomerId, items: Vec<Item> },
    PaymentReceived { amount: Money, method: PaymentMethod },
    OrderShipped { tracking: TrackingNumber },
    OrderDelivered { signed_by: String },
}
}

Event Evolution

Design events to evolve gracefully:

#![allow(unused)]
fn main() {
// Version 1
#[derive(Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email: Email,
}

// Version 2 - Added field with default
#[derive(Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email: Email,
    #[serde(default)]
    referral_code: Option<String>,  // New field
}

// Version 3 - Structural change
#[derive(Serialize, Deserialize)]
#[serde(tag = "version")]
enum UserRegisteredVersioned {
    #[serde(rename = "1")]
    V1 { user_id: UserId, email: Email },
    
    #[serde(rename = "2")]
    V2 { 
        user_id: UserId, 
        email: Email, 
        referral_code: Option<String>,
    },
    
    #[serde(rename = "3")]
    V3 {
        user_id: UserId,
        email: Email,
        referral: Option<ReferralInfo>,  // Richer type
    },
}
}

Event Enrichment

Add context to events:

#![allow(unused)]
fn main() {
trait EventEnricher {
    fn enrich<E>(&self, event: E) -> EnrichedEvent<E>;
}

struct EnrichedEvent<E> {
    pub event: E,
    pub context: EventContext,
}

struct EventContext {
    pub session_id: SessionId,
    pub request_id: RequestId,
    pub feature_flags: HashMap<String, bool>,
    pub environment: Environment,
}
}

Querying Events

Read Options

Control how events are read:

#![allow(unused)]
fn main() {
let options = ReadOptions::default()
    .from_version(EventVersion::new(10))    // Start from version 10
    .to_version(EventVersion::new(20))      // Up to version 20
    .max_events(100)                        // Limit results
    .backwards();                           // Read in reverse

let events = event_store
    .read_stream(&stream_id, options)
    .await?;
}

Reading Multiple Streams

For multi-stream operations:

#![allow(unused)]
fn main() {
let stream_ids = vec![
    StreamId::from_static("order-123"),
    StreamId::from_static("inventory-abc"),
    StreamId::from_static("payment-xyz"),
];

let all_events = event_store
    .read_streams(&stream_ids, ReadOptions::default())
    .await?;

// Events from all streams, ordered by EventId (time)
}

Global Event Feed

Read all events across all streams:

#![allow(unused)]
fn main() {
let all_events = event_store
    .read_all_events(
        ReadOptions::default()
            .after(last_known_event_id)  // For pagination
            .max_events(1000)
    )
    .await?;
}

Event Store Guarantees

1. Atomicity

All events in a write operation succeed or fail together:

#![allow(unused)]
fn main() {
let events = vec![
    EventToWrite { /* withdraw from account A */ },
    EventToWrite { /* deposit to account B */ },
];

// Both events written atomically
event_store.write_events(events).await?;
}

2. Consistency

Version checks prevent conflicting writes:

#![allow(unused)]
fn main() {
// Two concurrent commands read version 5
let command1_events = vec![/* ... */];
let command2_events = vec![/* ... */];

// First write succeeds
event_store.write_events(command1_events).await?;  // OK

// Second write fails - version conflict
event_store.write_events(command2_events).await?;  // Error: Version conflict
}

3. Durability

Events are persisted before returning success:

#![allow(unused)]
fn main() {
// After this returns, events are durable
let result = event_store.write_events(events).await?;

// Even if the process crashes, events are safe
}

4. Ordering

Events maintain both stream order and global order:

#![allow(unused)]
fn main() {
// Stream order: version within a stream
stream_events.events[0].version < stream_events.events[1].version

// Global order: EventId across all streams  
all_events[0].id < all_events[1].id
}

Performance Optimization

Batch Writing

Write multiple events efficiently:

#![allow(unused)]
fn main() {
// Batch events for better performance
let mut batch = Vec::with_capacity(1000);

for item in large_dataset {
    batch.push(EventToWrite {
        stream_id: compute_stream_id(&item),
        payload: process_item(item),
        metadata: None,
        expected_version: ExpectedVersion::Any,
    });
    
    // Write in batches
    if batch.len() >= 100 {
        event_store.write_events(batch.drain(..).collect()).await?;
    }
}

// Write remaining
if !batch.is_empty() {
    event_store.write_events(batch).await?;
}
}

Stream Partitioning

Distribute load across streams:

#![allow(unused)]
fn main() {
// Instead of one hot stream
let stream_id = StreamId::from_static("orders");

// Partition by hash
let stream_id = StreamId::from_static(&format!(
    "orders-{}", 
    order_id.hash() % 16  // 16 partitions
));
}

Caching Strategies

Cache recent events for read performance:

#![allow(unused)]
fn main() {
struct CachedEventStore<ES: EventStore> {
    inner: ES,
    cache: Arc<RwLock<LruCache<StreamId, StreamEvents<ES::Event>>>>,
}

impl<ES: EventStore> CachedEventStore<ES> {
    async fn read_stream_cached(
        &self,
        stream_id: &StreamId,
        options: ReadOptions,
    ) -> Result<StreamEvents<ES::Event>, ES::Error> {
        // Check cache first
        if options.is_from_start() {
            if let Some(cached) = self.cache.read().await.get(stream_id) {
                return Ok(cached.clone());
            }
        }
        
        // Read from store
        let events = self.inner.read_stream(stream_id, options).await?;
        
        // Update cache
        self.cache.write().await.insert(stream_id.clone(), events.clone());
        
        Ok(events)
    }
}
}

Testing with Events

Event Fixtures

Create test events easily:

#![allow(unused)]
fn main() {
use eventcore::testing::builders::*;

fn create_account_opened_event() -> StoredEvent<BankEvent> {
    StoredEventBuilder::new()
        .with_stream_id(StreamId::from_static("account-123"))
        .with_version(EventVersion::new(1))
        .with_payload(BankEvent::AccountOpened {
            owner: "Alice".to_string(),
            initial_balance: 1000,
        })
        .with_metadata(
            EventMetadataBuilder::new()
                .with_user_id(UserId::from("alice@example.com"))
                .build()
        )
        .build()
}
}

Event Assertions

Test event properties:

#![allow(unused)]
fn main() {
use eventcore::testing::assertions::*;

#[test]
fn test_events_are_ordered() {
    let events = vec![/* ... */];
    
    assert_events_ordered(&events);
    assert_unique_event_ids(&events);
    assert_stream_version_progression(&events, &stream_id);
}
}

Summary

Events in EventCore are:

  • Immutable records of business facts
  • Time-ordered with UUIDv7 IDs
  • Version-controlled for consistency
  • Atomically written across streams
  • Rich with metadata for auditing

Best practices:

  1. Design events around business concepts
  2. Include all necessary data in events
  3. Plan for event evolution
  4. Use version control for consistency
  5. Optimize storage with partitioning

Next, let’s explore State Reconstruction

Chapter 3.3: State Reconstruction

State reconstruction is the heart of event sourcing - rebuilding current state by replaying historical events. EventCore makes this process efficient, type-safe, and predictable.

The Concept

Instead of storing current state in a database, event sourcing:

  1. Stores events - The facts about what happened
  2. Rebuilds state - By replaying events in order
  3. Guarantees consistency - Same events always produce same state

Think of it like a bank account:

  • Traditional: Store balance = $1000
  • Event Sourcing: Store deposits and withdrawals, calculate balance

How EventCore Reconstructs State

The Apply Function

Every command defines how events modify state:

#![allow(unused)]
fn main() {
impl CommandLogic for TransferMoney {
    type State = AccountState;
    type Event = BankEvent;
    
    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        match &event.payload {
            BankEvent::AccountOpened { initial_balance, owner } => {
                state.exists = true;
                state.balance = *initial_balance;
                state.owner = owner.clone();
                state.opened_at = event.occurred_at;
            }
            BankEvent::MoneyDeposited { amount, .. } => {
                state.balance += amount;
                state.transaction_count += 1;
                state.last_activity = event.occurred_at;
            }
            BankEvent::MoneyWithdrawn { amount, .. } => {
                state.balance = state.balance.saturating_sub(*amount);
                state.transaction_count += 1;
                state.last_activity = event.occurred_at;
            }
        }
    }
}
}

The Reconstruction Process

When a command executes, EventCore:

  1. Reads declared streams - Gets all events from specified streams
  2. Creates default state - Starts with State::default()
  3. Applies events in order - Calls apply() for each event
  4. Passes state to handle - Your business logic receives reconstructed state
#![allow(unused)]
fn main() {
// EventCore does this automatically:
let mut state = AccountState::default();
for event in events_from_streams {
    command.apply(&mut state, &event);
}
// Your handle() method receives the final state
}

State Design Patterns

Accumulator Pattern

Build up state incrementally:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct OrderState {
    exists: bool,
    items: Vec<OrderItem>,
    total: Money,
    status: OrderStatus,
    customer: Option<CustomerId>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        OrderEvent::Created { customer_id } => {
            state.exists = true;
            state.customer = Some(*customer_id);
            state.status = OrderStatus::Draft;
        }
        OrderEvent::ItemAdded { item, price } => {
            state.items.push(item.clone());
            state.total += price;
        }
        OrderEvent::Placed { .. } => {
            state.status = OrderStatus::Placed;
        }
    }
}
}

Snapshot Pattern

For expensive computations, pre-calculate during apply:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct AnalyticsState {
    total_revenue: Money,
    transactions_by_day: HashMap<Date, Vec<TransactionSummary>>,
    customer_lifetime_values: HashMap<CustomerId, Money>,
    // Pre-computed aggregates
    daily_averages: HashMap<Date, Money>,
    top_customers: BTreeSet<(Money, CustomerId)>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        AnalyticsEvent::Purchase { customer, amount, date } => {
            // Update raw data
            state.total_revenue += amount;
            state.transactions_by_day
                .entry(*date)
                .or_default()
                .push(TransactionSummary { customer: *customer, amount: *amount });
            
            // Update pre-computed values
            *state.customer_lifetime_values.entry(*customer).or_default() += amount;
            
            // Maintain sorted top customers
            state.top_customers.insert((*amount, *customer));
            if state.top_customers.len() > 100 {
                state.top_customers.pop_first();
            }
            
            // Recalculate daily average for this date
            let daily_total: Money = state.transactions_by_day[date]
                .iter()
                .map(|t| t.amount)
                .sum();
            let tx_count = state.transactions_by_day[date].len();
            state.daily_averages.insert(*date, daily_total / tx_count as u64);
        }
    }
}
}

State Machine Pattern

Track valid transitions:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct WorkflowState {
    current_phase: WorkflowPhase,
    completed_phases: HashSet<WorkflowPhase>,
    phase_durations: HashMap<WorkflowPhase, Duration>,
    last_transition: DateTime<Utc>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        WorkflowEvent::PhaseCompleted { phase, started_at } => {
            // Record phase duration
            let duration = event.occurred_at - started_at;
            state.phase_durations.insert(*phase, duration);
            
            // Mark as completed
            state.completed_phases.insert(*phase);
            
            // Transition to next phase
            state.current_phase = phase.next_phase();
            state.last_transition = event.occurred_at;
        }
    }
}
}

Multi-Stream State Reconstruction

When commands read multiple streams, state combines data from all:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ProcessPayment {
    #[stream]
    order_id: StreamId,
    
    #[stream]
    customer_id: StreamId,
    
    #[stream]
    payment_method_id: StreamId,
    
    amount: Money,
}

#[derive(Default)]
struct PaymentState {
    // From order stream
    order: OrderInfo,
    
    // From customer stream  
    customer: CustomerInfo,
    customer_payment_history: Vec<PaymentRecord>,
    
    // From payment method stream
    payment_method: PaymentMethodInfo,
    recent_charges: Vec<ChargeAttempt>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    // Events from different streams update different parts of state
    match (&event.stream_id, &event.payload) {
        (stream_id, PaymentEvent::Order(order_event)) 
            if stream_id == &self.order_id => {
            // Update order portion of state
            apply_order_event(&mut state.order, order_event);
        }
        (stream_id, PaymentEvent::Customer(customer_event)) 
            if stream_id == &self.customer_id => {
            // Update customer portion of state
            apply_customer_event(&mut state.customer, customer_event);
        }
        (stream_id, PaymentEvent::PaymentMethod(pm_event)) 
            if stream_id == &self.payment_method_id => {
            // Update payment method portion of state
            apply_payment_method_event(&mut state.payment_method, pm_event);
        }
        _ => {} // Ignore events from other streams
    }
}
}

Performance Optimization

Selective State Loading

Only reconstruct what you need:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct AccountState {
    // Core fields - always loaded
    exists: bool,
    balance: Money,
    status: AccountStatus,
    
    // Optional expensive data
    transaction_history: Option<Vec<Transaction>>,
    statistics: Option<AccountStatistics>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    // Always update core fields
    match &event.payload {
        BankEvent::MoneyDeposited { amount, .. } => {
            state.balance += amount;
        }
        // ...
    }
    
    // Only build history if requested
    if state.transaction_history.is_some() {
        if let Some(tx) = event_to_transaction(&event) {
            state.transaction_history
                .as_mut()
                .unwrap()
                .push(tx);
        }
    }
}

// In handle(), decide what to load:
async fn handle(&self, /* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Enable history loading for this command
    let mut state = Self::State::default();
    if self.requires_history() {
        state.transaction_history = Some(Vec::new());
    }
    
    // State reconstruction will populate history
    // ...
}
}

Event Filtering

Skip irrelevant events during reconstruction:

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    // Skip old events for performance
    let cutoff_date = Utc::now() - Duration::days(90);
    if event.occurred_at < cutoff_date {
        return; // Skip events older than 90 days
    }
    
    match &event.payload {
        // Process only recent events
    }
}
}

Memoization

Cache expensive calculations:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct MemoizedState {
    balance: Money,
    // Cache expensive calculations
    #[serde(skip)]
    cached_risk_score: Option<(DateTime<Utc>, RiskScore)>,
}

impl MemoizedState {
    fn risk_score(&mut self) -> RiskScore {
        let now = Utc::now();
        
        // Check cache validity (1 hour)
        if let Some((cached_at, score)) = self.cached_risk_score {
            if now - cached_at < Duration::hours(1) {
                return score;
            }
        }
        
        // Calculate expensive risk score
        let score = calculate_risk_score(self);
        self.cached_risk_score = Some((now, score));
        score
    }
}
}

Testing State Reconstruction

Unit Testing Apply Functions

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use eventcore::testing::builders::*;
    
    #[test]
    fn test_balance_calculation() {
        let command = TransferMoney { /* ... */ };
        let mut state = AccountState::default();
        
        // Create test events
        let events = vec![
            create_event(BankEvent::AccountOpened { 
                initial_balance: 1000,
                owner: "Alice".to_string(),
            }),
            create_event(BankEvent::MoneyDeposited { 
                amount: 500,
                reference: "Salary".to_string(),
            }),
            create_event(BankEvent::MoneyWithdrawn { 
                amount: 200,
                reference: "Rent".to_string(),
            }),
        ];
        
        // Apply events
        for event in events {
            command.apply(&mut state, &event);
        }
        
        // Verify final state
        assert_eq!(state.balance, 1300); // 1000 + 500 - 200
        assert_eq!(state.transaction_count, 2);
        assert!(state.exists);
    }
}
}

Property-Based Testing

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn balance_never_negative_with_saturating_sub(
        deposits in prop::collection::vec(1..1000u64, 0..10),
        withdrawals in prop::collection::vec(1..2000u64, 0..20),
    ) {
        let command = TransferMoney { /* ... */ };
        let mut state = AccountState::default();
        
        // Open account
        let open_event = create_event(BankEvent::AccountOpened {
            initial_balance: 0,
            owner: "Test".to_string(),
        });
        command.apply(&mut state, &open_event);
        
        // Apply deposits
        for amount in deposits {
            let event = create_event(BankEvent::MoneyDeposited {
                amount,
                reference: "Deposit".to_string(),
            });
            command.apply(&mut state, &event);
        }
        
        // Apply withdrawals
        for amount in withdrawals {
            let event = create_event(BankEvent::MoneyWithdrawn {
                amount,
                reference: "Withdrawal".to_string(),
            });
            command.apply(&mut state, &event);
        }
        
        // Balance should never be negative due to saturating_sub
        prop_assert!(state.balance >= 0);
    }
}
}

Testing Event Order Independence

Some state calculations should be order-independent:

#![allow(unused)]
fn main() {
#[test]
fn test_commutative_operations() {
    let events = vec![
        create_tag_added_event("rust"),
        create_tag_added_event("async"),
        create_tag_added_event("eventstore"),
    ];
    
    // Apply in different orders
    let mut state1 = TagState::default();
    for event in &events {
        apply_tag_event(&mut state1, event);
    }
    
    let mut state2 = TagState::default();
    for event in events.iter().rev() {
        apply_tag_event(&mut state2, event);
    }
    
    // Final state should be the same
    assert_eq!(state1.tags, state2.tags);
}
}

Common Pitfalls and Solutions

1. Mutable External State

Wrong: Depending on external state

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        OrderEvent::Created { .. } => {
            // DON'T DO THIS - external dependency!
            state.tax_rate = fetch_current_tax_rate();
        }
    }
}
}

Right: Store everything in events

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        OrderEvent::Created { tax_rate, .. } => {
            // Tax rate was captured when event was created
            state.tax_rate = *tax_rate;
        }
    }
}
}

2. Non-Deterministic Operations

Wrong: Using current time

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        OrderEvent::Created { .. } => {
            // DON'T DO THIS - non-deterministic!
            state.age_in_days = (Utc::now() - event.occurred_at).num_days();
        }
    }
}
}

Right: Calculate in handle() if needed

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    _stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Calculate age here, not in apply()
    let age_in_days = (Utc::now() - state.created_at).num_days();
    
    // Use for business logic...
}
}

3. Unbounded State Growth

Wrong: Keeping everything forever

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        LogEvent::Entry { message } => {
            // DON'T DO THIS - unbounded growth!
            state.all_log_entries.push(message.clone());
        }
    }
}
}

Right: Keep bounded state

#![allow(unused)]
fn main() {
fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        LogEvent::Entry { message, level } => {
            // Keep only recent errors
            if *level == LogLevel::Error {
                state.recent_errors.push(message.clone());
                if state.recent_errors.len() > 100 {
                    state.recent_errors.remove(0);
                }
            }
            
            // Track counts instead of full data
            *state.entries_by_level.entry(*level).or_default() += 1;
        }
    }
}
}

Advanced Patterns

Temporal State

Track state changes over time:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct TemporalState {
    current_value: i32,
    history: BTreeMap<DateTime<Utc>, i32>,
    transitions: Vec<StateTransition>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    let old_value = state.current_value;
    
    match &event.payload {
        ValueEvent::Changed { new_value } => {
            state.current_value = *new_value;
            state.history.insert(event.occurred_at, *new_value);
            state.transitions.push(StateTransition {
                at: event.occurred_at,
                from: old_value,
                to: *new_value,
                event_id: event.id,
            });
        }
    }
}

impl TemporalState {
    /// Get value at a specific point in time
    fn value_at(&self, timestamp: DateTime<Utc>) -> Option<i32> {
        self.history
            .range(..=timestamp)
            .next_back()
            .map(|(_, &value)| value)
    }
}
}

Derived State

Calculate derived values efficiently:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct DerivedState {
    // Raw data
    orders: Vec<Order>,
    
    // Derived data (calculated in apply)
    total_revenue: Money,
    average_order_value: Option<Money>,
    orders_by_status: HashMap<OrderStatus, usize>,
}

fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
    match &event.payload {
        OrderEvent::Placed { order } => {
            // Update raw data
            state.orders.push(order.clone());
            
            // Update derived data incrementally
            state.total_revenue += order.total;
            state.average_order_value = Some(
                state.total_revenue / state.orders.len() as u64
            );
            *state.orders_by_status
                .entry(OrderStatus::Placed)
                .or_default() += 1;
        }
    }
}
}

Summary

State reconstruction in EventCore:

  • Deterministic - Same events always produce same state
  • Type-safe - State structure defined by types
  • Efficient - Only reconstruct what you need
  • Testable - Easy to verify with known events
  • Flexible - Support any state structure

Best practices:

  1. Keep apply() functions pure and deterministic
  2. Pre-calculate expensive derived data
  3. Design state for your command’s needs
  4. Test state reconstruction thoroughly
  5. Optimize for your access patterns

Next, let’s explore Multi-Stream Atomicity

Chapter 3.4: Multi-Stream Atomicity

Multi-stream atomicity is EventCore’s key innovation. Traditional event sourcing forces you to choose aggregate boundaries upfront. EventCore lets each command define its own consistency boundary dynamically.

The Problem with Traditional Aggregates

In traditional event sourcing:

#![allow(unused)]
fn main() {
// Traditional approach - rigid boundaries
struct BankAccount {
    id: AccountId,
    balance: Money,
    // Can only modify THIS account atomically
}

// ❌ Cannot atomically transfer between accounts!
// Must use sagas, process managers, or eventual consistency
}

This leads to:

  • Complex workflows for operations spanning aggregates
  • Eventual consistency where immediate consistency is needed
  • Race conditions between related operations
  • Difficult refactoring when boundaries need to change

EventCore’s Solution

EventCore allows atomic operations across multiple streams:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct TransferMoney {
    #[stream]
    from_account: StreamId,   // Read and write this stream
    
    #[stream]
    to_account: StreamId,     // Read and write this stream too
    
    amount: Money,
}

// ✅ Both accounts updated atomically or not at all!
}

How It Works

1. Stream Declaration

Commands declare all streams they need:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ProcessOrder {
    #[stream]
    order: StreamId,
    
    #[stream]
    inventory: StreamId,
    
    #[stream]
    customer: StreamId,
    
    #[stream]
    payment: StreamId,
}
}

2. Atomic Read Phase

EventCore reads all declared streams with version tracking:

#![allow(unused)]
fn main() {
// EventCore does this internally:
let stream_data = HashMap::new();

for stream_id in command.read_streams() {
    let events = event_store.read_stream(&stream_id).await?;
    stream_data.insert(stream_id, StreamData {
        version: events.version,
        events: events.events,
    });
}
}

3. State Reconstruction

State is built from all streams:

#![allow(unused)]
fn main() {
let mut state = OrderProcessingState::default();

for (stream_id, data) in &stream_data {
    for event in &data.events {
        command.apply(&mut state, event);
    }
}
}

4. Command Execution

Your business logic runs with full state:

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Validate across all streams
    require!(state.order.is_valid(), "Invalid order");
    require!(state.inventory.has_stock(&self.items), "Insufficient stock");
    require!(state.customer.can_purchase(), "Customer not authorized");
    require!(state.payment.has_funds(self.total), "Insufficient funds");
    
    // Generate events for multiple streams
    Ok(vec![
        StreamWrite::new(&read_streams, self.order.clone(), 
            OrderEvent::Confirmed { /* ... */ })?,
        StreamWrite::new(&read_streams, self.inventory.clone(),
            InventoryEvent::Reserved { /* ... */ })?,
        StreamWrite::new(&read_streams, self.customer.clone(),
            CustomerEvent::OrderPlaced { /* ... */ })?,
        StreamWrite::new(&read_streams, self.payment.clone(),
            PaymentEvent::Charged { /* ... */ })?,
    ])
}
}

5. Atomic Write Phase

All events written atomically with version checks:

#![allow(unused)]
fn main() {
// EventCore ensures all-or-nothing write
event_store.write_events(vec![
    EventToWrite {
        stream_id: order_stream,
        payload: order_event,
        expected_version: ExpectedVersion::Exact(order_version),
    },
    EventToWrite {
        stream_id: inventory_stream,
        payload: inventory_event,
        expected_version: ExpectedVersion::Exact(inventory_version),
    },
    // ... more events
]).await?;
}

Consistency Guarantees

Version Checking

EventCore prevents concurrent modifications:

#![allow(unused)]
fn main() {
// Command A reads order v5, inventory v10
// Command B reads order v5, inventory v10

// Command A writes first - succeeds
// Order → v6, Inventory → v11

// Command B tries to write - FAILS
// Version conflict detected!
}

Automatic Retry

On version conflicts, EventCore:

  1. Re-reads all streams
  2. Rebuilds state with new events
  3. Re-executes command logic
  4. Attempts write again
#![allow(unused)]
fn main() {
// This happens automatically:
loop {
    let (state, versions) = read_and_build_state().await?;
    let events = command.handle(state).await?;
    
    match write_with_version_check(events, versions).await {
        Ok(_) => return Ok(()),
        Err(VersionConflict) => continue, // Retry
        Err(e) => return Err(e),
    }
}
}

Dynamic Stream Discovery

Commands can discover additional streams during execution:

#![allow(unused)]
fn main() {
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Discover we need product streams based on order items
    let product_streams: Vec<StreamId> = state.order.items
        .iter()
        .map(|item| StreamId::from(format!("product-{}", item.product_id)))
        .collect();
    
    // Request these additional streams
    stream_resolver.add_streams(product_streams);
    
    // EventCore will re-execute with all streams
    Ok(vec![])
}
}

Real-World Examples

E-Commerce Checkout

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CheckoutCart {
    #[stream]
    cart: StreamId,
    
    #[stream]
    customer: StreamId,
    
    #[stream]
    payment_method: StreamId,
    
    // Product streams discovered dynamically
}

async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Add product streams for inventory check
    let product_streams: Vec<StreamId> = state.cart.items
        .keys()
        .map(|id| StreamId::from(format!("product-{}", id)))
        .collect();
    
    stream_resolver.add_streams(product_streams);
    
    // Validate everything atomically
    for (product_id, quantity) in &state.cart.items {
        let product_state = &state.products[product_id];
        require!(
            product_state.available_stock >= *quantity,
            "Insufficient stock for product {}", product_id
        );
    }
    
    // Generate events for all affected streams
    let mut events = vec![
        // Convert cart to order
        StreamWrite::new(&read_streams, self.cart.clone(),
            CartEvent::CheckedOut { order_id })?,
            
        // Create order
        StreamWrite::new(&read_streams, order_stream,
            OrderEvent::Created { /* ... */ })?,
            
        // Charge payment
        StreamWrite::new(&read_streams, self.payment_method.clone(),
            PaymentEvent::Charged { amount: state.cart.total })?,
    ];
    
    // Reserve inventory from each product
    for (product_id, quantity) in &state.cart.items {
        let product_stream = StreamId::from(format!("product-{}", product_id));
        events.push(StreamWrite::new(&read_streams, product_stream,
            ProductEvent::StockReserved { quantity: *quantity })?);
    }
    
    Ok(events)
}
}

Distributed Ledger

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct RecordTransaction {
    #[stream]
    ledger: StreamId,
    
    #[stream]
    account_a: StreamId,
    
    #[stream]
    account_b: StreamId,
    
    entry: LedgerEntry,
}

async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Ensure double-entry bookkeeping consistency
    require!(
        self.entry.debits == self.entry.credits,
        "Debits must equal credits"
    );
    
    // Validate account states
    require!(
        state.account_a.is_active && state.account_b.is_active,
        "Both accounts must be active"
    );
    
    // Record atomically in all streams
    Ok(vec![
        StreamWrite::new(&read_streams, self.ledger.clone(),
            LedgerEvent::EntryRecorded { entry: self.entry.clone() })?,
            
        StreamWrite::new(&read_streams, self.account_a.clone(),
            AccountEvent::Debited { 
                amount: self.entry.debit_amount,
                reference: self.entry.id,
            })?,
            
        StreamWrite::new(&read_streams, self.account_b.clone(),
            AccountEvent::Credited {
                amount: self.entry.credit_amount,
                reference: self.entry.id,
            })?,
    ])
}
}

Workflow Orchestration

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CompleteWorkflowStep {
    #[stream]
    workflow: StreamId,
    
    #[stream]
    current_step: StreamId,
    
    // Next step stream discovered dynamically
    
    step_result: StepResult,
}

async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Determine next step based on current state and result
    let next_step_id = match (&state.current_step.step_type, &self.step_result) {
        (StepType::Approval, StepResult::Approved) => state.workflow.next_step,
        (StepType::Approval, StepResult::Rejected) => state.workflow.rejection_step,
        (StepType::Processing, StepResult::Success) => state.workflow.next_step,
        (StepType::Processing, StepResult::Error) => state.workflow.error_step,
        _ => None,
    };
    
    // Add next step stream if needed
    if let Some(next_id) = next_step_id {
        let next_stream = StreamId::from(format!("step-{}", next_id));
        stream_resolver.add_streams(vec![next_stream.clone()]);
    }
    
    // Atomic update across workflow and steps
    let mut events = vec![
        StreamWrite::new(&read_streams, self.workflow.clone(),
            WorkflowEvent::StepCompleted {
                step_id: state.current_step.id,
                result: self.step_result.clone(),
            })?,
            
        StreamWrite::new(&read_streams, self.current_step.clone(),
            StepEvent::Completed {
                result: self.step_result.clone(),
            })?,
    ];
    
    // Activate next step
    if let Some(next_id) = next_step_id {
        let next_stream = StreamId::from(format!("step-{}", next_id));
        events.push(StreamWrite::new(&read_streams, next_stream,
            StepEvent::Activated {
                workflow_id: state.workflow.id,
                activation_time: Utc::now(),
            })?);
    }
    
    Ok(events)
}
}

Performance Considerations

Stream Count Impact

Reading more streams has costs:

#![allow(unused)]
fn main() {
// Benchmark results (example):
// 1 stream:    5ms   average latency
// 5 streams:   12ms  average latency  
// 10 streams:  25ms  average latency
// 50 streams:  150ms average latency

// Design commands to read only necessary streams
}

Optimization Strategies

  1. Stream Partitioning
#![allow(unused)]
fn main() {
// Instead of one hot stream
let stream = StreamId::from_static("orders");

// Partition by customer segment
let stream = StreamId::from(format!("orders-{}", 
    customer_id.hash() % 16
));
}
  1. Lazy Stream Loading
#![allow(unused)]
fn main() {
async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Only load detail streams if needed
    if state.requires_detailed_check() {
        let detail_streams = compute_detail_streams(&state);
        stream_resolver.add_streams(detail_streams);
    }
    
    // Continue with basic validation...
}
}
  1. Read Filtering
#![allow(unused)]
fn main() {
// EventCore may support filtered reads (future feature)
let options = ReadOptions::default()
    .from_version(EventVersion::new(1000))  // Skip old events
    .event_types(&["OrderPlaced", "OrderShipped"]); // Only specific types
}

Testing Multi-Stream Commands

Integration Tests

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_multi_stream_atomicity() {
    let store = InMemoryEventStore::<BankEvent>::new();
    let executor = CommandExecutor::new(store.clone());
    
    // Setup initial state
    create_account(&executor, "account-1", 1000).await;
    create_account(&executor, "account-2", 500).await;
    
    // Execute transfer
    let transfer = TransferMoney {
        from_account: StreamId::from_static("account-1"),
        to_account: StreamId::from_static("account-2"),
        amount: 300,
    };
    
    executor.execute(&transfer).await.unwrap();
    
    // Verify both accounts updated atomically
    let account1 = get_balance(&store, "account-1").await;
    let account2 = get_balance(&store, "account-2").await;
    
    assert_eq!(account1, 700);  // 1000 - 300
    assert_eq!(account2, 800);  // 500 + 300
    assert_eq!(account1 + account2, 1500); // Total preserved
}
}

Concurrent Modification Tests

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_concurrent_transfers() {
    let store = InMemoryEventStore::<BankEvent>::new();
    let executor = CommandExecutor::new(store);
    
    // Setup accounts
    create_account(&executor, "A", 1000).await;
    create_account(&executor, "B", 1000).await;
    create_account(&executor, "C", 1000).await;
    
    // Concurrent transfers forming a cycle
    let transfer_ab = TransferMoney {
        from_account: StreamId::from_static("A"),
        to_account: StreamId::from_static("B"),
        amount: 100,
    };
    
    let transfer_bc = TransferMoney {
        from_account: StreamId::from_static("B"),
        to_account: StreamId::from_static("C"),
        amount: 100,
    };
    
    let transfer_ca = TransferMoney {
        from_account: StreamId::from_static("C"),
        to_account: StreamId::from_static("A"),
        amount: 100,
    };
    
    // Execute concurrently
    let (r1, r2, r3) = tokio::join!(
        executor.execute(&transfer_ab),
        executor.execute(&transfer_bc),
        executor.execute(&transfer_ca),
    );
    
    // All should succeed (with retries)
    assert!(r1.is_ok());
    assert!(r2.is_ok());
    assert!(r3.is_ok());
    
    // Total balance preserved
    let total = get_balance(&store, "A").await +
                get_balance(&store, "B").await +
                get_balance(&store, "C").await;
    assert_eq!(total, 3000);
}
}

Common Patterns

Read-Only Streams

Some streams are read but not written:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct ValidateTransaction {
    #[stream]
    transaction: StreamId,
    
    #[stream]
    rules_engine: StreamId,  // Read-only for validation rules
    
    #[stream]
    fraud_history: StreamId, // Read-only for risk assessment
}

async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Use read-only streams for validation
    let risk_score = calculate_risk(&state.fraud_history);
    let applicable_rules = state.rules_engine.rules_for(&self.transaction);
    
    // Only write to transaction stream
    Ok(vec![
        StreamWrite::new(&read_streams, self.transaction.clone(),
            TransactionEvent::Validated { risk_score })?
    ])
}
}

Conditional Stream Writes

Write to streams based on business logic:

#![allow(unused)]
fn main() {
async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    let mut events = vec![];
    
    // Always update the main stream
    events.push(StreamWrite::new(&read_streams, self.order.clone(),
        OrderEvent::Processed { /* ... */ })?);
    
    // Conditionally update other streams
    if state.customer.is_vip {
        events.push(StreamWrite::new(&read_streams, self.customer.clone(),
            CustomerEvent::VipPointsEarned { points: calculate_points() })?);
    }
    
    if state.requires_fraud_check() {
        events.push(StreamWrite::new(&read_streams, fraud_stream,
            FraudEvent::CheckRequested { /* ... */ })?);
    }
    
    Ok(events)
}
}

Summary

Multi-stream atomicity in EventCore provides:

  • Dynamic boundaries - Each command defines its consistency needs
  • True atomicity - All streams updated together or not at all
  • Automatic retries - Handle concurrent modifications gracefully
  • Stream discovery - Add streams dynamically during execution
  • Type safety - Compile-time guarantees about stream access

Best practices:

  1. Declare minimal required streams upfront
  2. Use dynamic discovery for conditional streams
  3. Design for retry-ability (idempotent operations)
  4. Test concurrent scenarios thoroughly
  5. Monitor retry rates in production

Next, let’s explore Error Handling

Chapter 3.5: Error Handling

Error handling in EventCore is designed to be explicit, recoverable, and informative. This chapter covers error types, handling strategies, and best practices for building resilient event-sourced systems.

Error Philosophy

EventCore follows these principles:

  1. Errors are values - Use Result<T, E> everywhere
  2. Be specific - Different error types for different failures
  3. Fail fast - Validate early in the command pipeline
  4. Recover gracefully - Automatic retries for transient errors
  5. Provide context - Rich error messages for debugging

Error Types

Command Errors

The main error type for command execution:

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum CommandError {
    #[error("Validation failed: {0}")]
    ValidationFailed(String),
    
    #[error("Business rule violation: {0}")]
    BusinessRuleViolation(String),
    
    #[error("Stream not found: {0}")]
    StreamNotFound(StreamId),
    
    #[error("Concurrency conflict on streams: {0:?}")]
    ConcurrencyConflict(Vec<StreamId>),
    
    #[error("Event store error: {0}")]
    EventStore(#[from] EventStoreError),
    
    #[error("Serialization error: {0}")]
    Serialization(#[from] serde_json::Error),
    
    #[error("Maximum retries exceeded: {0}")]
    MaxRetriesExceeded(String),
}
}

Event Store Errors

Storage-specific errors:

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum EventStoreError {
    #[error("Version conflict in stream {stream_id}: expected {expected:?}, actual {actual}")]
    VersionConflict {
        stream_id: StreamId,
        expected: ExpectedVersion,
        actual: EventVersion,
    },
    
    #[error("Stream {0} not found")]
    StreamNotFound(StreamId),
    
    #[error("Database error: {0}")]
    Database(String),
    
    #[error("Connection error: {0}")]
    Connection(String),
    
    #[error("Timeout after {0:?}")]
    Timeout(Duration),
    
    #[error("Transaction rolled back: {0}")]
    TransactionRollback(String),
}
}

Validation Patterns

Using the require! Macro

The require! macro makes validation concise:

#![allow(unused)]
fn main() {
use eventcore::require;

async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    _stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Simple validation
    require!(self.amount > 0, "Amount must be positive");
    
    // Validation with formatting
    require!(
        state.balance >= self.amount,
        "Insufficient balance: have {}, need {}",
        state.balance,
        self.amount
    );
    
    // Complex validation
    require!(
        state.account.is_active && !state.account.is_frozen,
        "Account must be active and not frozen"
    );
    
    // Continue with business logic...
    Ok(vec![/* events */])
}
}

Custom Validation Functions

For complex validations:

#![allow(unused)]
fn main() {
impl TransferMoney {
    fn validate_business_rules(&self, state: &AccountState) -> CommandResult<()> {
        // Daily limit check
        self.validate_daily_limit(state)?;
        
        // Fraud check
        self.validate_fraud_rules(state)?;
        
        // Compliance check
        self.validate_compliance(state)?;
        
        Ok(())
    }
    
    fn validate_daily_limit(&self, state: &AccountState) -> CommandResult<()> {
        const DAILY_LIMIT: Money = Money::from_cents(50_000_00);
        
        let today_total = state.transfers_today() + self.amount;
        require!(
            today_total <= DAILY_LIMIT,
            "Daily transfer limit exceeded: {} > {}",
            today_total,
            DAILY_LIMIT
        );
        
        Ok(())
    }
}

// In handle()
async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Run all validations
    self.validate_business_rules(&state)?;
    
    // Generate events...
}
}

Type-Safe Validation

Use types to make invalid states unrepresentable:

#![allow(unused)]
fn main() {
use nutype::nutype;

// Email validation at type level
#[nutype(
    sanitize(lowercase, trim),
    validate(regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct Email(String);

// Money that can't be negative
#[nutype(
    validate(greater_or_equal = 0),
    derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)
)]
pub struct Money(u64);

// Now these validations happen at construction
let email = Email::try_new("invalid-email")?; // Fails at parse time
let amount = Money::try_new(-100)?; // Compile error - u64 can't be negative
}

Handling Transient Errors

Automatic Retries

EventCore automatically retries on version conflicts:

#![allow(unused)]
fn main() {
// This happens inside EventCore:
pub async fn execute_with_retry<C: Command>(
    command: &C,
    max_retries: usize,
) -> CommandResult<ExecutionResult> {
    let mut attempts = 0;
    
    loop {
        attempts += 1;
        
        match execute_once(command).await {
            Ok(result) => return Ok(result),
            
            Err(CommandError::ConcurrencyConflict(_)) if attempts < max_retries => {
                // Exponential backoff
                let delay = Duration::from_millis(100 * 2_u64.pow(attempts as u32));
                tokio::time::sleep(delay).await;
                continue;
            }
            
            Err(e) => return Err(e),
        }
    }
}
}

Circuit Breaker Pattern

Protect against cascading failures:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::sync::atomic::{AtomicU32, AtomicU64, Ordering};

pub struct CircuitBreaker {
    failure_count: AtomicU32,
    last_failure_time: AtomicU64,
    threshold: u32,
    timeout: Duration,
}

impl CircuitBreaker {
    pub fn call<F, T, E>(&self, f: F) -> Result<T, CircuitBreakerError<E>>
    where
        F: FnOnce() -> Result<T, E>,
    {
        // Check if circuit is open
        if self.is_open() {
            return Err(CircuitBreakerError::Open);
        }
        
        // Try the operation
        match f() {
            Ok(result) => {
                self.on_success();
                Ok(result)
            }
            Err(e) => {
                self.on_failure();
                Err(CircuitBreakerError::Failed(e))
            }
        }
    }
    
    fn is_open(&self) -> bool {
        let failures = self.failure_count.load(Ordering::Relaxed);
        if failures >= self.threshold {
            let last_failure = self.last_failure_time.load(Ordering::Relaxed);
            let elapsed = Duration::from_millis(
                SystemTime::now()
                    .duration_since(UNIX_EPOCH)
                    .unwrap()
                    .as_millis() as u64 - last_failure
            );
            elapsed < self.timeout
        } else {
            false
        }
    }
}

// Usage in event store
impl PostgresEventStore {
    pub async fn read_stream_with_circuit_breaker(
        &self,
        stream_id: &StreamId,
    ) -> Result<StreamEvents, EventStoreError> {
        self.circuit_breaker.call(|| {
            self.read_stream_internal(stream_id).await
        })
    }
}
}

Error Recovery Strategies

Compensating Commands

When things go wrong, emit compensating events:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct RefundPayment {
    #[stream]
    payment: StreamId,
    
    #[stream]
    account: StreamId,
    
    reason: RefundReason,
}

async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    // Validate refund is possible
    require!(
        state.payment.status == PaymentStatus::Completed,
        "Can only refund completed payments"
    );
    
    require!(
        !state.payment.is_refunded,
        "Payment already refunded"
    );
    
    // Compensating events
    Ok(vec![
        StreamWrite::new(&read_streams, self.payment.clone(),
            PaymentEvent::Refunded {
                amount: state.payment.amount,
                reason: self.reason.clone(),
            })?,
            
        StreamWrite::new(&read_streams, self.account.clone(),
            AccountEvent::Credited {
                amount: state.payment.amount,
                reference: format!("Refund for payment {}", state.payment.id),
            })?,
    ])
}
}

Dead Letter Queues

Handle permanently failed commands:

#![allow(unused)]
fn main() {
pub struct DeadLetterQueue<C: Command> {
    failed_commands: Vec<FailedCommand<C>>,
}

#[derive(Debug)]
pub struct FailedCommand<C> {
    pub command: C,
    pub error: CommandError,
    pub attempts: usize,
    pub first_attempted: DateTime<Utc>,
    pub last_attempted: DateTime<Utc>,
}

impl<C: Command> CommandExecutor<C> {
    pub async fn execute_with_dlq(
        &self,
        command: C,
        dlq: &mut DeadLetterQueue<C>,
    ) -> CommandResult<ExecutionResult> {
        match self.execute_with_retry(&command, 5).await {
            Ok(result) => Ok(result),
            Err(e) if e.is_permanent() => {
                // Add to DLQ for manual intervention
                dlq.add(FailedCommand {
                    command,
                    error: e.clone(),
                    attempts: 5,
                    first_attempted: Utc::now(),
                    last_attempted: Utc::now(),
                });
                Err(e)
            }
            Err(e) => Err(e),
        }
    }
}
}

Error Context and Debugging

Rich Error Context

Add context to errors:

#![allow(unused)]
fn main() {
use std::fmt;

#[derive(Debug)]
pub struct ErrorContext {
    pub command_type: &'static str,
    pub stream_ids: Vec<StreamId>,
    pub correlation_id: CorrelationId,
    pub user_id: Option<UserId>,
    pub additional_context: HashMap<String, String>,
}

impl fmt::Display for ErrorContext {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "Command: {}, Streams: {:?}, Correlation: {}", 
            self.command_type,
            self.stream_ids,
            self.correlation_id
        )?;
        
        if let Some(user) = &self.user_id {
            write!(f, ", User: {}", user)?;
        }
        
        for (key, value) in &self.additional_context {
            write!(f, ", {}: {}", key, value)?;
        }
        
        Ok(())
    }
}

// Wrap errors with context
pub type ContextualResult<T> = Result<T, ContextualError>;

#[derive(Debug, thiserror::Error)]
#[error("{context}\nError: {source}")]
pub struct ContextualError {
    #[source]
    source: CommandError,
    context: ErrorContext,
}
}

Structured Logging

Log errors with full context:

#![allow(unused)]
fn main() {
use tracing::{error, warn, info, instrument};

#[instrument(skip(self, read_streams, state, stream_resolver))]
async fn handle(
    &self,
    read_streams: ReadStreams<Self::StreamSet>,
    state: Self::State,
    stream_resolver: &mut StreamResolver,
) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
    info!(
        amount = %self.amount,
        from = %self.from_account,
        to = %self.to_account,
        "Processing transfer"
    );
    
    if let Err(e) = self.validate_business_rules(&state) {
        error!(
            error = %e,
            balance = %state.balance,
            daily_total = %state.transfers_today(),
            "Transfer validation failed"
        );
        return Err(e);
    }
    
    // Continue...
}
}

Testing Error Scenarios

Unit Tests for Validation

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[tokio::test]
    async fn test_insufficient_balance_error() {
        let command = TransferMoney {
            from_account: StreamId::from_static("account-1"),
            to_account: StreamId::from_static("account-2"),
            amount: Money::from_cents(1000),
        };
        
        let state = AccountState {
            balance: Money::from_cents(500),
            ..Default::default()
        };
        
        let result = command.validate_business_rules(&state);
        
        assert!(matches!(
            result,
            Err(CommandError::ValidationFailed(msg)) if msg.contains("Insufficient balance")
        ));
    }
    
    #[tokio::test]
    async fn test_daily_limit_exceeded() {
        let command = TransferMoney {
            from_account: StreamId::from_static("account-1"),
            to_account: StreamId::from_static("account-2"),
            amount: Money::from_cents(10_000),
        };
        
        let mut state = AccountState::default();
        state.add_todays_transfer(Money::from_cents(45_000));
        
        let result = command.validate_business_rules(&state);
        
        assert!(matches!(
            result,
            Err(CommandError::BusinessRuleViolation(msg)) if msg.contains("Daily transfer limit")
        ));
    }
}
}

Integration Tests for Concurrency

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_concurrent_modification_handling() {
    let store = InMemoryEventStore::new();
    let executor = CommandExecutor::new(store);
    
    // Setup
    create_account(&executor, "account-1", 1000).await;
    
    // Create two conflicting commands
    let withdraw1 = WithdrawMoney {
        account: StreamId::from_static("account-1"),
        amount: Money::from_cents(600),
    };
    
    let withdraw2 = WithdrawMoney {
        account: StreamId::from_static("account-1"),
        amount: Money::from_cents(700),
    };
    
    // Execute concurrently
    let (result1, result2) = tokio::join!(
        executor.execute(&withdraw1),
        executor.execute(&withdraw2)
    );
    
    // One should succeed, one should fail due to insufficient funds after retry
    let successes = [&result1, &result2]
        .iter()
        .filter(|r| r.is_ok())
        .count();
    
    assert_eq!(successes, 1, "Exactly one withdrawal should succeed");
    
    // Check final balance
    let balance = get_account_balance(&store, "account-1").await;
    assert!(balance == 400 || balance == 300); // 1000 - 600 or 1000 - 700
}
}

Chaos Testing

#![allow(unused)]
fn main() {
use eventcore::testing::chaos::ChaosConfig;

#[tokio::test]
async fn test_resilience_under_chaos() {
    let base_store = InMemoryEventStore::new();
    let chaos_store = base_store.with_chaos(ChaosConfig {
        failure_probability: 0.1,  // 10% chance of failure
        latency_ms: Some(50..200), // Random latency
        version_conflict_probability: 0.2, // 20% chance of conflicts
    });
    
    let executor = CommandExecutor::new(chaos_store)
        .with_max_retries(10);
    
    // Run many operations
    let mut handles = vec![];
    for i in 0..100 {
        let executor = executor.clone();
        let handle = tokio::spawn(async move {
            let command = CreateTask {
                title: format!("Task {}", i),
                // ...
            };
            executor.execute(&command).await
        });
        handles.push(handle);
    }
    
    // Collect results
    let results: Vec<_> = futures::future::join_all(handles).await;
    
    // Despite chaos, most should succeed due to retries
    let success_rate = results.iter()
        .filter(|r| r.as_ref().unwrap().is_ok())
        .count() as f64 / results.len() as f64;
    
    assert!(success_rate > 0.95, "Success rate too low: {}", success_rate);
}
}

Production Error Handling

Monitoring and Alerting

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, register_counter, register_histogram};

lazy_static! {
    static ref COMMAND_ERRORS: Counter = register_counter!(
        "eventcore_command_errors_total",
        "Total number of command errors"
    ).unwrap();
    
    static ref RETRY_COUNT: Histogram = register_histogram!(
        "eventcore_command_retries",
        "Number of retries per command"
    ).unwrap();
}

impl CommandExecutor {
    async fn execute_with_metrics(&self, command: &impl Command) -> CommandResult<ExecutionResult> {
        let start = Instant::now();
        let mut retries = 0;
        
        loop {
            match self.execute_once(command).await {
                Ok(result) => {
                    RETRY_COUNT.observe(retries as f64);
                    return Ok(result);
                }
                Err(e) => {
                    COMMAND_ERRORS.inc();
                    
                    if e.is_retriable() && retries < self.max_retries {
                        retries += 1;
                        continue;
                    }
                    
                    return Err(e);
                }
            }
        }
    }
}
}

Error Recovery Procedures

Document recovery procedures:

#![allow(unused)]
fn main() {
/// Recovery procedure for payment processing failures
/// 
/// 1. Check payment provider status
/// 2. Verify account balances match event history
/// 3. Look for orphaned payments in provider but not in events
/// 4. Run reconciliation command if discrepancies found
/// 5. Contact support if automated recovery fails
#[derive(Command, Clone)]
struct ReconcilePayments {
    #[stream]
    payment_provider: StreamId,
    
    #[stream]
    reconciliation_log: StreamId,
    
    provider_transactions: Vec<ProviderTransaction>,
}
}

Best Practices

1. Fail Fast

Validate as early as possible:

#![allow(unused)]
fn main() {
// ✅ Good - validate at construction
impl TransferMoney {
    pub fn new(
        from: StreamId,
        to: StreamId,
        amount: Money,
    ) -> Result<Self, ValidationError> {
        if from == to {
            return Err(ValidationError::SameAccount);
        }
        
        Ok(Self {
            from_account: from,
            to_account: to,
            amount,
        })
    }
}

// ❌ Bad - validate late in handle()
}

2. Be Specific

Use specific error types:

#![allow(unused)]
fn main() {
// ✅ Good - specific errors
#[derive(Debug, thiserror::Error)]
pub enum TransferError {
    #[error("Insufficient balance: available {available}, requested {requested}")]
    InsufficientBalance { available: Money, requested: Money },
    
    #[error("Daily limit exceeded: limit {limit}, attempted {attempted}")]
    DailyLimitExceeded { limit: Money, attempted: Money },
    
    #[error("Account {0} is frozen")]
    AccountFrozen(AccountId),
}

// ❌ Bad - generic errors
Err("Transfer failed".into())
}

3. Make Errors Actionable

Provide enough context to fix issues:

#![allow(unused)]
fn main() {
// ✅ Good - actionable error
require!(
    state.account.kyc_verified,
    "Account KYC verification required. Please complete verification at: https://example.com/kyc/{}", 
    state.account.id
);

// ❌ Bad - vague error
require!(state.account.kyc_verified, "KYC required");
}

Summary

Error handling in EventCore:

  • Type-safe - Errors encoded in function signatures
  • Recoverable - Automatic retries for transient failures
  • Informative - Rich context for debugging
  • Testable - Easy to test error scenarios
  • Production-ready - Monitoring and recovery built-in

Best practices:

  1. Use require! macro for concise validation
  2. Create specific error types for your domain
  3. Add context to errors for debugging
  4. Test error scenarios thoroughly
  5. Monitor errors in production

You’ve completed Part 3! Continue to Part 4: Building Web APIs

Part 4: Building Web APIs

This part shows how to expose your EventCore application through HTTP APIs. We’ll cover command handlers, query endpoints, authentication, and API design best practices.

Chapters in This Part

  1. Setting Up HTTP Endpoints - Web framework integration
  2. Command Handlers - Exposing commands via HTTP
  3. Query Endpoints - Building read APIs with projections
  4. Authentication and Authorization - Securing your API
  5. API Versioning - Evolving APIs without breaking clients

What You’ll Learn

  • Integrate EventCore with popular Rust web frameworks
  • Design RESTful and GraphQL APIs for event-sourced systems
  • Handle authentication and authorization
  • Build efficient query endpoints using projections
  • Version your API as your system evolves

Prerequisites

  • Completed Part 3: Core Concepts
  • Basic understanding of HTTP and REST APIs
  • Familiarity with at least one Rust web framework helpful

Framework Examples

This part includes examples for:

  • Axum - Modern, ergonomic web framework
  • Actix Web - High-performance actor-based framework
  • Rocket - Type-safe, developer-friendly framework

Time to Complete

  • Reading: ~45 minutes
  • With implementation: ~2 hours

Ready to build APIs? Let’s start with Setting Up HTTP Endpoints

Chapter 4.1: Setting Up HTTP Endpoints

EventCore is framework-agnostic - you can use it with any Rust web framework. This chapter shows how to integrate EventCore with popular frameworks and structure your API.

Architecture Overview

HTTP Request → Web Framework → Command/Query → EventCore → Response

Your web layer should be thin, focusing on:

  1. Request parsing - Convert HTTP to domain types
  2. Authentication - Verify caller identity
  3. Authorization - Check permissions
  4. Command/Query execution - Delegate to EventCore
  5. Response formatting - Convert results to HTTP

Axum Integration

Axum is a modern web framework that pairs well with EventCore:

Setup

[dependencies]
eventcore = "1.0"
axum = "0.7"
tokio = { version = "1", features = ["full"] }
tower = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

Basic Application Structure

use axum::{
    extract::{State, Json},
    http::StatusCode,
    response::IntoResponse,
    routing::{get, post},
    Router,
};
use eventcore::prelude::*;
use std::sync::Arc;
use tokio::sync::RwLock;

// Application state shared across handlers
#[derive(Clone)]
struct AppState {
    executor: CommandExecutor<PostgresEventStore>,
    projections: Arc<RwLock<ProjectionManager>>,
}

#[tokio::main]
async fn main() {
    // Initialize EventCore
    let event_store = PostgresEventStore::new(
        "postgresql://localhost/eventcore"
    ).await.unwrap();
    
    let executor = CommandExecutor::new(event_store);
    let projections = Arc::new(RwLock::new(ProjectionManager::new()));
    
    let state = AppState {
        executor,
        projections,
    };
    
    // Build routes
    let app = Router::new()
        .route("/api/v1/tasks", post(create_task))
        .route("/api/v1/tasks/:id", get(get_task))
        .route("/api/v1/tasks/:id/assign", post(assign_task))
        .route("/api/v1/tasks/:id/complete", post(complete_task))
        .route("/api/v1/users/:id/tasks", get(get_user_tasks))
        .route("/health", get(health_check))
        .with_state(state);
    
    // Start server
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000")
        .await
        .unwrap();
        
    axum::serve(listener, app).await.unwrap();
}

Command Handler Example

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
struct CreateTaskRequest {
    title: String,
    description: String,
}

#[derive(Debug, Serialize)]
struct CreateTaskResponse {
    task_id: String,
    message: String,
}

async fn create_task(
    State(state): State<AppState>,
    Json(request): Json<CreateTaskRequest>,
) -> Result<Json<CreateTaskResponse>, ApiError> {
    // Validate input
    let title = TaskTitle::try_new(request.title)
        .map_err(|e| ApiError::validation(e))?;
    let description = TaskDescription::try_new(request.description)
        .map_err(|e| ApiError::validation(e))?;
    
    // Create command
    let task_id = TaskId::new();
    let command = CreateTask {
        task_id: StreamId::from(format!("task-{}", task_id)),
        title,
        description,
    };
    
    // Execute command
    state.executor
        .execute(&command)
        .await
        .map_err(|e| ApiError::from_command_error(e))?;
    
    // Return response
    Ok(Json(CreateTaskResponse {
        task_id: task_id.to_string(),
        message: "Task created successfully".to_string(),
    }))
}
}

Error Handling

#![allow(unused)]
fn main() {
#[derive(Debug)]
struct ApiError {
    status: StatusCode,
    message: String,
    details: Option<serde_json::Value>,
}

impl ApiError {
    fn validation<E: std::error::Error>(error: E) -> Self {
        Self {
            status: StatusCode::BAD_REQUEST,
            message: error.to_string(),
            details: None,
        }
    }
    
    fn from_command_error(error: CommandError) -> Self {
        match error {
            CommandError::ValidationFailed(msg) => Self {
                status: StatusCode::BAD_REQUEST,
                message: msg,
                details: None,
            },
            CommandError::BusinessRuleViolation(msg) => Self {
                status: StatusCode::UNPROCESSABLE_ENTITY,
                message: msg,
                details: None,
            },
            CommandError::StreamNotFound(_) => Self {
                status: StatusCode::NOT_FOUND,
                message: "Resource not found".to_string(),
                details: None,
            },
            CommandError::ConcurrencyConflict(_) => Self {
                status: StatusCode::CONFLICT,
                message: "Resource was modified by another request".to_string(),
                details: None,
            },
            _ => Self {
                status: StatusCode::INTERNAL_SERVER_ERROR,
                message: "An internal error occurred".to_string(),
                details: None,
            },
        }
    }
}

impl IntoResponse for ApiError {
    fn into_response(self) -> axum::response::Response {
        let body = serde_json::json!({
            "error": {
                "message": self.message,
                "details": self.details,
            }
        });
        
        (self.status, Json(body)).into_response()
    }
}
}

Actix Web Integration

Actix Web offers high performance and actor-based architecture:

Setup

[dependencies]
eventcore = "1.0"
actix-web = "4"
actix-rt = "2"

Application Structure

use actix_web::{web, App, HttpServer, HttpResponse, Result};
use eventcore::prelude::*;

struct AppData {
    executor: CommandExecutor<PostgresEventStore>,
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    let event_store = PostgresEventStore::new(
        "postgresql://localhost/eventcore"
    ).await.unwrap();
    
    let app_data = web::Data::new(AppData {
        executor: CommandExecutor::new(event_store),
    });
    
    HttpServer::new(move || {
        App::new()
            .app_data(app_data.clone())
            .service(
                web::scope("/api/v1")
                    .route("/tasks", web::post().to(create_task))
                    .route("/tasks/{id}", web::get().to(get_task))
                    .route("/tasks/{id}/assign", web::post().to(assign_task))
            )
    })
    .bind("127.0.0.1:8080")?
    .run()
    .await
}

async fn create_task(
    data: web::Data<AppData>,
    request: web::Json<CreateTaskRequest>,
) -> Result<HttpResponse> {
    // Similar to Axum example
    Ok(HttpResponse::Created().json(CreateTaskResponse {
        task_id: "...",
        message: "...",
    }))
}

Rocket Integration

Rocket provides a declarative, type-safe approach:

Setup

[dependencies]
eventcore = "1.0"
rocket = { version = "0.5", features = ["json"] }

Application Structure

#![allow(unused)]
fn main() {
use rocket::{State, serde::json::Json};
use eventcore::prelude::*;

struct AppState {
    executor: CommandExecutor<PostgresEventStore>,
}

#[rocket::post("/tasks", data = "<request>")]
async fn create_task(
    state: &State<AppState>,
    request: Json<CreateTaskRequest>,
) -> Result<Json<CreateTaskResponse>, ApiError> {
    // Implementation similar to Axum
}

#[rocket::launch]
fn rocket() -> _ {
    let event_store = /* initialize */;
    
    rocket::build()
        .manage(AppState {
            executor: CommandExecutor::new(event_store),
        })
        .mount("/api/v1", rocket::routes![
            create_task,
            get_task,
            assign_task,
        ])
}
}

Request/Response Design

Command Requests

Design your API requests to map cleanly to commands:

#![allow(unused)]
fn main() {
// HTTP Request
#[derive(Deserialize)]
struct TransferMoneyRequest {
    from_account: String,
    to_account: String,
    amount: Decimal,
    reference: Option<String>,
}

// Convert to command
impl TryFrom<TransferMoneyRequest> for TransferMoney {
    type Error = ValidationError;
    
    fn try_from(req: TransferMoneyRequest) -> Result<Self, Self::Error> {
        Ok(TransferMoney {
            from_account: StreamId::try_new(req.from_account)?,
            to_account: StreamId::try_new(req.to_account)?,
            amount: Money::try_from_decimal(req.amount)?,
            reference: req.reference.unwrap_or_default(),
        })
    }
}
}

Response Design

Return minimal, useful information:

#![allow(unused)]
fn main() {
#[derive(Serialize)]
#[serde(tag = "status")]
enum CommandResponse {
    #[serde(rename = "success")]
    Success {
        message: String,
        #[serde(skip_serializing_if = "Option::is_none")]
        resource_id: Option<String>,
        #[serde(skip_serializing_if = "Option::is_none")]
        resource_url: Option<String>,
    },
    #[serde(rename = "accepted")]
    Accepted {
        message: String,
        tracking_id: String,
    },
}
}

Middleware and Interceptors

Request ID Middleware

Track requests through your system:

#![allow(unused)]
fn main() {
use axum::middleware::{self, Next};
use axum::extract::Request;
use uuid::Uuid;

async fn request_id_middleware(
    mut request: Request,
    next: Next,
) -> impl IntoResponse {
    let request_id = Uuid::new_v4().to_string();
    
    // Add to request extensions
    request.extensions_mut().insert(RequestId(request_id.clone()));
    
    // Add to response headers
    let mut response = next.run(request).await;
    response.headers_mut().insert(
        "X-Request-ID",
        request_id.parse().unwrap(),
    );
    
    response
}

// Use in router
let app = Router::new()
    .route("/api/v1/tasks", post(create_task))
    .layer(middleware::from_fn(request_id_middleware));
}

Timing Middleware

Monitor performance:

#![allow(unused)]
fn main() {
use std::time::Instant;

async fn timing_middleware(
    request: Request,
    next: Next,
) -> impl IntoResponse {
    let start = Instant::now();
    let path = request.uri().path().to_owned();
    let method = request.method().clone();
    
    let response = next.run(request).await;
    
    let duration = start.elapsed();
    tracing::info!(
        method = %method,
        path = %path,
        duration_ms = %duration.as_millis(),
        status = %response.status(),
        "Request completed"
    );
    
    response
}
}

Configuration

Use environment variables for configuration:

#![allow(unused)]
fn main() {
use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Config {
    #[serde(default = "default_port")]
    port: u16,
    
    #[serde(default = "default_host")]
    host: String,
    
    database_url: String,
    
    #[serde(default = "default_max_connections")]
    max_connections: u32,
}

fn default_port() -> u16 { 3000 }
fn default_host() -> String { "0.0.0.0".to_string() }
fn default_max_connections() -> u32 { 20 }

impl Config {
    fn from_env() -> Result<Self, config::ConfigError> {
        let mut cfg = config::Config::default();
        
        // Load from environment
        cfg.merge(config::Environment::default())?;
        
        // Load from file if exists
        if std::path::Path::new("config.toml").exists() {
            cfg.merge(config::File::with_name("config"))?;
        }
        
        cfg.try_into()
    }
}
}

Health Checks

Expose system health:

#![allow(unused)]
fn main() {
#[derive(Serialize)]
struct HealthResponse {
    status: HealthStatus,
    version: &'static str,
    checks: HashMap<String, CheckResult>,
}

#[derive(Serialize)]
#[serde(rename_all = "lowercase")]
enum HealthStatus {
    Healthy,
    Degraded,
    Unhealthy,
}

async fn health_check(State(state): State<AppState>) -> Json<HealthResponse> {
    let mut checks = HashMap::new();
    
    // Check event store
    match state.executor.event_store().health_check().await {
        Ok(_) => checks.insert("event_store".to_string(), CheckResult::healthy()),
        Err(e) => checks.insert("event_store".to_string(), CheckResult::unhealthy(e)),
    };
    
    // Check projections
    let projections = state.projections.read().await;
    for (name, health) in projections.health_status() {
        checks.insert(format!("projection_{}", name), health);
    }
    
    // Overall status
    let status = if checks.values().all(|c| c.is_healthy()) {
        HealthStatus::Healthy
    } else if checks.values().any(|c| c.is_unhealthy()) {
        HealthStatus::Unhealthy
    } else {
        HealthStatus::Degraded
    };
    
    Json(HealthResponse {
        status,
        version: env!("CARGO_PKG_VERSION"),
        checks,
    })
}
}

Graceful Shutdown

Handle shutdown gracefully:

#![allow(unused)]
fn main() {
use tokio::signal;

async fn shutdown_signal() {
    let ctrl_c = async {
        signal::ctrl_c()
            .await
            .expect("failed to install Ctrl+C handler");
    };

    #[cfg(unix)]
    let terminate = async {
        signal::unix::signal(signal::unix::SignalKind::terminate())
            .expect("failed to install signal handler")
            .recv()
            .await;
    };

    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();

    tokio::select! {
        _ = ctrl_c => {},
        _ = terminate => {},
    }
}

// In main
let app = /* build app */;

axum::serve(listener, app)
    .with_graceful_shutdown(shutdown_signal())
    .await
    .unwrap();
}

Testing HTTP Endpoints

Test your API endpoints:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use axum::http::StatusCode;
    use tower::ServiceExt;
    
    #[tokio::test]
    async fn test_create_task_success() {
        let app = create_test_app().await;
        
        let response = app
            .oneshot(
                Request::builder()
                    .method("POST")
                    .uri("/api/v1/tasks")
                    .header("content-type", "application/json")
                    .body(Body::from(r#"{
                        "title": "Test Task",
                        "description": "Test Description"
                    }"#))
                    .unwrap(),
            )
            .await
            .unwrap();
        
        assert_eq!(response.status(), StatusCode::CREATED);
        
        let body: CreateTaskResponse = serde_json::from_slice(
            &hyper::body::to_bytes(response.into_body()).await.unwrap()
        ).unwrap();
        
        assert!(!body.task_id.is_empty());
    }
    
    async fn create_test_app() -> Router {
        let event_store = InMemoryEventStore::new();
        let state = AppState {
            executor: CommandExecutor::new(event_store),
            projections: Arc::new(RwLock::new(ProjectionManager::new())),
        };
        
        Router::new()
            .route("/api/v1/tasks", post(create_task))
            .with_state(state)
    }
}
}

Best Practices

  1. Keep handlers thin - Delegate business logic to commands
  2. Use proper status codes - 201 for creation, 202 for accepted, etc.
  3. Version your API - Use URL versioning (/api/v1/)
  4. Document with OpenAPI - Generate from code when possible
  5. Use correlation IDs - Track requests across services
  6. Log appropriately - Info for requests, error for failures
  7. Handle errors gracefully - Never expose internal details

Summary

Setting up HTTP endpoints for EventCore:

  • Framework agnostic - Works with any Rust web framework
  • Thin HTTP layer - Focus on translation, not business logic
  • Type-safe - Leverage Rust’s type system
  • Error handling - Map domain errors to HTTP responses
  • Testable - Easy to test endpoints in isolation

Key patterns:

  1. Parse and validate requests early
  2. Convert to domain commands
  3. Execute with EventCore
  4. Map results to HTTP responses
  5. Handle errors appropriately

Next, let’s explore Command Handlers

Chapter 4.2: Command Handlers

Command handlers are the bridge between HTTP requests and your EventCore commands. This chapter covers patterns for building robust, maintainable command handlers.

Command Handler Architecture

HTTP Request
    ↓
Parse & Validate
    ↓
Authenticate & Authorize
    ↓
Create Command
    ↓
Execute Command
    ↓
Format Response

Basic Command Handler Pattern

The Handler Function

#![allow(unused)]
fn main() {
use axum::{
    extract::{State, Path, Json},
    http::StatusCode,
    response::IntoResponse,
};
use serde::{Deserialize, Serialize};
use eventcore::prelude::*;

#[derive(Debug, Deserialize)]
struct AssignTaskRequest {
    assignee_id: String,
}

#[derive(Debug, Serialize)]
struct AssignTaskResponse {
    message: String,
    task_id: String,
    assignee_id: String,
    assigned_at: DateTime<Utc>,
}

async fn assign_task(
    State(state): State<AppState>,
    Path(task_id): Path<String>,
    user: AuthenticatedUser,  // From middleware
    Json(request): Json<AssignTaskRequest>,
) -> Result<Json<AssignTaskResponse>, ApiError> {
    // 1. Parse and validate input
    let task_stream = StreamId::try_new(format!("task-{}", task_id))
        .map_err(|e| ApiError::validation("Invalid task ID"))?;
        
    let assignee_stream = StreamId::try_new(format!("user-{}", request.assignee_id))
        .map_err(|e| ApiError::validation("Invalid assignee ID"))?;
    
    // 2. Create command
    let command = AssignTask {
        task_id: task_stream,
        assignee_id: assignee_stream,
        assigned_by: user.id.clone(),
    };
    
    // 3. Execute with context
    let result = state.executor
        .execute_with_context(
            &command,
            ExecutionContext::new()
                .with_user_id(user.id)
                .with_correlation_id(extract_correlation_id(&request))
        )
        .await
        .map_err(ApiError::from_command_error)?;
    
    // 4. Format response
    Ok(Json(AssignTaskResponse {
        message: "Task assigned successfully".to_string(),
        task_id: task_id.clone(),
        assignee_id: request.assignee_id,
        assigned_at: Utc::now(),
    }))
}
}

Authentication and Authorization

Authentication Middleware

#![allow(unused)]
fn main() {
use axum::{
    extract::{Request, FromRequestParts},
    http::{header, StatusCode},
    response::Response,
    middleware::Next,
};
use jsonwebtoken::{decode, DecodingKey, Validation};

#[derive(Debug, Clone, Serialize, Deserialize)]
struct Claims {
    sub: String,  // User ID
    exp: usize,   // Expiration time
    roles: Vec<String>,
}

#[derive(Debug, Clone)]
struct AuthenticatedUser {
    id: UserId,
    roles: Vec<String>,
}

#[async_trait]
impl<S> FromRequestParts<S> for AuthenticatedUser
where
    S: Send + Sync,
{
    type Rejection = ApiError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        _state: &S,
    ) -> Result<Self, Self::Rejection> {
        // Extract token from Authorization header
        let token = parts
            .headers
            .get(header::AUTHORIZATION)
            .and_then(|auth| auth.to_str().ok())
            .and_then(|auth| auth.strip_prefix("Bearer "))
            .ok_or_else(|| ApiError::unauthorized("Missing authentication token"))?;
        
        // Decode and validate token
        let token_data = decode::<Claims>(
            token,
            &DecodingKey::from_secret(JWT_SECRET.as_ref()),
            &Validation::default(),
        )
        .map_err(|_| ApiError::unauthorized("Invalid authentication token"))?;
        
        Ok(AuthenticatedUser {
            id: UserId::try_new(token_data.claims.sub)?,
            roles: token_data.claims.roles,
        })
    }
}
}

Authorization in Handlers

#![allow(unused)]
fn main() {
impl AuthenticatedUser {
    fn has_role(&self, role: &str) -> bool {
        self.roles.contains(&role.to_string())
    }
    
    fn can_manage_tasks(&self) -> bool {
        self.has_role("admin") || self.has_role("manager")
    }
    
    fn can_assign_tasks(&self) -> bool {
        self.has_role("admin") || self.has_role("manager") || self.has_role("lead")
    }
}

async fn delete_task(
    State(state): State<AppState>,
    Path(task_id): Path<String>,
    user: AuthenticatedUser,
) -> Result<StatusCode, ApiError> {
    // Check authorization
    if !user.can_manage_tasks() {
        return Err(ApiError::forbidden("Insufficient permissions to delete tasks"));
    }
    
    let command = DeleteTask {
        task_id: StreamId::try_new(format!("task-{}", task_id))?,
        deleted_by: user.id,
    };
    
    state.executor.execute(&command).await?;
    
    Ok(StatusCode::NO_CONTENT)
}
}

Input Validation

Request Validation

#![allow(unused)]
fn main() {
use validator::{Validate, ValidationError};

#[derive(Debug, Deserialize, Validate)]
struct CreateProjectRequest {
    #[validate(length(min = 3, max = 100))]
    name: String,
    
    #[validate(length(max = 1000))]
    description: Option<String>,
    
    #[validate(email)]
    owner_email: String,
    
    #[validate(range(min = 1, max = 365))]
    duration_days: u32,
    
    #[validate(custom = "validate_start_date")]
    start_date: Option<DateTime<Utc>>,
}

fn validate_start_date(date: &DateTime<Utc>) -> Result<(), ValidationError> {
    if *date < Utc::now() {
        return Err(ValidationError::new("Start date cannot be in the past"));
    }
    Ok(())
}

async fn create_project(
    State(state): State<AppState>,
    user: AuthenticatedUser,
    Json(request): Json<CreateProjectRequest>,
) -> Result<Json<CreateProjectResponse>, ApiError> {
    // Validate request
    request.validate()
        .map_err(|e| ApiError::validation_errors(e))?;
    
    // Create command with validated data
    let command = CreateProject {
        project_id: StreamId::from(format!("project-{}", ProjectId::new())),
        name: ProjectName::try_new(request.name)?,
        description: request.description
            .map(|d| ProjectDescription::try_new(d))
            .transpose()?,
        owner: UserId::try_new(request.owner_email)?,
        duration: Duration::days(request.duration_days as i64),
        start_date: request.start_date.unwrap_or_else(Utc::now),
        created_by: user.id,
    };
    
    // Execute and return response
    // ...
}
}

Custom Validation Rules

#![allow(unused)]
fn main() {
mod validators {
    use super::*;
    
    pub fn validate_business_hours(time: &NaiveTime) -> Result<(), ValidationError> {
        const BUSINESS_START: NaiveTime = NaiveTime::from_hms_opt(9, 0, 0).unwrap();
        const BUSINESS_END: NaiveTime = NaiveTime::from_hms_opt(17, 0, 0).unwrap();
        
        if *time < BUSINESS_START || *time > BUSINESS_END {
            return Err(ValidationError::new("Outside business hours"));
        }
        Ok(())
    }
    
    pub fn validate_future_date(date: &NaiveDate) -> Result<(), ValidationError> {
        if *date <= Local::now().naive_local().date() {
            return Err(ValidationError::new("Date must be in the future"));
        }
        Ok(())
    }
    
    pub fn validate_currency_code(code: &str) -> Result<(), ValidationError> {
        const VALID_CURRENCIES: &[&str] = &["USD", "EUR", "GBP", "JPY"];
        
        if !VALID_CURRENCIES.contains(&code) {
            return Err(ValidationError::new("Invalid currency code"));
        }
        Ok(())
    }
}
}

Idempotency

Ensure commands can be safely retried:

Idempotency Keys

#![allow(unused)]
fn main() {
use axum::extract::FromRequest;

#[derive(Debug, Clone)]
struct IdempotencyKey(String);

#[async_trait]
impl<S> FromRequestParts<S> for IdempotencyKey
where
    S: Send + Sync,
{
    type Rejection = ApiError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        _state: &S,
    ) -> Result<Self, Self::Rejection> {
        parts
            .headers
            .get("Idempotency-Key")
            .and_then(|v| v.to_str().ok())
            .map(|s| IdempotencyKey(s.to_string()))
            .ok_or_else(|| ApiError::bad_request("Idempotency-Key header required"))
    }
}

// Store for idempotency
#[derive(Clone)]
struct IdempotencyStore {
    cache: Arc<RwLock<HashMap<String, CachedResponse>>>,
}

#[derive(Clone)]
struct CachedResponse {
    status: StatusCode,
    body: Vec<u8>,
    created_at: DateTime<Utc>,
}

async fn idempotent_handler<F, Fut>(
    key: IdempotencyKey,
    store: State<IdempotencyStore>,
    handler: F,
) -> Response
where
    F: FnOnce() -> Fut,
    Fut: Future<Output = Response>,
{
    // Check cache
    let cache = store.cache.read().await;
    if let Some(cached) = cache.get(&key.0) {
        // Return cached response
        return Response::builder()
            .status(cached.status)
            .body(Body::from(cached.body.clone()))
            .unwrap();
    }
    drop(cache);
    
    // Execute handler
    let response = handler().await;
    
    // Cache successful responses
    if response.status().is_success() {
        let (parts, body) = response.into_parts();
        let body_bytes = hyper::body::to_bytes(body).await.unwrap().to_vec();
        
        let mut cache = store.cache.write().await;
        cache.insert(key.0, CachedResponse {
            status: parts.status,
            body: body_bytes.clone(),
            created_at: Utc::now(),
        });
        
        Response::from_parts(parts, Body::from(body_bytes))
    } else {
        response
    }
}
}

Command-Level Idempotency

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct TransferMoney {
    #[stream]
    from_account: StreamId,
    
    #[stream]
    to_account: StreamId,
    
    amount: Money,
    
    // Idempotency key embedded in command
    transfer_id: TransferId,
}

impl CommandLogic for TransferMoney {
    // ... other implementations
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Check if transfer already processed
        if state.processed_transfers.contains(&self.transfer_id) {
            // Already processed - return success with no new events
            return Ok(vec![]);
        }
        
        // Process transfer...
        Ok(vec![
            StreamWrite::new(
                &read_streams,
                self.from_account.clone(),
                BankEvent::TransferProcessed {
                    transfer_id: self.transfer_id,
                    amount: self.amount,
                }
            )?,
            // ... other events
        ])
    }
}
}

Error Response Formatting

Provide consistent, helpful error responses:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct ErrorResponse {
    error: ErrorDetails,
    #[serde(skip_serializing_if = "Option::is_none")]
    request_id: Option<String>,
}

#[derive(Debug, Serialize)]
struct ErrorDetails {
    code: String,
    message: String,
    #[serde(skip_serializing_if = "Option::is_none")]
    field_errors: Option<HashMap<String, Vec<String>>>,
    #[serde(skip_serializing_if = "Option::is_none")]
    help: Option<String>,
}

impl ApiError {
    fn to_response(&self, request_id: Option<String>) -> Response {
        let (status, error_details) = match self {
            ApiError::Validation { errors } => (
                StatusCode::BAD_REQUEST,
                ErrorDetails {
                    code: "VALIDATION_ERROR".to_string(),
                    message: "Invalid request data".to_string(),
                    field_errors: Some(errors.clone()),
                    help: Some("Check the field_errors for specific validation issues".to_string()),
                }
            ),
            ApiError::BusinessRule { message } => (
                StatusCode::UNPROCESSABLE_ENTITY,
                ErrorDetails {
                    code: "BUSINESS_RULE_VIOLATION".to_string(),
                    message: message.clone(),
                    field_errors: None,
                    help: None,
                }
            ),
            ApiError::NotFound { resource } => (
                StatusCode::NOT_FOUND,
                ErrorDetails {
                    code: "RESOURCE_NOT_FOUND".to_string(),
                    message: format!("{} not found", resource),
                    field_errors: None,
                    help: None,
                }
            ),
            ApiError::Conflict { message } => (
                StatusCode::CONFLICT,
                ErrorDetails {
                    code: "CONFLICT".to_string(),
                    message: message.clone(),
                    field_errors: None,
                    help: Some("The resource was modified. Please refresh and try again.".to_string()),
                }
            ),
            // ... other error types
        };
        
        let response = ErrorResponse {
            error: error_details,
            request_id,
        };
        
        (status, Json(response)).into_response()
    }
}
}

Batch Command Handlers

Handle multiple commands efficiently:

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
struct BatchRequest<T> {
    operations: Vec<T>,
    #[serde(default)]
    stop_on_error: bool,
}

#[derive(Debug, Serialize)]
struct BatchResponse<T> {
    results: Vec<BatchResult<T>>,
    successful: usize,
    failed: usize,
}

#[derive(Debug, Serialize)]
#[serde(tag = "status")]
enum BatchResult<T> {
    #[serde(rename = "success")]
    Success { result: T },
    #[serde(rename = "error")]
    Error { error: ErrorDetails },
}

async fn batch_create_tasks(
    State(state): State<AppState>,
    user: AuthenticatedUser,
    Json(batch): Json<BatchRequest<CreateTaskRequest>>,
) -> Result<Json<BatchResponse<CreateTaskResponse>>, ApiError> {
    let mut results = Vec::new();
    let mut successful = 0;
    let mut failed = 0;
    
    for request in batch.operations {
        match create_single_task(&state, &user, request).await {
            Ok(response) => {
                successful += 1;
                results.push(BatchResult::Success { result: response });
            }
            Err(error) => {
                failed += 1;
                results.push(BatchResult::Error { 
                    error: error.to_error_details() 
                });
                
                if batch.stop_on_error {
                    break;
                }
            }
        }
    }
    
    Ok(Json(BatchResponse {
        results,
        successful,
        failed,
    }))
}
}

Async Command Processing

For long-running commands:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct AsyncCommandResponse {
    tracking_id: String,
    status_url: String,
    message: String,
}

async fn import_large_dataset(
    State(state): State<AppState>,
    user: AuthenticatedUser,
    Json(request): Json<ImportDatasetRequest>,
) -> Result<Json<AsyncCommandResponse>, ApiError> {
    // Validate request
    request.validate()?;
    
    // Create tracking ID
    let tracking_id = TrackingId::new();
    
    // Queue command for async processing
    let command = ImportDataset {
        dataset_id: StreamId::from(format!("dataset-{}", DatasetId::new())),
        source_url: request.source_url,
        import_options: request.options,
        initiated_by: user.id,
        tracking_id: tracking_id.clone(),
    };
    
    // Submit to background queue
    state.command_queue
        .submit(command)
        .await
        .map_err(|_| ApiError::service_unavailable("Import service temporarily unavailable"))?;
    
    // Return tracking information
    Ok(Json(AsyncCommandResponse {
        tracking_id: tracking_id.to_string(),
        status_url: format!("/api/v1/imports/{}/status", tracking_id),
        message: "Import queued for processing".to_string(),
    }))
}

// Status endpoint
async fn get_import_status(
    State(state): State<AppState>,
    Path(tracking_id): Path<String>,
) -> Result<Json<ImportStatus>, ApiError> {
    let status = state.import_tracker
        .get_status(&TrackingId::try_new(tracking_id)?)
        .await?
        .ok_or_else(|| ApiError::not_found("Import"))?;
    
    Ok(Json(status))
}
}

Command Handler Testing

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    use eventcore::testing::prelude::*;
    
    #[tokio::test]
    async fn test_assign_task_authorization() {
        let state = create_test_state().await;
        
        // User without permission
        let user = AuthenticatedUser {
            id: UserId::try_new("user@example.com").unwrap(),
            roles: vec!["member".to_string()],
        };
        
        let request = AssignTaskRequest {
            assignee_id: "assignee@example.com".to_string(),
        };
        
        let result = assign_task(
            State(state),
            Path("task-123".to_string()),
            user,
            Json(request),
        ).await;
        
        assert!(matches!(
            result,
            Err(ApiError::Forbidden { .. })
        ));
    }
    
    #[tokio::test]
    async fn test_idempotent_transfer() {
        let state = create_test_state().await;
        let transfer_id = TransferId::new();
        
        let request = TransferMoneyRequest {
            from_account: "account-1".to_string(),
            to_account: "account-2".to_string(),
            amount: 100.0,
            transfer_id: transfer_id.to_string(),
        };
        
        // First call
        let response1 = transfer_money(
            State(state.clone()),
            Json(request.clone()),
        ).await.unwrap();
        
        // Second call with same transfer_id
        let response2 = transfer_money(
            State(state),
            Json(request),
        ).await.unwrap();
        
        // Should return same response
        assert_eq!(response1.0.transfer_id, response2.0.transfer_id);
        assert_eq!(response1.0.status, response2.0.status);
    }
}
}

Monitoring and Metrics

Track command handler performance:

#![allow(unused)]
fn main() {
use prometheus::{IntCounter, Histogram, register_int_counter, register_histogram};

lazy_static! {
    static ref COMMAND_COUNTER: IntCounter = register_int_counter!(
        "api_commands_total",
        "Total number of commands processed"
    ).unwrap();
    
    static ref COMMAND_DURATION: Histogram = register_histogram!(
        "api_command_duration_seconds",
        "Command processing duration"
    ).unwrap();
    
    static ref COMMAND_ERRORS: IntCounter = register_int_counter!(
        "api_command_errors_total",
        "Total number of command errors"
    ).unwrap();
}

async fn measured_handler<F, Fut, T>(
    command_type: &str,
    handler: F,
) -> Result<T, ApiError>
where
    F: FnOnce() -> Fut,
    Fut: Future<Output = Result<T, ApiError>>,
{
    COMMAND_COUNTER.inc();
    let timer = COMMAND_DURATION.start_timer();
    
    let result = handler().await;
    
    timer.observe_duration();
    
    if result.is_err() {
        COMMAND_ERRORS.inc();
    }
    
    // Log with structured data
    match &result {
        Ok(_) => {
            tracing::info!(
                command_type = %command_type,
                duration_ms = %timer.stop_and_record() * 1000.0,
                "Command completed successfully"
            );
        }
        Err(e) => {
            tracing::error!(
                command_type = %command_type,
                error = %e,
                "Command failed"
            );
        }
    }
    
    result
}
}

Best Practices

  1. Validate early - Check inputs before creating commands
  2. Use strong types - Convert strings to domain types ASAP
  3. Handle all errors - Map domain errors to appropriate HTTP responses
  4. Be idempotent - Design for safe retries
  5. Authenticate first - Verify identity before any processing
  6. Authorize actions - Check permissions for each operation
  7. Log appropriately - Include context for debugging
  8. Monitor everything - Track success rates and latencies

Summary

Command handlers in EventCore APIs:

  • Type-safe - Leverage Rust’s type system
  • Validated - Check inputs thoroughly
  • Authenticated - Know who’s making requests
  • Authorized - Enforce permissions
  • Idempotent - Safe to retry
  • Monitored - Track performance and errors

Key patterns:

  1. Parse and validate input
  2. Check authentication and authorization
  3. Create strongly-typed commands
  4. Execute with proper context
  5. Handle errors gracefully
  6. Return appropriate responses

Next, let’s explore Query Endpoints

Chapter 4.3: Query Endpoints

Query endpoints serve read requests from your projections. Unlike commands that modify state, queries are side-effect free and can be cached, making them perfect for high-performance read operations.

Query Architecture

HTTP Request → Authenticate → Authorize → Query Projection → Format Response
                                                ↑
                                         Read Model Store

Basic Query Pattern

Simple Query Endpoint

#![allow(unused)]
fn main() {
use axum::{
    extract::{State, Path, Query as QueryParams},
    Json,
};
use serde::{Deserialize, Serialize};

#[derive(Debug, Deserialize)]
struct ListTasksQuery {
    #[serde(default)]
    status: Option<TaskStatus>,
    
    #[serde(default)]
    assigned_to: Option<String>,
    
    #[serde(default = "default_page")]
    page: u32,
    
    #[serde(default = "default_page_size")]
    page_size: u32,
}

fn default_page() -> u32 { 1 }
fn default_page_size() -> u32 { 20 }

#[derive(Debug, Serialize)]
struct ListTasksResponse {
    tasks: Vec<TaskSummary>,
    pagination: PaginationInfo,
}

#[derive(Debug, Serialize)]
struct TaskSummary {
    id: String,
    title: String,
    status: TaskStatus,
    assigned_to: Option<String>,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
}

#[derive(Debug, Serialize)]
struct PaginationInfo {
    page: u32,
    page_size: u32,
    total_items: u64,
    total_pages: u32,
}

async fn list_tasks(
    State(state): State<AppState>,
    QueryParams(query): QueryParams<ListTasksQuery>,
) -> Result<Json<ListTasksResponse>, ApiError> {
    // Get projection
    let projection = state.projections
        .read()
        .await
        .get::<TaskListProjection>()
        .ok_or_else(|| ApiError::internal("Task projection not available"))?;
    
    // Apply filters
    let mut tasks = projection.get_all_tasks();
    
    if let Some(status) = query.status {
        tasks.retain(|t| t.status == status);
    }
    
    if let Some(assigned_to) = query.assigned_to {
        tasks.retain(|t| t.assigned_to.as_ref() == Some(&assigned_to));
    }
    
    // Calculate pagination
    let total_items = tasks.len() as u64;
    let total_pages = ((total_items as f32) / (query.page_size as f32)).ceil() as u32;
    
    // Apply pagination
    let start = ((query.page - 1) * query.page_size) as usize;
    let end = (start + query.page_size as usize).min(tasks.len());
    let page_tasks = tasks[start..end].to_vec();
    
    Ok(Json(ListTasksResponse {
        tasks: page_tasks.into_iter().map(Into::into).collect(),
        pagination: PaginationInfo {
            page: query.page,
            page_size: query.page_size,
            total_items,
            total_pages,
        },
    }))
}
}

Advanced Query Patterns

Filtering and Sorting

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
#[serde(rename_all = "snake_case")]
enum SortField {
    CreatedAt,
    UpdatedAt,
    Title,
    Priority,
    DueDate,
}

#[derive(Debug, Deserialize)]
#[serde(rename_all = "snake_case")]
enum SortOrder {
    Asc,
    Desc,
}

#[derive(Debug, Deserialize)]
struct AdvancedTaskQuery {
    // Filters
    #[serde(default)]
    status: Option<Vec<TaskStatus>>,
    
    #[serde(default)]
    assigned_to: Option<Vec<String>>,
    
    #[serde(default)]
    created_after: Option<DateTime<Utc>>,
    
    #[serde(default)]
    created_before: Option<DateTime<Utc>>,
    
    #[serde(default)]
    search: Option<String>,
    
    // Sorting
    #[serde(default = "default_sort_field")]
    sort_by: SortField,
    
    #[serde(default = "default_sort_order")]
    sort_order: SortOrder,
    
    // Pagination
    #[serde(default)]
    cursor: Option<String>,
    
    #[serde(default = "default_limit")]
    limit: u32,
}

fn default_sort_field() -> SortField { SortField::CreatedAt }
fn default_sort_order() -> SortOrder { SortOrder::Desc }
fn default_limit() -> u32 { 50 }

async fn search_tasks(
    State(state): State<AppState>,
    QueryParams(query): QueryParams<AdvancedTaskQuery>,
) -> Result<Json<CursorPaginatedResponse<TaskSummary>>, ApiError> {
    let projection = state.projections
        .read()
        .await
        .get::<TaskSearchProjection>()
        .ok_or_else(|| ApiError::internal("Search projection not available"))?;
    
    // Build query
    let mut search_query = SearchQuery::new();
    
    if let Some(statuses) = query.status {
        search_query = search_query.with_status_in(statuses);
    }
    
    if let Some(assignees) = query.assigned_to {
        search_query = search_query.with_assignee_in(assignees);
    }
    
    if let Some(after) = query.created_after {
        search_query = search_query.created_after(after);
    }
    
    if let Some(before) = query.created_before {
        search_query = search_query.created_before(before);
    }
    
    if let Some(search_text) = query.search {
        search_query = search_query.with_text_search(search_text);
    }
    
    // Apply sorting
    search_query = match query.sort_by {
        SortField::CreatedAt => search_query.sort_by_created_at(query.sort_order),
        SortField::UpdatedAt => search_query.sort_by_updated_at(query.sort_order),
        SortField::Title => search_query.sort_by_title(query.sort_order),
        SortField::Priority => search_query.sort_by_priority(query.sort_order),
        SortField::DueDate => search_query.sort_by_due_date(query.sort_order),
    };
    
    // Apply cursor pagination
    if let Some(cursor) = query.cursor {
        search_query = search_query.after_cursor(Cursor::decode(&cursor)?);
    }
    
    search_query = search_query.limit(query.limit);
    
    // Execute query
    let results = projection.search(search_query).await?;
    
    Ok(Json(results))
}
}

Aggregation Queries

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct TaskStatistics {
    total_tasks: u64,
    tasks_by_status: HashMap<TaskStatus, u64>,
    tasks_by_assignee: Vec<AssigneeStats>,
    completion_rate: f64,
    average_completion_time: Option<Duration>,
    overdue_tasks: u64,
}

#[derive(Debug, Serialize)]
struct AssigneeStats {
    assignee_id: String,
    assignee_name: String,
    total_tasks: u64,
    completed_tasks: u64,
    in_progress_tasks: u64,
}

async fn get_task_statistics(
    State(state): State<AppState>,
    QueryParams(query): QueryParams<DateRangeQuery>,
) -> Result<Json<TaskStatistics>, ApiError> {
    let projection = state.projections
        .read()
        .await
        .get::<TaskAnalyticsProjection>()
        .ok_or_else(|| ApiError::internal("Analytics projection not available"))?;
    
    let stats = projection.calculate_statistics(
        query.start_date,
        query.end_date,
    ).await?;
    
    Ok(Json(stats))
}

// Time-series data
#[derive(Debug, Serialize)]
struct TimeSeriesData {
    period: String,
    data_points: Vec<DataPoint>,
}

#[derive(Debug, Serialize)]
struct DataPoint {
    timestamp: DateTime<Utc>,
    value: f64,
    metadata: Option<serde_json::Value>,
}

async fn get_task_completion_trend(
    State(state): State<AppState>,
    QueryParams(query): QueryParams<TimeSeriesQuery>,
) -> Result<Json<TimeSeriesData>, ApiError> {
    let projection = state.projections
        .read()
        .await
        .get::<TaskMetricsProjection>()
        .ok_or_else(|| ApiError::internal("Metrics projection not available"))?;
    
    let data = projection.get_completion_trend(
        query.start_date,
        query.end_date,
        query.granularity,
    ).await?;
    
    Ok(Json(data))
}
}

GraphQL Integration

For complex queries, GraphQL can be more efficient:

#![allow(unused)]
fn main() {
use async_graphql::{
    Context, Object, Schema, EmptyMutation, EmptySubscription,
    ID, Result as GraphQLResult,
};

struct QueryRoot;

#[Object]
impl QueryRoot {
    async fn task(&self, ctx: &Context<'_>, id: ID) -> GraphQLResult<Option<Task>> {
        let projection = ctx.data::<Arc<TaskProjection>>()?;
        
        Ok(projection.get_task(&id.to_string()).await?)
    }
    
    async fn tasks(
        &self,
        ctx: &Context<'_>,
        filter: Option<TaskFilter>,
        sort: Option<TaskSort>,
        pagination: Option<PaginationInput>,
    ) -> GraphQLResult<TaskConnection> {
        let projection = ctx.data::<Arc<TaskProjection>>()?;
        
        let query = build_query(filter, sort, pagination);
        let results = projection.query(query).await?;
        
        Ok(TaskConnection::from(results))
    }
    
    async fn user(&self, ctx: &Context<'_>, id: ID) -> GraphQLResult<Option<User>> {
        let projection = ctx.data::<Arc<UserProjection>>()?;
        
        Ok(projection.get_user(&id.to_string()).await?)
    }
}

// GraphQL types
#[derive(async_graphql::SimpleObject)]
struct Task {
    id: ID,
    title: String,
    description: String,
    status: TaskStatus,
    assigned_to: Option<User>,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
}

#[derive(async_graphql::InputObject)]
struct TaskFilter {
    status: Option<Vec<TaskStatus>>,
    assigned_to: Option<Vec<ID>>,
    created_after: Option<DateTime<Utc>>,
    search: Option<String>,
}

// Axum handler
async fn graphql_handler(
    State(state): State<AppState>,
    user: Option<AuthenticatedUser>,
    req: GraphQLRequest,
) -> GraphQLResponse {
    let schema = Schema::build(QueryRoot, EmptyMutation, EmptySubscription)
        .data(state.projections.clone())
        .data(user)
        .finish();
    
    schema.execute(req.into_inner()).await.into()
}
}

Caching Strategies

Response Caching

#![allow(unused)]
fn main() {
use axum::http::header::{CACHE_CONTROL, ETAG, IF_NONE_MATCH};
use sha2::{Sha256, Digest};

#[derive(Clone)]
struct CacheConfig {
    public_max_age: Duration,
    private_max_age: Duration,
    stale_while_revalidate: Duration,
}

async fn cached_query_handler<F, Fut, T>(
    headers: HeaderMap,
    cache_config: CacheConfig,
    query_fn: F,
) -> Response
where
    F: FnOnce() -> Fut,
    Fut: Future<Output = Result<T, ApiError>>,
    T: Serialize,
{
    // Execute query
    let result = match query_fn().await {
        Ok(data) => data,
        Err(e) => return e.into_response(),
    };
    
    // Serialize response
    let body = match serde_json::to_vec(&result) {
        Ok(bytes) => bytes,
        Err(_) => return ApiError::internal("Serialization failed").into_response(),
    };
    
    // Calculate ETag
    let mut hasher = Sha256::new();
    hasher.update(&body);
    let etag = format!("\"{}\"", hex::encode(hasher.finalize()));
    
    // Check If-None-Match
    if let Some(if_none_match) = headers.get(IF_NONE_MATCH) {
        if if_none_match.to_str().ok() == Some(&etag) {
            return StatusCode::NOT_MODIFIED.into_response();
        }
    }
    
    // Build response with caching headers
    Response::builder()
        .status(StatusCode::OK)
        .header(CONTENT_TYPE, "application/json")
        .header(ETAG, &etag)
        .header(
            CACHE_CONTROL,
            format!(
                "public, max-age={}, stale-while-revalidate={}",
                cache_config.public_max_age.as_secs(),
                cache_config.stale_while_revalidate.as_secs()
            )
        )
        .body(Body::from(body))
        .unwrap()
}

// Usage
async fn get_public_statistics(
    State(state): State<AppState>,
    headers: HeaderMap,
) -> Response {
    cached_query_handler(
        headers,
        CacheConfig {
            public_max_age: Duration::from_secs(300), // 5 minutes
            private_max_age: Duration::from_secs(0),
            stale_while_revalidate: Duration::from_secs(60),
        },
        || async {
            let projection = state.projections
                .read()
                .await
                .get::<PublicStatsProjection>()
                .ok_or_else(|| ApiError::internal("Stats not available"))?;
            
            projection.get_current_stats().await
        },
    ).await
}
}

Query Result Caching

#![allow(unused)]
fn main() {
use moka::future::Cache;

#[derive(Clone)]
struct QueryCache {
    cache: Cache<String, CachedResult>,
}

#[derive(Clone)]
struct CachedResult {
    data: Vec<u8>,
    cached_at: DateTime<Utc>,
    ttl: Duration,
}

impl QueryCache {
    fn new() -> Self {
        Self {
            cache: Cache::builder()
                .max_capacity(10_000)
                .time_to_live(Duration::from_secs(300))
                .build(),
        }
    }
    
    async fn get_or_compute<F, Fut, T>(
        &self,
        key: &str,
        ttl: Duration,
        compute_fn: F,
    ) -> Result<T, ApiError>
    where
        F: FnOnce() -> Fut,
        Fut: Future<Output = Result<T, ApiError>>,
        T: Serialize + DeserializeOwned,
    {
        // Check cache
        if let Some(cached) = self.cache.get(key).await {
            if Utc::now() - cached.cached_at < cached.ttl {
                return serde_json::from_slice(&cached.data)
                    .map_err(|_| ApiError::internal("Cache deserialization failed"));
            }
        }
        
        // Compute result
        let result = compute_fn().await?;
        
        // Cache result
        let data = serde_json::to_vec(&result)
            .map_err(|_| ApiError::internal("Cache serialization failed"))?;
        
        self.cache.insert(
            key.to_string(),
            CachedResult {
                data,
                cached_at: Utc::now(),
                ttl,
            }
        ).await;
        
        Ok(result)
    }
}
}

Real-time Queries with SSE

Server-Sent Events for live updates:

#![allow(unused)]
fn main() {
use axum::response::sse::{Event, Sse};
use futures::stream::Stream;
use tokio_stream::StreamExt;

async fn task_updates_stream(
    State(state): State<AppState>,
    user: AuthenticatedUser,
) -> Sse<impl Stream<Item = Result<Event, ApiError>>> {
    let stream = async_stream::stream! {
        let mut subscription = state.projections
            .read()
            .await
            .get::<TaskProjection>()
            .unwrap()
            .subscribe_to_updates(user.id)
            .await;
        
        while let Some(update) = subscription.next().await {
            let event = match update {
                TaskUpdate::Created(task) => {
                    Event::default()
                        .event("task-created")
                        .json_data(task)
                        .unwrap()
                }
                TaskUpdate::Updated(task) => {
                    Event::default()
                        .event("task-updated")
                        .json_data(task)
                        .unwrap()
                }
                TaskUpdate::Deleted(task_id) => {
                    Event::default()
                        .event("task-deleted")
                        .data(task_id)
                }
            };
            
            yield Ok(event);
        }
    };
    
    Sse::new(stream).keep_alive(
        axum::response::sse::KeepAlive::new()
            .interval(Duration::from_secs(30))
            .text("keep-alive")
    )
}
}

Query Performance Optimization

N+1 Query Prevention

#![allow(unused)]
fn main() {
// Bad: N+1 queries
async fn get_tasks_with_assignees_bad(
    projection: &TaskProjection,
) -> Result<Vec<TaskWithAssignee>, ApiError> {
    let tasks = projection.get_all_tasks().await?;
    let mut results = Vec::new();
    
    for task in tasks {
        // This makes a separate query for each task!
        let assignee = if let Some(assignee_id) = &task.assigned_to {
            projection.get_user(assignee_id).await?
        } else {
            None
        };
        
        results.push(TaskWithAssignee {
            task,
            assignee,
        });
    }
    
    Ok(results)
}

// Good: Batch loading
async fn get_tasks_with_assignees_good(
    projection: &TaskProjection,
) -> Result<Vec<TaskWithAssignee>, ApiError> {
    let tasks = projection.get_all_tasks().await?;
    
    // Collect all assignee IDs
    let assignee_ids: HashSet<_> = tasks
        .iter()
        .filter_map(|t| t.assigned_to.as_ref())
        .cloned()
        .collect();
    
    // Load all assignees in one query
    let assignees = projection
        .get_users_by_ids(assignee_ids.into_iter().collect())
        .await?;
    
    // Build results
    let assignee_map: HashMap<_, _> = assignees
        .into_iter()
        .map(|u| (u.id.clone(), u))
        .collect();
    
    Ok(tasks.into_iter().map(|task| {
        let assignee = task.assigned_to
            .as_ref()
            .and_then(|id| assignee_map.get(id))
            .cloned();
        
        TaskWithAssignee { task, assignee }
    }).collect())
}
}

Query Complexity Limits

#![allow(unused)]
fn main() {
use async_graphql::{extensions::ComplexityLimit, ValidationResult};

struct QueryComplexity;

impl QueryComplexity {
    fn calculate_complexity(query: &GraphQLQuery) -> u32 {
        // Simple heuristic: count fields and multiply by depth
        let field_count = count_fields(query);
        let max_depth = calculate_max_depth(query);
        
        field_count * max_depth
    }
}

// In GraphQL schema
let schema = Schema::build(QueryRoot, EmptyMutation, EmptySubscription)
    .extension(ComplexityLimit::new(1000)) // Max complexity
    .finish();

// For REST endpoints
#[derive(Debug)]
struct QueryComplexityGuard {
    max_items: u32,
    max_depth: u32,
}

impl QueryComplexityGuard {
    fn validate(&self, query: &AdvancedTaskQuery) -> Result<(), ApiError> {
        // Check pagination limits
        if query.limit > self.max_items {
            return Err(ApiError::bad_request(
                format!("Limit cannot exceed {}", self.max_items)
            ));
        }
        
        // Check filter complexity
        let filter_count = 
            query.status.as_ref().map(|s| s.len()).unwrap_or(0) +
            query.assigned_to.as_ref().map(|a| a.len()).unwrap_or(0);
        
        if filter_count > 100 {
            return Err(ApiError::bad_request(
                "Too many filter values"
            ));
        }
        
        Ok(())
    }
}
}

Security Considerations

Query Authorization

#![allow(unused)]
fn main() {
#[async_trait]
trait QueryAuthorizer {
    async fn can_view_task(&self, user: &AuthenticatedUser, task_id: &str) -> bool;
    async fn can_view_user_tasks(&self, user: &AuthenticatedUser, target_user_id: &str) -> bool;
    async fn can_view_statistics(&self, user: &AuthenticatedUser) -> bool;
}

struct RoleBasedAuthorizer;

#[async_trait]
impl QueryAuthorizer for RoleBasedAuthorizer {
    async fn can_view_task(&self, user: &AuthenticatedUser, task_id: &str) -> bool {
        // Admin can see all
        if user.has_role("admin") {
            return true;
        }
        
        // Others can only see their own tasks or tasks they created
        // Would need to check task details...
        true
    }
    
    async fn can_view_user_tasks(&self, user: &AuthenticatedUser, target_user_id: &str) -> bool {
        // Users can see their own tasks
        if user.id.to_string() == target_user_id {
            return true;
        }
        
        // Managers can see their team's tasks
        user.has_role("manager") || user.has_role("admin")
    }
    
    async fn can_view_statistics(&self, user: &AuthenticatedUser) -> bool {
        user.has_role("manager") || user.has_role("admin")
    }
}

// Use in handlers
async fn get_user_tasks(
    State(state): State<AppState>,
    Path(user_id): Path<String>,
    user: AuthenticatedUser,
) -> Result<Json<Vec<TaskSummary>>, ApiError> {
    // Check authorization
    if !state.authorizer.can_view_user_tasks(&user, &user_id).await {
        return Err(ApiError::forbidden("Cannot view tasks for this user"));
    }
    
    // Continue with query...
}
}

Rate Limiting

#![allow(unused)]
fn main() {
use governor::{Quota, RateLimiter};

#[derive(Clone)]
struct RateLimitConfig {
    anonymous_quota: Quota,
    authenticated_quota: Quota,
    admin_quota: Quota,
}

async fn rate_limit_middleware(
    State(limiter): State<Arc<RateLimiter<String>>>,
    user: Option<AuthenticatedUser>,
    request: Request,
    next: Next,
) -> Result<Response, ApiError> {
    let key = match &user {
        Some(u) => u.id.to_string(),
        None => request
            .headers()
            .get("x-forwarded-for")
            .and_then(|h| h.to_str().ok())
            .unwrap_or("anonymous")
            .to_string(),
    };
    
    limiter
        .check_key(&key)
        .map_err(|_| ApiError::too_many_requests("Rate limit exceeded"))?;
    
    Ok(next.run(request).await)
}
}

Testing Query Endpoints

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[tokio::test]
    async fn test_pagination() {
        let state = create_test_state_with_tasks(100).await;
        
        // First page
        let response = list_tasks(
            State(state.clone()),
            QueryParams(ListTasksQuery {
                page: 1,
                page_size: 20,
                ..Default::default()
            }),
        ).await.unwrap();
        
        assert_eq!(response.0.tasks.len(), 20);
        assert_eq!(response.0.pagination.total_items, 100);
        assert_eq!(response.0.pagination.total_pages, 5);
        
        // Last page
        let response = list_tasks(
            State(state),
            QueryParams(ListTasksQuery {
                page: 5,
                page_size: 20,
                ..Default::default()
            }),
        ).await.unwrap();
        
        assert_eq!(response.0.tasks.len(), 20);
    }
    
    #[tokio::test]
    async fn test_caching_headers() {
        let state = create_test_state().await;
        
        let response = get_public_statistics(
            State(state),
            HeaderMap::new(),
        ).await;
        
        assert_eq!(response.status(), StatusCode::OK);
        assert!(response.headers().contains_key(ETAG));
        assert!(response.headers().contains_key(CACHE_CONTROL));
        
        let cache_control = response.headers()
            .get(CACHE_CONTROL)
            .unwrap()
            .to_str()
            .unwrap();
        
        assert!(cache_control.contains("max-age=300"));
    }
}
}

Best Practices

  1. Use projections - Don’t query event streams directly
  2. Paginate results - Never return unbounded lists
  3. Cache aggressively - Read queries are perfect for caching
  4. Validate query parameters - Prevent resource exhaustion
  5. Monitor performance - Track slow queries
  6. Use appropriate protocols - REST for simple, GraphQL for complex
  7. Implement authorization - Check permissions for all queries
  8. Version your API - Queries can evolve independently

Summary

Query endpoints in EventCore applications:

  • Projection-based - Read from optimized projections
  • Performant - Caching and optimization built-in
  • Flexible - Support REST, GraphQL, and real-time
  • Secure - Authorization and rate limiting
  • Testable - Easy to test in isolation

Key patterns:

  1. Read from projections, not event streams
  2. Implement proper pagination
  3. Cache responses appropriately
  4. Validate and limit query complexity
  5. Authorize access to data
  6. Monitor query performance

Next, let’s explore Authentication and Authorization

Chapter 4.4: Authentication and Authorization

Security is critical for event-sourced systems. This chapter covers authentication (who you are) and authorization (what you can do) patterns for EventCore APIs.

Authentication Strategies

JWT Authentication

JSON Web Tokens are stateless and work well with EventCore:

#![allow(unused)]
fn main() {
use jsonwebtoken::{encode, decode, Header, Algorithm, Validation, EncodingKey, DecodingKey};
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct Claims {
    sub: String,          // Subject (user ID)
    exp: usize,           // Expiration time
    iat: usize,           // Issued at
    roles: Vec<String>,   // User roles
    permissions: Vec<String>, // Specific permissions
}

#[derive(Clone)]
struct JwtConfig {
    secret: String,
    issuer: String,
    audience: String,
    access_token_duration: Duration,
    refresh_token_duration: Duration,
}

impl JwtConfig {
    fn create_access_token(&self, user: &User) -> Result<String, ApiError> {
        let now = Utc::now();
        let exp = now + self.access_token_duration;
        
        let claims = Claims {
            sub: user.id.to_string(),
            exp: exp.timestamp() as usize,
            iat: now.timestamp() as usize,
            roles: user.roles.clone(),
            permissions: user.permissions.clone(),
        };
        
        encode(
            &Header::default(),
            &claims,
            &EncodingKey::from_secret(self.secret.as_ref()),
        )
        .map_err(|_| ApiError::internal("Failed to create token"))
    }
    
    fn validate_token(&self, token: &str) -> Result<Claims, ApiError> {
        let mut validation = Validation::new(Algorithm::HS256);
        validation.set_issuer(&[&self.issuer]);
        validation.set_audience(&[&self.audience]);
        
        decode::<Claims>(
            token,
            &DecodingKey::from_secret(self.secret.as_ref()),
            &validation,
        )
        .map(|data| data.claims)
        .map_err(|e| match e.kind() {
            jsonwebtoken::errors::ErrorKind::ExpiredSignature => {
                ApiError::unauthorized("Token expired")
            }
            _ => ApiError::unauthorized("Invalid token"),
        })
    }
}
}

Login Endpoint

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
struct LoginRequest {
    email: String,
    password: String,
}

#[derive(Debug, Serialize)]
struct LoginResponse {
    access_token: String,
    refresh_token: String,
    token_type: String,
    expires_in: u64,
}

async fn login(
    State(state): State<AppState>,
    Json(request): Json<LoginRequest>,
) -> Result<Json<LoginResponse>, ApiError> {
    // Validate credentials
    let email = Email::try_new(request.email)
        .map_err(|_| ApiError::bad_request("Invalid email"))?;
    
    // Execute authentication command
    let command = AuthenticateUser {
        email: email.clone(),
        password: Password::from(request.password),
    };
    
    let result = state.executor
        .execute(&command)
        .await
        .map_err(|_| ApiError::unauthorized("Invalid credentials"))?;
    
    // Get user from projection
    let user = state.projections
        .read()
        .await
        .get::<UserProjection>()
        .unwrap()
        .get_user_by_email(&email)
        .await?
        .ok_or_else(|| ApiError::unauthorized("Invalid credentials"))?;
    
    // Create tokens
    let access_token = state.jwt_config.create_access_token(&user)?;
    let refresh_token = state.jwt_config.create_refresh_token(&user)?;
    
    // Store refresh token (for revocation)
    let store_command = StoreRefreshToken {
        user_id: user.id.clone(),
        token_hash: hash_token(&refresh_token),
        expires_at: Utc::now() + state.jwt_config.refresh_token_duration,
    };
    
    state.executor.execute(&store_command).await?;
    
    Ok(Json(LoginResponse {
        access_token,
        refresh_token,
        token_type: "Bearer".to_string(),
        expires_in: state.jwt_config.access_token_duration.as_secs(),
    }))
}
}

Authentication Middleware

#![allow(unused)]
fn main() {
use axum::{
    extract::{FromRequestParts, Request},
    middleware::{self, Next},
    response::Response,
};

#[derive(Debug, Clone)]
pub struct AuthenticatedUser {
    pub id: UserId,
    pub roles: Vec<String>,
    pub permissions: Vec<String>,
}

#[async_trait]
impl<S> FromRequestParts<S> for AuthenticatedUser
where
    S: Send + Sync,
{
    type Rejection = ApiError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        // Get JWT config from extensions (set by middleware)
        let jwt_config = parts
            .extensions
            .get::<JwtConfig>()
            .ok_or_else(|| ApiError::internal("JWT config not found"))?;
        
        // Extract token from Authorization header
        let token = extract_bearer_token(&parts.headers)?;
        
        // Validate token
        let claims = jwt_config.validate_token(token)?;
        
        Ok(AuthenticatedUser {
            id: UserId::try_new(claims.sub)?,
            roles: claims.roles,
            permissions: claims.permissions,
        })
    }
}

fn extract_bearer_token(headers: &HeaderMap) -> Result<&str, ApiError> {
    headers
        .get(AUTHORIZATION)
        .and_then(|v| v.to_str().ok())
        .and_then(|v| v.strip_prefix("Bearer "))
        .ok_or_else(|| ApiError::unauthorized("Missing or invalid Authorization header"))
}

// Optional authentication extractor
pub struct OptionalAuth(pub Option<AuthenticatedUser>);

#[async_trait]
impl<S> FromRequestParts<S> for OptionalAuth
where
    S: Send + Sync,
{
    type Rejection = Infallible;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        Ok(OptionalAuth(
            AuthenticatedUser::from_request_parts(parts, state)
                .await
                .ok()
        ))
    }
}
}

Authorization Patterns

Role-Based Access Control (RBAC)

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum Role {
    Admin,
    Manager,
    Employee,
    Guest,
}

impl AuthenticatedUser {
    pub fn has_role(&self, role: &str) -> bool {
        self.roles.contains(&role.to_string())
    }
    
    pub fn has_any_role(&self, roles: &[&str]) -> bool {
        roles.iter().any(|role| self.has_role(role))
    }
    
    pub fn has_all_roles(&self, roles: &[&str]) -> bool {
        roles.iter().all(|role| self.has_role(role))
    }
}

// Authorization guard
async fn require_role(
    user: &AuthenticatedUser,
    role: &str,
) -> Result<(), ApiError> {
    if !user.has_role(role) {
        return Err(ApiError::forbidden(
            format!("Requires {} role", role)
        ));
    }
    Ok(())
}

// In handlers
async fn admin_endpoint(
    user: AuthenticatedUser,
    // other params...
) -> Result<Json<AdminData>, ApiError> {
    require_role(&user, "admin").await?;
    
    // Admin-only logic...
}
}

Permission-Based Access Control

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum Permission {
    // Task permissions
    CreateTask,
    ReadTask,
    UpdateTask,
    DeleteTask,
    AssignTask,
    
    // User permissions
    CreateUser,
    ReadUser,
    UpdateUser,
    DeleteUser,
    
    // Admin permissions
    ViewAnalytics,
    ManageSystem,
}

impl AuthenticatedUser {
    pub fn has_permission(&self, permission: &str) -> bool {
        self.permissions.contains(&permission.to_string())
    }
    
    pub fn can(&self, action: Permission) -> bool {
        self.has_permission(&action.to_string())
    }
}

// Permission checking in handlers
async fn create_task_handler(
    user: AuthenticatedUser,
    Json(request): Json<CreateTaskRequest>,
) -> Result<Json<CreateTaskResponse>, ApiError> {
    if !user.can(Permission::CreateTask) {
        return Err(ApiError::forbidden("Cannot create tasks"));
    }
    
    // Create task...
}
}

Resource-Based Access Control

#![allow(unused)]
fn main() {
#[async_trait]
trait ResourceAuthorizer {
    async fn can_read(&self, user: &AuthenticatedUser, resource_id: &str) -> bool;
    async fn can_write(&self, user: &AuthenticatedUser, resource_id: &str) -> bool;
    async fn can_delete(&self, user: &AuthenticatedUser, resource_id: &str) -> bool;
}

struct TaskAuthorizer {
    projection: Arc<TaskProjection>,
}

#[async_trait]
impl ResourceAuthorizer for TaskAuthorizer {
    async fn can_read(&self, user: &AuthenticatedUser, task_id: &str) -> bool {
        // Admins can read all
        if user.has_role("admin") {
            return true;
        }
        
        // Check if user owns or is assigned to task
        if let Ok(Some(task)) = self.projection.get_task(task_id).await {
            return task.created_by == user.id || 
                   task.assigned_to == Some(user.id.clone());
        }
        
        false
    }
    
    async fn can_write(&self, user: &AuthenticatedUser, task_id: &str) -> bool {
        // Similar logic for write permissions
        if user.has_role("admin") || user.has_role("manager") {
            return true;
        }
        
        // Check ownership or assignment
        if let Ok(Some(task)) = self.projection.get_task(task_id).await {
            return task.assigned_to == Some(user.id.clone());
        }
        
        false
    }
    
    async fn can_delete(&self, user: &AuthenticatedUser, task_id: &str) -> bool {
        // Only admins and creators can delete
        if user.has_role("admin") {
            return true;
        }
        
        if let Ok(Some(task)) = self.projection.get_task(task_id).await {
            return task.created_by == user.id;
        }
        
        false
    }
}
}

Command Authorization

Embed authorization in commands:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct UpdateTask {
    #[stream]
    task_id: StreamId,
    
    title: Option<TaskTitle>,
    description: Option<TaskDescription>,
    
    // Who is making the change
    updated_by: UserId,
}

impl CommandLogic for UpdateTask {
    // ... other implementations
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Check authorization within command
        require!(
            state.can_user_update_task(&self.updated_by),
            "User {} cannot update task {}",
            self.updated_by,
            self.task_id
        );
        
        // Proceed with update...
    }
}

// State includes authorization data
impl TaskState {
    fn can_user_update_task(&self, user_id: &UserId) -> bool {
        // Task creator can always update
        if self.created_by == *user_id {
            return true;
        }
        
        // Assigned user can update
        if self.assigned_to == Some(user_id.clone()) {
            return true;
        }
        
        // Check roles (would need to be passed in state)
        false
    }
}
}

API Key Authentication

For service-to-service communication:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct ApiKey {
    key: String,
    service_name: String,
    permissions: Vec<String>,
    rate_limit: Option<u32>,
}

#[async_trait]
impl<S> FromRequestParts<S> for ApiKey
where
    S: Send + Sync,
{
    type Rejection = ApiError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        let key = parts
            .headers
            .get("X-API-Key")
            .and_then(|v| v.to_str().ok())
            .ok_or_else(|| ApiError::unauthorized("Missing API key"))?;
        
        // Look up API key (from cache/database)
        let api_key = validate_api_key(key).await?;
        
        Ok(api_key)
    }
}

async fn validate_api_key(key: &str) -> Result<ApiKey, ApiError> {
    // Hash the key for lookup
    let key_hash = hash_api_key(key);
    
    // Look up in projection/cache
    let api_key = get_api_key_by_hash(&key_hash)
        .await?
        .ok_or_else(|| ApiError::unauthorized("Invalid API key"))?;
    
    // Check if expired
    if api_key.expires_at < Utc::now() {
        return Err(ApiError::unauthorized("API key expired"));
    }
    
    Ok(api_key)
}
}

OAuth2 Integration

For third-party authentication:

#![allow(unused)]
fn main() {
use oauth2::{
    AuthorizationCode, AuthUrl, ClientId, ClientSecret, CsrfToken,
    PkceCodeChallenge, RedirectUrl, TokenResponse, TokenUrl,
};

#[derive(Clone)]
struct OAuth2Config {
    client_id: ClientId,
    client_secret: ClientSecret,
    auth_url: AuthUrl,
    token_url: TokenUrl,
    redirect_url: RedirectUrl,
}

async fn oauth_login(
    State(oauth): State<OAuth2Config>,
    Query(params): Query<HashMap<String, String>>,
) -> Result<Redirect, ApiError> {
    let client = BasicClient::new(
        oauth.client_id,
        Some(oauth.client_secret),
        oauth.auth_url,
        Some(oauth.token_url),
    )
    .set_redirect_uri(oauth.redirect_url);
    
    // Generate PKCE challenge
    let (pkce_challenge, pkce_verifier) = PkceCodeChallenge::new_random_sha256();
    
    // Generate authorization URL
    let (auth_url, csrf_token) = client
        .authorize_url(CsrfToken::new_random)
        .add_scope(Scope::new("read:user".to_string()))
        .set_pkce_challenge(pkce_challenge)
        .url();
    
    // Store CSRF token and PKCE verifier (in session/cache)
    store_oauth_state(&csrf_token, &pkce_verifier).await?;
    
    Ok(Redirect::to(auth_url.as_str()))
}

async fn oauth_callback(
    State(state): State<AppState>,
    Query(params): Query<OAuthCallbackParams>,
) -> Result<Json<LoginResponse>, ApiError> {
    // Verify CSRF token
    let (stored_csrf, pkce_verifier) = get_oauth_state(&params.state).await?;
    
    if stored_csrf != params.state {
        return Err(ApiError::bad_request("Invalid state parameter"));
    }
    
    // Exchange code for token
    let token_result = exchange_code_for_token(
        &state.oauth_config,
        &params.code,
        &pkce_verifier,
    ).await?;
    
    // Get user info from provider
    let user_info = fetch_user_info(&token_result.access_token()).await?;
    
    // Create or update user in EventCore
    let command = CreateOrUpdateOAuthUser {
        provider: "github".to_string(),
        provider_user_id: user_info.id,
        email: user_info.email,
        name: user_info.name,
    };
    
    state.executor.execute(&command).await?;
    
    // Create JWT tokens
    let user = get_user_by_email(&user_info.email).await?;
    let access_token = state.jwt_config.create_access_token(&user)?;
    
    Ok(Json(LoginResponse {
        access_token,
        // ... other fields
    }))
}
}

Session Management

Track active sessions:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct CreateSession {
    #[stream]
    user_id: StreamId,
    
    #[stream]
    session_id: StreamId,
    
    ip_address: IpAddr,
    user_agent: String,
    expires_at: DateTime<Utc>,
}

#[derive(Command, Clone)]
struct RevokeSession {
    #[stream]
    session_id: StreamId,
    
    #[stream]
    user_id: StreamId,
    
    reason: RevocationReason,
}

// Session validation middleware
async fn validate_session(
    State(state): State<AppState>,
    user: AuthenticatedUser,
    request: Request,
    next: Next,
) -> Result<Response, ApiError> {
    let session_id = extract_session_id(&request)?;
    
    // Check if session is valid
    let session = state.projections
        .read()
        .await
        .get::<SessionProjection>()
        .unwrap()
        .get_session(&session_id)
        .await?
        .ok_or_else(|| ApiError::unauthorized("Invalid session"))?;
    
    // Verify session belongs to user
    if session.user_id != user.id {
        return Err(ApiError::unauthorized("Session mismatch"));
    }
    
    // Check expiration
    if session.expires_at < Utc::now() {
        return Err(ApiError::unauthorized("Session expired"));
    }
    
    // Check if revoked
    if session.revoked {
        return Err(ApiError::unauthorized("Session revoked"));
    }
    
    Ok(next.run(request).await)
}
}

Security Headers

Add security headers to all responses:

#![allow(unused)]
fn main() {
async fn security_headers_middleware(
    request: Request,
    next: Next,
) -> Response {
    let mut response = next.run(request).await;
    
    let headers = response.headers_mut();
    
    // Prevent clickjacking
    headers.insert(
        "X-Frame-Options",
        HeaderValue::from_static("DENY"),
    );
    
    // XSS protection
    headers.insert(
        "X-Content-Type-Options",
        HeaderValue::from_static("nosniff"),
    );
    
    // CSP
    headers.insert(
        "Content-Security-Policy",
        HeaderValue::from_static("default-src 'self'"),
    );
    
    // HSTS
    headers.insert(
        "Strict-Transport-Security",
        HeaderValue::from_static("max-age=31536000; includeSubDomains"),
    );
    
    response
}
}

Testing Authentication

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    fn create_test_token(user_id: &str, roles: Vec<&str>) -> String {
        let claims = Claims {
            sub: user_id.to_string(),
            exp: (Utc::now() + Duration::hours(1)).timestamp() as usize,
            iat: Utc::now().timestamp() as usize,
            roles: roles.into_iter().map(|s| s.to_string()).collect(),
            permissions: vec![],
        };
        
        encode(
            &Header::default(),
            &claims,
            &EncodingKey::from_secret(TEST_SECRET.as_ref()),
        ).unwrap()
    }
    
    #[tokio::test]
    async fn test_authentication_required() {
        let app = create_test_app();
        
        // No token
        let response = app
            .oneshot(
                Request::builder()
                    .uri("/api/v1/protected")
                    .body(Body::empty())
                    .unwrap(),
            )
            .await
            .unwrap();
        
        assert_eq!(response.status(), StatusCode::UNAUTHORIZED);
    }
    
    #[tokio::test]
    async fn test_role_authorization() {
        let app = create_test_app();
        
        // User token without admin role
        let token = create_test_token("user123", vec!["user"]);
        
        let response = app
            .oneshot(
                Request::builder()
                    .uri("/api/v1/admin/users")
                    .header("Authorization", format!("Bearer {}", token))
                    .body(Body::empty())
                    .unwrap(),
            )
            .await
            .unwrap();
        
        assert_eq!(response.status(), StatusCode::FORBIDDEN);
    }
}
}

Best Practices

  1. Use HTTPS always - Never send tokens over unencrypted connections
  2. Short token lifetimes - Access tokens should expire quickly
  3. Refresh tokens - Use refresh tokens for long-lived sessions
  4. Store hashes - Never store plaintext tokens or passwords
  5. Audit everything - Log all authentication/authorization events
  6. Principle of least privilege - Grant minimal necessary permissions
  7. Defense in depth - Layer multiple security mechanisms
  8. Regular reviews - Audit permissions and access regularly

Summary

Authentication and authorization in EventCore:

  • Flexible strategies - JWT, API keys, OAuth2
  • Strong typing - Type-safe user and permission models
  • Event sourced - Authentication events provide audit trail
  • Performance - Caching for fast authorization checks
  • Testable - Easy to test security rules

Key patterns:

  1. Authenticate early in the request pipeline
  2. Embed authorization in commands
  3. Use projections for fast permission lookups
  4. Audit all security events
  5. Test security thoroughly

Next, let’s explore API Versioning

Chapter 4.5: API Versioning

APIs evolve over time. This chapter covers strategies for versioning your EventCore APIs while maintaining backward compatibility and providing a smooth migration path for clients.

Versioning Strategies

URL Path Versioning

The most explicit and commonly used approach:

#![allow(unused)]
fn main() {
use axum::{Router, routing::post};

fn create_versioned_routes() -> Router {
    Router::new()
        // Version 1 endpoints
        .nest("/api/v1", v1_routes())
        // Version 2 endpoints
        .nest("/api/v2", v2_routes())
        // Latest version alias (optional)
        .nest("/api/latest", v2_routes())
}

fn v1_routes() -> Router {
    Router::new()
        .route("/tasks", post(v1::create_task))
        .route("/tasks/:id", get(v1::get_task))
        .route("/tasks/:id/assign", post(v1::assign_task))
}

fn v2_routes() -> Router {
    Router::new()
        .route("/tasks", post(v2::create_task))
        .route("/tasks/:id", get(v2::get_task))
        .route("/tasks/:id/assign", post(v2::assign_task))
        // New in v2
        .route("/tasks/:id/subtasks", get(v2::get_subtasks))
        .route("/tasks/bulk", post(v2::bulk_create_tasks))
}
}

Header-Based Versioning

More RESTful but less discoverable:

#![allow(unused)]
fn main() {
use axum::{
    extract::{FromRequestParts, Request},
    http::HeaderValue,
};

#[derive(Debug, Clone, Copy)]
enum ApiVersion {
    V1,
    V2,
}

impl Default for ApiVersion {
    fn default() -> Self {
        ApiVersion::V2 // Latest version
    }
}

#[async_trait]
impl<S> FromRequestParts<S> for ApiVersion
where
    S: Send + Sync,
{
    type Rejection = ApiError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        _state: &S,
    ) -> Result<Self, Self::Rejection> {
        let version = parts
            .headers
            .get("API-Version")
            .and_then(|v| v.to_str().ok())
            .map(|v| match v {
                "1" | "v1" => ApiVersion::V1,
                "2" | "v2" => ApiVersion::V2,
                _ => ApiVersion::default(),
            })
            .unwrap_or_default();
        
        Ok(version)
    }
}

// Use in handlers
async fn create_task(
    version: ApiVersion,
    Json(request): Json<serde_json::Value>,
) -> Result<Response, ApiError> {
    match version {
        ApiVersion::V1 => v1::create_task_handler(request).await,
        ApiVersion::V2 => v2::create_task_handler(request).await,
    }
}
}

Content Type Versioning

Using vendor-specific media types:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
enum ContentVersion {
    V1,
    V2,
}

impl ContentVersion {
    fn from_content_type(content_type: &str) -> Self {
        if content_type.contains("vnd.eventcore.v1+json") {
            ContentVersion::V1
        } else if content_type.contains("vnd.eventcore.v2+json") {
            ContentVersion::V2
        } else {
            ContentVersion::V2 // Default to latest
        }
    }
    
    fn to_content_type(&self) -> &'static str {
        match self {
            ContentVersion::V1 => "application/vnd.eventcore.v1+json",
            ContentVersion::V2 => "application/vnd.eventcore.v2+json",
        }
    }
}
}

Request/Response Evolution

Backward Compatible Changes

These changes don’t require a new version:

#![allow(unused)]
fn main() {
// Original V1 request
#[derive(Debug, Deserialize)]
struct CreateTaskRequestV1 {
    title: String,
    description: String,
}

// Backward compatible V1 with optional field
#[derive(Debug, Deserialize)]
struct CreateTaskRequestV1Enhanced {
    title: String,
    description: String,
    #[serde(default)]
    priority: Option<Priority>, // New optional field
}

// Response expansion is also backward compatible
#[derive(Debug, Serialize)]
struct TaskResponseV1 {
    id: String,
    title: String,
    description: String,
    created_at: DateTime<Utc>,
    #[serde(skip_serializing_if = "Option::is_none")]
    priority: Option<Priority>, // New optional field
}
}

Breaking Changes

These require a new API version:

#![allow(unused)]
fn main() {
mod v1 {
    #[derive(Debug, Deserialize)]
    struct CreateTaskRequest {
        title: String,
        description: String,
        assigned_to: String, // Single assignee
    }
}

mod v2 {
    #[derive(Debug, Deserialize)]
    struct CreateTaskRequest {
        title: String,
        description: String,
        assigned_to: Vec<String>, // Breaking: Now multiple assignees
        #[serde(default)]
        tags: Vec<String>, // New field
    }
}

// Adapter to support both versions
async fn create_task_adapter(
    version: ApiVersion,
    Json(value): Json<serde_json::Value>,
) -> Result<Json<TaskResponse>, ApiError> {
    match version {
        ApiVersion::V1 => {
            let request: v1::CreateTaskRequest = serde_json::from_value(value)?;
            // Convert V1 to internal command
            let command = CreateTask {
                title: request.title,
                description: request.description,
                assigned_to: vec![request.assigned_to], // Adapt single to vec
                tags: vec![], // Default for V1
            };
            execute_create_task(command).await
        }
        ApiVersion::V2 => {
            let request: v2::CreateTaskRequest = serde_json::from_value(value)?;
            let command = CreateTask {
                title: request.title,
                description: request.description,
                assigned_to: request.assigned_to,
                tags: request.tags,
            };
            execute_create_task(command).await
        }
    }
}
}

Command Versioning

Version commands to handle different API versions:

#![allow(unused)]
fn main() {
// Internal command representation (latest version)
#[derive(Command, Clone)]
struct CreateTask {
    #[stream]
    task_id: StreamId,
    
    title: TaskTitle,
    description: TaskDescription,
    assigned_to: Vec<UserId>,
    tags: Vec<Tag>,
    priority: Priority,
}

// Version-specific command builders
mod command_builders {
    use super::*;
    
    pub fn from_v1_request(req: v1::CreateTaskRequest) -> Result<CreateTask, ApiError> {
        Ok(CreateTask {
            task_id: StreamId::from(format!("task-{}", TaskId::new())),
            title: TaskTitle::try_new(req.title)?,
            description: TaskDescription::try_new(req.description)?,
            assigned_to: vec![UserId::try_new(req.assigned_to)?],
            tags: vec![], // V1 doesn't support tags
            priority: Priority::Normal, // Default for V1
        })
    }
    
    pub fn from_v2_request(req: v2::CreateTaskRequest) -> Result<CreateTask, ApiError> {
        Ok(CreateTask {
            task_id: StreamId::from(format!("task-{}", TaskId::new())),
            title: TaskTitle::try_new(req.title)?,
            description: TaskDescription::try_new(req.description)?,
            assigned_to: req.assigned_to
                .into_iter()
                .map(|a| UserId::try_new(a))
                .collect::<Result<Vec<_>, _>>()?,
            tags: req.tags
                .into_iter()
                .map(|t| Tag::try_new(t))
                .collect::<Result<Vec<_>, _>>()?,
            priority: req.priority.unwrap_or(Priority::Normal),
        })
    }
}
}

Response Transformation

Transform internal data to version-specific responses:

#![allow(unused)]
fn main() {
// Internal projection data
#[derive(Debug, Clone)]
struct TaskData {
    id: TaskId,
    title: String,
    description: String,
    assigned_to: Vec<UserId>,
    tags: Vec<Tag>,
    priority: Priority,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
    subtasks: Vec<SubtaskData>, // Added in V2
}

// Response transformers
mod response_transformers {
    use super::*;
    
    pub fn to_v1_response(task: TaskData) -> v1::TaskResponse {
        v1::TaskResponse {
            id: task.id.to_string(),
            title: task.title,
            description: task.description,
            assigned_to: task.assigned_to.first()
                .map(|u| u.to_string())
                .unwrap_or_default(), // V1 only supports single assignee
            created_at: task.created_at,
            updated_at: task.updated_at,
        }
    }
    
    pub fn to_v2_response(task: TaskData) -> v2::TaskResponse {
        v2::TaskResponse {
            id: task.id.to_string(),
            title: task.title,
            description: task.description,
            assigned_to: task.assigned_to
                .into_iter()
                .map(|u| u.to_string())
                .collect(),
            tags: task.tags
                .into_iter()
                .map(|t| t.to_string())
                .collect(),
            priority: task.priority,
            created_at: task.created_at,
            updated_at: task.updated_at,
            subtask_count: task.subtasks.len(),
            _links: v2::Links {
                self_: format!("/api/v2/tasks/{}", task.id),
                subtasks: format!("/api/v2/tasks/{}/subtasks", task.id),
            },
        }
    }
}
}

Deprecation Strategy

Communicate deprecation clearly:

#![allow(unused)]
fn main() {
async fn deprecated_middleware(
    request: Request,
    next: Next,
) -> Response {
    let mut response = next.run(request).await;
    
    // Add deprecation headers
    response.headers_mut().insert(
        "Sunset",
        HeaderValue::from_static("Sat, 31 Dec 2024 23:59:59 GMT"),
    );
    
    response.headers_mut().insert(
        "Deprecation",
        HeaderValue::from_static("true"),
    );
    
    response.headers_mut().insert(
        "Link",
        HeaderValue::from_static(
            "</api/v2/docs>; rel=\"successor-version\""
        ),
    );
    
    response
}

// Apply to V1 routes
let v1_routes = Router::new()
    .route("/tasks", post(v1::create_task))
    .layer(middleware::from_fn(deprecated_middleware));
}

Deprecation Notices in Responses

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct DeprecatedResponse<T> {
    #[serde(flatten)]
    data: T,
    _deprecation: DeprecationNotice,
}

#[derive(Debug, Serialize)]
struct DeprecationNotice {
    message: &'static str,
    sunset_date: &'static str,
    migration_guide: &'static str,
}

impl<T> DeprecatedResponse<T> {
    fn new(data: T) -> Self {
        Self {
            data,
            _deprecation: DeprecationNotice {
                message: "This API version is deprecated",
                sunset_date: "2024-12-31",
                migration_guide: "https://docs.eventcore.io/migration/v1-to-v2",
            },
        }
    }
}
}

Version Discovery

Help clients discover available versions:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct ApiVersionInfo {
    version: String,
    status: VersionStatus,
    deprecated: bool,
    sunset_date: Option<String>,
    endpoints: Vec<EndpointInfo>,
}

#[derive(Debug, Serialize)]
#[serde(rename_all = "lowercase")]
enum VersionStatus {
    Stable,
    Beta,
    Deprecated,
    Sunset,
}

async fn get_api_versions() -> Json<Vec<ApiVersionInfo>> {
    Json(vec![
        ApiVersionInfo {
            version: "v1".to_string(),
            status: VersionStatus::Deprecated,
            deprecated: true,
            sunset_date: Some("2024-12-31".to_string()),
            endpoints: vec![
                EndpointInfo {
                    path: "/api/v1/tasks",
                    methods: vec!["GET", "POST"],
                },
                // ... other endpoints
            ],
        },
        ApiVersionInfo {
            version: "v2".to_string(),
            status: VersionStatus::Stable,
            deprecated: false,
            sunset_date: None,
            endpoints: vec![
                EndpointInfo {
                    path: "/api/v2/tasks",
                    methods: vec!["GET", "POST"],
                },
                EndpointInfo {
                    path: "/api/v2/tasks/bulk",
                    methods: vec!["POST"],
                },
                // ... other endpoints
            ],
        },
    ])
}
}

Migration Support

Help clients migrate between versions:

#![allow(unused)]
fn main() {
// Migration endpoint that accepts V1 format and returns V2
async fn migrate_task_format(
    Json(v1_task): Json<v1::TaskResponse>,
) -> Result<Json<v2::TaskResponse>, ApiError> {
    // Transform V1 to V2 format
    let v2_task = v2::TaskResponse {
        id: v1_task.id,
        title: v1_task.title,
        description: v1_task.description,
        assigned_to: vec![v1_task.assigned_to], // Convert single to array
        tags: vec![], // Default empty
        priority: Priority::Normal, // Default
        created_at: v1_task.created_at,
        updated_at: v1_task.updated_at,
        subtask_count: 0, // Default
        _links: v2::Links {
            self_: format!("/api/v2/tasks/{}", v1_task.id),
            subtasks: format!("/api/v2/tasks/{}/subtasks", v1_task.id),
        },
    };
    
    Ok(Json(v2_task))
}

// Bulk migration endpoint
async fn migrate_tasks_bulk(
    Json(request): Json<BulkMigrationRequest>,
) -> Result<Json<BulkMigrationResponse>, ApiError> {
    let mut migrated = Vec::new();
    let mut errors = Vec::new();
    
    for task_id in request.task_ids {
        match migrate_single_task(&task_id).await {
            Ok(task) => migrated.push(task),
            Err(e) => errors.push(MigrationError {
                task_id,
                error: e.to_string(),
            }),
        }
    }
    
    Ok(Json(BulkMigrationResponse {
        migrated_count: migrated.len(),
        error_count: errors.len(),
        errors: if errors.is_empty() { None } else { Some(errors) },
    }))
}
}

Testing Multiple Versions

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[tokio::test]
    async fn test_v1_compatibility() {
        let app = create_app();
        
        // V1 request format
        let v1_request = serde_json::json!({
            "title": "Test Task",
            "description": "Test Description",
            "assigned_to": "user123"
        });
        
        let response = app
            .clone()
            .oneshot(
                Request::builder()
                    .uri("/api/v1/tasks")
                    .method("POST")
                    .header("Content-Type", "application/json")
                    .body(Body::from(v1_request.to_string()))
                    .unwrap(),
            )
            .await
            .unwrap();
        
        assert_eq!(response.status(), StatusCode::CREATED);
        
        // Verify deprecation headers
        assert_eq!(
            response.headers().get("Deprecation").unwrap(),
            "true"
        );
    }
    
    #[tokio::test]
    async fn test_v2_enhancements() {
        let app = create_app();
        
        // V2 request with new features
        let v2_request = serde_json::json!({
            "title": "Test Task",
            "description": "Test Description",
            "assigned_to": ["user123", "user456"],
            "tags": ["urgent", "backend"],
            "priority": "high"
        });
        
        let response = app
            .oneshot(
                Request::builder()
                    .uri("/api/v2/tasks")
                    .method("POST")
                    .header("Content-Type", "application/json")
                    .body(Body::from(v2_request.to_string()))
                    .unwrap(),
            )
            .await
            .unwrap();
        
        assert_eq!(response.status(), StatusCode::CREATED);
        
        let body: v2::TaskResponse = serde_json::from_slice(
            &hyper::body::to_bytes(response.into_body()).await.unwrap()
        ).unwrap();
        
        assert_eq!(body.assigned_to.len(), 2);
        assert_eq!(body.tags.len(), 2);
    }
    
    #[tokio::test]
    async fn test_version_negotiation() {
        let app = create_app();
        
        // Test header-based versioning
        let response = app
            .clone()
            .oneshot(
                Request::builder()
                    .uri("/api/tasks/123")
                    .header("API-Version", "v1")
                    .body(Body::empty())
                    .unwrap(),
            )
            .await
            .unwrap();
        
        // Should return V1 format
        let body: v1::TaskResponse = serde_json::from_slice(
            &hyper::body::to_bytes(response.into_body()).await.unwrap()
        ).unwrap();
        
        assert!(body.assigned_to.is_string()); // V1 uses string
    }
}
}

Documentation

Generate version-specific documentation:

#![allow(unused)]
fn main() {
use utoipa::{OpenApi, ToSchema};

#[derive(OpenApi)]
#[openapi(
    paths(
        v1::create_task,
        v1::get_task,
    ),
    components(
        schemas(v1::CreateTaskRequest, v1::TaskResponse)
    ),
    tags(
        (name = "tasks", description = "Task management API v1")
    ),
    info(
        title = "EventCore API v1",
        version = "1.0.0",
        description = "Legacy API version - deprecated"
    )
)]
struct ApiDocV1;

#[derive(OpenApi)]
#[openapi(
    paths(
        v2::create_task,
        v2::get_task,
        v2::bulk_create_tasks,
    ),
    components(
        schemas(v2::CreateTaskRequest, v2::TaskResponse)
    ),
    tags(
        (name = "tasks", description = "Task management API v2")
    ),
    info(
        title = "EventCore API v2",
        version = "2.0.0",
        description = "Current stable API version"
    )
)]
struct ApiDocV2;

// Serve version-specific docs
async fn serve_api_docs(version: ApiVersion) -> impl IntoResponse {
    match version {
        ApiVersion::V1 => Json(ApiDocV1::openapi()),
        ApiVersion::V2 => Json(ApiDocV2::openapi()),
    }
}
}

Best Practices

  1. Plan for versioning from day one - Even if you start with v1
  2. Use semantic versioning - Major.Minor.Patch
  3. Maintain backward compatibility - When possible
  4. Communicate changes clearly - Use headers and documentation
  5. Set deprecation timelines - Give clients time to migrate
  6. Version at the right level - Not every change needs a new version
  7. Test all versions - Maintain test suites for each supported version
  8. Monitor version usage - Track which versions clients use

Summary

API versioning in EventCore applications:

  • Multiple strategies - URL, header, content-type versioning
  • Smooth migration - Tools to help clients upgrade
  • Clear deprecation - Sunset dates and migration guides
  • Version discovery - Clients can explore available versions
  • Backward compatibility - Maintain old versions gracefully

Key patterns:

  1. Choose a versioning strategy and stick to it
  2. Transform between versions at API boundaries
  3. Keep internal representations version-agnostic
  4. Communicate deprecation clearly
  5. Provide migration tools and guides
  6. Test all supported versions

Congratulations! You’ve completed Part 4. Continue to Part 5: Advanced Topics

Part 5: Advanced Topics

This part covers advanced EventCore patterns and techniques for building sophisticated event-sourced systems. These topics build on the foundations from previous parts.

Chapters in This Part

  1. Schema Evolution - Evolving events and commands over time
  2. Event Versioning - Managing event format changes
  3. Long-Running Processes - Sagas and process managers
  4. Distributed Systems - Multi-service event sourcing
  5. Performance Optimization - Scaling EventCore applications

What You’ll Learn

  • Handle schema changes gracefully
  • Version events and commands safely
  • Implement complex business processes
  • Scale across multiple services
  • Optimize for high performance

Prerequisites

  • Completed Parts 1-4
  • Production experience with EventCore recommended
  • Understanding of distributed systems concepts helpful

Complexity Level

These topics are advanced and assume solid understanding of event sourcing principles and EventCore fundamentals.

Time to Complete

  • Reading: ~60 minutes
  • With implementation: ~4 hours

Ready for advanced topics? Let’s start with Schema Evolution

Chapter 5.1: Schema Evolution

Schema evolution is the process of changing event and command structures over time while maintaining backward compatibility. EventCore provides powerful tools for handling schema changes gracefully.

The Challenge

Your system evolves. Business requirements change. Data structures need to adapt. But in event sourcing, you can never change historical events - they’re immutable facts about what happened.

#![allow(unused)]
fn main() {
// Day 1: Simple user registration
#[derive(Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email: String,
}

// 6 months later: Need more fields
#[derive(Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email: String,
    // New fields - but old events don't have them!
    first_name: String,
    last_name: String,
    preferences: UserPreferences,
}
}

EventCore’s Schema Evolution Approach

EventCore uses a combination of:

  1. Serde defaults - Handle missing fields gracefully
  2. Event versioning - Explicit version tracking
  3. Migration functions - Transform old formats to new
  4. Schema registry - Central type management

Backward Compatible Changes

These changes don’t break existing events:

Adding Optional Fields

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email: String,
    
    // New optional fields with defaults
    #[serde(default)]
    first_name: Option<String>,
    
    #[serde(default)]
    last_name: Option<String>,
    
    #[serde(default)]
    preferences: UserPreferences,
}

impl Default for UserPreferences {
    fn default() -> Self {
        Self {
            newsletter: false,
            notifications: true,
            theme: Theme::Light,
        }
    }
}
}

Adding Fields with Sensible Defaults

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
struct OrderPlaced {
    order_id: OrderId,
    customer_id: CustomerId,
    items: Vec<OrderItem>,
    
    // New field with computed default
    #[serde(default = "default_currency")]
    currency: Currency,
    
    // New field with timestamp default
    #[serde(default = "Utc::now")]
    placed_at: DateTime<Utc>,
}

fn default_currency() -> Currency {
    Currency::USD
}
}

Adding Enum Variants

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum PaymentMethod {
    CreditCard { last_four: String },
    BankTransfer { account: String },
    PayPal { email: String },
    
    // New variants - old events still deserialize
    ApplePay { device_id: String },
    GooglePay { account_id: String },
    
    // Unknown variant fallback
    #[serde(other)]
    Unknown,
}
}

Breaking Changes

These require explicit versioning:

Removing Fields

#![allow(unused)]
fn main() {
// V1: Has deprecated field
#[derive(Debug, Serialize, Deserialize)]
struct UserRegisteredV1 {
    user_id: UserId,
    email: String,
    username: String, // Being removed
}

// V2: Field removed
#[derive(Debug, Serialize, Deserialize)]
struct UserRegisteredV2 {
    user_id: UserId,
    email: String,
    // username removed - breaking change!
}
}

Changing Field Types

#![allow(unused)]
fn main() {
// V1: String user ID
#[derive(Debug, Serialize, Deserialize)]
struct UserRegisteredV1 {
    user_id: String, // String
    email: String,
}

// V2: Structured user ID
#[derive(Debug, Serialize, Deserialize)]
struct UserRegisteredV2 {
    user_id: UserId, // Custom type - breaking change!
    email: String,
}
}

Restructuring Data

#![allow(unused)]
fn main() {
// V1: Flat structure
#[derive(Debug, Serialize, Deserialize)]
struct OrderPlacedV1 {
    order_id: OrderId,
    billing_street: String,
    billing_city: String,
    billing_state: String,
    shipping_street: String,
    shipping_city: String,
    shipping_state: String,
}

// V2: Nested structure
#[derive(Debug, Serialize, Deserialize)]
struct OrderPlacedV2 {
    order_id: OrderId,
    billing_address: Address,  // Restructured - breaking change!
    shipping_address: Address,
}
}

Versioned Events

EventCore supports explicit event versioning:

#![allow(unused)]
fn main() {
use eventcore::serialization::VersionedEvent;

#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "version")]
enum UserRegisteredVersioned {
    #[serde(rename = "1")]
    V1 {
        user_id: String,
        email: String,
        username: String,
    },
    
    #[serde(rename = "2")]
    V2 {
        user_id: UserId,
        email: String,
        first_name: String,
        last_name: String,
    },
    
    #[serde(rename = "3")]
    V3 {
        user_id: UserId,
        email: String,
        profile: UserProfile, // Further evolution
    },
}

impl VersionedEvent for UserRegisteredVersioned {
    const EVENT_TYPE: &'static str = "UserRegistered";
    
    fn current_version() -> u32 {
        3
    }
    
    fn migrate_to_current(self) -> Self {
        match self {
            UserRegisteredVersioned::V1 { user_id, email, username } => {
                // V1 → V2: Convert string ID, extract names from username
                let (first_name, last_name) = split_username(&username);
                let user_id = UserId::try_new(user_id).unwrap_or_else(|_| UserId::new());
                
                UserRegisteredVersioned::V2 {
                    user_id,
                    email,
                    first_name,
                    last_name,
                }
            }
            UserRegisteredVersioned::V2 { user_id, email, first_name, last_name } => {
                // V2 → V3: Create profile from names
                UserRegisteredVersioned::V3 {
                    user_id,
                    email,
                    profile: UserProfile {
                        first_name,
                        last_name,
                        bio: None,
                        avatar_url: None,
                    },
                }
            }
            v3 => v3, // Already current version
        }
    }
}
}

Migration Functions

For complex transformations, use migration functions:

#![allow(unused)]
fn main() {
use eventcore::serialization::{Migration, MigrationError};

struct UserRegisteredV1ToV2;

impl Migration<UserRegisteredV1, UserRegisteredV2> for UserRegisteredV1ToV2 {
    fn migrate(&self, v1: UserRegisteredV1) -> Result<UserRegisteredV2, MigrationError> {
        // Complex migration logic
        let user_id = parse_legacy_user_id(&v1.user_id)?;
        let (first_name, last_name) = extract_names_from_username(&v1.username)?;
        
        // Validate converted data
        if first_name.is_empty() {
            return Err(MigrationError::InvalidData("Empty first name".to_string()));
        }
        
        Ok(UserRegisteredV2 {
            user_id,
            email: v1.email,
            first_name,
            last_name,
        })
    }
}

fn parse_legacy_user_id(legacy_id: &str) -> Result<UserId, MigrationError> {
    // Handle legacy ID formats
    if legacy_id.starts_with("user_") {
        let numeric_part = legacy_id.strip_prefix("user_")
            .ok_or_else(|| MigrationError::InvalidData("Invalid legacy ID format".to_string()))?;
        
        let uuid = Uuid::new_v5(&Uuid::NAMESPACE_OID, numeric_part.as_bytes());
        Ok(UserId::from(uuid))
    } else if let Ok(uuid) = Uuid::parse_str(legacy_id) {
        Ok(UserId::from(uuid))
    } else {
        Err(MigrationError::InvalidData(format!("Cannot parse user ID: {}", legacy_id)))
    }
}
}

Schema Registry

EventCore provides a schema registry for managing types:

#![allow(unused)]
fn main() {
use eventcore::serialization::{SchemaRegistry, TypeInfo};

#[derive(Default)]
struct MySchemaRegistry {
    registry: SchemaRegistry,
}

impl MySchemaRegistry {
    fn new() -> Self {
        let mut registry = SchemaRegistry::new();
        
        // Register event types with versions
        registry.register::<UserRegisteredV1>("UserRegistered", 1);
        registry.register::<UserRegisteredV2>("UserRegistered", 2);
        registry.register::<UserRegisteredV3>("UserRegistered", 3);
        
        // Register migrations
        registry.add_migration::<UserRegisteredV1, UserRegisteredV2>(
            UserRegisteredV1ToV2
        );
        registry.add_migration::<UserRegisteredV2, UserRegisteredV3>(
            UserRegisteredV2ToV3
        );
        
        Self { registry }
    }
    
    fn deserialize_event(&self, event_type: &str, version: u32, data: &[u8]) -> Result<Box<dyn Any>, SerializationError> {
        self.registry.deserialize_and_migrate(event_type, version, data)
    }
}
}

Command Evolution

Commands evolve differently than events because they don’t need historical compatibility:

#![allow(unused)]
fn main() {
// Commands can change more freely
#[derive(Command, Clone)]
struct CreateUser {
    // V1 fields
    email: Email,
    
    // V2 additions - no historical constraint
    first_name: FirstName,
    last_name: LastName,
    
    // V3 additions
    initial_preferences: UserPreferences,
    referral_code: Option<ReferralCode>,
}

// Use builder pattern for backward compatibility
impl CreateUser {
    pub fn builder() -> CreateUserBuilder {
        CreateUserBuilder::default()
    }
    
    // V1-style constructor
    pub fn from_email(email: Email) -> Self {
        Self {
            email,
            first_name: FirstName::default(),
            last_name: LastName::default(),
            initial_preferences: UserPreferences::default(),
            referral_code: None,
        }
    }
    
    // V2-style constructor
    pub fn with_name(email: Email, first_name: FirstName, last_name: LastName) -> Self {
        Self {
            email,
            first_name,
            last_name,
            initial_preferences: UserPreferences::default(),
            referral_code: None,
        }
    }
}

#[derive(Default)]
pub struct CreateUserBuilder {
    email: Option<Email>,
    first_name: Option<FirstName>,
    last_name: Option<LastName>,
    initial_preferences: Option<UserPreferences>,
    referral_code: Option<ReferralCode>,
}

impl CreateUserBuilder {
    pub fn email(mut self, email: Email) -> Self {
        self.email = Some(email);
        self
    }
    
    pub fn name(mut self, first: FirstName, last: LastName) -> Self {
        self.first_name = Some(first);
        self.last_name = Some(last);
        self
    }
    
    pub fn preferences(mut self, prefs: UserPreferences) -> Self {
        self.initial_preferences = Some(prefs);
        self
    }
    
    pub fn referral_code(mut self, code: ReferralCode) -> Self {
        self.referral_code = Some(code);
        self
    }
    
    pub fn build(self) -> Result<CreateUser, ValidationError> {
        Ok(CreateUser {
            email: self.email.ok_or(ValidationError::MissingField("email"))?,
            first_name: self.first_name.unwrap_or_default(),
            last_name: self.last_name.unwrap_or_default(),
            initial_preferences: self.initial_preferences.unwrap_or_default(),
            referral_code: self.referral_code,
        })
    }
}
}

State Evolution

State structures also need to evolve with events:

#![allow(unused)]
fn main() {
#[derive(Default)]
struct UserState {
    exists: bool,
    email: String,
    
    // V2 fields with defaults
    first_name: Option<String>,
    last_name: Option<String>,
    
    // V3 fields
    profile: Option<UserProfile>,
    preferences: UserPreferences,
}

impl CommandLogic for CreateUser {
    type State = UserState;
    type Event = UserEvent;
    
    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        match &event.payload {
            UserEvent::RegisteredV1 { user_id, email, username } => {
                state.exists = true;
                state.email = email.clone();
                // Legacy events don't have separate names
                state.first_name = None;
                state.last_name = None;
            }
            UserEvent::RegisteredV2 { user_id, email, first_name, last_name } => {
                state.exists = true;
                state.email = email.clone();
                state.first_name = Some(first_name.clone());
                state.last_name = Some(last_name.clone());
            }
            UserEvent::RegisteredV3 { user_id, email, profile } => {
                state.exists = true;
                state.email = email.clone();
                state.first_name = Some(profile.first_name.clone());
                state.last_name = Some(profile.last_name.clone());
                state.profile = Some(profile.clone());
            }
            // Handle other events...
        }
    }
}
}

Projection Evolution

Projections need to handle schema changes too:

#![allow(unused)]
fn main() {
#[async_trait]
impl Projection for UserListProjection {
    type Event = UserEvent;
    type Error = ProjectionError;
    
    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match &event.payload {
            // Handle all versions of user registration
            UserEvent::RegisteredV1 { user_id, email, username } => {
                let user = UserSummary {
                    id: user_id.clone(),
                    email: email.clone(),
                    display_name: username.clone(), // Use username as display name
                    first_name: None,
                    last_name: None,
                    created_at: event.occurred_at,
                };
                self.users.insert(user_id.clone(), user);
            }
            UserEvent::RegisteredV2 { user_id, email, first_name, last_name } => {
                let user = UserSummary {
                    id: user_id.clone(),
                    email: email.clone(),
                    display_name: format!("{} {}", first_name, last_name),
                    first_name: Some(first_name.clone()),
                    last_name: Some(last_name.clone()),
                    created_at: event.occurred_at,
                };
                self.users.insert(user_id.clone(), user);
            }
            UserEvent::RegisteredV3 { user_id, email, profile } => {
                let user = UserSummary {
                    id: user_id.clone(),
                    email: email.clone(),
                    display_name: profile.display_name(),
                    first_name: Some(profile.first_name.clone()),
                    last_name: Some(profile.last_name.clone()),
                    created_at: event.occurred_at,
                };
                self.users.insert(user_id.clone(), user);
            }
        }
        Ok(())
    }
}
}

Migration Strategies

Forward-Only Evolution

The simplest approach - only add fields, never remove:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
struct ProductCreated {
    product_id: ProductId,
    name: String,
    price: Money,
    
    // V2 additions
    #[serde(default)]
    category: Option<Category>,
    #[serde(default)]
    tags: Vec<Tag>,
    
    // V3 additions
    #[serde(default)]
    metadata: ProductMetadata,
    #[serde(default)]
    variants: Vec<ProductVariant>,
    
    // V4 additions
    #[serde(default)]
    seo_info: Option<SeoInfo>,
    #[serde(default = "default_status")]
    status: ProductStatus,
}

fn default_status() -> ProductStatus {
    ProductStatus::Active
}
}

Event Splitting

Split large events into focused ones:

#![allow(unused)]
fn main() {
// V1: Monolithic event
struct OrderProcessedV1 {
    order_id: OrderId,
    payment_method: PaymentMethod,
    payment_amount: Money,
    shipping_address: Address,
    items: Vec<OrderItem>,
    discount: Option<Discount>,
    tax_amount: Money,
}

// V2: Split into focused events
enum OrderEventV2 {
    PaymentProcessed {
        order_id: OrderId,
        payment_method: PaymentMethod,
        amount: Money,
    },
    ShippingAddressSet {
        order_id: OrderId,
        address: Address,
    },
    ItemsAdded {
        order_id: OrderId,
        items: Vec<OrderItem>,
    },
    DiscountApplied {
        order_id: OrderId,
        discount: Discount,
    },
    TaxCalculated {
        order_id: OrderId,
        amount: Money,
    },
}
}

Lazy Migration

Migrate events only when needed:

#![allow(unused)]
fn main() {
use eventcore::serialization::LazyMigration;

#[derive(Clone)]
struct LazyUserEvent {
    raw_data: Vec<u8>,
    version: u32,
    migrated: Option<UserEvent>,
}

impl LazyUserEvent {
    fn get(&mut self) -> Result<&UserEvent, MigrationError> {
        if self.migrated.is_none() {
            let migrated = match self.version {
                1 => {
                    let v1: UserRegisteredV1 = serde_json::from_slice(&self.raw_data)?;
                    UserEvent::from_v1(v1)
                }
                2 => {
                    let v2: UserRegisteredV2 = serde_json::from_slice(&self.raw_data)?;
                    UserEvent::from_v2(v2)
                }
                3 => {
                    serde_json::from_slice(&self.raw_data)?
                }
                _ => return Err(MigrationError::UnsupportedVersion(self.version)),
            };
            self.migrated = Some(migrated);
        }
        Ok(self.migrated.as_ref().unwrap())
    }
}
}

Testing Schema Evolution

Migration Tests

#![allow(unused)]
fn main() {
#[cfg(test)]
mod migration_tests {
    use super::*;
    
    #[test]
    fn test_v1_to_v2_migration() {
        let v1_event = UserRegisteredV1 {
            user_id: "user_123".to_string(),
            email: "john.doe@example.com".to_string(),
            username: "john_doe".to_string(),
        };
        
        let migration = UserRegisteredV1ToV2;
        let v2_event = migration.migrate(v1_event).unwrap();
        
        assert!(v2_event.user_id.to_string().contains("123"));
        assert_eq!(v2_event.email, "john.doe@example.com");
        assert_eq!(v2_event.first_name, "john");
        assert_eq!(v2_event.last_name, "doe");
    }
    
    #[test]
    fn test_serialization_roundtrip() {
        let v2_event = UserRegisteredV2 {
            user_id: UserId::new(),
            email: "test@example.com".to_string(),
            first_name: "Test".to_string(),
            last_name: "User".to_string(),
        };
        
        // Serialize
        let json = serde_json::to_string(&v2_event).unwrap();
        
        // Deserialize
        let deserialized: UserRegisteredV2 = serde_json::from_str(&json).unwrap();
        
        assert_eq!(v2_event.user_id, deserialized.user_id);
        assert_eq!(v2_event.email, deserialized.email);
    }
    
    #[test]
    fn test_backward_compatibility() {
        // V1 JSON without new fields
        let v1_json = r#"{
            "user_id": "550e8400-e29b-41d4-a716-446655440000",
            "email": "legacy@example.com"
        }"#;
        
        // Should deserialize into V2 with defaults
        let v2_event: UserRegisteredV2 = serde_json::from_str(v1_json).unwrap();
        
        assert_eq!(v2_event.email, "legacy@example.com");
        assert!(v2_event.first_name.is_empty()); // Default
        assert!(v2_event.last_name.is_empty()); // Default
    }
}
}

Property-Based Migration Tests

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn migration_preserves_core_data(
        user_id in any::<String>(),
        email in "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}",
        username in "[a-zA-Z0-9_]{3,20}",
    ) {
        let v1 = UserRegisteredV1 {
            user_id: user_id.clone(),
            email: email.clone(),
            username,
        };
        
        let migration = UserRegisteredV1ToV2;
        let v2 = migration.migrate(v1).unwrap();
        
        // Core data should be preserved
        prop_assert_eq!(v2.email, email);
        
        // User ID should be convertible
        prop_assert!(v2.user_id.to_string().len() > 0);
    }
}
}

Best Practices

  1. Plan for evolution - Design events with future changes in mind
  2. Use optional fields - Default to optional for new fields
  3. Never remove fields - Mark as deprecated instead
  4. Version breaking changes - Use explicit versioning for major changes
  5. Test migrations thoroughly - Especially edge cases
  6. Document schema changes - Keep a changelog
  7. Migrate lazily - Only when events are read
  8. Monitor migration performance - Large migrations can be slow

Summary

Schema evolution in EventCore:

  • Backward compatible - Old events still work
  • Versioned explicitly - Track breaking changes
  • Migration support - Transform old formats
  • Type-safe - Compile-time guarantees
  • Testable - Comprehensive test support

Key patterns:

  1. Use serde defaults for backward compatibility
  2. Version events explicitly for breaking changes
  3. Write migration functions for complex transformations
  4. Test all migration paths thoroughly
  5. Plan for evolution from day one

Next, let’s explore Event Versioning

Chapter 5.2: Event Versioning

Event versioning is a systematic approach to managing changes in event schemas while preserving the ability to read historical data. This chapter covers EventCore’s versioning strategies and implementation patterns.

Versioning Strategies

Semantic Versioning for Events

Apply semantic versioning principles to events:

#![allow(unused)]
fn main() {
use eventcore::serialization::EventVersion;

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
struct EventSchemaVersion {
    major: u32,
    minor: u32,
    patch: u32,
}

impl EventSchemaVersion {
    const fn new(major: u32, minor: u32, patch: u32) -> Self {
        Self { major, minor, patch }
    }
    
    // Breaking changes
    const V1_0_0: Self = Self::new(1, 0, 0);
    const V2_0_0: Self = Self::new(2, 0, 0);
    
    // Backward compatible additions
    const V1_1_0: Self = Self::new(1, 1, 0);
    const V1_2_0: Self = Self::new(1, 2, 0);
    
    // Bug fixes/clarifications
    const V1_0_1: Self = Self::new(1, 0, 1);
}

trait VersionedEvent {
    const EVENT_TYPE: &'static str;
    const VERSION: EventSchemaVersion;
    
    fn is_compatible_with(version: &EventSchemaVersion) -> bool;
}
}

Linear Versioning

Simpler approach with incremental versions:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "version")]
enum UserEvent {
    #[serde(rename = "1")]
    V1(UserEventV1),
    
    #[serde(rename = "2")]
    V2(UserEventV2),
    
    #[serde(rename = "3")]
    V3(UserEventV3),
}

#[derive(Debug, Serialize, Deserialize)]
struct UserEventV1 {
    pub user_id: String,
    pub email: String,
    pub username: String,
}

#[derive(Debug, Serialize, Deserialize)]
struct UserEventV2 {
    pub user_id: UserId,
    pub email: Email,
    pub first_name: String,
    pub last_name: String,
}

#[derive(Debug, Serialize, Deserialize)]
struct UserEventV3 {
    pub user_id: UserId,
    pub email: Email,
    pub profile: UserProfile,
    pub preferences: UserPreferences,
}
}

Version-Aware Serialization

EventCore provides automatic version handling:

#![allow(unused)]
fn main() {
use eventcore::serialization::{VersionedSerializer, SerializationFormat};

#[derive(Clone)]
struct EventSerializer {
    format: SerializationFormat,
    registry: TypeRegistry,
}

impl EventSerializer {
    fn new() -> Self {
        let mut registry = TypeRegistry::new();
        
        // Register all versions
        registry.register_versioned::<UserEventV1>("UserEvent", 1);
        registry.register_versioned::<UserEventV2>("UserEvent", 2);
        registry.register_versioned::<UserEventV3>("UserEvent", 3);
        
        Self {
            format: SerializationFormat::Json,
            registry,
        }
    }
    
    fn serialize_event<T>(&self, event: &T) -> Result<VersionedPayload, SerializationError>
    where
        T: Serialize + VersionedEvent,
    {
        let data = self.format.serialize(event)?;
        
        Ok(VersionedPayload {
            event_type: T::EVENT_TYPE.to_string(),
            version: T::VERSION.to_string(),
            format: self.format,
            data,
        })
    }
    
    fn deserialize_event<T>(&self, payload: &VersionedPayload) -> Result<T, SerializationError>
    where
        T: DeserializeOwned + VersionedEvent,
    {
        // Check version compatibility
        let payload_version = EventSchemaVersion::parse(&payload.version)?;
        if !T::is_compatible_with(&payload_version) {
            return Err(SerializationError::IncompatibleVersion {
                expected: T::VERSION,
                found: payload_version,
            });
        }
        
        self.format.deserialize(&payload.data)
    }
}

#[derive(Debug, Clone)]
struct VersionedPayload {
    event_type: String,
    version: String,
    format: SerializationFormat,
    data: Vec<u8>,
}
}

Migration Chains

Handle complex version transitions:

#![allow(unused)]
fn main() {
use eventcore::serialization::{MigrationChain, Migration};

struct UserEventMigrationChain {
    migrations: Vec<Box<dyn Migration<UserEvent, UserEvent>>>,
}

impl UserEventMigrationChain {
    fn new() -> Self {
        let migrations: Vec<Box<dyn Migration<UserEvent, UserEvent>>> = vec![
            Box::new(V1ToV2Migration),
            Box::new(V2ToV3Migration),
        ];
        
        Self { migrations }
    }
    
    fn migrate_to_latest(&self, event: UserEvent, from_version: u32) -> Result<UserEvent, MigrationError> {
        let mut current_event = event;
        let mut current_version = from_version;
        
        // Apply migrations in sequence
        while current_version < UserEvent::LATEST_VERSION {
            let migration = self.migrations
                .get((current_version - 1) as usize)
                .ok_or(MigrationError::NoMigrationPath { 
                    from: current_version, 
                    to: UserEvent::LATEST_VERSION 
                })?;
            
            current_event = migration.migrate(current_event)?;
            current_version += 1;
        }
        
        Ok(current_event)
    }
}

struct V1ToV2Migration;

impl Migration<UserEvent, UserEvent> for V1ToV2Migration {
    fn migrate(&self, event: UserEvent) -> Result<UserEvent, MigrationError> {
        match event {
            UserEvent::V1(v1) => {
                // Convert V1 to V2
                let user_id = UserId::try_from(v1.user_id)
                    .map_err(|e| MigrationError::ConversionFailed(e.to_string()))?;
                
                let email = Email::try_from(v1.email)
                    .map_err(|e| MigrationError::ConversionFailed(e.to_string()))?;
                
                // Extract names from username
                let (first_name, last_name) = split_username(&v1.username);
                
                Ok(UserEvent::V2(UserEventV2 {
                    user_id,
                    email,
                    first_name,
                    last_name,
                }))
            }
            other => Ok(other), // Already V2 or later
        }
    }
}

fn split_username(username: &str) -> (String, String) {
    let parts: Vec<&str> = username.split('_').collect();
    match parts.len() {
        1 => (parts[0].to_string(), String::new()),
        2 => (parts[0].to_string(), parts[1].to_string()),
        _ => (parts[0].to_string(), parts[1..].join("_")),
    }
}
}

Event Store Integration

Integrate versioning with the event store:

#![allow(unused)]
fn main() {
#[async_trait]
impl EventStore for VersionedEventStore {
    type Event = VersionedEvent;
    type Error = EventStoreError;
    
    async fn write_events(
        &self,
        events: Vec<EventToWrite<Self::Event>>,
    ) -> Result<WriteResult, Self::Error> {
        let versioned_events: Result<Vec<_>, _> = events
            .into_iter()
            .map(|event| {
                let payload = self.serializer.serialize_event(&event.payload)?;
                Ok(EventToWrite {
                    stream_id: event.stream_id,
                    payload,
                    metadata: event.metadata,
                    expected_version: event.expected_version,
                })
            })
            .collect();
        
        self.inner.write_events(versioned_events?).await
    }
    
    async fn read_stream(
        &self,
        stream_id: &StreamId,
        options: ReadOptions,
    ) -> Result<StreamEvents<Self::Event>, Self::Error> {
        let raw_events = self.inner.read_stream(stream_id, options).await?;
        
        let events: Result<Vec<_>, _> = raw_events
            .events
            .into_iter()
            .map(|event| {
                let payload = self.serializer.deserialize_event(&event.payload)?;
                Ok(StoredEvent {
                    id: event.id,
                    stream_id: event.stream_id,
                    version: event.version,
                    payload,
                    metadata: event.metadata,
                    occurred_at: event.occurred_at,
                })
            })
            .collect();
        
        Ok(StreamEvents {
            stream_id: raw_events.stream_id,
            version: raw_events.version,
            events: events?,
        })
    }
}
}

Version-Aware Projections

Projections that handle multiple event versions:

#![allow(unused)]
fn main() {
#[async_trait]
impl Projection for UserProjection {
    type Event = VersionedEvent;
    type Error = ProjectionError;
    
    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match &event.payload {
            VersionedEvent::User(user_event) => {
                self.apply_user_event(user_event, event.occurred_at).await?;
            }
            _ => {} // Ignore other event types
        }
        Ok(())
    }
}

impl UserProjection {
    async fn apply_user_event(
        &mut self, 
        event: &UserEvent, 
        occurred_at: DateTime<Utc>
    ) -> Result<(), ProjectionError> {
        match event {
            UserEvent::V1(v1) => {
                // Handle V1 events
                let user = User {
                    id: UserId::try_from(v1.user_id.clone())?,
                    email: v1.email.clone(),
                    display_name: v1.username.clone(),
                    first_name: None,
                    last_name: None,
                    profile: None,
                    preferences: UserPreferences::default(),
                    created_at: occurred_at,
                    updated_at: occurred_at,
                };
                self.users.insert(user.id.clone(), user);
            }
            UserEvent::V2(v2) => {
                // Handle V2 events
                let user = User {
                    id: v2.user_id.clone(),
                    email: v2.email.to_string(),
                    display_name: format!("{} {}", v2.first_name, v2.last_name),
                    first_name: Some(v2.first_name.clone()),
                    last_name: Some(v2.last_name.clone()),
                    profile: None,
                    preferences: UserPreferences::default(),
                    created_at: occurred_at,
                    updated_at: occurred_at,
                };
                self.users.insert(user.id.clone(), user);
            }
            UserEvent::V3(v3) => {
                // Handle V3 events
                let user = User {
                    id: v3.user_id.clone(),
                    email: v3.email.to_string(),
                    display_name: v3.profile.display_name(),
                    first_name: Some(v3.profile.first_name.clone()),
                    last_name: Some(v3.profile.last_name.clone()),
                    profile: Some(v3.profile.clone()),
                    preferences: v3.preferences.clone(),
                    created_at: occurred_at,
                    updated_at: occurred_at,
                };
                self.users.insert(user.id.clone(), user);
            }
        }
        Ok(())
    }
}
}

Version Compatibility Rules

Define clear compatibility rules:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
enum CompatibilityLevel {
    FullyCompatible,    // Can read/write without issues
    ReadOnly,           // Can read but not write
    RequiresMigration,  // Need migration to use
    Incompatible,       // Cannot use
}

trait VersionCompatibility {
    fn check_compatibility(reader_version: &str, event_version: &str) -> CompatibilityLevel;
}

struct UserEventCompatibility;

impl VersionCompatibility for UserEventCompatibility {
    fn check_compatibility(reader_version: &str, event_version: &str) -> CompatibilityLevel {
        use CompatibilityLevel::*;
        
        match (reader_version, event_version) {
            // Same version - fully compatible
            (r, e) if r == e => FullyCompatible,
            
            // Reader newer than event - usually compatible
            ("2", "1") | ("3", "1") | ("3", "2") => FullyCompatible,
            
            // Reader older than event - may need migration
            ("1", "2") | ("1", "3") | ("2", "3") => RequiresMigration,
            
            // Special compatibility rules
            ("1.1", "1.0") => FullyCompatible, // Minor versions compatible
            
            _ => Incompatible,
        }
    }
}

// Usage in deserialization
fn deserialize_with_compatibility_check<T>(
    payload: &VersionedPayload,
    reader_version: &str,
) -> Result<T, SerializationError>
where
    T: DeserializeOwned + VersionCompatibility,
{
    let compatibility = T::check_compatibility(reader_version, &payload.version);
    
    match compatibility {
        CompatibilityLevel::FullyCompatible => {
            // Direct deserialization
            serde_json::from_slice(&payload.data)
                .map_err(SerializationError::Deserialization)
        }
        CompatibilityLevel::ReadOnly => {
            // Deserialize but mark as read-only
            let mut event: T = serde_json::from_slice(&payload.data)?;
            // Mark event as read-only somehow
            Ok(event)
        }
        CompatibilityLevel::RequiresMigration => {
            // Apply migration
            let migrated = migrate_to_version(&payload.data, &payload.version, reader_version)?;
            serde_json::from_slice(&migrated)
                .map_err(SerializationError::Deserialization)
        }
        CompatibilityLevel::Incompatible => {
            Err(SerializationError::IncompatibleVersion {
                reader: reader_version.to_string(),
                event: payload.version.clone(),
            })
        }
    }
}
}

Event Archival and Compression

Handle old event versions efficiently:

#![allow(unused)]
fn main() {
use eventcore::archival::{EventArchiver, CompressionLevel};

struct VersionedEventArchiver {
    archiver: EventArchiver,
    retention_policy: RetentionPolicy,
}

#[derive(Debug, Clone)]
struct RetentionPolicy {
    pub keep_latest_versions: u32,
    pub archive_after_days: u32,
    pub compress_after_days: u32,
    pub delete_after_years: u32,
}

impl VersionedEventArchiver {
    async fn archive_old_versions(&self, stream_id: &StreamId) -> Result<ArchiveResult, ArchiveError> {
        let events = self.read_all_events(stream_id).await?;
        let mut archive_stats = ArchiveResult::default();
        
        for event in events {
            let age_days = (Utc::now() - event.occurred_at).num_days() as u32;
            
            match event.payload.version() {
                v if v < (CURRENT_VERSION - self.retention_policy.keep_latest_versions) => {
                    if age_days > self.retention_policy.delete_after_years * 365 {
                        // Delete very old events
                        self.archiver.delete_event(&event.id).await?;
                        archive_stats.deleted += 1;
                    } else if age_days > self.retention_policy.compress_after_days {
                        // Compress old events
                        self.archiver.compress_event(&event.id, CompressionLevel::High).await?;
                        archive_stats.compressed += 1;
                    } else if age_days > self.retention_policy.archive_after_days {
                        // Move to cold storage
                        self.archiver.archive_event(&event.id).await?;
                        archive_stats.archived += 1;
                    }
                }
                _ => {
                    // Keep recent versions in hot storage
                    archive_stats.retained += 1;
                }
            }
        }
        
        Ok(archive_stats)
    }
}

#[derive(Debug, Default)]
struct ArchiveResult {
    pub retained: u32,
    pub archived: u32,
    pub compressed: u32,
    pub deleted: u32,
}
}

Version Monitoring

Monitor version usage in production:

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, IntGauge};

lazy_static! {
    static ref EVENT_VERSION_COUNTER: Counter = register_counter!(
        "eventcore_event_versions_total",
        "Total events by version"
    ).unwrap();
    
    static ref MIGRATION_DURATION: Histogram = register_histogram!(
        "eventcore_migration_duration_seconds",
        "Time spent migrating events"
    ).unwrap();
    
    static ref ACTIVE_VERSIONS: IntGauge = register_int_gauge!(
        "eventcore_active_event_versions",
        "Number of active event versions"
    ).unwrap();
}

struct VersionMetrics {
    version_counts: HashMap<String, u64>,
    migration_stats: HashMap<(String, String), MigrationStats>,
}

#[derive(Debug, Default)]
struct MigrationStats {
    pub total_migrations: u64,
    pub successful_migrations: u64,
    pub failed_migrations: u64,
    pub average_duration: Duration,
}

impl VersionMetrics {
    fn record_event_version(&mut self, event_type: &str, version: &str) {
        *self.version_counts
            .entry(format!("{}:{}", event_type, version))
            .or_insert(0) += 1;
        
        EVENT_VERSION_COUNTER
            .with_label_values(&[event_type, version])
            .inc();
    }
    
    fn record_migration(&mut self, from: &str, to: &str, duration: Duration, success: bool) {
        let key = (from.to_string(), to.to_string());
        let stats = self.migration_stats.entry(key).or_default();
        
        stats.total_migrations += 1;
        if success {
            stats.successful_migrations += 1;
        } else {
            stats.failed_migrations += 1;
        }
        
        // Update average duration
        let total_time = stats.average_duration * (stats.total_migrations - 1) as u32 + duration;
        stats.average_duration = total_time / stats.total_migrations as u32;
        
        MIGRATION_DURATION.observe(duration.as_secs_f64());
    }
    
    fn update_active_versions(&self) {
        let active_count = self.version_counts
            .keys()
            .map(|key| key.split(':').nth(1).unwrap_or("unknown"))
            .collect::<HashSet<_>>()
            .len();
        
        ACTIVE_VERSIONS.set(active_count as i64);
    }
}
}

Testing Event Versions

Comprehensive testing for versioned events:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod version_tests {
    use super::*;
    use proptest::prelude::*;
    
    #[test]
    fn test_version_serialization_roundtrip() {
        let v3_event = UserEventV3 {
            user_id: UserId::new(),
            email: Email::try_new("test@example.com").unwrap(),
            profile: UserProfile {
                first_name: "Test".to_string(),
                last_name: "User".to_string(),
                bio: Some("Test bio".to_string()),
                avatar_url: None,
            },
            preferences: UserPreferences::default(),
        };
        
        let serializer = EventSerializer::new();
        
        // Serialize
        let payload = serializer.serialize_event(&v3_event).unwrap();
        assert_eq!(payload.version, "3");
        
        // Deserialize
        let deserialized: UserEventV3 = serializer.deserialize_event(&payload).unwrap();
        assert_eq!(v3_event.user_id, deserialized.user_id);
        assert_eq!(v3_event.email, deserialized.email);
    }
    
    #[test]
    fn test_migration_chain() {
        let v1_event = UserEvent::V1(UserEventV1 {
            user_id: "user_123".to_string(),
            email: "test@example.com".to_string(),
            username: "test_user".to_string(),
        });
        
        let migration_chain = UserEventMigrationChain::new();
        let v3_event = migration_chain.migrate_to_latest(v1_event, 1).unwrap();
        
        match v3_event {
            UserEvent::V3(v3) => {
                assert_eq!(v3.email.to_string(), "test@example.com");
                assert_eq!(v3.profile.first_name, "test");
                assert_eq!(v3.profile.last_name, "user");
            }
            _ => panic!("Expected V3 event after migration"),
        }
    }
    
    proptest! {
        #[test]
        fn version_compatibility_is_transitive(
            v1 in 1u32..10,
            v2 in 1u32..10,
            v3 in 1u32..10,
        ) {
            let versions = [v1, v2, v3];
            versions.sort();
            let [min_v, mid_v, max_v] = versions;
            
            // If min compatible with mid, and mid compatible with max,
            // then migration chain should work
            if UserEventCompatibility::check_compatibility(
                &mid_v.to_string(), &min_v.to_string()
            ) != CompatibilityLevel::Incompatible &&
            UserEventCompatibility::check_compatibility(
                &max_v.to_string(), &mid_v.to_string()
            ) != CompatibilityLevel::Incompatible {
                // Migration from min to max should be possible
                prop_assert!(can_migrate_between_versions(min_v, max_v));
            }
        }
    }
    
    fn can_migrate_between_versions(from: u32, to: u32) -> bool {
        // Implementation depends on your migration chain
        to >= from && (to - from) <= MAX_MIGRATION_DISTANCE
    }
}
}

Best Practices

  1. Version everything explicitly - Don’t rely on implicit versioning
  2. Plan migration paths - Design how old versions become new ones
  3. Test all paths - Test reading old events with new code
  4. Monitor version usage - Track which versions are in production
  5. Clean up old versions - Archive or delete very old events
  6. Document changes - Keep detailed changelogs
  7. Gradual rollouts - Deploy new versions incrementally
  8. Backward compatibility - Maintain as long as practical

Summary

Event versioning in EventCore:

  • Explicit versioning - Clear version tracking
  • Migration support - Transform between versions
  • Compatibility checking - Know what works together
  • Performance monitoring - Track version usage
  • Testing support - Comprehensive test patterns

Key patterns:

  1. Use semantic or linear versioning consistently
  2. Define clear compatibility rules
  3. Implement migration chains for complex changes
  4. Monitor version usage in production
  5. Test all migration paths thoroughly

Next, let’s explore Long-Running Processes

Chapter 5.3: Long-Running Processes

Long-running processes, also known as sagas or process managers, coordinate complex business workflows that span multiple commands and may take significant time to complete. EventCore provides patterns for implementing these reliably.

What Are Long-Running Processes?

Long-running processes are stateful workflows that:

  • React to events
  • Execute commands
  • Maintain state across time
  • Handle failures and compensations
  • May run for days, weeks, or months

Examples include:

  • Order fulfillment workflows
  • User onboarding sequences
  • Financial transaction processing
  • Document approval chains

Process Manager Pattern

EventCore implements the process manager pattern:

#![allow(unused)]
fn main() {
use eventcore::process::{ProcessManager, ProcessState, ProcessResult};

#[derive(Command, Clone)]
struct OrderFulfillmentProcess {
    #[stream]
    process_id: StreamId,
    
    #[stream]
    order_id: StreamId,
    
    current_step: FulfillmentStep,
    timeout_at: Option<DateTime<Utc>>,
}

#[derive(Debug, Clone, PartialEq)]
enum FulfillmentStep {
    PaymentPending,
    PaymentConfirmed,
    InventoryReserved,
    Shipped,
    Delivered,
    Completed,
    Failed(String),
}

#[derive(Default)]
struct OrderFulfillmentState {
    order_id: Option<OrderId>,
    current_step: FulfillmentStep,
    payment_confirmed: bool,
    inventory_reserved: bool,
    shipping_info: Option<ShippingInfo>,
    timeout_at: Option<DateTime<Utc>>,
    retry_count: u32,
    created_at: DateTime<Utc>,
}

impl CommandLogic for OrderFulfillmentProcess {
    type State = OrderFulfillmentState;
    type Event = ProcessEvent;
    
    fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) {
        match &event.payload {
            ProcessEvent::Started { order_id, timeout_at } => {
                state.order_id = Some(*order_id);
                state.current_step = FulfillmentStep::PaymentPending;
                state.timeout_at = *timeout_at;
                state.created_at = event.occurred_at;
            }
            ProcessEvent::StepCompleted { step } => {
                state.current_step = step.clone();
            }
            ProcessEvent::PaymentConfirmed => {
                state.payment_confirmed = true;
            }
            ProcessEvent::InventoryReserved => {
                state.inventory_reserved = true;
            }
            ProcessEvent::ShippingInfoUpdated { info } => {
                state.shipping_info = Some(info.clone());
            }
            ProcessEvent::Failed { reason } => {
                state.current_step = FulfillmentStep::Failed(reason.clone());
            }
            ProcessEvent::RetryAttempted => {
                state.retry_count += 1;
            }
        }
    }
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Check for timeout
        if let Some(timeout) = state.timeout_at {
            if Utc::now() > timeout {
                return Ok(vec![
                    StreamWrite::new(
                        &read_streams,
                        self.process_id.clone(),
                        ProcessEvent::Failed {
                            reason: "Process timed out".to_string(),
                        }
                    )?
                ]);
            }
        }
        
        // Execute current step
        match state.current_step {
            FulfillmentStep::PaymentPending => {
                self.handle_payment_step(&read_streams, &state).await
            }
            FulfillmentStep::PaymentConfirmed => {
                self.handle_inventory_step(&read_streams, &state).await
            }
            FulfillmentStep::InventoryReserved => {
                self.handle_shipping_step(&read_streams, &state).await
            }
            FulfillmentStep::Shipped => {
                self.handle_delivery_step(&read_streams, &state).await
            }
            FulfillmentStep::Delivered => {
                self.handle_completion_step(&read_streams, &state).await
            }
            FulfillmentStep::Completed | FulfillmentStep::Failed(_) => {
                // Process finished - no more events
                Ok(vec![])
            }
        }
    }
}

impl OrderFulfillmentProcess {
    async fn handle_payment_step(
        &self,
        read_streams: &ReadStreams<OrderFulfillmentProcessStreamSet>,
        state: &OrderFulfillmentState,
    ) -> CommandResult<Vec<StreamWrite<OrderFulfillmentProcessStreamSet, ProcessEvent>>> {
        if !state.payment_confirmed {
            // Check if payment was confirmed by external event
            // This would typically listen to payment events
            Ok(vec![])
        } else {
            // Move to next step
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.process_id.clone(),
                    ProcessEvent::StepCompleted {
                        step: FulfillmentStep::PaymentConfirmed,
                    }
                )?
            ])
        }
    }
    
    async fn handle_inventory_step(
        &self,
        read_streams: &ReadStreams<OrderFulfillmentProcessStreamSet>,
        state: &OrderFulfillmentState,
    ) -> CommandResult<Vec<StreamWrite<OrderFulfillmentProcessStreamSet, ProcessEvent>>> {
        if !state.inventory_reserved {
            // Reserve inventory
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.process_id.clone(),
                    ProcessEvent::InventoryReserved,
                )?
            ])
        } else {
            // Move to shipping
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.process_id.clone(),
                    ProcessEvent::StepCompleted {
                        step: FulfillmentStep::InventoryReserved,
                    }
                )?
            ])
        }
    }
    
    // Similar implementations for other steps...
}
}

Event-Driven Process Coordination

Processes react to events from other parts of the system:

#![allow(unused)]
fn main() {
#[async_trait]
impl EventHandler<SystemEvent> for OrderFulfillmentProcess {
    async fn handle_event(
        &self,
        event: &StoredEvent<SystemEvent>,
        executor: &CommandExecutor,
    ) -> Result<(), ProcessError> {
        match &event.payload {
            SystemEvent::Payment(PaymentEvent::Confirmed { order_id, .. }) => {
                // Payment confirmed - advance process
                let process_command = AdvanceOrderProcess {
                    process_id: derive_process_id(order_id),
                    trigger: ProcessTrigger::PaymentConfirmed,
                };
                executor.execute(&process_command).await?;
            }
            SystemEvent::Inventory(InventoryEvent::Reserved { order_id, .. }) => {
                let process_command = AdvanceOrderProcess {
                    process_id: derive_process_id(order_id),
                    trigger: ProcessTrigger::InventoryReserved,
                };
                executor.execute(&process_command).await?;
            }
            SystemEvent::Shipping(ShippingEvent::Dispatched { order_id, tracking, .. }) => {
                let process_command = AdvanceOrderProcess {
                    process_id: derive_process_id(order_id),
                    trigger: ProcessTrigger::Shipped { tracking_number: tracking.clone() },
                };
                executor.execute(&process_command).await?;
            }
            _ => {} // Ignore other events
        }
        Ok(())
    }
}

#[derive(Command, Clone)]
struct AdvanceOrderProcess {
    #[stream]
    process_id: StreamId,
    
    trigger: ProcessTrigger,
}

#[derive(Debug, Clone)]
enum ProcessTrigger {
    PaymentConfirmed,
    InventoryReserved,
    Shipped { tracking_number: String },
    Delivered,
    Failed { reason: String },
}
}

Saga Pattern Implementation

For distributed transactions, implement the saga pattern:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct Bookingsaga {
    #[stream]
    saga_id: StreamId,
    
    #[stream]
    reservation_id: StreamId,
    
    steps: Vec<SagaStep>,
    current_step: usize,
    compensation_mode: bool,
}

#[derive(Debug, Clone)]
struct SagaStep {
    name: String,
    command: Box<dyn SerializableCommand>,
    compensation: Box<dyn SerializableCommand>,
    status: StepStatus,
}

#[derive(Debug, Clone, PartialEq)]
enum StepStatus {
    Pending,
    Completed,
    Failed,
    Compensated,
}

impl CommandLogic for BookingSaga {
    type State = SagaState;
    type Event = SagaEvent;
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        if state.compensation_mode {
            self.handle_compensation(&read_streams, &state).await
        } else {
            self.handle_forward_execution(&read_streams, &state).await
        }
    }
}

impl BookingSaga {
    async fn handle_forward_execution(
        &self,
        read_streams: &ReadStreams<BookingSagaStreamSet>,
        state: &SagaState,
    ) -> CommandResult<Vec<StreamWrite<BookingSagaStreamSet, SagaEvent>>> {
        if state.current_step >= state.steps.len() {
            // All steps completed
            return Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::Completed,
                )?
            ]);
        }
        
        let current_step = &state.steps[state.current_step];
        
        match current_step.status {
            StepStatus::Pending => {
                // Execute current step
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::StepStarted {
                            step_index: state.current_step,
                            step_name: current_step.name.clone(),
                        }
                    )?
                ])
            }
            StepStatus::Completed => {
                // Move to next step
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::StepAdvanced {
                            next_step: state.current_step + 1,
                        }
                    )?
                ])
            }
            StepStatus::Failed => {
                // Start compensation
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::CompensationStarted {
                            failed_step: state.current_step,
                        }
                    )?
                ])
            }
            StepStatus::Compensated => unreachable!("Cannot be compensated in forward mode"),
        }
    }
    
    async fn handle_compensation(
        &self,
        read_streams: &ReadStreams<BookingSagaStreamSet>,
        state: &SagaState,
    ) -> CommandResult<Vec<StreamWrite<BookingSagaStreamSet, SagaEvent>>> {
        // Compensate completed steps in reverse order
        let compensation_step = state.steps
            .iter()
            .rposition(|step| step.status == StepStatus::Completed);
        
        match compensation_step {
            Some(index) => {
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::CompensationStepStarted {
                            step_index: index,
                            step_name: state.steps[index].name.clone(),
                        }
                    )?
                ])
            }
            None => {
                // All compensations completed
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::CompensationCompleted,
                    )?
                ])
            }
        }
    }
}

// Example saga for hotel + flight + car booking
fn create_travel_booking_saga(
    hotel_booking: BookHotelCommand,
    flight_booking: BookFlightCommand,
    car_booking: BookCarCommand,
) -> BookingSaga {
    let steps = vec![
        SagaStep {
            name: "book_hotel".to_string(),
            command: Box::new(hotel_booking.clone()),
            compensation: Box::new(CancelHotelCommand {
                booking_id: hotel_booking.booking_id,
            }),
            status: StepStatus::Pending,
        },
        SagaStep {
            name: "book_flight".to_string(),
            command: Box::new(flight_booking.clone()),
            compensation: Box::new(CancelFlightCommand {
                booking_id: flight_booking.booking_id,
            }),
            status: StepStatus::Pending,
        },
        SagaStep {
            name: "book_car".to_string(),
            command: Box::new(car_booking.clone()),
            compensation: Box::new(CancelCarCommand {
                booking_id: car_booking.booking_id,
            }),
            status: StepStatus::Pending,
        },
    ];
    
    BookingSaga {
        saga_id: StreamId::from(format!("booking-saga-{}", SagaId::new())),
        reservation_id: StreamId::from(format!("reservation-{}", ReservationId::new())),
        steps,
        current_step: 0,
        compensation_mode: false,
    }
}
}

Timeout and Retry Handling

Long-running processes need robust timeout and retry logic:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct ProcessTimeout {
    timeout_at: DateTime<Utc>,
    retry_policy: RetryPolicy,
    max_retries: u32,
    current_retries: u32,
}

#[derive(Debug, Clone)]
enum RetryPolicy {
    FixedDelay { delay: Duration },
    ExponentialBackoff { base_delay: Duration, max_delay: Duration },
    LinearBackoff { initial_delay: Duration, increment: Duration },
}

impl ProcessTimeout {
    fn should_retry(&self) -> bool {
        self.current_retries < self.max_retries
    }
    
    fn next_retry_delay(&self) -> Duration {
        match &self.retry_policy {
            RetryPolicy::FixedDelay { delay } => *delay,
            RetryPolicy::ExponentialBackoff { base_delay, max_delay } => {
                let delay = *base_delay * 2_u32.pow(self.current_retries);
                std::cmp::min(delay, *max_delay)
            }
            RetryPolicy::LinearBackoff { initial_delay, increment } => {
                *initial_delay + (*increment * self.current_retries)
            }
        }
    }
    
    fn next_timeout(&self) -> DateTime<Utc> {
        Utc::now() + self.next_retry_delay()
    }
}

// Timeout scheduler for processes
#[async_trait]
trait ProcessTimeoutScheduler {
    async fn schedule_timeout(
        &self,
        process_id: StreamId,
        timeout_at: DateTime<Utc>,
    ) -> Result<(), TimeoutError>;
    
    async fn cancel_timeout(
        &self,
        process_id: StreamId,
    ) -> Result<(), TimeoutError>;
}

struct InMemoryTimeoutScheduler {
    timeouts: Arc<RwLock<BTreeMap<DateTime<Utc>, Vec<StreamId>>>>,
    executor: CommandExecutor,
}

impl InMemoryTimeoutScheduler {
    async fn run_timeout_checker(&self) {
        let mut interval = tokio::time::interval(Duration::from_secs(10));
        
        loop {
            interval.tick().await;
            self.check_timeouts().await;
        }
    }
    
    async fn check_timeouts(&self) {
        let now = Utc::now();
        let mut timeouts = self.timeouts.write().await;
        
        // Find expired timeouts
        let expired: Vec<_> = timeouts
            .range(..=now)
            .flat_map(|(_, process_ids)| process_ids.clone())
            .collect();
        
        // Remove expired timeouts
        timeouts.retain(|&timeout_time, _| timeout_time > now);
        
        // Trigger timeout commands
        for process_id in expired {
            let timeout_command = ProcessTimeoutCommand {
                process_id,
                timed_out_at: now,
            };
            
            if let Err(e) = self.executor.execute(&timeout_command).await {
                tracing::error!("Failed to execute timeout command: {}", e);
            }
        }
    }
}

#[derive(Command, Clone)]
struct ProcessTimeoutCommand {
    #[stream]
    process_id: StreamId,
    
    timed_out_at: DateTime<Utc>,
}

impl CommandLogic for ProcessTimeoutCommand {
    type State = ProcessState;
    type Event = ProcessEvent;
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        // Check if process should retry or fail
        let should_retry = state.timeout.as_ref()
            .map(|t| t.should_retry())
            .unwrap_or(false);
        
        if should_retry {
            let next_timeout = state.timeout.as_ref().unwrap().next_timeout();
            
            Ok(vec![
                StreamWrite::new(
                    &read_streams,
                    self.process_id.clone(),
                    ProcessEvent::RetryScheduled {
                        retry_at: next_timeout,
                        attempt: state.timeout.as_ref().unwrap().current_retries + 1,
                    }
                )?
            ])
        } else {
            Ok(vec![
                StreamWrite::new(
                    &read_streams,
                    self.process_id.clone(),
                    ProcessEvent::Failed {
                        reason: "Process timed out after maximum retries".to_string(),
                    }
                )?
            ])
        }
    }
}
}

Process Monitoring and Observability

Monitor long-running processes in production:

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, Gauge};

lazy_static! {
    static ref PROCESS_STARTED: Counter = register_counter!(
        "eventcore_processes_started_total",
        "Total number of processes started"
    ).unwrap();
    
    static ref PROCESS_COMPLETED: Counter = register_counter!(
        "eventcore_processes_completed_total",
        "Total number of processes completed"
    ).unwrap();
    
    static ref PROCESS_FAILED: Counter = register_counter!(
        "eventcore_processes_failed_total",
        "Total number of processes failed"
    ).unwrap();
    
    static ref PROCESS_DURATION: Histogram = register_histogram!(
        "eventcore_process_duration_seconds",
        "Process execution duration"
    ).unwrap();
    
    static ref ACTIVE_PROCESSES: Gauge = register_gauge!(
        "eventcore_active_processes",
        "Number of currently active processes"
    ).unwrap();
}

#[derive(Clone)]
struct ProcessMetrics {
    process_counts: HashMap<String, ProcessCounts>,
    active_processes: HashSet<StreamId>,
}

#[derive(Debug, Default)]
struct ProcessCounts {
    started: u64,
    completed: u64,
    failed: u64,
    average_duration: Duration,
}

impl ProcessMetrics {
    fn record_process_started(&mut self, process_type: &str, process_id: StreamId) {
        PROCESS_STARTED.with_label_values(&[process_type]).inc();
        
        self.process_counts
            .entry(process_type.to_string())
            .or_default()
            .started += 1;
        
        self.active_processes.insert(process_id);
        ACTIVE_PROCESSES.set(self.active_processes.len() as f64);
    }
    
    fn record_process_completed(
        &mut self, 
        process_type: &str, 
        process_id: StreamId, 
        duration: Duration
    ) {
        PROCESS_COMPLETED.with_label_values(&[process_type]).inc();
        PROCESS_DURATION.observe(duration.as_secs_f64());
        
        let counts = self.process_counts
            .entry(process_type.to_string())
            .or_default();
        counts.completed += 1;
        
        // Update average duration
        let total_completed = counts.completed;
        counts.average_duration = (counts.average_duration * (total_completed - 1) as u32 + duration) 
            / total_completed as u32;
        
        self.active_processes.remove(&process_id);
        ACTIVE_PROCESSES.set(self.active_processes.len() as f64);
    }
    
    fn record_process_failed(&mut self, process_type: &str, process_id: StreamId) {
        PROCESS_FAILED.with_label_values(&[process_type]).inc();
        
        self.process_counts
            .entry(process_type.to_string())
            .or_default()
            .failed += 1;
        
        self.active_processes.remove(&process_id);
        ACTIVE_PROCESSES.set(self.active_processes.len() as f64);
    }
}

// Process health monitoring
#[derive(Debug)]
struct ProcessHealthCheck {
    max_process_age: Duration,
    max_retry_count: u32,
    warning_thresholds: HealthThresholds,
}

#[derive(Debug)]
struct HealthThresholds {
    failure_rate: f64,        // 0.0-1.0
    average_duration: Duration,
    stuck_process_age: Duration,
}

impl ProcessHealthCheck {
    async fn check_process_health(&self, metrics: &ProcessMetrics) -> HealthStatus {
        let mut issues = Vec::new();
        
        for (process_type, counts) in &metrics.process_counts {
            // Check failure rate
            let total = counts.started;
            if total > 0 {
                let failure_rate = counts.failed as f64 / total as f64;
                if failure_rate > self.warning_thresholds.failure_rate {
                    issues.push(format!(
                        "High failure rate for {}: {:.1}%", 
                        process_type, 
                        failure_rate * 100.0
                    ));
                }
            }
            
            // Check average duration
            if counts.average_duration > self.warning_thresholds.average_duration {
                issues.push(format!(
                    "Slow processes for {}: {:?}", 
                    process_type, 
                    counts.average_duration
                ));
            }
        }
        
        // Check for stuck processes
        let stuck_count = self.count_stuck_processes(&metrics.active_processes).await;
        if stuck_count > 0 {
            issues.push(format!("{} processes appear stuck", stuck_count));
        }
        
        if issues.is_empty() {
            HealthStatus::Healthy
        } else {
            HealthStatus::Warning { issues }
        }
    }
    
    async fn count_stuck_processes(&self, active_processes: &HashSet<StreamId>) -> usize {
        // This would query the event store to check process ages
        // Implementation depends on your monitoring setup
        0
    }
}

#[derive(Debug)]
enum HealthStatus {
    Healthy,
    Warning { issues: Vec<String> },
    Critical { issues: Vec<String> },
}
}

Testing Long-Running Processes

Test processes thoroughly:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod process_tests {
    use super::*;
    use eventcore::testing::prelude::*;
    
    #[tokio::test]
    async fn test_order_fulfillment_happy_path() {
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        
        let order_id = OrderId::new();
        let process = OrderFulfillmentProcess::start(order_id).unwrap();
        
        // Start process
        executor.execute(&process).await.unwrap();
        
        // Simulate payment confirmation
        let payment_event = PaymentConfirmed {
            order_id,
            amount: Money::from_cents(1000),
        };
        
        // Process should advance
        let advance_command = AdvanceOrderProcess {
            process_id: process.process_id,
            trigger: ProcessTrigger::PaymentConfirmed,
        };
        executor.execute(&advance_command).await.unwrap();
        
        // Continue with inventory, shipping, etc.
        // Verify process reaches completion
    }
    
    #[tokio::test]
    async fn test_process_timeout_and_retry() {
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        let scheduler = InMemoryTimeoutScheduler::new(executor.clone());
        
        let order_id = OrderId::new();
        let mut process = OrderFulfillmentProcess::start(order_id).unwrap();
        process.timeout_at = Some(Utc::now() + Duration::from_secs(1));
        
        // Start process
        executor.execute(&process).await.unwrap();
        
        // Wait for timeout
        tokio::time::sleep(Duration::from_secs(2)).await;
        
        // Verify timeout was triggered
        // Check retry logic works
    }
    
    #[tokio::test]
    async fn test_saga_compensation() {
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        
        // Create booking saga
        let saga = create_travel_booking_saga(
            create_hotel_booking(),
            create_flight_booking(),
            create_car_booking(),
        );
        
        // Start saga
        executor.execute(&saga).await.unwrap();
        
        // Simulate hotel booking success
        simulate_step_success(&executor, &saga.saga_id, 0).await;
        
        // Simulate flight booking failure
        simulate_step_failure(&executor, &saga.saga_id, 1, "No availability").await;
        
        // Verify compensation started
        // Check hotel booking was cancelled
    }
}
}

Best Practices

  1. Design for failure - Always plan compensation strategies
  2. Use timeouts - Prevent processes from hanging forever
  3. Implement retries - Handle transient failures gracefully
  4. Monitor actively - Track process health in production
  5. Keep state minimal - Only store what’s needed for decisions
  6. Test thoroughly - Include failure scenarios and edge cases
  7. Document workflows - Make process logic clear
  8. Version processes - Handle schema evolution like events

Summary

Long-running processes in EventCore:

  • Stateful workflows - Coordinate complex business processes
  • Event-driven - React to events from other parts of the system
  • Fault tolerant - Handle failures and compensations
  • Monitorable - Track health and performance
  • Testable - Comprehensive testing support

Key patterns:

  1. Use process managers for complex workflows
  2. Implement saga pattern for distributed transactions
  3. Handle timeouts and retries robustly
  4. Monitor process health actively
  5. Test all failure scenarios

Next, let’s explore Distributed Systems

Chapter 5.4: Distributed Systems

EventCore excels in distributed systems where multiple services need to coordinate while maintaining consistency. This chapter covers patterns for building resilient, scalable distributed event-sourced architectures.

Distributed EventCore Architecture

Service Boundaries

Each service owns its event streams and commands:

#![allow(unused)]
fn main() {
// User Service
#[derive(Command, Clone)]
struct CreateUser {
    #[stream]
    user_id: StreamId,
    
    email: Email,
    profile: UserProfile,
}

// Order Service  
#[derive(Command, Clone)]
struct CreateOrder {
    #[stream]
    order_id: StreamId,
    
    #[stream]
    customer_id: StreamId, // References user from User Service
    
    items: Vec<OrderItem>,
}

// Payment Service
#[derive(Command, Clone)]
struct ProcessPayment {
    #[stream]
    payment_id: StreamId,
    
    #[stream]
    order_id: StreamId, // References order from Order Service
    
    amount: Money,
    method: PaymentMethod,
}
}

Event Publishing

Services publish events for other services to consume:

#![allow(unused)]
fn main() {
use eventcore::distributed::{EventPublisher, EventSubscriber};

#[async_trait]
trait EventPublisher {
    async fn publish(&self, event: &StoredEvent) -> Result<(), PublishError>;
}

struct MessageBusPublisher {
    bus: MessageBus,
    topic_mapping: HashMap<String, String>,
}

impl MessageBusPublisher {
    async fn publish_event<E>(&self, event: &StoredEvent<E>) -> Result<(), PublishError>
    where
        E: Serialize,
    {
        let topic = self.topic_mapping
            .get(&E::event_type())
            .ok_or(PublishError::UnknownEventType)?;
        
        let message = DistributedEvent {
            event_id: event.id,
            event_type: E::event_type(),
            stream_id: event.stream_id.clone(),
            version: event.version,
            payload: serde_json::to_value(&event.payload)?,
            metadata: event.metadata.clone(),
            occurred_at: event.occurred_at,
            published_at: Utc::now(),
            service_id: self.service_id(),
        };
        
        self.bus.publish(topic, &message).await?;
        Ok(())
    }
    
    fn service_id(&self) -> String {
        std::env::var("SERVICE_ID").unwrap_or_else(|_| "unknown".to_string())
    }
}

#[derive(Debug, Serialize, Deserialize)]
struct DistributedEvent {
    event_id: EventId,
    event_type: String,
    stream_id: StreamId,
    version: EventVersion,
    payload: serde_json::Value,
    metadata: EventMetadata,
    occurred_at: DateTime<Utc>,
    published_at: DateTime<Utc>,
    service_id: String,
}
}

Event Subscription

Services subscribe to events from other services:

#![allow(unused)]
fn main() {
#[async_trait]
trait EventSubscriber {
    async fn subscribe<F>(&self, topic: &str, handler: F) -> Result<(), SubscribeError>
    where
        F: Fn(DistributedEvent) -> BoxFuture<'_, Result<(), HandleError>> + Send + Sync + 'static;
}

struct OrderEventHandler {
    executor: CommandExecutor,
}

impl OrderEventHandler {
    async fn handle_user_events(&self, event: DistributedEvent) -> Result<(), HandleError> {
        match event.event_type.as_str() {
            "UserRegistered" => {
                let user_registered: UserRegisteredEvent = serde_json::from_value(event.payload)?;
                
                // Create customer profile in order service
                let command = CreateCustomerProfile {
                    customer_id: StreamId::from(format!("customer-{}", user_registered.user_id)),
                    user_id: user_registered.user_id,
                    email: user_registered.email,
                    preferences: CustomerPreferences::default(),
                };
                
                self.executor.execute(&command).await?;
            }
            "UserUpdated" => {
                // Handle user updates
                let user_updated: UserUpdatedEvent = serde_json::from_value(event.payload)?;
                
                let command = UpdateCustomerProfile {
                    customer_id: StreamId::from(format!("customer-{}", user_updated.user_id)),
                    email: user_updated.email,
                    profile_updates: user_updated.profile_changes,
                };
                
                self.executor.execute(&command).await?;
            }
            _ => {
                // Unknown event type - log and ignore
                tracing::debug!("Ignoring unknown event type: {}", event.event_type);
            }
        }
        Ok(())
    }
}

// Setup subscription
async fn setup_event_subscriptions(
    subscriber: &impl EventSubscriber,
    handler: OrderEventHandler,
) -> Result<(), SubscribeError> {
    // Subscribe to user events
    subscriber.subscribe("user-events", move |event| {
        let handler = handler.clone();
        Box::pin(async move {
            handler.handle_user_events(event).await
        })
    }).await?;
    
    // Subscribe to payment events  
    subscriber.subscribe("payment-events", move |event| {
        let handler = handler.clone();
        Box::pin(async move {
            handler.handle_payment_events(event).await
        })
    }).await?;
    
    Ok(())
}
}

Distributed Transactions

Handle distributed transactions with the saga pattern:

#![allow(unused)]
fn main() {
#[derive(Command, Clone)]
struct DistributedOrderSaga {
    #[stream]
    saga_id: StreamId,
    
    order_details: OrderDetails,
    customer_id: UserId,
}

#[derive(Default)]
struct DistributedSagaState {
    order_created: bool,
    payment_reserved: bool,
    inventory_reserved: bool,
    shipping_scheduled: bool,
    completed: bool,
    compensation_needed: bool,
    failed_step: Option<String>,
}

impl CommandLogic for DistributedOrderSaga {
    type State = DistributedSagaState;
    type Event = SagaEvent;
    
    async fn handle(
        &self,
        read_streams: ReadStreams<Self::StreamSet>,
        state: Self::State,
        _stream_resolver: &mut StreamResolver,
    ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> {
        if state.compensation_needed {
            self.handle_compensation(&read_streams, &state).await
        } else {
            self.handle_forward_flow(&read_streams, &state).await
        }
    }
}

impl DistributedOrderSaga {
    async fn handle_forward_flow(
        &self,
        read_streams: &ReadStreams<DistributedOrderSagaStreamSet>,
        state: &DistributedSagaState,
    ) -> CommandResult<Vec<StreamWrite<DistributedOrderSagaStreamSet, SagaEvent>>> {
        match (state.order_created, state.payment_reserved, state.inventory_reserved, state.shipping_scheduled) {
            (false, _, _, _) => {
                // Step 1: Create order
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::OrderCreationRequested {
                            order_details: self.order_details.clone(),
                        }
                    )?
                ])
            }
            (true, false, _, _) => {
                // Step 2: Reserve payment
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::PaymentReservationRequested {
                            customer_id: self.customer_id,
                            amount: self.order_details.total_amount(),
                        }
                    )?
                ])
            }
            (true, true, false, _) => {
                // Step 3: Reserve inventory
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::InventoryReservationRequested {
                            items: self.order_details.items.clone(),
                        }
                    )?
                ])
            }
            (true, true, true, false) => {
                // Step 4: Schedule shipping
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::ShippingScheduleRequested {
                            order_id: self.order_details.order_id,
                            shipping_address: self.order_details.shipping_address.clone(),
                        }
                    )?
                ])
            }
            (true, true, true, true) => {
                // All steps completed
                Ok(vec![
                    StreamWrite::new(
                        read_streams,
                        self.saga_id.clone(),
                        SagaEvent::SagaCompleted,
                    )?
                ])
            }
        }
    }
    
    async fn handle_compensation(
        &self,
        read_streams: &ReadStreams<DistributedOrderSagaStreamSet>,
        state: &DistributedSagaState,
    ) -> CommandResult<Vec<StreamWrite<DistributedOrderSagaStreamSet, SagaEvent>>> {
        // Compensate in reverse order
        if state.shipping_scheduled {
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::ShippingCancellationRequested,
                )?
            ])
        } else if state.inventory_reserved {
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::InventoryReleaseRequested,
                )?
            ])
        } else if state.payment_reserved {
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::PaymentReleaseRequested,
                )?
            ])
        } else if state.order_created {
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::OrderCancellationRequested,
                )?
            ])
        } else {
            Ok(vec![
                StreamWrite::new(
                    read_streams,
                    self.saga_id.clone(),
                    SagaEvent::CompensationCompleted,
                )?
            ])
        }
    }
}

// External service integration
struct ExternalServiceClient {
    http_client: reqwest::Client,
    service_url: String,
    timeout: Duration,
}

impl ExternalServiceClient {
    async fn create_order(&self, order: &OrderDetails) -> Result<OrderId, ServiceError> {
        let response = self.http_client
            .post(&format!("{}/orders", self.service_url))
            .json(order)
            .timeout(self.timeout)
            .send()
            .await?;
        
        if response.status().is_success() {
            let result: CreateOrderResponse = response.json().await?;
            Ok(result.order_id)
        } else {
            Err(ServiceError::RequestFailed {
                status: response.status(),
                body: response.text().await.unwrap_or_default(),
            })
        }
    }
    
    async fn cancel_order(&self, order_id: OrderId) -> Result<(), ServiceError> {
        let response = self.http_client
            .delete(&format!("{}/orders/{}", self.service_url, order_id))
            .timeout(self.timeout)
            .send()
            .await?;
        
        if !response.status().is_success() {
            return Err(ServiceError::RequestFailed {
                status: response.status(),
                body: response.text().await.unwrap_or_default(),
            });
        }
        
        Ok(())
    }
}
}

Event Sourcing Across Services

Cross-Service Projections

Build projections that consume events from multiple services:

#![allow(unused)]
fn main() {
struct CrossServiceOrderProjection {
    orders: HashMap<OrderId, OrderView>,
    event_store: Arc<dyn EventStore>,
    user_service_client: UserServiceClient,
    payment_service_client: PaymentServiceClient,
}

#[derive(Debug, Clone)]
struct OrderView {
    order_id: OrderId,
    customer_info: CustomerInfo,
    items: Vec<OrderItem>,
    payment_status: PaymentStatus,
    shipping_status: ShippingStatus,
    total_amount: Money,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
}

#[async_trait]
impl Projection for CrossServiceOrderProjection {
    type Event = DistributedEvent;
    type Error = ProjectionError;
    
    async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> {
        match event.payload.event_type.as_str() {
            "OrderCreated" => {
                let order_created: OrderCreatedEvent = 
                    serde_json::from_value(event.payload.payload.clone())?;
                
                // Get customer info from user service
                let customer_info = self.user_service_client
                    .get_customer_info(order_created.customer_id)
                    .await?;
                
                let order_view = OrderView {
                    order_id: order_created.order_id,
                    customer_info,
                    items: order_created.items,
                    payment_status: PaymentStatus::Pending,
                    shipping_status: ShippingStatus::NotStarted,
                    total_amount: order_created.total_amount,
                    created_at: event.occurred_at,
                    updated_at: event.occurred_at,
                };
                
                self.orders.insert(order_created.order_id, order_view);
            }
            "PaymentProcessed" => {
                let payment_processed: PaymentProcessedEvent = 
                    serde_json::from_value(event.payload.payload.clone())?;
                
                if let Some(order) = self.orders.get_mut(&payment_processed.order_id) {
                    order.payment_status = PaymentStatus::Completed;
                    order.updated_at = event.occurred_at;
                }
            }
            "ShipmentDispatched" => {
                let shipment_dispatched: ShipmentDispatchedEvent = 
                    serde_json::from_value(event.payload.payload.clone())?;
                
                if let Some(order) = self.orders.get_mut(&shipment_dispatched.order_id) {
                    order.shipping_status = ShippingStatus::Dispatched;
                    order.updated_at = event.occurred_at;
                }
            }
            _ => {} // Ignore other events
        }
        
        Ok(())
    }
}
}

Event Federation

Federate events across service boundaries:

#![allow(unused)]
fn main() {
struct EventFederationHub {
    publishers: HashMap<String, Box<dyn EventPublisher>>,
    subscribers: HashMap<String, Vec<Box<dyn EventSubscriber>>>,
    routing_rules: RoutingRules,
}

#[derive(Debug, Clone)]
struct RoutingRules {
    routes: Vec<RoutingRule>,
}

#[derive(Debug, Clone)]
struct RoutingRule {
    source_service: String,
    event_pattern: String,
    target_services: Vec<String>,
    transformation: Option<String>,
}

impl EventFederationHub {
    async fn route_event(&self, event: &DistributedEvent) -> Result<(), FederationError> {
        let applicable_rules = self.routing_rules
            .routes
            .iter()
            .filter(|rule| {
                rule.source_service == event.service_id &&
                self.matches_pattern(&event.event_type, &rule.event_pattern)
            });
        
        for rule in applicable_rules {
            let transformed_event = if let Some(ref transformation) = rule.transformation {
                self.transform_event(event, transformation)?
            } else {
                event.clone()
            };
            
            for target_service in &rule.target_services {
                if let Some(publisher) = self.publishers.get(target_service) {
                    publisher.publish_federated_event(&transformed_event).await?;
                }
            }
        }
        
        Ok(())
    }
    
    fn matches_pattern(&self, event_type: &str, pattern: &str) -> bool {
        // Simple pattern matching - could be more sophisticated
        pattern == "*" || 
        pattern == event_type ||
        (pattern.ends_with("*") && event_type.starts_with(&pattern[..pattern.len()-1]))
    }
    
    fn transform_event(&self, event: &DistributedEvent, transformation: &str) -> Result<DistributedEvent, FederationError> {
        // Apply transformation rules
        match transformation {
            "user_to_customer" => {
                let mut transformed = event.clone();
                transformed.event_type = transformed.event_type.replace("User", "Customer");
                Ok(transformed)
            }
            "anonymize_pii" => {
                let mut transformed = event.clone();
                // Remove PII from payload
                if let Some(email) = transformed.payload.get_mut("email") {
                    *email = serde_json::Value::String("***@***.***".to_string());
                }
                Ok(transformed)
            }
            _ => Err(FederationError::UnknownTransformation(transformation.to_string())),
        }
    }
}
}

Service Discovery and Health

Service Registry

#![allow(unused)]
fn main() {
#[async_trait]
trait ServiceRegistry {
    async fn register_service(&self, service: ServiceInfo) -> Result<(), RegistryError>;
    async fn discover_services(&self, service_type: &str) -> Result<Vec<ServiceInfo>, RegistryError>;
    async fn health_check(&self, service_id: &str) -> Result<HealthStatus, RegistryError>;
}

#[derive(Debug, Clone)]
struct ServiceInfo {
    id: String,
    name: String,
    service_type: String,
    version: String,
    endpoints: HashMap<String, String>,
    health_check_url: String,
    capabilities: Vec<String>,
    metadata: HashMap<String, String>,
    registered_at: DateTime<Utc>,
}

struct ConsulServiceRegistry {
    consul_client: ConsulClient,
}

impl ConsulServiceRegistry {
    async fn register_eventcore_service(&self) -> Result<(), RegistryError> {
        let service = ServiceInfo {
            id: format!("eventcore-{}", uuid::Uuid::new_v4()),
            name: "order-service".to_string(),
            service_type: "eventcore".to_string(),
            version: env!("CARGO_PKG_VERSION").to_string(),
            endpoints: hashmap! {
                "http".to_string() => "http://localhost:8080".to_string(),
                "grpc".to_string() => "grpc://localhost:8081".to_string(),
                "events".to_string() => "kafka://localhost:9092/order-events".to_string(),
            },
            health_check_url: "http://localhost:8080/health".to_string(),
            capabilities: vec![
                "event-sourcing".to_string(),
                "order-management".to_string(),
                "payment-processing".to_string(),
            ],
            metadata: hashmap! {
                "environment".to_string() => "production".to_string(),
                "region".to_string() => "us-east-1".to_string(),
            },
            registered_at: Utc::now(),
        };
        
        self.register_service(service).await
    }
}
}

Circuit Breaker for Service Calls

#![allow(unused)]
fn main() {
struct ServiceCircuitBreaker {
    state: Arc<RwLock<CircuitBreakerState>>,
    config: CircuitBreakerConfig,
}

#[derive(Debug)]
struct CircuitBreakerConfig {
    failure_threshold: u32,
    timeout: Duration,
    retry_timeout: Duration,
}

#[derive(Debug)]
enum CircuitBreakerState {
    Closed { failure_count: u32 },
    Open { failed_at: DateTime<Utc> },
    HalfOpen,
}

impl ServiceCircuitBreaker {
    async fn call<F, T, E>(&self, operation: F) -> Result<T, CircuitBreakerError<E>>
    where
        F: Future<Output = Result<T, E>>,
    {
        // Check circuit state
        {
            let state = self.state.read().await;
            match *state {
                CircuitBreakerState::Open { failed_at } => {
                    if Utc::now() - failed_at < self.config.retry_timeout {
                        return Err(CircuitBreakerError::CircuitOpen);
                    }
                    // Transition to half-open
                }
                _ => {}
            }
        }
        
        // Update to half-open if we were open
        {
            let mut state = self.state.write().await;
            if matches!(*state, CircuitBreakerState::Open { .. }) {
                *state = CircuitBreakerState::HalfOpen;
            }
        }
        
        // Execute operation with timeout
        match tokio::time::timeout(self.config.timeout, operation).await {
            Ok(Ok(result)) => {
                // Success - reset circuit
                let mut state = self.state.write().await;
                *state = CircuitBreakerState::Closed { failure_count: 0 };
                Ok(result)
            }
            Ok(Err(e)) => {
                // Operation failed
                self.record_failure().await;
                Err(CircuitBreakerError::OperationFailed(e))
            }
            Err(_) => {
                // Timeout
                self.record_failure().await;
                Err(CircuitBreakerError::Timeout)
            }
        }
    }
    
    async fn record_failure(&self) {
        let mut state = self.state.write().await;
        match *state {
            CircuitBreakerState::Closed { failure_count } => {
                let new_count = failure_count + 1;
                if new_count >= self.config.failure_threshold {
                    *state = CircuitBreakerState::Open { failed_at: Utc::now() };
                } else {
                    *state = CircuitBreakerState::Closed { failure_count: new_count };
                }
            }
            CircuitBreakerState::HalfOpen => {
                *state = CircuitBreakerState::Open { failed_at: Utc::now() };
            }
            _ => {}
        }
    }
}

#[derive(Debug, thiserror::Error)]
enum CircuitBreakerError<E> {
    #[error("Circuit breaker is open")]
    CircuitOpen,
    
    #[error("Operation timed out")]
    Timeout,
    
    #[error("Operation failed: {0}")]
    OperationFailed(E),
}
}

Distributed Monitoring

Distributed Tracing

#![allow(unused)]
fn main() {
use opentelemetry::{global, trace::{TraceContextExt, Tracer}};
use tracing_opentelemetry::OpenTelemetrySpanExt;

#[derive(Clone)]
struct DistributedCommandExecutor {
    inner: CommandExecutor,
    tracer: Box<dyn Tracer + Send + Sync>,
}

impl DistributedCommandExecutor {
    async fn execute_with_tracing<C: Command>(
        &self,
        command: &C,
        parent_context: Option<SpanContext>,
    ) -> CommandResult<ExecutionResult> {
        let span = self.tracer
            .span_builder(format!("execute_command_{}", std::any::type_name::<C>()))
            .with_kind(SpanKind::Internal)
            .start(&self.tracer);
        
        if let Some(parent) = parent_context {
            span.set_parent(parent);
        }
        
        let _guard = span.enter();
        
        span.set_attribute("command.type", std::any::type_name::<C>());
        span.set_attribute("service.name", self.service_name());
        
        match self.inner.execute(command).await {
            Ok(result) => {
                span.set_attribute("command.success", true);
                span.set_attribute("events.written", result.events_written.len() as i64);
                Ok(result)
            }
            Err(e) => {
                span.set_attribute("command.success", false);
                span.set_attribute("error.message", e.to_string());
                Err(e)
            }
        }
    }
}

// Distributed event with trace context
#[derive(Debug, Serialize, Deserialize)]
struct TracedDistributedEvent {
    #[serde(flatten)]
    event: DistributedEvent,
    trace_id: String,
    span_id: String,
}

impl From<(&StoredEvent, &SpanContext)> for TracedDistributedEvent {
    fn from((event, context): (&StoredEvent, &SpanContext)) -> Self {
        Self {
            event: event.into(),
            trace_id: context.trace_id().to_string(),
            span_id: context.span_id().to_string(),
        }
    }
}
}

Metrics Collection

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, Gauge, Registry};

#[derive(Clone)]
struct DistributedMetrics {
    registry: Registry,
    // Command metrics
    commands_total: Counter,
    command_duration: Histogram,
    command_errors: Counter,
    // Event metrics
    events_published: Counter,
    events_consumed: Counter,
    event_lag: Gauge,
    // Service metrics
    service_health: Gauge,
    active_connections: Gauge,
}

impl DistributedMetrics {
    fn new(service_name: &str) -> Self {
        let registry = Registry::new();
        
        let commands_total = Counter::new(
            "eventcore_commands_total",
            "Total commands executed"
        ).unwrap();
        
        let command_duration = Histogram::new(
            "eventcore_command_duration_seconds",
            "Command execution duration"
        ).unwrap();
        
        let command_errors = Counter::new(
            "eventcore_command_errors_total", 
            "Total command errors"
        ).unwrap();
        
        let events_published = Counter::new(
            "eventcore_events_published_total",
            "Total events published"
        ).unwrap();
        
        let events_consumed = Counter::new(
            "eventcore_events_consumed_total",
            "Total events consumed"
        ).unwrap();
        
        let event_lag = Gauge::new(
            "eventcore_event_lag_seconds",
            "Event processing lag"
        ).unwrap();
        
        let service_health = Gauge::new(
            "eventcore_service_health",
            "Service health status (0=down, 1=up)"
        ).unwrap();
        
        let active_connections = Gauge::new(
            "eventcore_active_connections",
            "Number of active connections"
        ).unwrap();
        
        // Register all metrics
        registry.register(Box::new(commands_total.clone())).unwrap();
        registry.register(Box::new(command_duration.clone())).unwrap();
        registry.register(Box::new(command_errors.clone())).unwrap();
        registry.register(Box::new(events_published.clone())).unwrap();
        registry.register(Box::new(events_consumed.clone())).unwrap();
        registry.register(Box::new(event_lag.clone())).unwrap();
        registry.register(Box::new(service_health.clone())).unwrap();
        registry.register(Box::new(active_connections.clone())).unwrap();
        
        Self {
            registry,
            commands_total,
            command_duration,
            command_errors,
            events_published,
            events_consumed,
            event_lag,
            service_health,
            active_connections,
        }
    }
    
    fn record_command_executed(&self, command_type: &str, duration: Duration, success: bool) {
        self.commands_total
            .with_label_values(&[command_type])
            .inc();
        
        self.command_duration
            .with_label_values(&[command_type])
            .observe(duration.as_secs_f64());
        
        if !success {
            self.command_errors
                .with_label_values(&[command_type])
                .inc();
        }
    }
    
    fn record_event_published(&self, event_type: &str) {
        self.events_published
            .with_label_values(&[event_type])
            .inc();
    }
    
    fn record_event_consumed(&self, event_type: &str, lag: Duration) {
        self.events_consumed
            .with_label_values(&[event_type])
            .inc();
        
        self.event_lag
            .with_label_values(&[event_type])
            .set(lag.as_secs_f64());
    }
    
    async fn export_metrics(&self) -> String {
        use prometheus::Encoder;
        let encoder = prometheus::TextEncoder::new();
        let metric_families = self.registry.gather();
        encoder.encode_to_string(&metric_families).unwrap()
    }
}
}

Testing Distributed Systems

#![allow(unused)]
fn main() {
#[cfg(test)]
mod distributed_tests {
    use super::*;
    use testcontainers::*;
    
    #[tokio::test]
    async fn test_distributed_saga() {
        // Setup test environment with multiple services
        let docker = clients::Cli::default();
        let kafka_container = docker.run(images::kafka::Kafka::default());
        let postgres_container = docker.run(images::postgres::Postgres::default());
        
        // Start services
        let user_service = start_user_service(&postgres_container).await;
        let order_service = start_order_service(&postgres_container).await;
        let payment_service = start_payment_service(&postgres_container).await;
        
        // Setup event routing
        let event_hub = EventFederationHub::new(&kafka_container);
        
        // Execute distributed saga
        let saga = DistributedOrderSaga {
            saga_id: StreamId::new(),
            order_details: create_test_order(),
            customer_id: create_test_customer(&user_service).await,
        };
        
        let result = order_service.execute_saga(&saga).await;
        
        // Verify all services were coordinated correctly
        assert!(result.is_ok());
        
        // Verify final state across services
        let order = order_service.get_order(saga.order_details.order_id).await?;
        assert_eq!(order.status, OrderStatus::Completed);
        
        let payment = payment_service.get_payment(saga.order_details.order_id).await?;
        assert_eq!(payment.status, PaymentStatus::Completed);
    }
    
    #[tokio::test]
    async fn test_service_failure_compensation() {
        // Similar setup but simulate payment service failure
        // Verify compensation is triggered
        // Verify order is cancelled
        // Verify inventory is released
    }
}
}

Best Practices

  1. Design for independence - Services should be loosely coupled
  2. Use event-driven communication - Prefer async events over sync calls
  3. Implement circuit breakers - Protect against cascading failures
  4. Monitor everything - Comprehensive observability is critical
  5. Plan for failure - Design compensation strategies upfront
  6. Version everything - Events, services, and APIs
  7. Test across services - Include distributed testing
  8. Document service contracts - Clear event schemas and APIs

Summary

Distributed EventCore systems:

  • Service boundaries - Clear ownership of streams and commands
  • Event-driven - Async communication between services
  • Fault tolerant - Circuit breakers and compensation
  • Observable - Distributed tracing and metrics
  • Scalable - Independent scaling of services

Key patterns:

  1. Own your streams - each service owns its event streams
  2. Publish events - share state changes via events
  3. Use sagas - coordinate distributed transactions
  4. Monitor health - track service health and performance
  5. Plan for failure - implement circuit breakers and compensation

Next, let’s explore Performance Optimization

Chapter 5.5: Performance Optimization

EventCore is designed for performance, but complex event-sourced systems need careful optimization. This chapter covers patterns and techniques for maximizing performance in production.

Performance Fundamentals

Key Metrics

Monitor these critical metrics:

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, Gauge, register_counter, register_histogram, register_gauge};

lazy_static! {
    // Throughput metrics
    static ref COMMANDS_PER_SECOND: Counter = register_counter!(
        "eventcore_commands_per_second",
        "Commands executed per second"
    ).unwrap();
    
    static ref EVENTS_PER_SECOND: Counter = register_counter!(
        "eventcore_events_per_second", 
        "Events written per second"
    ).unwrap();
    
    // Latency metrics
    static ref COMMAND_LATENCY: Histogram = register_histogram!(
        "eventcore_command_latency_seconds",
        "Command execution latency"
    ).unwrap();
    
    static ref EVENT_STORE_LATENCY: Histogram = register_histogram!(
        "eventcore_event_store_latency_seconds",
        "Event store operation latency"
    ).unwrap();
    
    // Resource usage
    static ref ACTIVE_STREAMS: Gauge = register_gauge!(
        "eventcore_active_streams",
        "Number of active event streams"
    ).unwrap();
    
    static ref MEMORY_USAGE: Gauge = register_gauge!(
        "eventcore_memory_usage_bytes",
        "Memory usage in bytes"
    ).unwrap();
}

#[derive(Debug, Clone)]
struct PerformanceMetrics {
    pub commands_per_second: f64,
    pub events_per_second: f64,
    pub avg_command_latency: Duration,
    pub p95_command_latency: Duration,
    pub p99_command_latency: Duration,
    pub memory_usage_mb: f64,
    pub active_streams: u64,
}

impl PerformanceMetrics {
    fn record_command_executed(&self, duration: Duration) {
        COMMANDS_PER_SECOND.inc();
        COMMAND_LATENCY.observe(duration.as_secs_f64());
    }
    
    fn record_events_written(&self, count: usize) {
        EVENTS_PER_SECOND.inc_by(count as f64);
    }
}
}

Performance Targets

Typical performance targets for EventCore applications:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct PerformanceTargets {
    // Throughput targets
    pub min_commands_per_second: f64,      // 100+ commands/sec
    pub min_events_per_second: f64,        // 1000+ events/sec
    
    // Latency targets
    pub max_p50_latency: Duration,         // <10ms
    pub max_p95_latency: Duration,         // <50ms
    pub max_p99_latency: Duration,         // <100ms
    
    // Resource targets
    pub max_memory_usage_mb: f64,          // <1GB per service
    pub max_cpu_usage_percent: f64,        // <70%
}

impl PerformanceTargets {
    fn production() -> Self {
        Self {
            min_commands_per_second: 100.0,
            min_events_per_second: 1000.0,
            max_p50_latency: Duration::from_millis(10),
            max_p95_latency: Duration::from_millis(50),
            max_p99_latency: Duration::from_millis(100),
            max_memory_usage_mb: 1024.0,
            max_cpu_usage_percent: 70.0,
        }
    }
    
    fn development() -> Self {
        Self {
            min_commands_per_second: 10.0,
            min_events_per_second: 100.0,
            max_p50_latency: Duration::from_millis(50),
            max_p95_latency: Duration::from_millis(200),
            max_p99_latency: Duration::from_millis(500),
            max_memory_usage_mb: 512.0,
            max_cpu_usage_percent: 50.0,
        }
    }
}
}

Event Store Optimization

Connection Pooling

Optimize database connections for high throughput:

#![allow(unused)]
fn main() {
use sqlx::{Pool, Postgres, ConnectOptions};
use std::time::Duration;

#[derive(Debug, Clone)]
struct OptimizedPostgresConfig {
    pub database_url: String,
    pub max_connections: u32,
    pub min_connections: u32,
    pub acquire_timeout: Duration,
    pub idle_timeout: Duration,
    pub max_lifetime: Duration,
    pub connect_timeout: Duration,
    pub command_timeout: Duration,
}

impl OptimizedPostgresConfig {
    fn production() -> Self {
        Self {
            database_url: "postgresql://user:pass@host/db".to_string(),
            max_connections: 20,           // Higher for production
            min_connections: 5,            // Always keep minimum ready
            acquire_timeout: Duration::from_secs(30),
            idle_timeout: Duration::from_secs(600),     // 10 minutes
            max_lifetime: Duration::from_secs(1800),    // 30 minutes
            connect_timeout: Duration::from_secs(5),
            command_timeout: Duration::from_secs(30),
        }
    }
    
    async fn create_pool(&self) -> Result<Pool<Postgres>, sqlx::Error> {
        let options = sqlx::postgres::PgConnectOptions::from_url(&url::Url::parse(&self.database_url)?)?
            .application_name("eventcore-optimized");
        
        sqlx::postgres::PgPoolOptions::new()
            .max_connections(self.max_connections)
            .min_connections(self.min_connections)
            .acquire_timeout(self.acquire_timeout)
            .idle_timeout(self.idle_timeout)
            .max_lifetime(self.max_lifetime)
            .connect_with(options)
            .await
    }
}

struct OptimizedPostgresEventStore {
    pool: Pool<Postgres>,
    config: OptimizedPostgresConfig,
    batch_size: usize,
}

impl OptimizedPostgresEventStore {
    async fn new(config: OptimizedPostgresConfig) -> Result<Self, sqlx::Error> {
        let pool = config.create_pool().await?;
        
        Ok(Self {
            pool,
            config,
            batch_size: 1000, // Optimal batch size for PostgreSQL
        })
    }
}
}

Batch Operations

Batch database operations for better throughput:

#![allow(unused)]
fn main() {
#[async_trait]
impl EventStore for OptimizedPostgresEventStore {
    type Event = serde_json::Value;
    type Error = EventStoreError;
    
    async fn write_events_batch(
        &self,
        events: Vec<EventToWrite<Self::Event>>,
    ) -> Result<WriteResult, Self::Error> {
        if events.is_empty() {
            return Ok(WriteResult { events_written: 0 });
        }
        
        // Batch events by stream for version checking
        let mut stream_batches: HashMap<StreamId, Vec<_>> = HashMap::new();
        for event in events {
            stream_batches.entry(event.stream_id.clone()).or_default().push(event);
        }
        
        let mut transaction = self.pool.begin().await?;
        let mut total_written = 0;
        
        for (stream_id, batch) in stream_batches {
            let written = self.write_stream_batch(&mut transaction, stream_id, batch).await?;
            total_written += written;
        }
        
        transaction.commit().await?;
        
        Ok(WriteResult { events_written: total_written })
    }
    
    async fn write_stream_batch(
        &self,
        transaction: &mut sqlx::Transaction<'_, Postgres>,
        stream_id: StreamId,
        events: Vec<EventToWrite<Self::Event>>,
    ) -> Result<usize, EventStoreError> {
        if events.is_empty() {
            return Ok(0);
        }
        
        // Check current version
        let current_version = self.get_stream_version(&mut *transaction, &stream_id).await?;
        
        // Validate expected versions
        let expected_version = events[0].expected_version;
        if expected_version != current_version {
            return Err(EventStoreError::VersionConflict {
                stream: stream_id,
                expected: expected_version,
                actual: current_version,
            });
        }
        
        // Prepare batch insert
        let mut values = Vec::new();
        let mut parameters = Vec::new();
        let mut param_index = 1;
        
        for (i, event) in events.iter().enumerate() {
            let version = current_version.0 + i as u64 + 1;
            let event_id = EventId::new_v7();
            
            values.push(format!(
                "(${}, ${}, ${}, ${}, ${}, ${}, ${})",
                param_index, param_index + 1, param_index + 2, param_index + 3,
                param_index + 4, param_index + 5, param_index + 6
            ));
            
            parameters.extend([
                event_id.as_ref(),
                stream_id.as_ref(),
                &version.to_string(),
                &event.event_type,
                &serde_json::to_string(&event.payload)?,
                &serde_json::to_string(&event.metadata)?,
                &Utc::now().to_rfc3339(),
            ]);
            
            param_index += 7;
        }
        
        let query = format!(
            "INSERT INTO events (id, stream_id, version, event_type, payload, metadata, occurred_at) VALUES {}",
            values.join(", ")
        );
        
        let mut query_builder = sqlx::query(&query);
        for param in parameters {
            query_builder = query_builder.bind(param);
        }
        
        let rows_affected = query_builder.execute(&mut **transaction).await?.rows_affected();
        
        Ok(rows_affected as usize)
    }
}
}

Read Optimization

Optimize reading patterns:

#![allow(unused)]
fn main() {
impl OptimizedPostgresEventStore {
    // Optimized stream reading with pagination
    async fn read_stream_paginated(
        &self,
        stream_id: &StreamId,
        from_version: EventVersion,
        page_size: usize,
    ) -> Result<StreamEvents<Self::Event>, Self::Error> {
        let query = "
            SELECT id, stream_id, version, event_type, payload, metadata, occurred_at
            FROM events 
            WHERE stream_id = $1 AND version >= $2
            ORDER BY version ASC
            LIMIT $3
        ";
        
        let rows = sqlx::query(query)
            .bind(stream_id.as_ref())
            .bind(from_version.as_ref())
            .bind(page_size as i64)
            .fetch_all(&self.pool)
            .await?;
        
        let events = rows.into_iter()
            .map(|row| self.row_to_event(row))
            .collect::<Result<Vec<_>, _>>()?;
        
        let version = events.last()
            .map(|e| e.version)
            .unwrap_or(from_version);
        
        Ok(StreamEvents {
            stream_id: stream_id.clone(),
            version,
            events,
        })
    }
    
    // Multi-stream reading with parallel queries
    async fn read_multiple_streams(
        &self,
        stream_ids: Vec<StreamId>,
        options: ReadOptions,
    ) -> Result<Vec<StreamEvents<Self::Event>>, Self::Error> {
        let futures = stream_ids.into_iter().map(|stream_id| {
            self.read_stream(&stream_id, options.clone())
        });
        
        let results = futures::future::try_join_all(futures).await?;
        Ok(results)
    }
    
    // Optimized subscription reading
    async fn read_all_events_from(
        &self,
        position: EventPosition,
        batch_size: usize,
    ) -> Result<Vec<StoredEvent<Self::Event>>, Self::Error> {
        let query = "
            SELECT id, stream_id, version, event_type, payload, metadata, occurred_at
            FROM events 
            WHERE occurred_at > $1
            ORDER BY occurred_at ASC
            LIMIT $2
        ";
        
        let rows = sqlx::query(query)
            .bind(position.timestamp)
            .bind(batch_size as i64)
            .fetch_all(&self.pool)
            .await?;
        
        rows.into_iter()
            .map(|row| self.row_to_event(row))
            .collect()
    }
}
}

Memory Optimization

State Management

Optimize memory usage in command state:

#![allow(unused)]
fn main() {
use std::collections::LRU;

#[derive(Clone)]
struct OptimizedCommandExecutor {
    event_store: Arc<dyn EventStore<Event = serde_json::Value>>,
    state_cache: Arc<RwLock<LruCache<StreamId, Arc<dyn Any + Send + Sync>>>>,
    cache_size: usize,
}

impl OptimizedCommandExecutor {
    fn new(event_store: Arc<dyn EventStore<Event = serde_json::Value>>) -> Self {
        Self {
            event_store,
            state_cache: Arc::new(RwLock::new(LruCache::new(NonZeroUsize::new(1000).unwrap()))),
            cache_size: 1000,
        }
    }
    
    async fn execute_with_caching<C: Command>(
        &self,
        command: &C,
    ) -> CommandResult<ExecutionResult> {
        let read_streams = self.read_streams_for_command(command).await?;
        
        // Try to get cached state
        let cached_state = self.get_cached_state::<C>(&read_streams).await;
        
        let state = match cached_state {
            Some(state) => state,
            None => {
                // Reconstruct state and cache it
                let state = self.reconstruct_state::<C>(&read_streams).await?;
                self.cache_state(&read_streams, &state).await;
                state
            }
        };
        
        // Execute command
        let mut stream_resolver = StreamResolver::new();
        let events = command.handle(read_streams, state, &mut stream_resolver).await?;
        
        // Write events and invalidate cache
        let result = self.write_events(events).await?;
        self.invalidate_cache_for_streams(&result.affected_streams).await;
        
        Ok(result)
    }
    
    async fn get_cached_state<C: Command>(&self, read_streams: &ReadStreams<C::StreamSet>) -> Option<C::State> {
        let cache = self.state_cache.read().await;
        
        // Check if all streams are cached and up-to-date
        for stream_data in read_streams.iter() {
            if let Some(cached) = cache.get(&stream_data.stream_id) {
                // Verify cache is current
                if !self.is_cache_current(&stream_data, cached).await {
                    return None;
                }
            } else {
                return None;
            }
        }
        
        // All streams cached - reconstruct state from cache
        self.reconstruct_from_cache(read_streams).await
    }
    
    async fn cache_state<C: Command>(&self, read_streams: &ReadStreams<C::StreamSet>, state: &C::State) {
        let mut cache = self.state_cache.write().await;
        
        for stream_data in read_streams.iter() {
            let cached_data = CachedStreamData {
                stream_id: stream_data.stream_id.clone(),
                version: stream_data.version,
                events: stream_data.events.clone(),
                cached_at: Utc::now(),
            };
            
            cache.put(stream_data.stream_id.clone(), Arc::new(cached_data));
        }
    }
}

#[derive(Debug, Clone)]
struct CachedStreamData {
    stream_id: StreamId,
    version: EventVersion,
    events: Vec<StoredEvent<serde_json::Value>>,
    cached_at: DateTime<Utc>,
}
}

Event Streaming

Stream events instead of loading everything into memory:

#![allow(unused)]
fn main() {
use tokio_stream::{Stream, StreamExt};
use futures::stream::TryStreamExt;

trait StreamingEventStore {
    fn stream_events(
        &self,
        stream_id: &StreamId,
        from_version: EventVersion,
    ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>>;
    
    fn stream_all_events(
        &self,
        from_position: EventPosition,
    ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>>;
}

impl StreamingEventStore for OptimizedPostgresEventStore {
    fn stream_events(
        &self,
        stream_id: &StreamId,
        from_version: EventVersion,
    ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>> {
        let pool = self.pool.clone();
        let stream_id = stream_id.clone();
        let page_size = 100;
        
        async_stream::try_stream! {
            let mut current_version = from_version;
            
            loop {
                let query = "
                    SELECT id, stream_id, version, event_type, payload, metadata, occurred_at
                    FROM events 
                    WHERE stream_id = $1 AND version >= $2
                    ORDER BY version ASC
                    LIMIT $3
                ";
                
                let rows = sqlx::query(query)
                    .bind(stream_id.as_ref())
                    .bind(current_version.as_ref())
                    .bind(page_size as i64)
                    .fetch_all(&pool)
                    .await?;
                
                if rows.is_empty() {
                    break;
                }
                
                for row in rows {
                    let event = self.row_to_event(row)?;
                    current_version = EventVersion::from(event.version.as_u64() + 1);
                    yield event;
                }
                
                if rows.len() < page_size {
                    break;
                }
            }
        }
    }
}

// Usage in projections
#[async_trait]
impl Projection for StreamingProjection {
    type Event = serde_json::Value;
    type Error = ProjectionError;
    
    async fn rebuild_from_stream(
        &mut self,
        event_stream: impl Stream<Item = Result<StoredEvent<Self::Event>, EventStoreError>>,
    ) -> Result<(), Self::Error> {
        let mut stream = std::pin::pin!(event_stream);
        
        while let Some(event_result) = stream.next().await {
            let event = event_result?;
            self.apply(&event).await?;
            
            // Checkpoint every 1000 events
            if event.version.as_u64() % 1000 == 0 {
                self.save_checkpoint(event.version).await?;
            }
        }
        
        Ok(())
    }
}
}

Concurrency Optimization

Parallel Command Execution

Execute independent commands in parallel:

#![allow(unused)]
fn main() {
use tokio::sync::Semaphore;
use std::sync::Arc;

#[derive(Clone)]
struct ParallelCommandExecutor {
    inner: CommandExecutor,
    concurrency_limit: Arc<Semaphore>,
    stream_locks: Arc<RwLock<HashMap<StreamId, Arc<Mutex<()>>>>>,
}

impl ParallelCommandExecutor {
    fn new(inner: CommandExecutor, max_concurrency: usize) -> Self {
        Self {
            inner,
            concurrency_limit: Arc::new(Semaphore::new(max_concurrency)),
            stream_locks: Arc::new(RwLock::new(HashMap::new())),
        }
    }
    
    async fn execute_batch<C: Command>(
        &self,
        commands: Vec<C>,
    ) -> Vec<CommandResult<ExecutionResult>> {
        // Group commands by affected streams
        let stream_groups = self.group_by_streams(&commands).await;
        
        let futures = stream_groups.into_iter().map(|(streams, commands)| {
            self.execute_stream_group(streams, commands)
        });
        
        let results = futures::future::join_all(futures).await;
        
        // Flatten results
        results.into_iter().flatten().collect()
    }
    
    async fn execute_stream_group<C: Command>(
        &self,
        affected_streams: HashSet<StreamId>,
        commands: Vec<C>,
    ) -> Vec<CommandResult<ExecutionResult>> {
        // Acquire locks for all streams in this group
        let _locks = self.acquire_stream_locks(&affected_streams).await;
        
        // Execute commands sequentially within the group
        let mut results = Vec::new();
        
        for command in commands {
            let _permit = self.concurrency_limit.acquire().await.unwrap();
            let result = self.inner.execute(&command).await;
            results.push(result);
        }
        
        results
    }
    
    async fn group_by_streams<C: Command>(
        &self,
        commands: &[C],
    ) -> HashMap<HashSet<StreamId>, Vec<C>> {
        let mut groups = HashMap::new();
        
        for command in commands {
            let streams = command.read_streams(&command).into_iter().collect();
            groups.entry(streams).or_insert_with(Vec::new).push(command.clone());
        }
        
        groups
    }
    
    async fn acquire_stream_locks(
        &self,
        stream_ids: &HashSet<StreamId>,
    ) -> Vec<tokio::sync::MutexGuard<'_, ()>> {
        let mut locks = Vec::new();
        
        // Sort stream IDs to prevent deadlocks
        let mut sorted_streams: Vec<_> = stream_ids.iter().collect();
        sorted_streams.sort();
        
        for stream_id in sorted_streams {
            let lock = {
                let stream_locks = self.stream_locks.read().await;
                stream_locks.get(stream_id).cloned()
            };
            
            let lock = match lock {
                Some(lock) => lock,
                None => {
                    let mut stream_locks = self.stream_locks.write().await;
                    stream_locks.entry(stream_id.clone())
                        .or_insert_with(|| Arc::new(Mutex::new(())))
                        .clone()
                }
            };
            
            locks.push(lock.lock().await);
        }
        
        locks
    }
}
}

Async Batching

Batch operations automatically:

#![allow(unused)]
fn main() {
use tokio::sync::mpsc;
use tokio::time::{interval, Duration};

struct BatchProcessor<T, R> {
    sender: mpsc::UnboundedSender<BatchItem<T, R>>,
    batch_size: usize,
    batch_timeout: Duration,
}

struct BatchItem<T, R> {
    item: T,
    response_sender: oneshot::Sender<R>,
}

impl<T, R> BatchProcessor<T, R>
where
    T: Send + 'static,
    R: Send + 'static,
{
    fn new<F, Fut>(
        batch_size: usize,
        batch_timeout: Duration,
        processor: F,
    ) -> Self
    where
        F: Fn(Vec<T>) -> Fut + Send + 'static,
        Fut: Future<Output = Vec<R>> + Send,
    {
        let (sender, receiver) = mpsc::unbounded_channel();
        
        tokio::spawn(Self::batch_worker(receiver, batch_size, batch_timeout, processor));
        
        Self {
            sender,
            batch_size,
            batch_timeout,
        }
    }
    
    async fn process(&self, item: T) -> Result<R, BatchError> {
        let (response_sender, response_receiver) = oneshot::channel();
        
        self.sender.send(BatchItem {
            item,
            response_sender,
        })?;
        
        response_receiver.await.map_err(BatchError::Cancelled)
    }
    
    async fn batch_worker<F, Fut>(
        mut receiver: mpsc::UnboundedReceiver<BatchItem<T, R>>,
        batch_size: usize,
        batch_timeout: Duration,
        processor: F,
    )
    where
        F: Fn(Vec<T>) -> Fut + Send + 'static,
        Fut: Future<Output = Vec<R>> + Send,
    {
        let mut batch = Vec::new();
        let mut senders = Vec::new();
        let mut timer = interval(batch_timeout);
        
        loop {
            select! {
                item = receiver.recv() => {
                    match item {
                        Some(BatchItem { item, response_sender }) => {
                            batch.push(item);
                            senders.push(response_sender);
                            
                            if batch.len() >= batch_size {
                                Self::process_batch(&processor, &mut batch, &mut senders).await;
                            }
                        }
                        None => break, // Channel closed
                    }
                }
                _ = timer.tick() => {
                    if !batch.is_empty() {
                        Self::process_batch(&processor, &mut batch, &mut senders).await;
                    }
                }
            }
        }
    }
    
    async fn process_batch<F, Fut>(
        processor: &F,
        batch: &mut Vec<T>,
        senders: &mut Vec<oneshot::Sender<R>>,
    )
    where
        F: Fn(Vec<T>) -> Fut,
        Fut: Future<Output = Vec<R>>,
    {
        if batch.is_empty() {
            return;
        }
        
        let items = std::mem::take(batch);
        let response_senders = std::mem::take(senders);
        
        let results = processor(items).await;
        
        for (sender, result) in response_senders.into_iter().zip(results) {
            let _ = sender.send(result); // Ignore send errors
        }
    }
}

// Usage for batched event writing
type EventBatch = BatchProcessor<EventToWrite<serde_json::Value>, Result<(), EventStoreError>>;

impl OptimizedPostgresEventStore {
    fn new_with_batching(pool: Pool<Postgres>) -> (Self, EventBatch) {
        let store = Self::new(pool);
        let store_clone = store.clone();
        
        let batch_processor = BatchProcessor::new(
            100,                           // Batch size
            Duration::from_millis(10),     // Batch timeout
            move |events| {
                let store = store_clone.clone();
                async move {
                    match store.write_events_batch(events).await {
                        Ok(_) => vec![Ok(()); events.len()],
                        Err(e) => vec![Err(e); events.len()],
                    }
                }
            }
        );
        
        (store, batch_processor)
    }
}
}

Projection Optimization

Incremental Updates

Update projections incrementally:

#![allow(unused)]
fn main() {
#[async_trait]
trait IncrementalProjection {
    type Event;
    type State;
    type Error;
    
    async fn apply_incremental(
        &mut self,
        event: &StoredEvent<Self::Event>,
        previous_state: Option<&Self::State>,
    ) -> Result<Self::State, Self::Error>;
    
    async fn get_checkpoint(&self) -> Result<EventVersion, Self::Error>;
    async fn save_checkpoint(&self, version: EventVersion) -> Result<(), Self::Error>;
}

struct OptimizedUserProjection {
    users: HashMap<UserId, UserSummary>,
    last_processed_version: EventVersion,
    checkpoint_interval: u64,
}

#[async_trait]
impl IncrementalProjection for OptimizedUserProjection {
    type Event = UserEvent;
    type State = HashMap<UserId, UserSummary>;
    type Error = ProjectionError;
    
    async fn apply_incremental(
        &mut self,
        event: &StoredEvent<Self::Event>,
        previous_state: Option<&Self::State>,
    ) -> Result<Self::State, Self::Error> {
        // Clone state if provided, otherwise start fresh
        let mut state = previous_state.cloned().unwrap_or_default();
        
        // Apply only this event
        match &event.payload {
            UserEvent::Registered { user_id, email, profile } => {
                state.insert(*user_id, UserSummary {
                    id: *user_id,
                    email: email.clone(),
                    display_name: profile.display_name(),
                    created_at: event.occurred_at,
                    updated_at: event.occurred_at,
                });
            }
            UserEvent::ProfileUpdated { user_id, profile } => {
                if let Some(user) = state.get_mut(user_id) {
                    user.display_name = profile.display_name();
                    user.updated_at = event.occurred_at;
                }
            }
        }
        
        // Update checkpoint
        self.last_processed_version = event.version;
        
        // Save checkpoint periodically
        if event.version.as_u64() % self.checkpoint_interval == 0 {
            self.save_checkpoint(event.version).await?;
        }
        
        Ok(state)
    }
    
    async fn get_checkpoint(&self) -> Result<EventVersion, Self::Error> {
        Ok(self.last_processed_version)
    }
    
    async fn save_checkpoint(&self, version: EventVersion) -> Result<(), Self::Error> {
        // Save to persistent storage
        // Implementation depends on your checkpoint store
        Ok(())
    }
}
}

Materialized Views

Use database materialized views for query optimization:

-- Create materialized view for user summaries
CREATE MATERIALIZED VIEW user_summaries AS
SELECT 
    (payload->>'user_id')::uuid as user_id,
    payload->>'email' as email,
    payload->'profile'->>'display_name' as display_name,
    occurred_at as created_at,
    occurred_at as updated_at
FROM events 
WHERE event_type = 'UserRegistered'
UNION ALL
SELECT 
    (payload->>'user_id')::uuid as user_id,
    NULL as email,
    payload->'profile'->>'display_name' as display_name,
    NULL as created_at,
    occurred_at as updated_at
FROM events 
WHERE event_type = 'UserProfileUpdated';

-- Create indexes for fast queries
CREATE INDEX idx_user_summaries_user_id ON user_summaries(user_id);
CREATE INDEX idx_user_summaries_email ON user_summaries(email);

-- Refresh materialized view (can be automated)
REFRESH MATERIALIZED VIEW user_summaries;

Benchmarking and Profiling

Performance Testing

Create comprehensive benchmarks:

#![allow(unused)]
fn main() {
use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId};
use tokio::runtime::Runtime;

fn benchmark_command_execution(c: &mut Criterion) {
    let rt = Runtime::new().unwrap();
    let store = rt.block_on(async {
        InMemoryEventStore::new()
    });
    let executor = CommandExecutor::new(store);
    
    let mut group = c.benchmark_group("command_execution");
    
    for concurrency in [1, 10, 50, 100].iter() {
        group.bench_with_input(
            BenchmarkId::new("create_user", concurrency),
            concurrency,
            |b, &concurrency| {
                b.to_async(&rt).iter(|| async {
                    let commands: Vec<_> = (0..concurrency)
                        .map(|i| CreateUser {
                            email: Email::try_new(format!("user{}@example.com", i)).unwrap(),
                            first_name: FirstName::try_new(format!("User{}", i)).unwrap(),
                            last_name: LastName::try_new("Test".to_string()).unwrap(),
                        })
                        .collect();
                    
                    let futures = commands.into_iter().map(|cmd| executor.execute(&cmd));
                    let results = futures::future::join_all(futures).await;
                    
                    black_box(results);
                });
            }
        );
    }
    
    group.finish();
}

fn benchmark_event_store_operations(c: &mut Criterion) {
    let rt = Runtime::new().unwrap();
    let store = rt.block_on(async {
        PostgresEventStore::new("postgresql://localhost/eventcore_bench").await.unwrap()
    });
    
    let mut group = c.benchmark_group("event_store");
    
    for batch_size in [1, 10, 100, 1000].iter() {
        group.bench_with_input(
            BenchmarkId::new("write_events", batch_size),
            batch_size,
            |b, &batch_size| {
                b.to_async(&rt).iter(|| async {
                    let events: Vec<_> = (0..batch_size)
                        .map(|i| EventToWrite {
                            stream_id: StreamId::try_new(format!("test-{}", i)).unwrap(),
                            payload: json!({ "test": i }),
                            metadata: EventMetadata::default(),
                            expected_version: EventVersion::from(0),
                        })
                        .collect();
                    
                    let result = store.write_events(events).await;
                    black_box(result);
                });
            }
        );
    }
    
    group.finish();
}

criterion_group!(benches, benchmark_command_execution, benchmark_event_store_operations);
criterion_main!(benches);
}

Memory Profiling

Profile memory usage patterns:

#![allow(unused)]
fn main() {
use memory_profiler::{Allocator, ProfiledAllocator};

#[global_allocator]
static PROFILED_ALLOCATOR: ProfiledAllocator<std::alloc::System> = ProfiledAllocator::new(std::alloc::System);

#[derive(Debug)]
struct MemoryUsageReport {
    pub allocated_bytes: usize,
    pub deallocated_bytes: usize,
    pub peak_memory: usize,
    pub current_memory: usize,
}

impl MemoryUsageReport {
    fn capture() -> Self {
        let stats = PROFILED_ALLOCATOR.stats();
        Self {
            allocated_bytes: stats.allocated,
            deallocated_bytes: stats.deallocated,
            peak_memory: stats.peak,
            current_memory: stats.current,
        }
    }
}

#[cfg(test)]
mod memory_tests {
    use super::*;
    
    #[tokio::test]
    async fn test_memory_usage_during_batch_execution() {
        let initial_memory = MemoryUsageReport::capture();
        
        // Execute large batch of commands
        let store = InMemoryEventStore::new();
        let executor = CommandExecutor::new(store);
        
        let commands: Vec<_> = (0..10000)
            .map(|i| CreateUser {
                email: Email::try_new(format!("user{}@example.com", i)).unwrap(),
                first_name: FirstName::try_new(format!("User{}", i)).unwrap(),
                last_name: LastName::try_new("Test".to_string()).unwrap(),
            })
            .collect();
        
        let peak_memory = MemoryUsageReport::capture();
        
        for command in commands {
            executor.execute(&command).await.unwrap();
        }
        
        let final_memory = MemoryUsageReport::capture();
        
        println!("Initial memory: {:?}", initial_memory);
        println!("Peak memory: {:?}", peak_memory);
        println!("Final memory: {:?}", final_memory);
        
        // Assert memory doesn't grow unbounded
        let memory_growth = final_memory.current_memory.saturating_sub(initial_memory.current_memory);
        assert!(memory_growth < 100 * 1024 * 1024, "Memory growth too large: {} bytes", memory_growth);
    }
}
}

Production Monitoring

Performance Dashboards

Create monitoring dashboards:

#![allow(unused)]
fn main() {
use prometheus::{Opts, Registry, TextEncoder, Encoder};
use axum::{response::Html, routing::get, Router};

#[derive(Clone)]
struct PerformanceMonitor {
    registry: Registry,
    metrics: PerformanceMetrics,
}

impl PerformanceMonitor {
    fn new() -> Self {
        let registry = Registry::new();
        let metrics = PerformanceMetrics::new(&registry);
        
        Self { registry, metrics }
    }
    
    async fn metrics_handler(&self) -> String {
        let encoder = TextEncoder::new();
        let metric_families = self.registry.gather();
        encoder.encode_to_string(&metric_families).unwrap()
    }
    
    fn dashboard_routes(&self) -> Router {
        let monitor = self.clone();
        
        Router::new()
            .route("/metrics", get(move || monitor.metrics_handler()))
            .route("/health", get(|| async { "OK" }))
            .route("/dashboard", get(|| async {
                Html(include_str!("performance_dashboard.html"))
            }))
    }
}

// HTML dashboard template
const DASHBOARD_HTML: &str = r#"
<!DOCTYPE html>
<html>
<head>
    <title>EventCore Performance Dashboard</title>
    <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
</head>
<body>
    <h1>EventCore Performance Metrics</h1>
    
    <div style="display: flex; flex-wrap: wrap;">
        <div style="width: 50%; padding: 10px;">
            <canvas id="throughputChart"></canvas>
        </div>
        <div style="width: 50%; padding: 10px;">
            <canvas id="latencyChart"></canvas>
        </div>
        <div style="width: 50%; padding: 10px;">
            <canvas id="memoryChart"></canvas>
        </div>
        <div style="width: 50%; padding: 10px;">
            <canvas id="streamsChart"></canvas>
        </div>
    </div>
    
    <script>
        // Real-time dashboard implementation
        async function updateMetrics() {
            const response = await fetch('/metrics');
            const text = await response.text();
            // Parse Prometheus metrics and update charts
            parseAndUpdateCharts(text);
        }
        
        setInterval(updateMetrics, 5000); // Update every 5 seconds
        updateMetrics(); // Initial load
    </script>
</body>
</html>
"#;
}

Alerting

Set up performance alerts:

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicBool, Ordering};

#[derive(Clone)]
struct PerformanceAlerting {
    thresholds: PerformanceTargets,
    alert_cooldown: Duration,
    last_alert: Arc<Mutex<HashMap<String, DateTime<Utc>>>>,
    alert_enabled: Arc<AtomicBool>,
}

impl PerformanceAlerting {
    fn new(thresholds: PerformanceTargets) -> Self {
        Self {
            thresholds,
            alert_cooldown: Duration::from_minutes(5),
            last_alert: Arc::new(Mutex::new(HashMap::new())),
            alert_enabled: Arc::new(AtomicBool::new(true)),
        }
    }
    
    async fn check_metrics(&self, metrics: &PerformanceMetrics) {
        if !self.alert_enabled.load(Ordering::Relaxed) {
            return;
        }
        
        // Check command latency
        if metrics.p95_command_latency > self.thresholds.max_p95_latency {
            self.send_alert(
                "high_latency",
                &format!(
                    "P95 latency is {}ms, threshold is {}ms",
                    metrics.p95_command_latency.as_millis(),
                    self.thresholds.max_p95_latency.as_millis()
                )
            ).await;
        }
        
        // Check throughput
        if metrics.commands_per_second < self.thresholds.min_commands_per_second {
            self.send_alert(
                "low_throughput",
                &format!(
                    "Throughput is {:.1} commands/sec, threshold is {:.1}",
                    metrics.commands_per_second,
                    self.thresholds.min_commands_per_second
                )
            ).await;
        }
        
        // Check memory usage
        if metrics.memory_usage_mb > self.thresholds.max_memory_usage_mb {
            self.send_alert(
                "high_memory",
                &format!(
                    "Memory usage is {:.1}MB, threshold is {:.1}MB",
                    metrics.memory_usage_mb,
                    self.thresholds.max_memory_usage_mb
                )
            ).await;
        }
    }
    
    async fn send_alert(&self, alert_type: &str, message: &str) {
        let mut last_alerts = self.last_alert.lock().await;
        let now = Utc::now();
        
        // Check cooldown
        if let Some(last_time) = last_alerts.get(alert_type) {
            if now.signed_duration_since(*last_time) < self.alert_cooldown {
                return; // Still in cooldown
            }
        }
        
        // Send alert (implement your alerting system)
        self.dispatch_alert(alert_type, message).await;
        
        // Update last alert time
        last_alerts.insert(alert_type.to_string(), now);
    }
    
    async fn dispatch_alert(&self, alert_type: &str, message: &str) {
        // Implementation depends on your alerting system
        // Examples: Slack, PagerDuty, email, etc.
        tracing::error!("PERFORMANCE ALERT [{}]: {}", alert_type, message);
        
        // Example: Send to Slack
        if let Ok(webhook_url) = std::env::var("SLACK_WEBHOOK_URL") {
            let payload = json!({
                "text": format!("🚨 EventCore Performance Alert: {}", message),
                "channel": "#alerts",
                "username": "EventCore Monitor"
            });
            
            let client = reqwest::Client::new();
            let _ = client.post(&webhook_url)
                .json(&payload)
                .send()
                .await;
        }
    }
}
}

Best Practices

  1. Measure first - Always profile before optimizing
  2. Optimize bottlenecks - Focus on the slowest operations
  3. Batch operations - Reduce round trips to storage
  4. Cache wisely - Cache expensive computations, not everything
  5. Stream large datasets - Don’t load everything into memory
  6. Monitor continuously - Track performance in production
  7. Set alerts - Get notified when performance degrades
  8. Test under load - Use realistic workloads in testing

Summary

Performance optimization in EventCore:

  • Comprehensive monitoring - Track all key metrics
  • Database optimization - Connection pooling and batching
  • Memory efficiency - Streaming and caching strategies
  • Concurrency optimization - Parallel execution patterns
  • Production monitoring - Dashboards and alerting

Key strategies:

  1. Optimize the event store with connection pooling and batching
  2. Use streaming for large datasets to minimize memory usage
  3. Implement parallel execution for independent commands
  4. Monitor performance continuously with metrics and alerts
  5. Profile and benchmark to identify bottlenecks

Performance is a journey, not a destination. Measure, optimize, and monitor continuously to ensure your EventCore applications scale effectively in production.

Next, let’s explore the Operations Guide

Security

This section covers security best practices for building secure applications with EventCore.

Topics

Key Principles

  1. Defense in Depth - Multiple layers of security
  2. Least Privilege - Grant minimal necessary access
  3. Fail Secure - Default to denying access
  4. Audit Everything - Log security events
  5. Encrypt Sensitive Data - Protect data at rest and in transit

Security Guide

This guide covers security best practices when building applications with EventCore.

Overview

EventCore provides a solid foundation for secure applications through:

  • Strong type safety that prevents many common vulnerabilities
  • Immutable event storage providing natural audit trails
  • Built-in concurrency control preventing data races
  • Configurable resource limits preventing DoS attacks

However, EventCore is a library, not a complete application framework. Security responsibilities are shared between EventCore and your application code.

What EventCore Provides

Type Safety

  • Validated domain types using nutype prevent injection attacks
  • Exhaustive pattern matching eliminates undefined behavior
  • Memory safety guaranteed by Rust

Concurrency Control

  • Optimistic locking prevents lost updates
  • Version checking ensures consistency
  • Atomic multi-stream operations maintain integrity

Resource Protection

  • Configurable timeouts prevent runaway operations
  • Batch size limits prevent memory exhaustion
  • Retry limits prevent infinite loops

What You Must Implement

Authentication & Authorization

EventCore does not provide:

  • User authentication
  • Stream-level access control
  • Command authorization
  • Read model security

You must implement these at the application layer.

Data Protection

EventCore stores events as-is. You must:

  • Encrypt sensitive data before storing
  • Implement key management
  • Handle data retention/deletion
  • Ensure compliance with regulations

Input Validation

While EventCore validates its own types, you must:

  • Validate all user input
  • Sanitize data before processing
  • Implement rate limiting
  • Prevent abuse patterns

Security Layers

┌─────────────────────────────────────┐
│         Application Layer           │
│  • Authentication                   │
│  • Authorization                    │
│  • Input Validation                 │
│  • Rate Limiting                    │
├─────────────────────────────────────┤
│         EventCore Layer             │
│  • Type Safety                      │
│  • Concurrency Control              │
│  • Resource Limits                  │
│  • Audit Trail                      │
├─────────────────────────────────────┤
│         Storage Layer               │
│  • Encryption at Rest               │
│  • Access Control                   │
│  • Backup Security                  │
│  • Network Security                 │
└─────────────────────────────────────┘

Next Steps

Authentication & Authorization

EventCore is authentication-agnostic but provides hooks for integrating your auth system.

Authentication Integration

Capturing User Identity

EventCore’s metadata system captures user identity for audit trails:

#![allow(unused)]
fn main() {
use eventcore::{CommandExecutor, UserId};

// Execute command with authenticated user
let user_id = UserId::try_new("user@example.com")?;
let result = executor
    .execute_as_user(command, user_id)
    .await?;
}

Middleware Pattern

Implement authentication as middleware:

#![allow(unused)]
fn main() {
use axum::{
    extract::State,
    http::StatusCode,
    middleware::Next,
    response::Response,
};

async fn auth_middleware(
    State(auth): State<AuthService>,
    headers: HeaderMap,
    mut req: Request,
    next: Next,
) -> Result<Response, StatusCode> {
    // Extract and verify token
    let token = headers
        .get("Authorization")
        .and_then(|h| h.to_str().ok())
        .ok_or(StatusCode::UNAUTHORIZED)?;
    
    let user = auth
        .verify_token(token)
        .await
        .map_err(|_| StatusCode::UNAUTHORIZED)?;
    
    // Add user to request extensions
    req.extensions_mut().insert(user);
    
    Ok(next.run(req).await)
}
}

Authorization Patterns

Stream-Level Authorization

Implement fine-grained access control:

#![allow(unused)]
fn main() {
#[async_trait]
trait StreamAuthorization {
    async fn can_read(&self, user: &User, stream_id: &StreamId) -> bool;
    async fn can_write(&self, user: &User, stream_id: &StreamId) -> bool;
}

struct CommandAuthorizationLayer<A: StreamAuthorization> {
    auth: A,
}

impl<A: StreamAuthorization> CommandAuthorizationLayer<A> {
    async fn authorize_command(
        &self,
        command: &impl Command,
        user: &User,
    ) -> Result<(), AuthError> {
        // Check read permissions
        for stream_id in command.read_streams() {
            if !self.auth.can_read(user, &stream_id).await {
                return Err(AuthError::Forbidden(stream_id));
            }
        }
        
        // Check write permissions
        for stream_id in command.write_streams() {
            if !self.auth.can_write(user, &stream_id).await {
                return Err(AuthError::Forbidden(stream_id));
            }
        }
        
        Ok(())
    }
}
}

Role-Based Access Control (RBAC)

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
enum Role {
    Admin,
    User,
    ReadOnly,
}

#[derive(Debug, Clone)]
struct User {
    id: UserId,
    roles: Vec<Role>,
}

impl User {
    fn has_role(&self, role: &Role) -> bool {
        self.roles.contains(role)
    }
    
    fn can_execute_command(&self, command_type: &str) -> bool {
        match command_type {
            "CreateAccount" => self.has_role(&Role::Admin),
            "UpdateAccount" => {
                self.has_role(&Role::Admin) || self.has_role(&Role::User)
            }
            "ViewAccount" => true, // All authenticated users
            _ => false,
        }
    }
}
}

Attribute-Based Access Control (ABAC)

#![allow(unused)]
fn main() {
#[derive(Debug)]
struct AccessContext {
    user: User,
    resource: Resource,
    action: Action,
    environment: Environment,
}

#[async_trait]
trait AccessPolicy {
    async fn evaluate(&self, context: &AccessContext) -> Decision;
}

struct AbacAuthorizer {
    policies: Vec<Box<dyn AccessPolicy>>,
}

impl AbacAuthorizer {
    async fn authorize(&self, context: AccessContext) -> Result<(), AuthError> {
        for policy in &self.policies {
            match policy.evaluate(&context).await {
                Decision::Deny(reason) => {
                    return Err(AuthError::PolicyDenied(reason));
                }
                Decision::Allow => continue,
            }
        }
        Ok(())
    }
}
}

Projection Security

Row-Level Security

Filter projections based on user permissions:

#![allow(unused)]
fn main() {
#[async_trait]
impl ReadModelStore for SecureAccountStore {
    async fn get_account(
        &self,
        account_id: &AccountId,
        user: &User,
    ) -> Result<Option<AccountReadModel>> {
        let account = self.inner.get_account(account_id).await?;
        
        // Apply row-level security
        match account {
            Some(acc) if self.user_can_view(&acc, user) => Ok(Some(acc)),
            _ => Ok(None),
        }
    }
    
    async fn list_accounts(
        &self,
        user: &User,
        filter: AccountFilter,
    ) -> Result<Vec<AccountReadModel>> {
        let accounts = self.inner.list_accounts(filter).await?;
        
        // Filter based on permissions
        Ok(accounts
            .into_iter()
            .filter(|acc| self.user_can_view(acc, user))
            .collect())
    }
}
}

Field-Level Security

Redact sensitive fields:

#![allow(unused)]
fn main() {
impl AccountReadModel {
    fn redact_for_user(&self, user: &User) -> Self {
        let mut redacted = self.clone();
        
        if !user.has_role(&Role::Admin) {
            redacted.ssn = None;
            redacted.tax_id = None;
        }
        
        if !user.has_role(&Role::Financial) {
            redacted.balance = None;
            redacted.credit_limit = None;
        }
        
        redacted
    }
}
}

Best Practices

  1. Fail Secure: Default to denying access
  2. Audit Everything: Log all authorization decisions
  3. Minimize Privileges: Grant only necessary permissions
  4. Separate Concerns: Keep auth logic separate from business logic
  5. Token Expiry: Implement short-lived tokens with refresh
  6. Rate Limiting: Prevent brute force attacks

Common Pitfalls

  • Not checking permissions on read models
  • Forgetting to validate token expiry
  • Exposing internal IDs that enable enumeration
  • Not rate limiting authentication attempts
  • Storing permissions in events (they change over time)

Data Encryption

Events are immutable and permanent. Encrypt sensitive data before storing it.

Encryption Strategies

Field-Level Encryption

Encrypt individual fields containing sensitive data:

#![allow(unused)]
fn main() {
use aes_gcm::{
    aead::{Aead, KeyInit},
    Aes256Gcm, Nonce,
};
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct EncryptedField {
    ciphertext: Vec<u8>,
    nonce: Vec<u8>,
    key_id: String, // For key rotation
}

impl EncryptedField {
    fn encrypt(
        plaintext: &str,
        key: &[u8; 32],
        key_id: String,
    ) -> Result<Self, EncryptionError> {
        let cipher = Aes256Gcm::new(key.into());
        let nonce = Nonce::from_slice(b"unique nonce"); // Use random nonce
        
        let ciphertext = cipher
            .encrypt(nonce, plaintext.as_bytes())
            .map_err(|_| EncryptionError::EncryptionFailed)?;
        
        Ok(Self {
            ciphertext,
            nonce: nonce.to_vec(),
            key_id,
        })
    }
    
    fn decrypt(&self, key: &[u8; 32]) -> Result<String, EncryptionError> {
        let cipher = Aes256Gcm::new(key.into());
        let nonce = Nonce::from_slice(&self.nonce);
        
        let plaintext = cipher
            .decrypt(nonce, self.ciphertext.as_ref())
            .map_err(|_| EncryptionError::DecryptionFailed)?;
        
        String::from_utf8(plaintext)
            .map_err(|_| EncryptionError::InvalidUtf8)
    }
}
}

Event Payload Encryption

Encrypt entire event payloads:

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum SecureEvent {
    #[serde(rename = "encrypted")]
    Encrypted {
        payload: EncryptedField,
        event_type: String,
    },
    // Non-sensitive events can remain unencrypted
    SystemEvent(SystemEvent),
}

impl SecureEvent {
    fn encrypt_event<E: Serialize>(
        event: E,
        event_type: String,
        key: &[u8; 32],
        key_id: String,
    ) -> Result<Self, EncryptionError> {
        let json = serde_json::to_string(&event)?;
        let encrypted = EncryptedField::encrypt(&json, key, key_id)?;
        
        Ok(Self::Encrypted {
            payload: encrypted,
            event_type,
        })
    }
}
}

Key Management

Key Storage

Never store encryption keys in:

  • Source code
  • Configuration files
  • Environment variables (in production)
  • Event payloads

Use proper key management:

  • AWS KMS
  • Azure Key Vault
  • HashiCorp Vault
  • Hardware Security Modules (HSM)

Key Rotation

Support key rotation without re-encrypting historical data:

#![allow(unused)]
fn main() {
struct KeyManager {
    current_key_id: String,
    keys: HashMap<String, Key>,
}

impl KeyManager {
    fn encrypt(&self, data: &str) -> Result<EncryptedField, Error> {
        let key = self.keys
            .get(&self.current_key_id)
            .ok_or(Error::KeyNotFound)?;
        
        EncryptedField::encrypt(data, &key.material, self.current_key_id.clone())
    }
    
    fn decrypt(&self, field: &EncryptedField) -> Result<String, Error> {
        // Use the key ID stored with the encrypted data
        let key = self.keys
            .get(&field.key_id)
            .ok_or(Error::KeyNotFound)?;
        
        field.decrypt(&key.material)
    }
}
}

Encryption Patterns

Deterministic Encryption

For fields that need to be searchable:

#![allow(unused)]
fn main() {
use sha2::{Sha256, Digest};

fn deterministic_encrypt(
    plaintext: &str,
    key: &[u8; 32],
) -> String {
    let mut hasher = Sha256::new();
    hasher.update(key);
    hasher.update(plaintext.as_bytes());
    
    base64::encode(hasher.finalize())
}

// Usage in events
#[derive(Serialize, Deserialize)]
struct UserRegistered {
    user_id: UserId,
    email_hash: String, // For lookups
    encrypted_email: EncryptedField, // Actual email
}
}

Tokenization

Replace sensitive data with tokens:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
struct Token(String);

trait TokenVault {
    async fn tokenize(&self, value: &str) -> Result<Token, Error>;
    async fn detokenize(&self, token: &Token) -> Result<String, Error>;
}

// Store tokens in events instead of sensitive data
#[derive(Serialize, Deserialize)]
struct PaymentProcessed {
    payment_id: PaymentId,
    card_token: Token, // Not the actual card number
    amount: Money,
}
}

Compliance Considerations

GDPR - Right to Erasure

Since events are immutable, implement crypto-shredding:

#![allow(unused)]
fn main() {
impl KeyManager {
    async fn shred_user_data(&mut self, user_id: &UserId) -> Result<(), Error> {
        // Delete user-specific encryption keys
        self.user_keys.remove(user_id);
        
        // Events remain but are now unreadable
        Ok(())
    }
}
}

PCI DSS

Never store in events:

  • Full credit card numbers
  • CVV/CVC codes
  • PIN numbers
  • Magnetic stripe data

HIPAA

Encrypt all Protected Health Information (PHI):

  • Patient names
  • Medical record numbers
  • Health conditions
  • Treatment information

Performance Considerations

  1. Batch Operations: Encrypt/decrypt in batches when possible
  2. Caching: Cache decrypted data with appropriate TTLs
  3. Async Operations: Use async encryption for better throughput
  4. Hardware Acceleration: Use AES-NI when available

Example: Secure User Events

#![allow(unused)]
fn main() {
use eventcore::Event;

#[derive(Debug, Serialize, Deserialize)]
struct SecureUserEvent {
    #[serde(flatten)]
    base: Event,
    #[serde(flatten)]
    payload: SecureUserPayload,
}

#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum SecureUserPayload {
    UserRegistered {
        user_id: UserId,
        username: String, // Public
        email_hash: String, // For lookups
        encrypted_pii: EncryptedField, // Name, email, phone
    },
    ProfileUpdated {
        user_id: UserId,
        changes: Vec<ProfileChange>,
        encrypted_changes: Option<EncryptedField>,
    },
}

// Helper for building secure events
struct SecureEventBuilder<'a> {
    crypto: &'a CryptoService,
}

impl<'a> SecureEventBuilder<'a> {
    async fn user_registered(
        &self,
        user_id: UserId,
        username: String,
        email: String,
        pii: PersonalInfo,
    ) -> Result<SecureUserEvent, Error> {
        let email_hash = self.crypto.hash_email(&email);
        let encrypted_pii = self.crypto.encrypt_pii(&pii).await?;
        
        Ok(SecureUserEvent {
            base: Event::new(),
            payload: SecureUserPayload::UserRegistered {
                user_id,
                username,
                email_hash,
                encrypted_pii,
            },
        })
    }
}
}

Input Validation

Proper input validation prevents injection attacks and data corruption.

Validation Layers

1. API Layer Validation

Validate at the edge before data enters your system:

#![allow(unused)]
fn main() {
use axum::{
    extract::Json,
    http::StatusCode,
    response::IntoResponse,
};
use validator::{Validate, ValidationError};

#[derive(Debug, Deserialize, Validate)]
struct CreateUserRequest {
    #[validate(length(min = 3, max = 50))]
    username: String,
    
    #[validate(email)]
    email: String,
    
    #[validate(length(min = 8), custom = "validate_password_strength")]
    password: String,
    
    #[validate(range(min = 13, max = 120))]
    age: u8,
}

fn validate_password_strength(password: &str) -> Result<(), ValidationError> {
    let has_uppercase = password.chars().any(|c| c.is_uppercase());
    let has_lowercase = password.chars().any(|c| c.is_lowercase());
    let has_digit = password.chars().any(|c| c.is_digit(10));
    let has_special = password.chars().any(|c| !c.is_alphanumeric());
    
    if !(has_uppercase && has_lowercase && has_digit && has_special) {
        return Err(ValidationError::new("weak_password"));
    }
    
    Ok(())
}

async fn create_user(
    Json(request): Json<CreateUserRequest>,
) -> Result<impl IntoResponse, StatusCode> {
    // Validation happens automatically during deserialization
    request.validate()
        .map_err(|_| StatusCode::BAD_REQUEST)?;
    
    // Continue with validated data...
    Ok(StatusCode::CREATED)
}
}

2. Domain Type Validation

Use nutype for domain-level validation:

#![allow(unused)]
fn main() {
use nutype::nutype;

#[nutype(
    sanitize(trim, lowercase),
    validate(
        len_char_min = 3,
        len_char_max = 50,
        regex = r"^[a-z0-9_]+$"
    ),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct Username(String);

#[nutype(
    sanitize(trim),
    validate(regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct Email(String);

#[nutype(
    validate(greater_or_equal = 0, less_or_equal = 1_000_000),
    derive(Debug, Clone, Copy, Serialize, Deserialize)
)]
pub struct Money(u64); // In cents

// Usage
let username = Username::try_new("JohnDoe123")
    .map_err(|_| "Invalid username")?;
    
let email = Email::try_new("john@example.com")
    .map_err(|_| "Invalid email")?;
}

3. Command Validation

Validate business rules in commands:

#![allow(unused)]
fn main() {
use eventcore::{Command, CommandError, require};

#[derive(Debug, Clone)]
struct TransferMoney {
    from_account: AccountId,
    to_account: AccountId,
    amount: Money,
}

impl TransferMoney {
    fn new(
        from: AccountId,
        to: AccountId,
        amount: Money,
    ) -> Result<Self, ValidationError> {
        // Validate at construction
        if from == to {
            return Err(ValidationError::SameAccount);
        }
        
        if amount.is_zero() {
            return Err(ValidationError::ZeroAmount);
        }
        
        Ok(Self {
            from_account: from,
            to_account: to,
            amount,
        })
    }
}

#[async_trait]
impl CommandLogic for TransferMoney {
    async fn handle(&self, state: State) -> CommandResult<Vec<Event>> {
        // Business rule validation
        require!(
            state.from_balance >= self.amount,
            CommandError::InsufficientFunds
        );
        
        require!(
            state.to_account.is_active(),
            CommandError::AccountInactive
        );
        
        require!(
            self.amount <= state.daily_limit_remaining,
            CommandError::DailyLimitExceeded
        );
        
        // Proceed with valid transfer...
        Ok(vec![/* events */])
    }
}
}

Sanitization Patterns

HTML/Script Injection Prevention

#![allow(unused)]
fn main() {
use ammonia::clean;

#[nutype(
    sanitize(trim, with = sanitize_html),
    validate(len_char_max = 1000),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct SafeHtml(String);

fn sanitize_html(input: &str) -> String {
    // Remove dangerous HTML/JS
    clean(input)
}

// For plain text fields
#[nutype(
    sanitize(trim, with = escape_html),
    validate(len_char_max = 500),
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct DisplayName(String);

fn escape_html(input: &str) -> String {
    input
        .replace('&', "&amp;")
        .replace('<', "&lt;")
        .replace('>', "&gt;")
        .replace('"', "&quot;")
        .replace('\'', "&#x27;")
}
}

SQL Injection Prevention

EventCore uses parameterized queries via sqlx, but validate data types:

#![allow(unused)]
fn main() {
#[nutype(
    sanitize(trim),
    validate(regex = r"^[a-zA-Z0-9_]+$"), // Alphanumeric + underscore only
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct TableName(String);

#[nutype(
    sanitize(trim),
    validate(regex = r"^[a-zA-Z_][a-zA-Z0-9_]*$"), // Valid identifier
    derive(Debug, Clone, Serialize, Deserialize)
)]
pub struct ColumnName(String);
}

Rate Limiting

Protect against abuse:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use tokio::sync::Mutex;
use std::collections::HashMap;
use std::time::{Duration, Instant};

struct RateLimiter {
    limits: Arc<Mutex<HashMap<String, Vec<Instant>>>>,
    max_requests: usize,
    window: Duration,
}

impl RateLimiter {
    async fn check_rate_limit(&self, key: &str) -> Result<(), RateLimitError> {
        let mut limits = self.limits.lock().await;
        let now = Instant::now();
        let requests = limits.entry(key.to_string()).or_default();
        
        // Remove old requests outside window
        requests.retain(|&time| now.duration_since(time) < self.window);
        
        if requests.len() >= self.max_requests {
            return Err(RateLimitError::TooManyRequests);
        }
        
        requests.push(now);
        Ok(())
    }
}

// Apply to commands
async fn execute_command(
    command: Command,
    user_id: UserId,
    rate_limiter: &RateLimiter,
) -> Result<(), Error> {
    // Rate limit by user
    rate_limiter.check_rate_limit(&user_id.to_string()).await?;
    
    // Rate limit by IP for anonymous operations
    // rate_limiter.check_rate_limit(&ip_address).await?;
    
    executor.execute(command).await
}
}

File Upload Validation

#![allow(unused)]
fn main() {
use tokio::io::AsyncReadExt;

#[derive(Debug)]
struct FileValidator {
    max_size: usize,
    allowed_types: Vec<String>,
}

impl FileValidator {
    async fn validate_upload(
        &self,
        mut file: impl AsyncRead + Unpin,
        content_type: &str,
    ) -> Result<Vec<u8>, ValidationError> {
        // Check content type
        if !self.allowed_types.contains(&content_type.to_string()) {
            return Err(ValidationError::InvalidFileType);
        }
        
        // Read and check size
        let mut buffer = Vec::new();
        let bytes_read = file
            .take(self.max_size as u64 + 1)
            .read_to_end(&mut buffer)
            .await?;
            
        if bytes_read > self.max_size {
            return Err(ValidationError::FileTooLarge);
        }
        
        // Verify file magic numbers
        if !self.verify_file_signature(&buffer, content_type) {
            return Err(ValidationError::InvalidFileContent);
        }
        
        Ok(buffer)
    }
    
    fn verify_file_signature(&self, data: &[u8], content_type: &str) -> bool {
        match content_type {
            "image/jpeg" => data.starts_with(&[0xFF, 0xD8, 0xFF]),
            "image/png" => data.starts_with(&[0x89, 0x50, 0x4E, 0x47]),
            "application/pdf" => data.starts_with(b"%PDF"),
            _ => true, // Add more as needed
        }
    }
}
}

Validation Best Practices

  1. Validate Early: At system boundaries
  2. Fail Fast: Return errors immediately
  3. Be Specific: Provide clear error messages
  4. Whitelist, Don’t Blacklist: Define what’s allowed
  5. Layer Defense: Validate at multiple levels
  6. Log Violations: Track validation failures

Common Mistakes

  • Trusting client-side validation
  • Not validating after deserialization
  • Weak regex patterns
  • Not checking array/collection sizes
  • Forgetting to validate optional fields
  • Not escaping output data

Compliance

EventCore’s immutable audit trail helps with compliance, but you must implement specific controls.

📋 Comprehensive Compliance Checklist

For a detailed compliance checklist covering OWASP, NIST, SOC2, PCI DSS, GDPR, and HIPAA requirements, see our COMPLIANCE_CHECKLIST.md.

This checklist provides actionable items for achieving compliance with major security frameworks and regulations.

GDPR Compliance

Data Protection Principles

  1. Lawfulness: Store only data with legal basis
  2. Purpose Limitation: Use data only for stated purposes
  3. Data Minimization: Store only necessary data
  4. Accuracy: Provide mechanisms to correct data
  5. Storage Limitation: Implement retention policies
  6. Security: Encrypt and protect personal data

Right to Erasure (Right to be Forgotten)

Since events are immutable, use crypto-shredding:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use uuid::Uuid;

struct GdprCompliantEventStore {
    event_store: Box<dyn EventStore>,
    key_vault: Box<dyn KeyVault>,
    user_keys: HashMap<UserId, KeyId>,
}

impl GdprCompliantEventStore {
    async fn forget_user(&mut self, user_id: UserId) -> Result<(), Error> {
        // 1. Delete user's encryption key
        if let Some(key_id) = self.user_keys.remove(&user_id) {
            self.key_vault.delete_key(key_id).await?;
        }
        
        // 2. Store erasure event for audit
        let erasure_event = UserDataErased {
            user_id: user_id.clone(),
            erased_at: Timestamp::now(),
            reason: "GDPR Article 17 Request".to_string(),
        };
        
        self.event_store
            .append_events(
                &StreamId::from_user(&user_id),
                vec![Event::from(erasure_event)],
            )
            .await?;
        
        // 3. Events remain but PII is now unreadable
        Ok(())
    }
}
}

Data Portability

Export user data in machine-readable format:

#![allow(unused)]
fn main() {
#[async_trait]
trait GdprExport {
    async fn export_user_data(
        &self,
        user_id: UserId,
    ) -> Result<UserDataExport, Error>;
}

#[derive(Serialize)]
struct UserDataExport {
    user_id: UserId,
    export_date: Timestamp,
    profile: UserProfile,
    events: Vec<UserEvent>,
    projections: HashMap<String, Value>,
}

impl EventStore {
    async fn export_user_events(
        &self,
        user_id: &UserId,
    ) -> Result<Vec<UserEvent>, Error> {
        // Collect all events related to user
        let streams = self.find_user_streams(user_id).await?;
        let mut events = Vec::new();
        
        for stream_id in streams {
            let stream_events = self.read_stream(&stream_id).await?;
            events.extend(
                stream_events
                    .into_iter()
                    .filter(|e| e.involves_user(user_id))
                    .map(|e| e.decrypt_for_export())
            );
        }
        
        Ok(events)
    }
}
}

PCI DSS Compliance

Never Store in Events

#![allow(unused)]
fn main() {
// BAD - Never do this
#[derive(Serialize, Deserialize)]
struct PaymentProcessed {
    card_number: String,      // NEVER!
    cvv: String,             // NEVER!
    pin: String,             // NEVER!
}

// GOOD - Store only tokens
#[derive(Serialize, Deserialize)]
struct PaymentProcessed {
    payment_id: PaymentId,
    card_token: CardToken,    // From PCI-compliant tokenizer
    last_four: String,        // "****1234"
    amount: Money,
    merchant_ref: String,
}
}

Audit Requirements

#![allow(unused)]
fn main() {
struct PciAuditLogger {
    logger: Box<dyn AuditLogger>,
}

impl PciAuditLogger {
    async fn log_payment_access(
        &self,
        user: &User,
        action: PaymentAction,
        resource: &str,
    ) -> Result<(), Error> {
        let entry = AuditEntry {
            timestamp: Timestamp::now(),
            user_id: user.id.clone(),
            action: action.to_string(),
            resource: resource.to_string(),
            ip_address: user.ip_address.clone(),
            success: true,
        };
        
        self.logger.log(entry).await
    }
}
}

HIPAA Compliance

Protected Health Information (PHI)

Always encrypt PHI:

#![allow(unused)]
fn main() {
#[derive(Serialize, Deserialize)]
struct PatientRecord {
    patient_id: PatientId,
    // All PHI must be encrypted
    encrypted_name: EncryptedField,
    encrypted_ssn: EncryptedField,
    encrypted_diagnosis: EncryptedField,
    encrypted_medications: EncryptedField,
    // Non-PHI can be unencrypted
    admission_date: Date,
    room_number: String,
}

struct HipaaCompliantStore {
    encryption: EncryptionService,
    audit: AuditService,
}

impl HipaaCompliantStore {
    async fn store_patient_event(
        &self,
        event: PatientEvent,
        accessed_by: UserId,
    ) -> Result<(), Error> {
        // Audit the access
        self.audit.log_phi_access(
            &accessed_by,
            &event.patient_id(),
            "WRITE",
        ).await?;
        
        // Encrypt and store
        let encrypted = self.encryption.encrypt_event(event)?;
        self.event_store.append(encrypted).await?;
        
        Ok(())
    }
}
}

Access Controls

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
enum HipaaRole {
    Doctor,
    Nurse,
    Admin,
    Billing,
}

impl HipaaRole {
    fn can_access_phi(&self) -> bool {
        matches!(self, HipaaRole::Doctor | HipaaRole::Nurse)
    }
    
    fn can_access_billing(&self) -> bool {
        matches!(self, HipaaRole::Admin | HipaaRole::Billing)
    }
}
}

SOX Compliance

Financial Controls

#![allow(unused)]
fn main() {
struct SoxCompliantExecutor {
    executor: CommandExecutor,
    approvals: ApprovalService,
}

impl SoxCompliantExecutor {
    async fn execute_financial_command(
        &self,
        command: FinancialCommand,
        requester: User,
    ) -> Result<(), Error> {
        // Segregation of duties
        if command.amount() > Money::from_dollars(10_000) {
            let approver = self.approvals
                .get_approver(&requester)
                .await?;
                
            self.approvals
                .request_approval(&command, &approver)
                .await?;
        }
        
        // Execute with full audit trail
        let result = self.executor
            .execute_with_metadata(
                command,
                metadata! {
                    "sox_requester" => requester.id,
                    "sox_timestamp" => Timestamp::now(),
                    "sox_ip" => requester.ip_address,
                },
            )
            .await?;
            
        Ok(result)
    }
}
}

General Compliance Features

Audit Trail

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize)]
struct ComplianceAuditEntry {
    timestamp: Timestamp,
    event_id: EventId,
    stream_id: StreamId,
    user_id: UserId,
    action: String,
    regulation: String, // "GDPR", "PCI", "HIPAA"
    details: HashMap<String, String>,
}

trait ComplianceAuditor {
    async fn log_access(&self, entry: ComplianceAuditEntry) -> Result<(), Error>;
    async fn generate_report(
        &self,
        regulation: &str,
        from: Date,
        to: Date,
    ) -> Result<ComplianceReport, Error>;
}
}

Data Retention

#![allow(unused)]
fn main() {
struct RetentionPolicy {
    regulation: String,
    data_type: String,
    retention_days: u32,
    action: RetentionAction,
}

enum RetentionAction {
    Delete,
    Archive,
    Anonymize,
}

struct RetentionManager {
    policies: Vec<RetentionPolicy>,
}

impl RetentionManager {
    async fn apply_retention(&self, event_store: &EventStore) -> Result<(), Error> {
        for policy in &self.policies {
            let cutoff = Timestamp::now() - Duration::days(policy.retention_days);
            
            match policy.action {
                RetentionAction::Delete => {
                    // For GDPR compliance
                    self.crypto_shred_old_data(cutoff).await?;
                }
                RetentionAction::Archive => {
                    // Move to cold storage
                    self.archive_old_events(cutoff).await?;
                }
                RetentionAction::Anonymize => {
                    // Remove PII but keep analytics data
                    self.anonymize_old_events(cutoff).await?;
                }
            }
        }
        Ok(())
    }
}
}

Compliance Checklist

  • Implement encryption for all PII/PHI
  • Set up audit logging for all access
  • Configure data retention policies
  • Implement right to erasure (GDPR)
  • Set up data export capabilities
  • Configure access controls (RBAC/ABAC)
  • Implement approval workflows (SOX)
  • Set up monitoring and alerting
  • Document all compliance measures
  • Regular compliance audits

Part 6: Operations

This part covers the operational aspects of running EventCore applications in production. From deployment strategies to monitoring, backup, and troubleshooting, you’ll learn how to operate EventCore systems reliably at scale.

Chapters in This Part

  1. Deployment Strategies - Production deployment patterns
  2. Monitoring and Metrics - Observability and performance tracking
  3. Backup and Recovery - Data protection and disaster recovery
  4. Troubleshooting - Debugging and problem resolution
  5. Production Checklist - Go-live validation and best practices

What You’ll Learn

  • Deploy EventCore applications safely
  • Monitor system health and performance
  • Implement backup and recovery procedures
  • Troubleshoot common production issues
  • Validate production readiness

Prerequisites

  • Completed Parts 1-5
  • Basic understanding of production deployments
  • Familiarity with containerization and orchestration
  • Knowledge of monitoring and logging concepts

Target Audience

  • DevOps engineers
  • Site reliability engineers
  • Platform engineers
  • Senior developers responsible for production systems

Time to Complete

  • Reading: ~45 minutes
  • With implementation: ~6 hours

Ready to learn production operations? Let’s start with Deployment Strategies

Chapter 6.1: Deployment Strategies

EventCore applications require careful deployment planning to ensure high availability, data consistency, and smooth rollouts. This chapter covers production-ready deployment patterns and strategies.

Container-Based Deployment

Docker Configuration

EventCore applications containerize well with proper configuration:

# Multi-stage build for optimized production image
FROM rust:1.87-slim as builder

WORKDIR /usr/src/app
COPY Cargo.toml Cargo.lock ./
COPY src ./src

# Build with release optimizations
RUN cargo build --release --locked

# Runtime image
FROM debian:bookworm-slim

# Install runtime dependencies
RUN apt-get update && apt-get install -y \
    ca-certificates \
    libssl3 \
    && rm -rf /var/lib/apt/lists/*

# Create non-root user
RUN useradd -r -s /bin/false eventcore

# Copy application
COPY --from=builder /usr/src/app/target/release/eventcore-app /usr/local/bin/
RUN chmod +x /usr/local/bin/eventcore-app

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

USER eventcore
EXPOSE 8080

CMD ["eventcore-app"]

Environment Configuration

Use environment variables for configuration:

# Database configuration
DATABASE_URL=postgresql://user:pass@db:5432/eventcore
DATABASE_MAX_CONNECTIONS=20
DATABASE_ACQUIRE_TIMEOUT=30s

# Application configuration
HTTP_PORT=8080
LOG_LEVEL=info
LOG_FORMAT=json

# Performance tuning
COMMAND_TIMEOUT=30s
EVENT_BATCH_SIZE=100
PROJECTION_WORKERS=4

# Security
JWT_SECRET_KEY=/run/secrets/jwt_key
CORS_ALLOWED_ORIGINS=https://myapp.com

# Monitoring
METRICS_PORT=9090
TRACING_ENDPOINT=http://jaeger:14268/api/traces
HEALTH_CHECK_INTERVAL=30s

Docker Compose for Development

version: '3.8'

services:
  eventcore-app:
    build: .
    ports:
      - "8080:8080"
      - "9090:9090"
    environment:
      DATABASE_URL: postgresql://postgres:password@postgres:5432/eventcore
      LOG_LEVEL: debug
      METRICS_PORT: 9090
    depends_on:
      postgres:
        condition: service_healthy
    networks:
      - eventcore
    restart: unless-stopped

  postgres:
    image: postgres:17-alpine
    environment:
      POSTGRES_DB: eventcore
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./migrations:/docker-entrypoint-initdb.d
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - eventcore

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9091:9090"
    volumes:
      - ./config/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    networks:
      - eventcore

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin
    volumes:
      - grafana_data:/var/lib/grafana
      - ./config/grafana:/etc/grafana/provisioning
    networks:
      - eventcore

volumes:
  postgres_data:
  prometheus_data:
  grafana_data:

networks:
  eventcore:
    driver: bridge

Kubernetes Deployment

Application Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: eventcore-app
  namespace: eventcore
  labels:
    app: eventcore
    component: application
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: eventcore
      component: application
  template:
    metadata:
      labels:
        app: eventcore
        component: application
    spec:
      serviceAccountName: eventcore
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: eventcore-app
        image: eventcore:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: http
        - containerPort: 9090
          name: metrics
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: eventcore-secrets
              key: database-url
        - name: JWT_SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: eventcore-secrets
              key: jwt-secret
        envFrom:
        - configMapRef:
            name: eventcore-config
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        volumeMounts:
        - name: config
          mountPath: /etc/eventcore
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: eventcore-config
---
apiVersion: v1
kind: Service
metadata:
  name: eventcore-service
  namespace: eventcore
  labels:
    app: eventcore
    component: application
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  - port: 9090
    targetPort: 9090
    protocol: TCP
    name: metrics
  selector:
    app: eventcore
    component: application
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: eventcore-config
  namespace: eventcore
data:
  HTTP_PORT: "8080"
  METRICS_PORT: "9090"
  LOG_LEVEL: "info"
  LOG_FORMAT: "json"
  COMMAND_TIMEOUT: "30s"
  EVENT_BATCH_SIZE: "100"
  PROJECTION_WORKERS: "4"
  HEALTH_CHECK_INTERVAL: "30s"
---
apiVersion: v1
kind: Secret
metadata:
  name: eventcore-secrets
  namespace: eventcore
type: Opaque
data:
  database-url: <base64-encoded-database-url>
  jwt-secret: <base64-encoded-jwt-secret>

Database Configuration

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster
  namespace: eventcore
spec:
  instances: 3
  primaryUpdateStrategy: unsupervised
  
  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: "256MB"
      effective_cache_size: "1GB"
      maintenance_work_mem: "64MB"
      checkpoint_completion_target: "0.9"
      wal_buffers: "16MB"
      default_statistics_target: "100"
      random_page_cost: "1.1"
      effective_io_concurrency: "200"
    
  bootstrap:
    initdb:
      database: eventcore
      owner: eventcore
      secret:
        name: postgres-credentials
  
  storage:
    size: 100Gi
    storageClass: fast-ssd
  
  monitoring:
    enabled: true
  
  backup:
    target: prefer-standby
    retentionPolicy: "30d"
    data:
      compression: gzip
      encryption: AES256
      jobs: 2
    wal:
      compression: gzip
      encryption: AES256
---
apiVersion: v1
kind: Secret
metadata:
  name: postgres-credentials
  namespace: eventcore
type: kubernetes.io/basic-auth
data:
  username: <base64-encoded-username>
  password: <base64-encoded-password>

Ingress Configuration

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: eventcore-ingress
  namespace: eventcore
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
  tls:
  - hosts:
    - api.eventcore.example.com
    secretName: eventcore-tls
  rules:
  - host: api.eventcore.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: eventcore-service
            port:
              number: 80

Blue-Green Deployment

Deployment Strategy

Blue-green deployment ensures zero-downtime updates:

# Blue environment (current production)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eventcore-blue
  namespace: eventcore
  labels:
    app: eventcore
    environment: blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: eventcore
      environment: blue
  template:
    metadata:
      labels:
        app: eventcore
        environment: blue
    spec:
      containers:
      - name: eventcore-app
        image: eventcore:v1.0.0
        # ... container spec
---
# Green environment (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eventcore-green
  namespace: eventcore
  labels:
    app: eventcore
    environment: green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: eventcore
      environment: green
  template:
    metadata:
      labels:
        app: eventcore
        environment: green
    spec:
      containers:
      - name: eventcore-app
        image: eventcore:v1.1.0
        # ... container spec
---
# Service that can switch between environments
apiVersion: v1
kind: Service
metadata:
  name: eventcore-service
  namespace: eventcore
spec:
  selector:
    app: eventcore
    environment: blue  # Switch to 'green' when deploying
  ports:
  - port: 80
    targetPort: 8080

Deployment Script

#!/bin/bash
set -e

NAMESPACE="eventcore"
NEW_VERSION="$1"
CURRENT_ENV="blue"
TARGET_ENV="green"

if [[ -z "$NEW_VERSION" ]]; then
    echo "Usage: $0 <new-version>"
    exit 1
fi

echo "Starting blue-green deployment to version $NEW_VERSION"

# Get current environment
CURRENT_SELECTOR=$(kubectl get service eventcore-service -n $NAMESPACE -o jsonpath='{.spec.selector.environment}')
if [[ "$CURRENT_SELECTOR" == "blue" ]]; then
    TARGET_ENV="green"
    CURRENT_ENV="blue"
else
    TARGET_ENV="blue"
    CURRENT_ENV="green"
fi

echo "Current environment: $CURRENT_ENV"
echo "Target environment: $TARGET_ENV"

# Update target environment with new version
kubectl set image deployment/eventcore-$TARGET_ENV -n $NAMESPACE \
    eventcore-app=eventcore:$NEW_VERSION

# Wait for rollout to complete
kubectl rollout status deployment/eventcore-$TARGET_ENV -n $NAMESPACE

# Health check on target environment
echo "Performing health checks..."
TARGET_POD=$(kubectl get pods -n $NAMESPACE -l environment=$TARGET_ENV -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n $NAMESPACE $TARGET_POD -- curl -f http://localhost:8080/health

# Run smoke tests
echo "Running smoke tests..."
kubectl port-forward -n $NAMESPACE service/eventcore-$TARGET_ENV 8081:80 &
PORT_FORWARD_PID=$!
sleep 5

# Basic functionality test
curl -f http://localhost:8081/health
curl -f http://localhost:8081/metrics

kill $PORT_FORWARD_PID

# Switch traffic to target environment
echo "Switching traffic to $TARGET_ENV environment"
kubectl patch service eventcore-service -n $NAMESPACE \
    -p '{"spec":{"selector":{"environment":"'$TARGET_ENV'"}}}'

echo "Deployment complete. Traffic switched to $TARGET_ENV"
echo "Old environment ($CURRENT_ENV) is still running for rollback if needed"
echo "To rollback: kubectl patch service eventcore-service -n $NAMESPACE -p '{\"spec\":{\"selector\":{\"environment\":\"$CURRENT_ENV\"}}}'"

Canary Deployment

Traffic Splitting with Istio

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: eventcore-canary
  namespace: eventcore
spec:
  hosts:
  - api.eventcore.example.com
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: eventcore-service
        subset: canary
  - route:
    - destination:
        host: eventcore-service
        subset: stable
      weight: 95
    - destination:
        host: eventcore-service
        subset: canary
      weight: 5
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: eventcore-destination
  namespace: eventcore
spec:
  host: eventcore-service
  subsets:
  - name: stable
    labels:
      version: stable
  - name: canary
    labels:
      version: canary

Automated Canary with Flagger

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: eventcore
  namespace: eventcore
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: eventcore
  progressDeadlineSeconds: 60
  service:
    port: 80
    targetPort: 8080
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 30s
    webhooks:
    - name: smoke-test
      type: pre-rollout
      url: http://flagger-loadtester.test/
      timeout: 15s
      metadata:
        type: bash
        cmd: "curl -sd 'test' http://eventcore-canary/health"
    - name: load-test
      url: http://flagger-loadtester.test/
      timeout: 5s
      metadata:
        cmd: "hey -z 1m -q 10 -c 2 http://eventcore-canary/"

Database Migrations

Schema Migration Strategy

#![allow(unused)]
fn main() {
use sqlx::{PgPool, migrate::MigrateDatabase, Postgres};

pub struct MigrationManager {
    pool: PgPool,
    migration_path: String,
}

impl MigrationManager {
    pub async fn new(database_url: &str, migration_path: String) -> Result<Self, sqlx::Error> {
        // Ensure database exists
        if !Postgres::database_exists(database_url).await? {
            Postgres::create_database(database_url).await?;
        }
        
        let pool = PgPool::connect(database_url).await?;
        
        Ok(Self {
            pool,
            migration_path,
        })
    }
    
    pub async fn run_migrations(&self) -> Result<(), sqlx::Error> {
        sqlx::migrate::Migrator::new(std::path::Path::new(&self.migration_path))
            .await?
            .run(&self.pool)
            .await?;
        
        Ok(())
    }
    
    pub async fn check_migration_status(&self) -> Result<MigrationStatus, sqlx::Error> {
        let migrator = sqlx::migrate::Migrator::new(std::path::Path::new(&self.migration_path))
            .await?;
        
        let applied = migrator.get_applied_migrations(&self.pool).await?;
        let available = migrator.iter().count();
        
        Ok(MigrationStatus {
            applied: applied.len(),
            available,
            pending: available - applied.len(),
        })
    }
}

#[derive(Debug)]
pub struct MigrationStatus {
    pub applied: usize,
    pub available: usize,
    pub pending: usize,
}
}

Migration Files Structure

migrations/
├── 001_initial_schema.sql
├── 002_add_user_preferences.sql
├── 003_optimize_event_indexes.sql
└── 004_add_projection_checkpoints.sql

Example migration:

-- migrations/001_initial_schema.sql
-- Create events table with optimized indexes
CREATE TABLE events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    stream_id VARCHAR(255) NOT NULL,
    version BIGINT NOT NULL,
    event_type VARCHAR(255) NOT NULL,
    payload JSONB NOT NULL,
    metadata JSONB NOT NULL DEFAULT '{}',
    occurred_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
    
    CONSTRAINT events_stream_version_unique UNIQUE (stream_id, version)
);

-- Optimized indexes for common query patterns
CREATE INDEX idx_events_stream_id ON events (stream_id);
CREATE INDEX idx_events_stream_id_version ON events (stream_id, version);
CREATE INDEX idx_events_occurred_at ON events (occurred_at);
CREATE INDEX idx_events_event_type ON events (event_type);
CREATE INDEX idx_events_payload_gin ON events USING GIN (payload);

-- Create projection checkpoints table
CREATE TABLE projection_checkpoints (
    projection_name VARCHAR(255) PRIMARY KEY,
    last_event_id UUID,
    last_event_version BIGINT,
    stream_positions JSONB NOT NULL DEFAULT '{}',
    updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_projection_checkpoints_updated_at ON projection_checkpoints (updated_at);

Zero-Downtime Migration Pattern

#!/bin/bash
# Zero-downtime migration script

set -e

DATABASE_URL="$1"
MIGRATION_PATH="./migrations"

echo "Starting zero-downtime migration process..."

# Step 1: Run additive migrations (safe)
echo "Running additive migrations..."
sqlx migrate run --source $MIGRATION_PATH/additive

# Step 2: Deploy new application version (backward compatible)
echo "Deploying new application version..."
kubectl set image deployment/eventcore-app eventcore-app=eventcore:$NEW_VERSION
kubectl rollout status deployment/eventcore-app

# Step 3: Verify application health
echo "Verifying application health..."
kubectl get pods -l app=eventcore
curl -f http://api.eventcore.example.com/health

# Step 4: Run data migrations (if needed)
echo "Running data migrations..."
sqlx migrate run --source $MIGRATION_PATH/data

# Step 5: Run cleanup migrations (remove old columns/tables)
echo "Running cleanup migrations..."
sqlx migrate run --source $MIGRATION_PATH/cleanup

echo "Zero-downtime migration completed successfully!"

Configuration Management

Environment-Specific Configuration

#![allow(unused)]
fn main() {
use config::{Config, ConfigError, Environment, File};
use serde::Deserialize;

#[derive(Debug, Deserialize, Clone)]
pub struct AppConfig {
    pub database: DatabaseConfig,
    pub server: ServerConfig,
    pub monitoring: MonitoringConfig,
    pub features: FeatureFlags,
}

#[derive(Debug, Deserialize, Clone)]
pub struct DatabaseConfig {
    pub url: String,
    pub max_connections: u32,
    pub acquire_timeout_seconds: u64,
    pub command_timeout_seconds: u64,
}

#[derive(Debug, Deserialize, Clone)]
pub struct ServerConfig {
    pub host: String,
    pub port: u16,
    pub cors_origins: Vec<String>,
    pub request_timeout_seconds: u64,
}

#[derive(Debug, Deserialize, Clone)]
pub struct MonitoringConfig {
    pub metrics_port: u16,
    pub tracing_endpoint: Option<String>,
    pub log_level: String,
    pub health_check_interval_seconds: u64,
}

#[derive(Debug, Deserialize, Clone)]
pub struct FeatureFlags {
    pub enable_metrics: bool,
    pub enable_tracing: bool,
    pub enable_auth: bool,
    pub enable_rate_limiting: bool,
}

impl AppConfig {
    pub fn from_env() -> Result<Self, ConfigError> {
        let environment = std::env::var("ENVIRONMENT").unwrap_or_else(|_| "development".to_string());
        
        let config = Config::builder()
            // Start with default configuration
            .add_source(File::with_name("config/default"))
            // Add environment-specific configuration
            .add_source(File::with_name(&format!("config/{}", environment)).required(false))
            // Add local configuration (for development)
            .add_source(File::with_name("config/local").required(false))
            // Override with environment variables
            .add_source(Environment::with_prefix("EVENTCORE").separator("_"))
            .build()?;
        
        config.try_deserialize()
    }
}
}

Configuration Files

# config/default.yaml
database:
  max_connections: 10
  acquire_timeout_seconds: 30
  command_timeout_seconds: 60

server:
  host: "0.0.0.0"
  port: 8080
  cors_origins: ["http://localhost:3000"]
  request_timeout_seconds: 30

monitoring:
  metrics_port: 9090
  log_level: "info"
  health_check_interval_seconds: 30

features:
  enable_metrics: true
  enable_tracing: false
  enable_auth: false
  enable_rate_limiting: false
# config/production.yaml
database:
  max_connections: 20
  acquire_timeout_seconds: 10
  command_timeout_seconds: 30

server:
  cors_origins: ["https://myapp.com"]
  request_timeout_seconds: 15

monitoring:
  log_level: "warn"
  health_check_interval_seconds: 10

features:
  enable_tracing: true
  enable_auth: true
  enable_rate_limiting: true

Health Checks and Readiness

Application Health Endpoints

#![allow(unused)]
fn main() {
use axum::{Json, response::Json as JsonResponse, extract::State};
use serde_json::{json, Value};
use std::sync::Arc;

#[derive(Clone)]
pub struct HealthService {
    event_store: Arc<dyn EventStore>,
    dependencies: Vec<Arc<dyn HealthCheck>>,
}

#[async_trait]
pub trait HealthCheck: Send + Sync {
    async fn name(&self) -> &'static str;
    async fn check(&self) -> HealthStatus;
}

#[derive(Debug, Clone)]
pub enum HealthStatus {
    Healthy,
    Unhealthy(String),
    Unknown,
}

impl HealthService {
    pub async fn health_check(&self) -> JsonResponse<Value> {
        let mut overall_healthy = true;
        let mut checks = Vec::new();
        
        // Check event store
        let event_store_status = self.check_event_store().await;
        let event_store_healthy = matches!(event_store_status, HealthStatus::Healthy);
        overall_healthy &= event_store_healthy;
        
        checks.push(json!({
            "name": "event_store",
            "status": if event_store_healthy { "healthy" } else { "unhealthy" },
            "details": match event_store_status {
                HealthStatus::Unhealthy(msg) => Some(msg),
                _ => None,
            }
        }));
        
        // Check dependencies
        for dependency in &self.dependencies {
            let name = dependency.name().await;
            let status = dependency.check().await;
            let healthy = matches!(status, HealthStatus::Healthy);
            overall_healthy &= healthy;
            
            checks.push(json!({
                "name": name,
                "status": if healthy { "healthy" } else { "unhealthy" },
                "details": match status {
                    HealthStatus::Unhealthy(msg) => Some(msg),
                    _ => None,
                }
            }));
        }
        
        let response = json!({
            "status": if overall_healthy { "healthy" } else { "unhealthy" },
            "checks": checks,
            "timestamp": chrono::Utc::now().to_rfc3339(),
            "version": env!("CARGO_PKG_VERSION")
        });
        
        JsonResponse(response)
    }
    
    pub async fn readiness_check(&self) -> JsonResponse<Value> {
        // Readiness is stricter - all components must be ready
        let event_store_ready = self.check_event_store_ready().await;
        let migrations_ready = self.check_migrations_ready().await;
        
        let ready = event_store_ready && migrations_ready;
        
        let response = json!({
            "status": if ready { "ready" } else { "not_ready" },
            "checks": {
                "event_store": event_store_ready,
                "migrations": migrations_ready,
            },
            "timestamp": chrono::Utc::now().to_rfc3339()
        });
        
        JsonResponse(response)
    }
    
    async fn check_event_store(&self) -> HealthStatus {
        match self.event_store.health_check().await {
            Ok(_) => HealthStatus::Healthy,
            Err(e) => HealthStatus::Unhealthy(format!("Event store error: {}", e)),
        }
    }
    
    async fn check_event_store_ready(&self) -> bool {
        // More stringent check for readiness
        self.event_store.ping().await.is_ok()
    }
    
    async fn check_migrations_ready(&self) -> bool {
        // Check if all migrations are applied
        match self.event_store.migration_status().await {
            Ok(status) => status.pending == 0,
            Err(_) => false,
        }
    }
}

// Route handlers
pub async fn health_handler(State(health_service): State<HealthService>) -> JsonResponse<Value> {
    health_service.health_check().await
}

pub async fn readiness_handler(State(health_service): State<HealthService>) -> JsonResponse<Value> {
    health_service.readiness_check().await
}

pub async fn liveness_handler() -> JsonResponse<Value> {
    // Simple liveness check - just return OK if the process is running
    JsonResponse(json!({
        "status": "alive",
        "timestamp": chrono::Utc::now().to_rfc3339()
    }))
}
}

Kubernetes Health Check Configuration

# Detailed health check configuration
spec:
  containers:
  - name: eventcore-app
    # Liveness probe - restart container if this fails
    livenessProbe:
      httpGet:
        path: /liveness
        port: 8080
        httpHeaders:
        - name: Accept
          value: application/json
      initialDelaySeconds: 30
      periodSeconds: 30
      timeoutSeconds: 5
      failureThreshold: 3
      successThreshold: 1
    
    # Readiness probe - remove from service if this fails
    readinessProbe:
      httpGet:
        path: /readiness
        port: 8080
        httpHeaders:
        - name: Accept
          value: application/json
      initialDelaySeconds: 5
      periodSeconds: 10
      timeoutSeconds: 3
      failureThreshold: 3
      successThreshold: 1
    
    # Startup probe - give extra time during startup
    startupProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
      timeoutSeconds: 3
      failureThreshold: 30
      successThreshold: 1

Best Practices

  1. Containerize everything - Use containers for consistent deployments
  2. Infrastructure as Code - Version control all configuration
  3. Zero-downtime deployments - Use blue-green or canary strategies
  4. Database migrations - Plan for backward compatibility
  5. Health monitoring - Implement comprehensive health checks
  6. Configuration management - Separate config from code
  7. Security - Use secrets management and RBAC
  8. Rollback plans - Always have a rollback strategy

Summary

EventCore deployment strategies:

  • Containerized - Docker and Kubernetes ready
  • Zero-downtime - Blue-green and canary deployments
  • Database migrations - Safe schema evolution
  • Health monitoring - Comprehensive health checks
  • Configuration management - Environment-specific config

Key patterns:

  1. Use containers for consistent, portable deployments
  2. Implement blue-green or canary deployments for zero downtime
  3. Plan database migrations for backward compatibility
  4. Configure comprehensive health checks for reliability
  5. Manage configuration separately from application code

Next, let’s explore Monitoring and Metrics

Chapter 6.2: Monitoring and Metrics

Effective monitoring is crucial for operating EventCore applications in production. This chapter covers comprehensive observability strategies including metrics, logging, tracing, and alerting.

Metrics Collection

Prometheus Integration

EventCore provides built-in Prometheus metrics:

#![allow(unused)]
fn main() {
use prometheus::{
    Counter, Histogram, Gauge, IntGauge,
    register_counter, register_histogram, register_gauge, register_int_gauge,
    Encoder, TextEncoder
};
use axum::{response::Response, http::StatusCode};

lazy_static! {
    // Command execution metrics
    static ref COMMANDS_TOTAL: Counter = register_counter!(
        "eventcore_commands_total",
        "Total number of commands executed"
    ).unwrap();
    
    static ref COMMAND_DURATION: Histogram = register_histogram!(
        "eventcore_command_duration_seconds",
        "Command execution duration in seconds"
    ).unwrap();
    
    static ref COMMAND_ERRORS: Counter = register_counter!(
        "eventcore_command_errors_total", 
        "Total number of command execution errors"
    ).unwrap();
    
    // Event store metrics
    static ref EVENTS_WRITTEN: Counter = register_counter!(
        "eventcore_events_written_total",
        "Total number of events written to the store"
    ).unwrap();
    
    static ref EVENT_STORE_LATENCY: Histogram = register_histogram!(
        "eventcore_event_store_latency_seconds",
        "Event store operation latency in seconds"
    ).unwrap();
    
    // Stream metrics
    static ref ACTIVE_STREAMS: IntGauge = register_int_gauge!(
        "eventcore_active_streams",
        "Number of active event streams"
    ).unwrap();
    
    static ref STREAM_VERSIONS: Gauge = register_gauge!(
        "eventcore_stream_versions",
        "Current version of event streams"
    ).unwrap();
    
    // Projection metrics
    static ref PROJECTION_EVENTS_PROCESSED: Counter = register_counter!(
        "eventcore_projection_events_processed_total",
        "Total events processed by projections"
    ).unwrap();
    
    static ref PROJECTION_LAG: Gauge = register_gauge!(
        "eventcore_projection_lag_seconds",
        "Projection lag behind latest events in seconds"
    ).unwrap();
    
    // System metrics
    static ref MEMORY_USAGE: Gauge = register_gauge!(
        "eventcore_memory_usage_bytes",
        "Memory usage in bytes"
    ).unwrap();
    
    static ref CONNECTION_POOL_SIZE: IntGauge = register_int_gauge!(
        "eventcore_connection_pool_size",
        "Database connection pool size"
    ).unwrap();
}

#[derive(Clone)]
pub struct MetricsService {
    start_time: std::time::Instant,
}

impl MetricsService {
    pub fn new() -> Self {
        Self {
            start_time: std::time::Instant::now(),
        }
    }
    
    pub fn record_command_executed(&self, command_type: &str, duration: std::time::Duration, success: bool) {
        COMMANDS_TOTAL.with_label_values(&[command_type]).inc();
        COMMAND_DURATION.with_label_values(&[command_type]).observe(duration.as_secs_f64());
        
        if !success {
            COMMAND_ERRORS.with_label_values(&[command_type]).inc();
        }
    }
    
    pub fn record_events_written(&self, stream_id: &str, count: usize) {
        EVENTS_WRITTEN.with_label_values(&[stream_id]).inc_by(count as f64);
    }
    
    pub fn record_event_store_operation(&self, operation: &str, duration: std::time::Duration) {
        EVENT_STORE_LATENCY.with_label_values(&[operation]).observe(duration.as_secs_f64());
    }
    
    pub fn update_active_streams(&self, count: i64) {
        ACTIVE_STREAMS.set(count);
    }
    
    pub fn update_stream_version(&self, stream_id: &str, version: f64) {
        STREAM_VERSIONS.with_label_values(&[stream_id]).set(version);
    }
    
    pub fn record_projection_event(&self, projection_name: &str, lag_seconds: f64) {
        PROJECTION_EVENTS_PROCESSED.with_label_values(&[projection_name]).inc();
        PROJECTION_LAG.with_label_values(&[projection_name]).set(lag_seconds);
    }
    
    pub fn update_memory_usage(&self, bytes: f64) {
        MEMORY_USAGE.set(bytes);
    }
    
    pub fn update_connection_pool_size(&self, size: i64) {
        CONNECTION_POOL_SIZE.set(size);
    }
    
    pub async fn export_metrics(&self) -> Result<Response<String>, StatusCode> {
        let encoder = TextEncoder::new();
        let metric_families = prometheus::gather();
        
        match encoder.encode_to_string(&metric_families) {
            Ok(output) => {
                let response = Response::builder()
                    .status(StatusCode::OK)
                    .header("Content-Type", encoder.format_type())
                    .body(output)
                    .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
                Ok(response)
            }
            Err(_) => Err(StatusCode::INTERNAL_SERVER_ERROR),
        }
    }
}

// Metrics endpoint handler
pub async fn metrics_handler(
    State(metrics_service): State<MetricsService>
) -> Result<Response<String>, StatusCode> {
    metrics_service.export_metrics().await
}
}

Custom Metrics

Define application-specific metrics:

#![allow(unused)]
fn main() {
use prometheus::{register_counter_vec, register_histogram_vec, CounterVec, HistogramVec};

lazy_static! {
    // Business metrics
    static ref USER_REGISTRATIONS: Counter = register_counter!(
        "eventcore_user_registrations_total",
        "Total number of user registrations"
    ).unwrap();
    
    static ref ORDER_VALUE: Histogram = register_histogram!(
        "eventcore_order_value_dollars",
        "Order value in dollars",
        vec![10.0, 50.0, 100.0, 500.0, 1000.0, 5000.0]
    ).unwrap();
    
    static ref API_REQUESTS: CounterVec = register_counter_vec!(
        "eventcore_api_requests_total",
        "Total API requests",
        &["method", "endpoint", "status"]
    ).unwrap();
    
    static ref REQUEST_DURATION: HistogramVec = register_histogram_vec!(
        "eventcore_request_duration_seconds",
        "Request duration in seconds",
        &["method", "endpoint"]
    ).unwrap();
}

pub struct BusinessMetrics;

impl BusinessMetrics {
    pub fn record_user_registration() {
        USER_REGISTRATIONS.inc();
    }
    
    pub fn record_order_placed(value_dollars: f64) {
        ORDER_VALUE.observe(value_dollars);
    }
    
    pub fn record_api_request(method: &str, endpoint: &str, status: u16, duration: std::time::Duration) {
        API_REQUESTS
            .with_label_values(&[method, endpoint, &status.to_string()])
            .inc();
        
        REQUEST_DURATION
            .with_label_values(&[method, endpoint])
            .observe(duration.as_secs_f64());
    }
}
}

Automatic Instrumentation

Instrument EventCore operations automatically:

#![allow(unused)]
fn main() {
use std::time::Instant;
use async_trait::async_trait;

pub struct InstrumentedCommandExecutor {
    inner: CommandExecutor,
    metrics: MetricsService,
}

impl InstrumentedCommandExecutor {
    pub fn new(inner: CommandExecutor, metrics: MetricsService) -> Self {
        Self { inner, metrics }
    }
}

#[async_trait]
impl CommandExecutor for InstrumentedCommandExecutor {
    async fn execute<C: Command>(&self, command: &C) -> CommandResult<ExecutionResult> {
        let start = Instant::now();
        let command_type = std::any::type_name::<C>();
        
        let result = self.inner.execute(command).await;
        let duration = start.elapsed();
        let success = result.is_ok();
        
        self.metrics.record_command_executed(command_type, duration, success);
        
        if let Ok(ref execution_result) = result {
            self.metrics.record_events_written(
                &execution_result.affected_streams[0].to_string(),
                execution_result.events_written.len()
            );
        }
        
        result
    }
}

// Instrumented event store
pub struct InstrumentedEventStore {
    inner: Arc<dyn EventStore>,
    metrics: MetricsService,
}

#[async_trait]
impl EventStore for InstrumentedEventStore {
    async fn write_events(&self, events: Vec<EventToWrite>) -> EventStoreResult<WriteResult> {
        let start = Instant::now();
        let result = self.inner.write_events(events).await;
        let duration = start.elapsed();
        
        self.metrics.record_event_store_operation("write", duration);
        result
    }
    
    async fn read_stream(&self, stream_id: &StreamId, options: ReadOptions) -> EventStoreResult<StreamEvents> {
        let start = Instant::now();
        let result = self.inner.read_stream(stream_id, options).await;
        let duration = start.elapsed();
        
        self.metrics.record_event_store_operation("read", duration);
        result
    }
}
}

Structured Logging

Logging Configuration

#![allow(unused)]
fn main() {
use tracing::{info, warn, error, debug, trace, instrument};
use tracing_subscriber::{
    layer::SubscriberExt,
    util::SubscriberInitExt,
    fmt,
    EnvFilter,
};
use serde_json::json;

pub fn init_logging(log_level: &str, log_format: &str) -> Result<(), Box<dyn std::error::Error>> {
    let env_filter = EnvFilter::try_from_default_env()
        .unwrap_or_else(|_| EnvFilter::new(log_level));
    
    let fmt_layer = match log_format {
        "json" => {
            fmt::layer()
                .json()
                .with_current_span(true)
                .with_span_list(true)
                .with_target(true)
                .with_file(true)
                .with_line_number(true)
                .boxed()
        }
        _ => {
            fmt::layer()
                .with_target(true)
                .with_file(true)
                .with_line_number(true)
                .boxed()
        }
    };
    
    tracing_subscriber::registry()
        .with(env_filter)
        .with(fmt_layer)
        .init();
    
    Ok(())
}

// Structured logging for command execution
#[instrument(skip(command), fields(command_type = %std::any::type_name::<C>()))]
pub async fn execute_command_with_logging<C: Command>(
    command: &C,
    executor: &CommandExecutor,
) -> CommandResult<ExecutionResult> {
    debug!("Starting command execution");
    
    let result = executor.execute(command).await;
    
    match &result {
        Ok(execution_result) => {
            info!(
                events_written = execution_result.events_written.len(),
                affected_streams = execution_result.affected_streams.len(),
                "Command executed successfully"
            );
        }
        Err(error) => {
            error!(
                error = %error,
                "Command execution failed"
            );
        }
    }
    
    result
}

// Event store logging
#[instrument(skip(events), fields(event_count = events.len()))]
pub async fn write_events_with_logging(
    events: Vec<EventToWrite>,
    event_store: &dyn EventStore,
) -> EventStoreResult<WriteResult> {
    debug!("Writing events to store");
    
    let stream_ids: Vec<_> = events.iter()
        .map(|e| e.stream_id.to_string())
        .collect();
    
    let result = event_store.write_events(events).await;
    
    match &result {
        Ok(write_result) => {
            info!(
                events_written = write_result.events_written,
                streams = ?stream_ids,
                "Events written successfully"
            );
        }
        Err(error) => {
            error!(
                error = %error,
                streams = ?stream_ids,
                "Failed to write events"
            );
        }
    }
    
    result
}
}

Log Aggregation

Configure log shipping to centralized systems:

# Fluentd configuration for Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: eventcore
data:
  fluent.conf: |
    <source>
      @type tail
      path /var/log/containers/eventcore-*.log
      pos_file /var/log/fluentd-eventcore.log.pos
      tag eventcore.*
      format json
      time_key time
      time_format %Y-%m-%dT%H:%M:%S.%NZ
    </source>
    
    <filter eventcore.**>
      @type parser
      key_name log
      format json
      reserve_data true
    </filter>
    
    <match eventcore.**>
      @type elasticsearch
      host elasticsearch.logging.svc.cluster.local
      port 9200
      index_name eventcore-logs
      type_name _doc
      include_timestamp true
      logstash_format true
      logstash_prefix eventcore
      
      <buffer>
        @type file
        path /var/log/fluentd-buffers/eventcore
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

Distributed Tracing

OpenTelemetry Integration

#![allow(unused)]
fn main() {
use opentelemetry::{
    global,
    trace::{TraceError, Tracer, TracerProvider},
    KeyValue,
};
use opentelemetry_otlp::WithExportConfig;
use opentelemetry_sdk::{
    trace::{self, Sampler},
    Resource,
};
use tracing_opentelemetry::OpenTelemetryLayer;

pub fn init_tracing(service_name: &str, otlp_endpoint: &str) -> Result<(), TraceError> {
    let tracer = opentelemetry_otlp::new_pipeline()
        .tracing()
        .with_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_endpoint(otlp_endpoint)
        )
        .with_trace_config(
            trace::config()
                .with_sampler(Sampler::TraceIdRatioBased(1.0))
                .with_resource(Resource::new(vec![
                    KeyValue::new("service.name", service_name.to_string()),
                    KeyValue::new("service.version", env!("CARGO_PKG_VERSION")),
                    KeyValue::new("deployment.environment", 
                        std::env::var("ENVIRONMENT").unwrap_or_else(|_| "unknown".to_string())
                    ),
                ]))
        )
        .install_batch(opentelemetry_sdk::runtime::Tokio)?;
    
    let telemetry_layer = tracing_opentelemetry::layer().with_tracer(tracer);
    
    tracing_subscriber::registry()
        .with(telemetry_layer)
        .init();
    
    Ok(())
}

// Traced command execution
#[tracing::instrument(skip(command, executor), fields(command_id = %uuid::Uuid::new_v4()))]
pub async fn execute_command_traced<C: Command>(
    command: &C,
    executor: &CommandExecutor,
) -> CommandResult<ExecutionResult> {
    let span = tracing::Span::current();
    span.record("command.type", std::any::type_name::<C>());
    
    let result = executor.execute(command).await;
    
    match &result {
        Ok(execution_result) => {
            span.record("command.success", true);
            span.record("events.count", execution_result.events_written.len());
            span.record("streams.count", execution_result.affected_streams.len());
        }
        Err(error) => {
            span.record("command.success", false);
            span.record("error.message", format!("{}", error));
            span.record("error.type", std::any::type_name_of_val(error));
        }
    }
    
    result
}

// Cross-service trace propagation
use axum::{
    extract::Request,
    http::{HeaderMap, HeaderName, HeaderValue},
    middleware::Next,
    response::Response,
};

pub async fn trace_propagation_middleware(
    request: Request,
    next: Next,
) -> Response {
    // Extract trace context from headers
    let headers = request.headers();
    let parent_context = global::get_text_map_propagator(|propagator| {
        propagator.extract(&HeaderMapCarrier::new(headers))
    });
    
    // Create new span with parent context
    let span = tracing::info_span!(
        "http_request",
        method = %request.method(),
        uri = %request.uri(),
        version = ?request.version(),
    );
    
    // Set parent context
    span.set_parent(parent_context);
    
    // Execute request within span
    let response = span.in_scope(|| next.run(request)).await;
    
    response
}

struct HeaderMapCarrier<'a> {
    headers: &'a HeaderMap,
}

impl<'a> HeaderMapCarrier<'a> {
    fn new(headers: &'a HeaderMap) -> Self {
        Self { headers }
    }
}

impl<'a> opentelemetry::propagation::Extractor for HeaderMapCarrier<'a> {
    fn get(&self, key: &str) -> Option<&str> {
        self.headers.get(key)?.to_str().ok()
    }
    
    fn keys(&self) -> Vec<&str> {
        self.headers.keys().map(|k| k.as_str()).collect()
    }
}
}

Alerting

Prometheus Alerting Rules

# prometheus-alerts.yaml
groups:
- name: eventcore.rules
  rules:
  # High error rate
  - alert: HighCommandErrorRate
    expr: |
      (
        rate(eventcore_command_errors_total[5m]) /
        rate(eventcore_commands_total[5m])
      ) > 0.05
    for: 2m
    labels:
      severity: warning
      service: eventcore
    annotations:
      summary: "High command error rate detected"
      description: "Command error rate is {{ $value | humanizePercentage }} over the last 5 minutes"
  
  # High latency
  - alert: HighCommandLatency
    expr: |
      histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m])) > 1.0
    for: 3m
    labels:
      severity: warning
      service: eventcore
    annotations:
      summary: "High command latency detected"
      description: "95th percentile command latency is {{ $value }}s"
  
  # Event store issues
  - alert: EventStoreDown
    expr: up{job="eventcore"} == 0
    for: 1m
    labels:
      severity: critical
      service: eventcore
    annotations:
      summary: "EventCore service is down"
      description: "EventCore service has been down for more than 1 minute"
  
  # Projection lag
  - alert: ProjectionLag
    expr: eventcore_projection_lag_seconds > 300
    for: 5m
    labels:
      severity: warning
      service: eventcore
    annotations:
      summary: "Projection lag is high"
      description: "Projection {{ $labels.projection_name }} is {{ $value }}s behind"
  
  # Memory usage
  - alert: HighMemoryUsage
    expr: |
      (eventcore_memory_usage_bytes / (1024 * 1024 * 1024)) > 1.0
    for: 5m
    labels:
      severity: warning
      service: eventcore
    annotations:
      summary: "High memory usage"
      description: "Memory usage is {{ $value | humanize }}GB"
  
  # Database connection pool
  - alert: DatabaseConnectionPoolExhausted
    expr: eventcore_connection_pool_size / eventcore_connection_pool_max_size > 0.9
    for: 2m
    labels:
      severity: critical
      service: eventcore
    annotations:
      summary: "Database connection pool nearly exhausted"
      description: "Connection pool utilization is {{ $value | humanizePercentage }}"

Alert Manager Configuration

# alertmanager.yaml
global:
  smtp_smarthost: 'smtp.example.com:587'
  smtp_from: 'alerts@eventcore.com'

route:
  group_by: ['alertname', 'service']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'web.hook'
  routes:
  - match:
      severity: critical
    receiver: 'critical-alerts'
  - match:
      severity: warning
    receiver: 'warning-alerts'

receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://slack-webhook/webhook'

- name: 'critical-alerts'
  email_configs:
  - to: 'oncall@eventcore.com'
    subject: 'CRITICAL: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
    body: |
      {{ range .Alerts }}
      Alert: {{ .Annotations.summary }}
      Description: {{ .Annotations.description }}
      Labels: {{ range .Labels.SortedPairs }}{{ .Name }}={{ .Value }} {{ end }}
      {{ end }}
  slack_configs:
  - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
    channel: '#critical-alerts'
    title: 'Critical Alert: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'

- name: 'warning-alerts'
  slack_configs:
  - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
    channel: '#warnings'
    title: 'Warning: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'

Grafana Dashboards

EventCore Operations Dashboard

{
  "dashboard": {
    "title": "EventCore Operations",
    "panels": [
      {
        "title": "Command Execution Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(eventcore_commands_total[5m])",
            "legendFormat": "Commands/sec"
          }
        ]
      },
      {
        "title": "Command Latency",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.50, rate(eventcore_command_duration_seconds_bucket[5m]))",
            "legendFormat": "p50"
          },
          {
            "expr": "histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m]))",
            "legendFormat": "p95"
          },
          {
            "expr": "histogram_quantile(0.99, rate(eventcore_command_duration_seconds_bucket[5m]))",
            "legendFormat": "p99"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "singlestat",
        "targets": [
          {
            "expr": "rate(eventcore_command_errors_total[5m]) / rate(eventcore_commands_total[5m])",
            "legendFormat": "Error Rate"
          }
        ],
        "thresholds": [
          {
            "value": 0.01,
            "colorMode": "critical"
          }
        ]
      },
      {
        "title": "Active Streams",
        "type": "singlestat",
        "targets": [
          {
            "expr": "eventcore_active_streams",
            "legendFormat": "Streams"
          }
        ]
      },
      {
        "title": "Projection Lag",
        "type": "graph",
        "targets": [
          {
            "expr": "eventcore_projection_lag_seconds",
            "legendFormat": "{{ projection_name }}"
          }
        ]
      },
      {
        "title": "Memory Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "eventcore_memory_usage_bytes / (1024 * 1024 * 1024)",
            "legendFormat": "Memory (GB)"
          }
        ]
      }
    ]
  }
}

Performance Monitoring

Real-Time Performance Metrics

#![allow(unused)]
fn main() {
use std::sync::Arc;
use tokio::sync::RwLock;
use std::collections::HashMap;

#[derive(Debug, Clone)]
pub struct PerformanceSnapshot {
    pub timestamp: chrono::DateTime<chrono::Utc>,
    pub commands_per_second: f64,
    pub events_per_second: f64,
    pub avg_latency_ms: f64,
    pub p95_latency_ms: f64,
    pub p99_latency_ms: f64,
    pub error_rate: f64,
    pub active_streams: i64,
    pub memory_usage_mb: f64,
}

pub struct PerformanceMonitor {
    snapshots: Arc<RwLock<Vec<PerformanceSnapshot>>>,
    max_snapshots: usize,
}

impl PerformanceMonitor {
    pub fn new(max_snapshots: usize) -> Self {
        Self {
            snapshots: Arc::new(RwLock::new(Vec::new())),
            max_snapshots,
        }
    }
    
    pub async fn capture_snapshot(&self) -> PerformanceSnapshot {
        let snapshot = PerformanceSnapshot {
            timestamp: chrono::Utc::now(),
            commands_per_second: self.calculate_command_rate().await,
            events_per_second: self.calculate_event_rate().await,
            avg_latency_ms: self.calculate_avg_latency().await,
            p95_latency_ms: self.calculate_p95_latency().await,
            p99_latency_ms: self.calculate_p99_latency().await,
            error_rate: self.calculate_error_rate().await,
            active_streams: self.get_active_stream_count().await,
            memory_usage_mb: self.get_memory_usage_mb().await,
        };
        
        let mut snapshots = self.snapshots.write().await;
        snapshots.push(snapshot.clone());
        
        // Keep only the most recent snapshots
        if snapshots.len() > self.max_snapshots {
            snapshots.remove(0);
        }
        
        snapshot
    }
    
    pub async fn get_trend_analysis(&self, minutes: u64) -> TrendAnalysis {
        let snapshots = self.snapshots.read().await;
        let cutoff = chrono::Utc::now() - chrono::Duration::minutes(minutes as i64);
        
        let recent_snapshots: Vec<_> = snapshots
            .iter()
            .filter(|s| s.timestamp > cutoff)
            .collect();
        
        if recent_snapshots.is_empty() {
            return TrendAnalysis::default();
        }
        
        TrendAnalysis {
            throughput_trend: self.calculate_trend(&recent_snapshots, |s| s.commands_per_second),
            latency_trend: self.calculate_trend(&recent_snapshots, |s| s.avg_latency_ms),
            error_rate_trend: self.calculate_trend(&recent_snapshots, |s| s.error_rate),
            memory_trend: self.calculate_trend(&recent_snapshots, |s| s.memory_usage_mb),
        }
    }
    
    async fn calculate_command_rate(&self) -> f64 {
        // Get rate from Prometheus metrics
        // Implementation depends on your metrics backend
        0.0
    }
    
    async fn calculate_event_rate(&self) -> f64 {
        // Get rate from Prometheus metrics
        0.0
    }
    
    async fn calculate_avg_latency(&self) -> f64 {
        // Get average latency from metrics
        0.0
    }
    
    async fn calculate_p95_latency(&self) -> f64 {
        // Get p95 latency from metrics
        0.0
    }
    
    async fn calculate_p99_latency(&self) -> f64 {
        // Get p99 latency from metrics
        0.0
    }
    
    async fn calculate_error_rate(&self) -> f64 {
        // Calculate error rate from metrics
        0.0
    }
    
    async fn get_active_stream_count(&self) -> i64 {
        // Get active stream count from metrics
        0
    }
    
    async fn get_memory_usage_mb(&self) -> f64 {
        // Get memory usage from system metrics
        0.0
    }
    
    fn calculate_trend<F>(&self, snapshots: &[&PerformanceSnapshot], extractor: F) -> Trend
    where
        F: Fn(&PerformanceSnapshot) -> f64,
    {
        if snapshots.len() < 2 {
            return Trend::Stable;
        }
        
        let values: Vec<f64> = snapshots.iter().map(|s| extractor(s)).collect();
        let first_half = &values[0..values.len()/2];
        let second_half = &values[values.len()/2..];
        
        let first_avg = first_half.iter().sum::<f64>() / first_half.len() as f64;
        let second_avg = second_half.iter().sum::<f64>() / second_half.len() as f64;
        
        let change_percent = (second_avg - first_avg) / first_avg * 100.0;
        
        match change_percent {
            x if x > 10.0 => Trend::Increasing,
            x if x < -10.0 => Trend::Decreasing,
            _ => Trend::Stable,
        }
    }
}

#[derive(Debug, Clone)]
pub struct TrendAnalysis {
    pub throughput_trend: Trend,
    pub latency_trend: Trend,
    pub error_rate_trend: Trend,
    pub memory_trend: Trend,
}

#[derive(Debug, Clone)]
pub enum Trend {
    Increasing,
    Decreasing,
    Stable,
}

impl Default for TrendAnalysis {
    fn default() -> Self {
        Self {
            throughput_trend: Trend::Stable,
            latency_trend: Trend::Stable,
            error_rate_trend: Trend::Stable,
            memory_trend: Trend::Stable,
        }
    }
}
}

Best Practices

  1. Comprehensive metrics - Monitor all key system components
  2. Structured logging - Use consistent, searchable log formats
  3. Distributed tracing - Track requests across service boundaries
  4. Proactive alerting - Alert on trends, not just thresholds
  5. Performance baselines - Establish and monitor performance baselines
  6. Dashboard organization - Create role-specific dashboards
  7. Alert fatigue - Tune alerts to reduce noise
  8. Runbook automation - Automate common response procedures

Summary

EventCore monitoring and metrics:

  • Prometheus metrics - Comprehensive system monitoring
  • Structured logging - Searchable, contextual logs
  • Distributed tracing - Request flow visibility
  • Intelligent alerting - Proactive issue detection
  • Performance monitoring - Real-time performance tracking

Key components:

  1. Export detailed Prometheus metrics for all operations
  2. Implement structured logging with correlation IDs
  3. Use distributed tracing for multi-service visibility
  4. Configure intelligent alerting with appropriate thresholds
  5. Build comprehensive dashboards for different audiences

Next, let’s explore Backup and Recovery

Chapter 6.3: Backup and Recovery

Data protection is critical for EventCore applications since event stores contain the complete history of your system. This chapter covers comprehensive backup strategies, disaster recovery procedures, and data integrity verification.

Backup Strategies

PostgreSQL Backup Configuration

EventCore’s PostgreSQL event store requires specific backup considerations:

# PostgreSQL backup configuration using CloudNativePG
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: eventcore-postgres
  namespace: eventcore
spec:
  instances: 3
  
  backup:
    target: prefer-standby
    retentionPolicy: "30d"
    
    # Base backup configuration
    data:
      compression: gzip
      encryption: AES256
      jobs: 2
      immediateCheckpoint: true
    
    # WAL archiving
    wal:
      compression: gzip
      encryption: AES256
      maxParallel: 2
    
    # Backup schedule
    barmanObjectStore:
      destinationPath: "s3://eventcore-backups/postgres"
      s3Credentials:
        accessKeyId:
          name: backup-credentials
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: backup-credentials
          key: SECRET_ACCESS_KEY
      wal:
        retention: "7d"
      data:
        retention: "30d"
        jobs: 2
---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: eventcore-backup-schedule
  namespace: eventcore
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  backupOwnerReference: self
  cluster:
    name: eventcore-postgres
  target: prefer-standby
  method: barmanObjectStore

Event Store Backup Implementation

#![allow(unused)]
fn main() {
use tokio::fs::File;
use tokio::io::{AsyncWriteExt, BufWriter};
use chrono::{DateTime, Utc};
use serde::{Serialize, Deserialize};
use uuid::Uuid;

#[derive(Debug, Clone)]
pub struct BackupManager {
    event_store: Arc<dyn EventStore>,
    storage: Arc<dyn BackupStorage>,
    config: BackupConfig,
}

#[derive(Debug, Clone)]
pub struct BackupConfig {
    pub backup_format: BackupFormat,
    pub compression: CompressionType,
    pub encryption_enabled: bool,
    pub chunk_size: usize,
    pub retention_days: u32,
    pub verify_after_backup: bool,
}

#[derive(Debug, Clone)]
pub enum BackupFormat {
    JsonLines,
    MessagePack,
    Custom,
}

#[derive(Debug, Clone)]
pub enum CompressionType {
    None,
    Gzip,
    Zstd,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct BackupMetadata {
    pub backup_id: Uuid,
    pub created_at: DateTime<Utc>,
    pub format: BackupFormat,
    pub compression: CompressionType,
    pub total_events: u64,
    pub total_streams: u64,
    pub size_bytes: u64,
    pub checksum: String,
    pub event_range: EventRange,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct EventRange {
    pub earliest_event: DateTime<Utc>,
    pub latest_event: DateTime<Utc>,
    pub earliest_version: EventVersion,
    pub latest_version: EventVersion,
}

impl BackupManager {
    pub async fn create_full_backup(&self) -> Result<BackupMetadata, BackupError> {
        let backup_id = Uuid::new_v4();
        let start_time = Utc::now();
        
        tracing::info!(backup_id = %backup_id, "Starting full backup");
        
        // Create backup metadata
        let mut metadata = BackupMetadata {
            backup_id,
            created_at: start_time,
            format: self.config.backup_format.clone(),
            compression: self.config.compression.clone(),
            total_events: 0,
            total_streams: 0,
            size_bytes: 0,
            checksum: String::new(),
            event_range: EventRange {
                earliest_event: start_time,
                latest_event: start_time,
                earliest_version: EventVersion::initial(),
                latest_version: EventVersion::initial(),
            },
        };
        
        // Get all streams
        let streams = self.event_store.list_all_streams().await?;
        metadata.total_streams = streams.len() as u64;
        
        // Create backup writer
        let backup_path = format!("full-backup-{}.eventcore", backup_id);
        let mut writer = BackupWriter::new(
            &backup_path,
            self.config.compression.clone(),
            self.config.encryption_enabled,
        ).await?;
        
        // Write backup header
        writer.write_header(&metadata).await?;
        
        // Backup each stream
        for stream_id in streams {
            let events = self.backup_stream(&stream_id, &mut writer).await?;
            metadata.total_events += events;
            
            if metadata.total_events % 10000 == 0 {
                tracing::info!(
                    backup_id = %backup_id,
                    events_backed_up = metadata.total_events,
                    "Backup progress"
                );
            }
        }
        
        // Calculate checksums and finalize
        metadata.size_bytes = writer.finalize().await?;
        metadata.checksum = writer.calculate_checksum().await?;
        
        // Store backup metadata
        self.storage.store_backup(&backup_path, &metadata).await?;
        
        // Verify backup if configured
        if self.config.verify_after_backup {
            self.verify_backup(&backup_id).await?;
        }
        
        let duration = Utc::now().signed_duration_since(start_time);
        tracing::info!(
            backup_id = %backup_id,
            duration_seconds = duration.num_seconds(),
            total_events = metadata.total_events,
            size_mb = metadata.size_bytes / (1024 * 1024),
            "Backup completed successfully"
        );
        
        Ok(metadata)
    }
    
    pub async fn create_incremental_backup(
        &self,
        since: DateTime<Utc>,
    ) -> Result<BackupMetadata, BackupError> {
        let backup_id = Uuid::new_v4();
        let start_time = Utc::now();
        
        tracing::info!(
            backup_id = %backup_id,
            since = %since,
            "Starting incremental backup"
        );
        
        // Query events since timestamp
        let events = self.event_store.read_events_since(since).await?;
        
        let mut metadata = BackupMetadata {
            backup_id,
            created_at: start_time,
            format: self.config.backup_format.clone(),
            compression: self.config.compression.clone(),
            total_events: events.len() as u64,
            total_streams: 0, // Will be calculated
            size_bytes: 0,
            checksum: String::new(),
            event_range: self.calculate_event_range(&events),
        };
        
        // Create backup writer
        let backup_path = format!("incremental-backup-{}.eventcore", backup_id);
        let mut writer = BackupWriter::new(
            &backup_path,
            self.config.compression.clone(),
            self.config.encryption_enabled,
        ).await?;
        
        // Write incremental backup
        writer.write_header(&metadata).await?;
        
        let mut unique_streams = std::collections::HashSet::new();
        for event in events {
            writer.write_event(&event).await?;
            unique_streams.insert(event.stream_id.clone());
        }
        
        metadata.total_streams = unique_streams.len() as u64;
        metadata.size_bytes = writer.finalize().await?;
        metadata.checksum = writer.calculate_checksum().await?;
        
        self.storage.store_backup(&backup_path, &metadata).await?;
        
        tracing::info!(
            backup_id = %backup_id,
            total_events = metadata.total_events,
            total_streams = metadata.total_streams,
            "Incremental backup completed"
        );
        
        Ok(metadata)
    }
    
    async fn backup_stream(
        &self,
        stream_id: &StreamId,
        writer: &mut BackupWriter,
    ) -> Result<u64, BackupError> {
        let mut event_count = 0;
        let mut from_version = EventVersion::initial();
        let batch_size = self.config.chunk_size;
        
        loop {
            let options = ReadOptions::default()
                .from_version(from_version)
                .limit(batch_size);
            
            let stream_events = self.event_store.read_stream(stream_id, options).await?;
            
            if stream_events.events.is_empty() {
                break;
            }
            
            for event in &stream_events.events {
                writer.write_event(event).await?;
                event_count += 1;
            }
            
            from_version = EventVersion::from(
                stream_events.events.last().unwrap().version.as_u64() + 1
            );
        }
        
        Ok(event_count)
    }
    
    fn calculate_event_range(&self, events: &[StoredEvent]) -> EventRange {
        if events.is_empty() {
            let now = Utc::now();
            return EventRange {
                earliest_event: now,
                latest_event: now,
                earliest_version: EventVersion::initial(),
                latest_version: EventVersion::initial(),
            };
        }
        
        let earliest = events.iter().min_by_key(|e| e.occurred_at).unwrap();
        let latest = events.iter().max_by_key(|e| e.occurred_at).unwrap();
        
        EventRange {
            earliest_event: earliest.occurred_at,
            latest_event: latest.occurred_at,
            earliest_version: earliest.version,
            latest_version: latest.version,
        }
    }
}

struct BackupWriter {
    file: BufWriter<File>,
    path: String,
    compression: CompressionType,
    encrypted: bool,
    bytes_written: u64,
}

impl BackupWriter {
    async fn new(
        path: &str,
        compression: CompressionType,
        encrypted: bool,
    ) -> Result<Self, BackupError> {
        let file = File::create(path).await?;
        let file = BufWriter::new(file);
        
        Ok(Self {
            file,
            path: path.to_string(),
            compression,
            encrypted,
            bytes_written: 0,
        })
    }
    
    async fn write_header(&mut self, metadata: &BackupMetadata) -> Result<(), BackupError> {
        let header = serde_json::to_string(metadata)?;
        let header_line = format!("EVENTCORE_BACKUP_HEADER:{}\n", header);
        
        self.file.write_all(header_line.as_bytes()).await?;
        self.bytes_written += header_line.len() as u64;
        
        Ok(())
    }
    
    async fn write_event(&mut self, event: &StoredEvent) -> Result<(), BackupError> {
        let event_line = match self.compression {
            CompressionType::None => {
                let json = serde_json::to_string(event)?;
                format!("{}\n", json)
            }
            CompressionType::Gzip => {
                // Implement gzip compression
                let json = serde_json::to_string(event)?;
                format!("{}\n", json) // Simplified for example
            }
            CompressionType::Zstd => {
                // Implement zstd compression
                let json = serde_json::to_string(event)?;
                format!("{}\n", json) // Simplified for example
            }
        };
        
        self.file.write_all(event_line.as_bytes()).await?;
        self.bytes_written += event_line.len() as u64;
        
        Ok(())
    }
    
    async fn finalize(&mut self) -> Result<u64, BackupError> {
        self.file.flush().await?;
        Ok(self.bytes_written)
    }
    
    async fn calculate_checksum(&self) -> Result<String, BackupError> {
        // Calculate SHA-256 checksum of the backup file
        use sha2::{Sha256, Digest};
        use tokio::fs::File;
        use tokio::io::AsyncReadExt;
        
        let mut file = File::open(&self.path).await?;
        let mut hasher = Sha256::new();
        let mut buffer = [0; 8192];
        
        loop {
            let bytes_read = file.read(&mut buffer).await?;
            if bytes_read == 0 {
                break;
            }
            hasher.update(&buffer[..bytes_read]);
        }
        
        Ok(format!("{:x}", hasher.finalize()))
    }
}
}

Point-in-Time Recovery

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct PointInTimeRecovery {
    backup_manager: BackupManager,
    event_store: Arc<dyn EventStore>,
}

impl PointInTimeRecovery {
    pub async fn restore_to_point_in_time(
        &self,
        target_time: DateTime<Utc>,
    ) -> Result<RecoveryResult, RecoveryError> {
        tracing::info!(target_time = %target_time, "Starting point-in-time recovery");
        
        // Find the best backup to start from
        let base_backup = self.find_best_base_backup(target_time).await?;
        
        // Restore from base backup
        self.restore_from_backup(&base_backup.backup_id).await?;
        
        // Apply incremental backups up to the target time
        let incremental_backups = self.find_incremental_backups_until(
            base_backup.created_at,
            target_time,
        ).await?;
        
        for backup in incremental_backups {
            self.apply_incremental_backup(&backup.backup_id, Some(target_time)).await?;
        }
        
        // Apply WAL entries up to the exact target time
        self.apply_wal_entries_until(target_time).await?;
        
        // Verify recovery
        let recovery_result = self.verify_recovery(target_time).await?;
        
        tracing::info!(
            target_time = %target_time,
            events_restored = recovery_result.events_restored,
            streams_restored = recovery_result.streams_restored,
            "Point-in-time recovery completed"
        );
        
        Ok(recovery_result)
    }
    
    async fn find_best_base_backup(
        &self,
        target_time: DateTime<Utc>,
    ) -> Result<BackupMetadata, RecoveryError> {
        let backups = self.backup_manager.list_backups().await?;
        
        // Find the latest full backup before the target time
        let base_backup = backups
            .iter()
            .filter(|b| b.created_at <= target_time)
            .filter(|b| matches!(b.format, BackupFormat::JsonLines)) // Full backup indicator
            .max_by_key(|b| b.created_at)
            .ok_or(RecoveryError::NoSuitableBackup)?;
        
        Ok(base_backup.clone())
    }
    
    async fn restore_from_backup(&self, backup_id: &Uuid) -> Result<(), RecoveryError> {
        tracing::info!(backup_id = %backup_id, "Restoring from base backup");
        
        // Clear the event store
        self.event_store.clear_all().await?;
        
        // Read backup file
        let backup_reader = BackupReader::new(backup_id).await?;
        let metadata = backup_reader.read_metadata().await?;
        
        tracing::info!(
            backup_id = %backup_id,
            total_events = metadata.total_events,
            "Reading backup events"
        );
        
        // Restore events in batches
        let batch_size = 1000;
        let mut events_restored = 0;
        
        while let Some(batch) = backup_reader.read_events_batch(batch_size).await? {
            self.event_store.write_events(batch).await?;
            events_restored += batch_size;
            
            if events_restored % 10000 == 0 {
                tracing::info!(
                    events_restored = events_restored,
                    "Restore progress"
                );
            }
        }
        
        Ok(())
    }
    
    async fn apply_wal_entries_until(
        &self,
        target_time: DateTime<Utc>,
    ) -> Result<(), RecoveryError> {
        // Apply WAL (Write-Ahead Log) entries from PostgreSQL
        // This provides exact point-in-time recovery
        
        let wal_entries = self.read_wal_entries_until(target_time).await?;
        
        for entry in wal_entries {
            if entry.timestamp <= target_time {
                self.apply_wal_entry(entry).await?;
            }
        }
        
        Ok(())
    }
}

#[derive(Debug, Clone)]
pub struct RecoveryResult {
    pub events_restored: u64,
    pub streams_restored: u64,
    pub recovery_time: DateTime<Utc>,
    pub data_integrity_verified: bool,
}
}

Disaster Recovery

Multi-Region Backup Strategy

# Multi-region backup configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: backup-config
  namespace: eventcore
data:
  backup-policy.yaml: |
    # Primary backup configuration
    primary:
      region: us-east-1
      storage: s3://eventcore-backups-primary
      schedule: "0 */6 * * *"  # Every 6 hours
      retention: "30d"
      
    # Cross-region replication
    replicas:
      - region: us-west-2
        storage: s3://eventcore-backups-west
        sync_schedule: "0 1 * * *"  # Daily sync
        retention: "90d"
        
      - region: eu-west-1
        storage: s3://eventcore-backups-eu
        sync_schedule: "0 2 * * *"  # Daily sync
        retention: "90d"
    
    # Archive configuration
    archive:
      storage: glacier://eventcore-archive
      after_days: 90
      retention: "7y"

Automated Disaster Recovery

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct DisasterRecoveryOrchestrator {
    primary_region: String,
    failover_regions: Vec<String>,
    backup_manager: BackupManager,
    health_checker: HealthChecker,
}

impl DisasterRecoveryOrchestrator {
    pub async fn execute_disaster_recovery(
        &self,
        trigger: DisasterTrigger,
    ) -> Result<RecoveryOutcome, DisasterRecoveryError> {
        tracing::error!(
            trigger = ?trigger,
            "Disaster recovery triggered"
        );
        
        // Assess the situation
        let assessment = self.assess_disaster_scope().await?;
        
        // Choose recovery strategy
        let strategy = self.choose_recovery_strategy(&assessment).await?;
        
        // Execute recovery
        match strategy {
            RecoveryStrategy::LocalRestore => {
                self.execute_local_restore().await
            }
            RecoveryStrategy::RegionalFailover { target_region } => {
                self.execute_regional_failover(&target_region).await
            }
            RecoveryStrategy::FullRebuild => {
                self.execute_full_rebuild().await
            }
        }
    }
    
    async fn assess_disaster_scope(&self) -> Result<DisasterAssessment, DisasterRecoveryError> {
        let mut assessment = DisasterAssessment::default();
        
        // Check primary database
        assessment.primary_db_accessible = self.health_checker
            .check_database_connectivity(&self.primary_region)
            .await
            .is_ok();
        
        // Check backup availability
        assessment.backup_accessible = self.backup_manager
            .verify_backup_accessibility()
            .await
            .is_ok();
        
        // Check replica regions
        for region in &self.failover_regions {
            let accessible = self.health_checker
                .check_database_connectivity(region)
                .await
                .is_ok();
            assessment.replica_regions.insert(region.clone(), accessible);
        }
        
        // Estimate data loss
        assessment.estimated_data_loss = self.calculate_potential_data_loss().await?;
        
        Ok(assessment)
    }
    
    async fn execute_regional_failover(
        &self,
        target_region: &str,
    ) -> Result<RecoveryOutcome, DisasterRecoveryError> {
        tracing::info!(
            target_region = target_region,
            "Executing regional failover"
        );
        
        // 1. Promote replica in target region
        self.promote_replica(target_region).await?;
        
        // 2. Update DNS to point to new region
        self.update_dns_routing(target_region).await?;
        
        // 3. Scale up resources in target region
        self.scale_up_target_region(target_region).await?;
        
        // 4. Verify system health
        let health_check = self.verify_system_health(target_region).await?;
        
        // 5. Notify stakeholders
        self.notify_failover_completion(target_region, &health_check).await?;
        
        Ok(RecoveryOutcome {
            strategy_used: RecoveryStrategy::RegionalFailover {
                target_region: target_region.to_string(),
            },
            recovery_time: Utc::now(),
            data_loss_minutes: 0, // Assuming near-real-time replication
            systems_recovered: health_check.systems_operational,
        })
    }
}

#[derive(Debug)]
pub struct DisasterAssessment {
    pub primary_db_accessible: bool,
    pub backup_accessible: bool,
    pub replica_regions: HashMap<String, bool>,
    pub estimated_data_loss: Duration,
}

#[derive(Debug, Clone)]
pub enum RecoveryStrategy {
    LocalRestore,
    RegionalFailover { target_region: String },
    FullRebuild,
}

#[derive(Debug)]
pub enum DisasterTrigger {
    DatabaseFailure,
    RegionOutage,
    DataCorruption,
    SecurityBreach,
    ManualTrigger,
}
}

Data Integrity Verification

Backup Verification

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct BackupVerifier {
    event_store: Arc<dyn EventStore>,
    backup_storage: Arc<dyn BackupStorage>,
}

impl BackupVerifier {
    pub async fn verify_backup_integrity(
        &self,
        backup_id: &Uuid,
    ) -> Result<VerificationResult, VerificationError> {
        tracing::info!(backup_id = %backup_id, "Starting backup verification");
        
        let mut result = VerificationResult::default();
        
        // Verify checksum
        result.checksum_valid = self.verify_checksum(backup_id).await?;
        
        // Verify metadata consistency
        result.metadata_consistent = self.verify_metadata(backup_id).await?;
        
        // Verify event integrity
        result.events_valid = self.verify_events(backup_id).await?;
        
        // Verify completeness (if verifying against live system)
        if let Ok(completeness) = self.verify_completeness(backup_id).await {
            result.completeness_verified = true;
            result.missing_events = completeness.missing_events;
        }
        
        result.verification_time = Utc::now();
        result.overall_valid = result.checksum_valid &&
            result.metadata_consistent &&
            result.events_valid &&
            result.missing_events == 0;
        
        if result.overall_valid {
            tracing::info!(backup_id = %backup_id, "Backup verification passed");
        } else {
            tracing::error!(
                backup_id = %backup_id,
                result = ?result,
                "Backup verification failed"
            );
        }
        
        Ok(result)
    }
    
    async fn verify_checksum(&self, backup_id: &Uuid) -> Result<bool, VerificationError> {
        let backup_metadata = self.backup_storage.get_metadata(backup_id).await?;
        let calculated_checksum = self.calculate_backup_checksum(backup_id).await?;
        
        Ok(backup_metadata.checksum == calculated_checksum)
    }
    
    async fn verify_events(&self, backup_id: &Uuid) -> Result<bool, VerificationError> {
        let backup_reader = BackupReader::new(backup_id).await?;
        let mut events_valid = true;
        let mut event_count = 0;
        
        while let Some(event) = backup_reader.read_next_event().await? {
            // Verify event structure
            if !self.is_event_structurally_valid(&event) {
                tracing::error!(
                    backup_id = %backup_id,
                    event_id = %event.id,
                    "Invalid event structure found"
                );
                events_valid = false;
                break;
            }
            
            // Verify event ordering (within stream)
            if !self.is_event_ordering_valid(&event) {
                tracing::error!(
                    backup_id = %backup_id,
                    event_id = %event.id,
                    "Invalid event ordering found"
                );
                events_valid = false;
                break;
            }
            
            event_count += 1;
            
            if event_count % 10000 == 0 {
                tracing::info!(
                    backup_id = %backup_id,
                    events_verified = event_count,
                    "Verification progress"
                );
            }
        }
        
        Ok(events_valid)
    }
    
    fn is_event_structurally_valid(&self, event: &StoredEvent) -> bool {
        // Verify required fields
        if event.id.is_nil() || event.stream_id.as_ref().is_empty() {
            return false;
        }
        
        // Verify event ordering within stream
        if event.version.as_u64() == 0 {
            return false;
        }
        
        // Verify timestamp is reasonable
        let now = Utc::now();
        if event.occurred_at > now || event.occurred_at < (now - chrono::Duration::days(3650)) {
            return false;
        }
        
        true
    }
    
    fn is_event_ordering_valid(&self, event: &StoredEvent) -> bool {
        // This would need to track ordering within streams
        // Simplified implementation for example
        true
    }
}

#[derive(Debug, Default)]
pub struct VerificationResult {
    pub checksum_valid: bool,
    pub metadata_consistent: bool,
    pub events_valid: bool,
    pub completeness_verified: bool,
    pub missing_events: u64,
    pub verification_time: DateTime<Utc>,
    pub overall_valid: bool,
}
}

Continuous Integrity Monitoring

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct IntegrityMonitor {
    event_store: Arc<dyn EventStore>,
    monitoring_config: IntegrityMonitoringConfig,
}

#[derive(Debug, Clone)]
pub struct IntegrityMonitoringConfig {
    pub check_interval: Duration,
    pub sample_percentage: f64,
    pub alert_on_corruption: bool,
    pub auto_repair: bool,
}

impl IntegrityMonitor {
    pub async fn start_monitoring(&self) -> Result<(), MonitoringError> {
        tracing::info!("Starting continuous integrity monitoring");
        
        let mut interval = tokio::time::interval(self.monitoring_config.check_interval);
        
        loop {
            interval.tick().await;
            
            match self.perform_integrity_check().await {
                Ok(report) => {
                    if !report.integrity_ok {
                        tracing::error!(
                            corruption_count = report.corrupted_events,
                            "Data integrity issues detected"
                        );
                        
                        if self.monitoring_config.alert_on_corruption {
                            self.send_corruption_alert(&report).await;
                        }
                        
                        if self.monitoring_config.auto_repair {
                            self.attempt_auto_repair(&report).await;
                        }
                    } else {
                        tracing::debug!("Integrity check passed");
                    }
                }
                Err(e) => {
                    tracing::error!(error = %e, "Integrity check failed");
                }
            }
        }
    }
    
    async fn perform_integrity_check(&self) -> Result<IntegrityReport, MonitoringError> {
        let start_time = Utc::now();
        let mut report = IntegrityReport::default();
        
        // Sample events for checking
        let sample_events = self.sample_events().await?;
        report.events_checked = sample_events.len() as u64;
        
        for event in sample_events {
            // Check event integrity
            let integrity_check = self.check_event_integrity(&event).await?;
            
            if !integrity_check.valid {
                report.corrupted_events += 1;
                report.corruption_details.push(integrity_check);
            }
        }
        
        report.check_time = Utc::now();
        report.check_duration = report.check_time.signed_duration_since(start_time);
        report.integrity_ok = report.corrupted_events == 0;
        
        Ok(report)
    }
    
    async fn sample_events(&self) -> Result<Vec<StoredEvent>, MonitoringError> {
        // Sample a percentage of events for integrity checking
        let sample_size = ((self.get_total_event_count().await? as f64) 
            * self.monitoring_config.sample_percentage / 100.0) as usize;
        
        // Use reservoir sampling or similar technique
        self.event_store.sample_events(sample_size).await
            .map_err(MonitoringError::EventStoreError)
    }
    
    async fn check_event_integrity(&self, event: &StoredEvent) -> Result<EventIntegrityCheck, MonitoringError> {
        let mut check = EventIntegrityCheck {
            event_id: event.id,
            stream_id: event.stream_id.clone(),
            valid: true,
            issues: Vec::new(),
        };
        
        // Check payload can be deserialized
        if let Err(_) = serde_json::from_value::<serde_json::Value>(event.payload.clone()) {
            check.valid = false;
            check.issues.push("Payload deserialization failed".to_string());
        }
        
        // Check metadata is valid
        if event.metadata.is_empty() {
            check.issues.push("Missing metadata".to_string());
        }
        
        // Check event ordering within stream
        if let Err(_) = self.verify_event_ordering(event).await {
            check.valid = false;
            check.issues.push("Event ordering violation".to_string());
        }
        
        Ok(check)
    }
}

#[derive(Debug, Default)]
pub struct IntegrityReport {
    pub check_time: DateTime<Utc>,
    pub check_duration: chrono::Duration,
    pub events_checked: u64,
    pub corrupted_events: u64,
    pub integrity_ok: bool,
    pub corruption_details: Vec<EventIntegrityCheck>,
}

#[derive(Debug)]
pub struct EventIntegrityCheck {
    pub event_id: EventId,
    pub stream_id: StreamId,
    pub valid: bool,
    pub issues: Vec<String>,
}
}

Backup Testing and Validation

Automated Backup Testing

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct BackupTestSuite {
    backup_manager: BackupManager,
    test_event_store: Arc<dyn EventStore>,
    test_config: BackupTestConfig,
}

#[derive(Debug, Clone)]
pub struct BackupTestConfig {
    pub test_frequency: Duration,
    pub full_restore_test_frequency: Duration,
    pub sample_restore_percentage: f64,
    pub cleanup_test_data: bool,
}

impl BackupTestSuite {
    pub async fn run_comprehensive_backup_tests(&self) -> Result<TestResults, TestError> {
        tracing::info!("Starting comprehensive backup tests");
        
        let mut results = TestResults::default();
        
        // Test 1: Backup creation
        results.backup_creation = self.test_backup_creation().await?;
        
        // Test 2: Backup verification
        results.backup_verification = self.test_backup_verification().await?;
        
        // Test 3: Partial restore
        results.partial_restore = self.test_partial_restore().await?;
        
        // Test 4: Full restore (if scheduled)
        if self.should_run_full_restore_test().await? {
            results.full_restore = Some(self.test_full_restore().await?);
        }
        
        // Test 5: Point-in-time recovery
        results.point_in_time_recovery = self.test_point_in_time_recovery().await?;
        
        // Test 6: Cross-region restore
        results.cross_region_restore = self.test_cross_region_restore().await?;
        
        results.overall_success = results.all_tests_passed();
        results.test_time = Utc::now();
        
        if results.overall_success {
            tracing::info!("All backup tests passed");
        } else {
            tracing::error!(results = ?results, "Some backup tests failed");
        }
        
        Ok(results)
    }
    
    async fn test_backup_creation(&self) -> Result<TestResult, TestError> {
        let start_time = Utc::now();
        
        // Create test data
        let test_events = self.create_test_events(1000).await?;
        self.write_test_events(&test_events).await?;
        
        // Create backup
        let backup_result = self.backup_manager.create_full_backup().await;
        
        let duration = Utc::now().signed_duration_since(start_time);
        
        match backup_result {
            Ok(metadata) => {
                Ok(TestResult {
                    test_name: "backup_creation".to_string(),
                    success: true,
                    duration,
                    details: format!("Backup created: {}", metadata.backup_id),
                    error: None,
                })
            }
            Err(e) => {
                Ok(TestResult {
                    test_name: "backup_creation".to_string(),
                    success: false,
                    duration,
                    details: "Backup creation failed".to_string(),
                    error: Some(e.to_string()),
                })
            }
        }
    }
    
    async fn test_full_restore(&self) -> Result<TestResult, TestError> {
        let start_time = Utc::now();
        
        // Get latest backup
        let latest_backup = self.backup_manager.get_latest_backup().await?;
        
        // Create clean test environment
        let test_store = self.create_clean_test_store().await?;
        
        // Perform restore
        let restore_result = self.restore_backup_to_store(
            &latest_backup.backup_id,
            &test_store,
        ).await;
        
        let duration = Utc::now().signed_duration_since(start_time);
        
        match restore_result {
            Ok(_) => {
                // Verify restore completeness
                let verification = self.verify_restore_completeness(&test_store).await?;
                
                Ok(TestResult {
                    test_name: "full_restore".to_string(),
                    success: verification.complete,
                    duration,
                    details: format!(
                        "Events restored: {}, Streams restored: {}",
                        verification.events_count,
                        verification.streams_count
                    ),
                    error: None,
                })
            }
            Err(e) => {
                Ok(TestResult {
                    test_name: "full_restore".to_string(),
                    success: false,
                    duration,
                    details: "Full restore failed".to_string(),
                    error: Some(e.to_string()),
                })
            }
        }
    }
}

#[derive(Debug, Default)]
pub struct TestResults {
    pub backup_creation: TestResult,
    pub backup_verification: TestResult,
    pub partial_restore: TestResult,
    pub full_restore: Option<TestResult>,
    pub point_in_time_recovery: TestResult,
    pub cross_region_restore: TestResult,
    pub overall_success: bool,
    pub test_time: DateTime<Utc>,
}

impl TestResults {
    fn all_tests_passed(&self) -> bool {
        self.backup_creation.success &&
        self.backup_verification.success &&
        self.partial_restore.success &&
        self.full_restore.as_ref().map_or(true, |t| t.success) &&
        self.point_in_time_recovery.success &&
        self.cross_region_restore.success
    }
}

#[derive(Debug, Default)]
pub struct TestResult {
    pub test_name: String,
    pub success: bool,
    pub duration: chrono::Duration,
    pub details: String,
    pub error: Option<String>,
}
}

Best Practices

  1. Regular backups - Automated, frequent backup schedules
  2. Multiple strategies - Full, incremental, and WAL-based backups
  3. Geographic distribution - Multi-region backup storage
  4. Regular testing - Automated backup and restore testing
  5. Integrity verification - Continuous data integrity monitoring
  6. Recovery planning - Documented disaster recovery procedures
  7. Retention policies - Appropriate data retention and archival
  8. Security - Encrypted backups and secure storage

Summary

EventCore backup and recovery:

  • Comprehensive backups - Full, incremental, and point-in-time
  • Disaster recovery - Multi-region failover capabilities
  • Data integrity - Continuous verification and monitoring
  • Automated testing - Regular backup and restore validation
  • Recovery orchestration - Automated disaster recovery procedures

Key components:

  1. Implement automated backup strategies with multiple approaches
  2. Design disaster recovery procedures for various failure scenarios
  3. Continuously monitor data integrity with automated verification
  4. Test backup and recovery procedures regularly
  5. Maintain geographic distribution of backups for resilience

Next, let’s explore Troubleshooting

Chapter 6.4: Troubleshooting

This chapter provides comprehensive troubleshooting guidance for EventCore applications in production. From common issues to advanced debugging techniques, you’ll learn to diagnose and resolve problems quickly.

Common Issues and Solutions

Command Execution Failures

Issue: Commands timing out

Symptoms:

  • Commands taking longer than expected
  • Timeout errors in logs
  • Degraded system performance

Debugging steps:

#![allow(unused)]
fn main() {
// Enable detailed command tracing
#[tracing::instrument(skip(command, executor), level = "debug")]
async fn debug_command_execution<C: Command>(
    command: &C,
    executor: &CommandExecutor,
) -> CommandResult<ExecutionResult> {
    let start = std::time::Instant::now();
    
    tracing::debug!(
        command_type = std::any::type_name::<C>(),
        "Starting command execution"
    );
    
    // Check stream access patterns
    let read_streams = command.read_streams(&command);
    tracing::debug!(
        stream_count = read_streams.len(),
        streams = ?read_streams,
        "Command will read from streams"
    );
    
    // Time each phase
    let read_start = std::time::Instant::now();
    let result = executor.execute(command).await;
    let total_duration = start.elapsed();
    
    match &result {
        Ok(execution_result) => {
            tracing::info!(
                total_duration_ms = total_duration.as_millis(),
                events_written = execution_result.events_written.len(),
                "Command completed successfully"
            );
        }
        Err(error) => {
            tracing::error!(
                total_duration_ms = total_duration.as_millis(),
                error = %error,
                "Command failed"
            );
        }
    }
    
    result
}
}

Common causes and solutions:

  1. Database connection pool exhaustion

    #![allow(unused)]
    fn main() {
    // Check connection pool metrics
    async fn diagnose_connection_pool(pool: &sqlx::PgPool) {
        let pool_options = pool.options();
        let pool_size = pool.size();
        let idle_connections = pool.num_idle();
        
        tracing::info!(
            max_connections = pool_options.get_max_connections(),
            current_size = pool_size,
            idle_connections = idle_connections,
            active_connections = pool_size - idle_connections,
            "Connection pool status"
        );
        
        // Alert if pool utilization is high
        let utilization = (pool_size as f64) / (pool_options.get_max_connections() as f64);
        if utilization > 0.8 {
            tracing::warn!(
                utilization_percent = utilization * 100.0,
                "High connection pool utilization"
            );
        }
    }
    }
  2. Long-running database queries

    -- PostgreSQL: Check for long-running queries
    SELECT 
        pid,
        now() - pg_stat_activity.query_start AS duration,
        query,
        state
    FROM pg_stat_activity
    WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes'
    AND state = 'active';
    
  3. Lock contention on streams

    #![allow(unused)]
    fn main() {
    // Implement lock timeout and retry
    async fn execute_with_lock_retry<C: Command>(
        command: &C,
        executor: &CommandExecutor,
        max_retries: u32,
    ) -> CommandResult<ExecutionResult> {
        let mut retry_count = 0;
        
        loop {
            match executor.execute(command).await {
                Ok(result) => return Ok(result),
                Err(CommandError::ConcurrencyConflict(streams)) => {
                    retry_count += 1;
                    if retry_count >= max_retries {
                        return Err(CommandError::ConcurrencyConflict(streams));
                    }
                    
                    // Exponential backoff
                    let delay = Duration::from_millis(100 * 2_u64.pow(retry_count - 1));
                    tokio::time::sleep(delay).await;
                    
                    tracing::warn!(
                        retry_attempt = retry_count,
                        delay_ms = delay.as_millis(),
                        conflicting_streams = ?streams,
                        "Retrying command due to concurrency conflict"
                    );
                }
                Err(other_error) => return Err(other_error),
            }
        }
    }
    }

Issue: Command validation failures

Symptoms:

  • Validation errors in command processing
  • Business rule violations
  • Data consistency issues

Debugging approach:

#![allow(unused)]
fn main() {
// Enhanced validation with detailed error reporting
#[derive(Debug, thiserror::Error)]
pub enum DetailedValidationError {
    #[error("Field validation failed: {field} - {reason}")]
    FieldValidation { field: String, reason: String },
    
    #[error("Business rule violation: {rule} - {context}")]
    BusinessRule { rule: String, context: String },
    
    #[error("State precondition failed: expected {expected}, found {actual}")]
    StatePrecondition { expected: String, actual: String },
    
    #[error("Reference validation failed: {reference_type} {reference_id} not found")]
    ReferenceNotFound { reference_type: String, reference_id: String },
}

// Validation with detailed context
pub fn validate_transfer_command(
    command: &TransferMoney,
    state: &AccountState,
) -> Result<(), DetailedValidationError> {
    // Check amount
    if command.amount <= Money::zero() {
        return Err(DetailedValidationError::FieldValidation {
            field: "amount".to_string(),
            reason: format!("Amount must be positive, got {}", command.amount),
        });
    }
    
    // Check account state
    if !state.is_active {
        return Err(DetailedValidationError::StatePrecondition {
            expected: "active account".to_string(),
            actual: "inactive account".to_string(),
        });
    }
    
    // Check sufficient balance
    if state.balance < command.amount {
        return Err(DetailedValidationError::BusinessRule {
            rule: "sufficient_balance".to_string(),
            context: format!(
                "Balance {} insufficient for transfer {}",
                state.balance, command.amount
            ),
        });
    }
    
    Ok(())
}
}

Event Store Issues

Issue: High event store latency

Diagnosis tools:

#![allow(unused)]
fn main() {
// Event store performance monitor
#[derive(Debug, Clone)]
pub struct EventStoreMonitor {
    latency_tracker: Arc<Mutex<LatencyTracker>>,
}

impl EventStoreMonitor {
    pub async fn monitor_operation<F, T>(&self, operation_name: &str, operation: F) -> Result<T, EventStoreError>
    where
        F: Future<Output = Result<T, EventStoreError>>,
    {
        let start = std::time::Instant::now();
        let result = operation.await;
        let duration = start.elapsed();
        
        // Record latency
        {
            let mut tracker = self.latency_tracker.lock().await;
            tracker.record_operation(operation_name, duration, result.is_ok());
        }
        
        // Alert on high latency
        if duration > Duration::from_millis(1000) {
            tracing::warn!(
                operation = operation_name,
                duration_ms = duration.as_millis(),
                success = result.is_ok(),
                "High latency event store operation"
            );
        }
        
        result
    }
    
    pub async fn get_performance_report(&self) -> PerformanceReport {
        let tracker = self.latency_tracker.lock().await;
        tracker.generate_report()
    }
}

#[derive(Debug)]
pub struct LatencyTracker {
    operations: HashMap<String, Vec<OperationMetric>>,
}

#[derive(Debug, Clone)]
struct OperationMetric {
    duration: Duration,
    success: bool,
    timestamp: DateTime<Utc>,
}

impl LatencyTracker {
    pub fn record_operation(&mut self, operation: &str, duration: Duration, success: bool) {
        let metric = OperationMetric {
            duration,
            success,
            timestamp: Utc::now(),
        };
        
        self.operations
            .entry(operation.to_string())
            .or_insert_with(Vec::new)
            .push(metric);
        
        // Keep only recent metrics (last hour)
        let cutoff = Utc::now() - chrono::Duration::hours(1);
        for metrics in self.operations.values_mut() {
            metrics.retain(|m| m.timestamp > cutoff);
        }
    }
    
    pub fn generate_report(&self) -> PerformanceReport {
        let mut report = PerformanceReport::default();
        
        for (operation, metrics) in &self.operations {
            if metrics.is_empty() {
                continue;
            }
            
            let durations: Vec<_> = metrics.iter().map(|m| m.duration).collect();
            let success_rate = metrics.iter().filter(|m| m.success).count() as f64 / metrics.len() as f64;
            
            let operation_stats = OperationStats {
                operation_name: operation.clone(),
                total_operations: metrics.len(),
                success_rate,
                avg_duration: durations.iter().sum::<Duration>() / durations.len() as u32,
                p95_duration: calculate_percentile(&durations, 0.95),
                p99_duration: calculate_percentile(&durations, 0.99),
            };
            
            report.operations.push(operation_stats);
        }
        
        report
    }
}

fn calculate_percentile(durations: &[Duration], percentile: f64) -> Duration {
    let mut sorted = durations.to_vec();
    sorted.sort();
    let index = ((sorted.len() as f64 - 1.0) * percentile) as usize;
    sorted[index]
}
}

PostgreSQL-specific debugging:

-- Check for blocking queries
SELECT 
    blocked_locks.pid AS blocked_pid,
    blocked_activity.usename AS blocked_user,
    blocking_locks.pid AS blocking_pid,
    blocking_activity.usename AS blocking_user,
    blocked_activity.query AS blocked_statement,
    blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity 
    ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks 
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity 
    ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.GRANTED;

-- Check index usage
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan < 100
ORDER BY idx_scan;

-- Check table and index sizes
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Issue: Event store corruption

Detection and recovery:

#![allow(unused)]
fn main() {
// Corruption detection
pub struct CorruptionDetector {
    event_store: Arc<dyn EventStore>,
}

impl CorruptionDetector {
    pub async fn scan_for_corruption(&self) -> Result<CorruptionReport, ScanError> {
        let mut report = CorruptionReport::default();
        
        // Scan all streams
        let all_streams = self.event_store.list_all_streams().await?;
        
        for stream_id in all_streams {
            match self.scan_stream(&stream_id).await {
                Ok(stream_report) => {
                    if stream_report.has_issues() {
                        report.corrupted_streams.push(stream_report);
                    }
                }
                Err(e) => {
                    tracing::error!(
                        stream_id = %stream_id,
                        error = %e,
                        "Failed to scan stream for corruption"
                    );
                    report.scan_errors.push(ScanError::StreamScanFailed {
                        stream_id: stream_id.clone(),
                        error: e.to_string(),
                    });
                }
            }
        }
        
        report.scan_completed_at = Utc::now();
        Ok(report)
    }
    
    async fn scan_stream(&self, stream_id: &StreamId) -> Result<StreamCorruptionReport, ScanError> {
        let mut report = StreamCorruptionReport {
            stream_id: stream_id.clone(),
            issues: Vec::new(),
        };
        
        let events = self.event_store.read_stream(stream_id, ReadOptions::default()).await?;
        
        // Check version sequence
        for (i, event) in events.events.iter().enumerate() {
            let expected_version = EventVersion::from(i as u64 + 1);
            if event.version != expected_version {
                report.issues.push(CorruptionIssue::VersionGap {
                    event_id: event.id,
                    expected_version,
                    actual_version: event.version,
                });
            }
            
            // Check event structure
            if let Err(e) = self.validate_event_structure(event) {
                report.issues.push(CorruptionIssue::StructuralError {
                    event_id: event.id,
                    error: e,
                });
            }
        }
        
        Ok(report)
    }
    
    fn validate_event_structure(&self, event: &StoredEvent) -> Result<(), String> {
        // Check UUID format
        if event.id.is_nil() {
            return Err("Nil event ID".to_string());
        }
        
        // Check payload can be deserialized
        match serde_json::from_value::<serde_json::Value>(event.payload.clone()) {
            Ok(_) => {}
            Err(e) => return Err(format!("Invalid payload JSON: {}", e)),
        }
        
        // Check timestamp is reasonable
        let now = Utc::now();
        if event.occurred_at > now + chrono::Duration::minutes(5) {
            return Err("Event timestamp is in the future".to_string());
        }
        
        if event.occurred_at < (now - chrono::Duration::days(10 * 365)) {
            return Err("Event timestamp is too old".to_string());
        }
        
        Ok(())
    }
}

#[derive(Debug, Default)]
pub struct CorruptionReport {
    pub corrupted_streams: Vec<StreamCorruptionReport>,
    pub scan_errors: Vec<ScanError>,
    pub scan_completed_at: DateTime<Utc>,
}

#[derive(Debug)]
pub struct StreamCorruptionReport {
    pub stream_id: StreamId,
    pub issues: Vec<CorruptionIssue>,
}

impl StreamCorruptionReport {
    pub fn has_issues(&self) -> bool {
        !self.issues.is_empty()
    }
}

#[derive(Debug)]
pub enum CorruptionIssue {
    VersionGap {
        event_id: EventId,
        expected_version: EventVersion,
        actual_version: EventVersion,
    },
    StructuralError {
        event_id: EventId,
        error: String,
    },
    DuplicateEvent {
        event_id: EventId,
        duplicate_id: EventId,
    },
}
}

Projection Issues

Issue: Projection lag

Monitoring and diagnosis:

#![allow(unused)]
fn main() {
// Projection lag monitor
#[derive(Debug, Clone)]
pub struct ProjectionLagMonitor {
    event_store: Arc<dyn EventStore>,
    projection_manager: Arc<ProjectionManager>,
}

impl ProjectionLagMonitor {
    pub async fn check_all_projections(&self) -> Result<Vec<ProjectionLagReport>, MonitorError> {
        let mut reports = Vec::new();
        
        let projections = self.projection_manager.list_projections().await?;
        let latest_event_time = self.get_latest_event_time().await?;
        
        for projection_name in projections {
            let report = self.check_projection_lag(&projection_name, latest_event_time).await?;
            reports.push(report);
        }
        
        Ok(reports)
    }
    
    async fn check_projection_lag(
        &self,
        projection_name: &str,
        latest_event_time: DateTime<Utc>,
    ) -> Result<ProjectionLagReport, MonitorError> {
        let checkpoint = self.projection_manager
            .get_checkpoint(projection_name)
            .await?;
        
        let lag = match checkpoint.last_processed_at {
            Some(last_processed) => latest_event_time.signed_duration_since(last_processed),
            None => chrono::Duration::max_value(), // Never processed
        };
        
        let status = if lag > chrono::Duration::minutes(30) {
            ProjectionStatus::Critical
        } else if lag > chrono::Duration::minutes(5) {
            ProjectionStatus::Warning
        } else {
            ProjectionStatus::Healthy
        };
        
        Ok(ProjectionLagReport {
            projection_name: projection_name.to_string(),
            lag_duration: lag,
            status,
            last_processed_event: checkpoint.last_event_id,
            last_processed_at: checkpoint.last_processed_at,
            events_processed: checkpoint.events_processed,
        })
    }
    
    async fn get_latest_event_time(&self) -> Result<DateTime<Utc>, MonitorError> {
        // Get the timestamp of the most recent event across all streams
        self.event_store.get_latest_event_time().await
            .map_err(MonitorError::EventStoreError)
    }
}

#[derive(Debug)]
pub struct ProjectionLagReport {
    pub projection_name: String,
    pub lag_duration: chrono::Duration,
    pub status: ProjectionStatus,
    pub last_processed_event: Option<EventId>,
    pub last_processed_at: Option<DateTime<Utc>>,
    pub events_processed: u64,
}

#[derive(Debug, Clone)]
pub enum ProjectionStatus {
    Healthy,
    Warning,
    Critical,
}
}

Projection rebuild when corrupted:

#![allow(unused)]
fn main() {
// Safe projection rebuild
pub struct ProjectionRebuilder {
    event_store: Arc<dyn EventStore>,
    projection_manager: Arc<ProjectionManager>,
}

impl ProjectionRebuilder {
    pub async fn rebuild_projection(
        &self,
        projection_name: &str,
        strategy: RebuildStrategy,
    ) -> Result<RebuildResult, RebuildError> {
        tracing::info!(
            projection_name = projection_name,
            strategy = ?strategy,
            "Starting projection rebuild"
        );
        
        let start_time = Utc::now();
        
        // Create backup of current projection state
        let backup_id = self.backup_projection_state(projection_name).await?;
        
        // Reset projection state
        self.projection_manager.reset_projection(projection_name).await?;
        
        // Rebuild based on strategy
        let rebuild_result = match strategy {
            RebuildStrategy::Full => {
                self.rebuild_from_beginning(projection_name).await
            }
            RebuildStrategy::FromCheckpoint { checkpoint_time } => {
                self.rebuild_from_checkpoint(projection_name, checkpoint_time).await
            }
            RebuildStrategy::FromEvent { event_id } => {
                self.rebuild_from_event(projection_name, event_id).await
            }
        };
        
        match rebuild_result {
            Ok(stats) => {
                // Rebuild successful - clean up backup
                self.cleanup_projection_backup(backup_id).await?;
                
                let duration = Utc::now().signed_duration_since(start_time);
                
                tracing::info!(
                    projection_name = projection_name,
                    events_processed = stats.events_processed,
                    duration_seconds = duration.num_seconds(),
                    "Projection rebuild completed successfully"
                );
                
                Ok(RebuildResult {
                    success: true,
                    events_processed: stats.events_processed,
                    duration,
                    backup_id: Some(backup_id),
                })
            }
            Err(e) => {
                // Rebuild failed - restore from backup
                tracing::error!(
                    projection_name = projection_name,
                    error = %e,
                    "Projection rebuild failed, restoring from backup"
                );
                
                self.restore_projection_from_backup(projection_name, backup_id).await?;
                
                Err(RebuildError::RebuildFailed {
                    original_error: Box::new(e),
                    backup_restored: true,
                })
            }
        }
    }
    
    async fn rebuild_from_beginning(&self, projection_name: &str) -> Result<RebuildStats, RebuildError> {
        let mut stats = RebuildStats::default();
        
        // Get all events in chronological order
        let events = self.event_store.read_all_events_ordered().await?;
        
        // Process events in batches
        let batch_size = 1000;
        for chunk in events.chunks(batch_size) {
            self.projection_manager
                .process_events_batch(projection_name, chunk)
                .await?;
            
            stats.events_processed += chunk.len() as u64;
            
            // Checkpoint every batch
            self.projection_manager
                .save_checkpoint(projection_name)
                .await?;
            
            // Progress reporting
            if stats.events_processed % 10000 == 0 {
                tracing::info!(
                    projection_name = projection_name,
                    events_processed = stats.events_processed,
                    "Rebuild progress"
                );
            }
        }
        
        Ok(stats)
    }
}

#[derive(Debug)]
pub enum RebuildStrategy {
    Full,
    FromCheckpoint { checkpoint_time: DateTime<Utc> },
    FromEvent { event_id: EventId },
}

#[derive(Debug, Default)]
pub struct RebuildStats {
    pub events_processed: u64,
}

#[derive(Debug)]
pub struct RebuildResult {
    pub success: bool,
    pub events_processed: u64,
    pub duration: chrono::Duration,
    pub backup_id: Option<Uuid>,
}
}

Debugging Tools

Command Execution Tracer

#![allow(unused)]
fn main() {
// Detailed command execution tracer
#[derive(Debug, Clone)]
pub struct CommandTracer {
    traces: Arc<Mutex<HashMap<Uuid, CommandTrace>>>,
}

#[derive(Debug, Clone)]
pub struct CommandTrace {
    pub trace_id: Uuid,
    pub command_type: String,
    pub start_time: DateTime<Utc>,
    pub phases: Vec<TracePhase>,
    pub completed: bool,
    pub result: Option<Result<String, String>>,
}

#[derive(Debug, Clone)]
pub struct TracePhase {
    pub phase_name: String,
    pub start_time: DateTime<Utc>,
    pub duration: Option<Duration>,
    pub details: HashMap<String, String>,
}

impl CommandTracer {
    pub fn start_trace<C: Command>(&self, command: &C) -> Uuid {
        let trace_id = Uuid::new_v4();
        let trace = CommandTrace {
            trace_id,
            command_type: std::any::type_name::<C>().to_string(),
            start_time: Utc::now(),
            phases: Vec::new(),
            completed: false,
            result: None,
        };
        
        let mut traces = self.traces.lock().unwrap();
        traces.insert(trace_id, trace);
        
        tracing::info!(
            trace_id = %trace_id,
            command_type = std::any::type_name::<C>(),
            "Started command trace"
        );
        
        trace_id
    }
    
    pub fn add_phase(&self, trace_id: Uuid, phase_name: &str, details: HashMap<String, String>) {
        let mut traces = self.traces.lock().unwrap();
        if let Some(trace) = traces.get_mut(&trace_id) {
            trace.phases.push(TracePhase {
                phase_name: phase_name.to_string(),
                start_time: Utc::now(),
                duration: None,
                details,
            });
        }
    }
    
    pub fn complete_phase(&self, trace_id: Uuid) {
        let mut traces = self.traces.lock().unwrap();
        if let Some(trace) = traces.get_mut(&trace_id) {
            if let Some(last_phase) = trace.phases.last_mut() {
                last_phase.duration = Some(
                    Utc::now().signed_duration_since(last_phase.start_time).to_std().unwrap_or_default()
                );
            }
        }
    }
    
    pub fn complete_trace(&self, trace_id: Uuid, result: Result<String, String>) {
        let mut traces = self.traces.lock().unwrap();
        if let Some(trace) = traces.get_mut(&trace_id) {
            trace.completed = true;
            trace.result = Some(result);
            
            let total_duration = Utc::now().signed_duration_since(trace.start_time);
            
            tracing::info!(
                trace_id = %trace_id,
                duration_ms = total_duration.num_milliseconds(),
                phases = trace.phases.len(),
                success = trace.result.as_ref().unwrap().is_ok(),
                "Completed command trace"
            );
        }
    }
    
    pub fn get_trace(&self, trace_id: Uuid) -> Option<CommandTrace> {
        let traces = self.traces.lock().unwrap();
        traces.get(&trace_id).cloned()
    }
    
    pub fn get_recent_traces(&self, limit: usize) -> Vec<CommandTrace> {
        let traces = self.traces.lock().unwrap();
        let mut trace_list: Vec<_> = traces.values().cloned().collect();
        trace_list.sort_by(|a, b| b.start_time.cmp(&a.start_time));
        trace_list.into_iter().take(limit).collect()
    }
}

// Usage in command executor
pub async fn execute_with_tracing<C: Command>(
    command: &C,
    executor: &CommandExecutor,
    tracer: &CommandTracer,
) -> CommandResult<ExecutionResult> {
    let trace_id = tracer.start_trace(command);
    
    // Phase 1: Stream Reading
    tracer.add_phase(trace_id, "stream_reading", hashmap! {
        "streams_to_read".to_string() => command.read_streams(command).len().to_string(),
    });
    
    let result = executor.execute(command).await;
    
    tracer.complete_phase(trace_id);
    
    // Complete trace
    let trace_result = match &result {
        Ok(execution_result) => Ok(format!(
            "Events written: {}, Streams affected: {}",
            execution_result.events_written.len(),
            execution_result.affected_streams.len()
        )),
        Err(e) => Err(e.to_string()),
    };
    
    tracer.complete_trace(trace_id, trace_result);
    
    result
}
}

Performance Profiler

#![allow(unused)]
fn main() {
// Built-in performance profiler
#[derive(Debug, Clone)]
pub struct PerformanceProfiler {
    profiles: Arc<Mutex<HashMap<String, PerformanceProfile>>>,
    enabled: bool,
}

#[derive(Debug, Clone)]
pub struct PerformanceProfile {
    pub operation_name: String,
    pub samples: Vec<PerformanceSample>,
    pub statistics: ProfileStatistics,
}

#[derive(Debug, Clone)]
pub struct PerformanceSample {
    pub timestamp: DateTime<Utc>,
    pub duration: Duration,
    pub memory_before: usize,
    pub memory_after: usize,
    pub success: bool,
    pub metadata: HashMap<String, String>,
}

#[derive(Debug, Clone, Default)]
pub struct ProfileStatistics {
    pub total_samples: usize,
    pub success_rate: f64,
    pub avg_duration: Duration,
    pub min_duration: Duration,
    pub max_duration: Duration,
    pub p95_duration: Duration,
    pub avg_memory_delta: i64,
}

impl PerformanceProfiler {
    pub fn new(enabled: bool) -> Self {
        Self {
            profiles: Arc::new(Mutex::new(HashMap::new())),
            enabled,
        }
    }
    
    pub async fn profile_operation<F, T>(&self, operation_name: &str, operation: F) -> T
    where
        F: Future<Output = T>,
    {
        if !self.enabled {
            return operation.await;
        }
        
        let memory_before = self.get_current_memory_usage();
        let start_time = Utc::now();
        let start_instant = std::time::Instant::now();
        
        let result = operation.await;
        
        let duration = start_instant.elapsed();
        let memory_after = self.get_current_memory_usage();
        
        let sample = PerformanceSample {
            timestamp: start_time,
            duration,
            memory_before,
            memory_after,
            success: true, // Would need to be determined by operation type
            metadata: HashMap::new(),
        };
        
        // Record sample
        let mut profiles = self.profiles.lock().await;
        let profile = profiles.entry(operation_name.to_string()).or_insert_with(|| {
            PerformanceProfile {
                operation_name: operation_name.to_string(),
                samples: Vec::new(),
                statistics: ProfileStatistics::default(),
            }
        });
        
        profile.samples.push(sample);
        
        // Update statistics
        self.update_statistics(profile);
        
        // Keep only recent samples (last hour)
        let cutoff = Utc::now() - chrono::Duration::hours(1);
        profile.samples.retain(|s| s.timestamp > cutoff);
        
        result
    }
    
    fn update_statistics(&self, profile: &mut PerformanceProfile) {
        if profile.samples.is_empty() {
            return;
        }
        
        let mut durations: Vec<_> = profile.samples.iter().map(|s| s.duration).collect();
        durations.sort();
        
        let success_count = profile.samples.iter().filter(|s| s.success).count();
        
        profile.statistics = ProfileStatistics {
            total_samples: profile.samples.len(),
            success_rate: success_count as f64 / profile.samples.len() as f64,
            avg_duration: durations.iter().sum::<Duration>() / durations.len() as u32,
            min_duration: durations[0],
            max_duration: durations[durations.len() - 1],
            p95_duration: durations[(durations.len() as f64 * 0.95) as usize],
            avg_memory_delta: profile.samples.iter()
                .map(|s| s.memory_after as i64 - s.memory_before as i64)
                .sum::<i64>() / profile.samples.len() as i64,
        };
    }
    
    fn get_current_memory_usage(&self) -> usize {
        // Platform-specific memory usage detection
        // This is a simplified implementation
        0
    }
    
    pub async fn get_profile_report(&self) -> HashMap<String, ProfileStatistics> {
        let profiles = self.profiles.lock().await;
        profiles.iter()
            .map(|(name, profile)| (name.clone(), profile.statistics.clone()))
            .collect()
    }
}
}

Log Analysis Tools

#![allow(unused)]
fn main() {
// Automated log analysis for common issues
#[derive(Debug, Clone)]
pub struct LogAnalyzer {
    log_patterns: Vec<LogPattern>,
}

#[derive(Debug, Clone)]
pub struct LogPattern {
    pub name: String,
    pub pattern: String,
    pub severity: LogSeverity,
    pub action: String,
}

#[derive(Debug, Clone)]
pub enum LogSeverity {
    Info,
    Warning,
    Error,
    Critical,
}

impl LogAnalyzer {
    pub fn new() -> Self {
        Self {
            log_patterns: Self::default_patterns(),
        }
    }
    
    fn default_patterns() -> Vec<LogPattern> {
        vec![
            LogPattern {
                name: "connection_pool_exhaustion".to_string(),
                pattern: r"(?i)connection.*pool.*exhausted|too many connections".to_string(),
                severity: LogSeverity::Critical,
                action: "Scale up connection pool or check for connection leaks".to_string(),
            },
            LogPattern {
                name: "command_timeout".to_string(),
                pattern: r"(?i)command.*timeout|execution.*timeout".to_string(),
                severity: LogSeverity::Error,
                action: "Check database performance and query optimization".to_string(),
            },
            LogPattern {
                name: "concurrency_conflict".to_string(),
                pattern: r"(?i)concurrency.*conflict|version.*conflict".to_string(),
                severity: LogSeverity::Warning,
                action: "Consider optimizing command patterns or retry strategies".to_string(),
            },
            LogPattern {
                name: "memory_pressure".to_string(),
                pattern: r"(?i)out of memory|memory.*limit|allocation.*failed".to_string(),
                severity: LogSeverity::Critical,
                action: "Scale up memory or check for memory leaks".to_string(),
            },
            LogPattern {
                name: "projection_lag".to_string(),
                pattern: r"(?i)projection.*lag|projection.*behind".to_string(),
                severity: LogSeverity::Warning,
                action: "Check projection performance and consider scaling".to_string(),
            },
        ]
    }
    
    pub async fn analyze_logs(&self, log_entries: &[LogEntry]) -> LogAnalysisReport {
        let mut report = LogAnalysisReport::default();
        
        for entry in log_entries {
            for pattern in &self.log_patterns {
                if self.matches_pattern(&entry.message, &pattern.pattern) {
                    let issue = LogIssue {
                        pattern_name: pattern.name.clone(),
                        severity: pattern.severity.clone(),
                        message: entry.message.clone(),
                        timestamp: entry.timestamp,
                        action: pattern.action.clone(),
                        occurrences: 1,
                    };
                    
                    // Aggregate similar issues
                    if let Some(existing) = report.issues.iter_mut()
                        .find(|i| i.pattern_name == issue.pattern_name) {
                        existing.occurrences += 1;
                        if entry.timestamp > existing.timestamp {
                            existing.timestamp = entry.timestamp;
                            existing.message = entry.message.clone();
                        }
                    } else {
                        report.issues.push(issue);
                    }
                }
            }
        }
        
        // Sort by severity and occurrence count
        report.issues.sort_by(|a, b| {
            match (&a.severity, &b.severity) {
                (LogSeverity::Critical, LogSeverity::Critical) => b.occurrences.cmp(&a.occurrences),
                (LogSeverity::Critical, _) => std::cmp::Ordering::Less,
                (_, LogSeverity::Critical) => std::cmp::Ordering::Greater,
                (LogSeverity::Error, LogSeverity::Error) => b.occurrences.cmp(&a.occurrences),
                (LogSeverity::Error, _) => std::cmp::Ordering::Less,
                (_, LogSeverity::Error) => std::cmp::Ordering::Greater,
                _ => b.occurrences.cmp(&a.occurrences),
            }
        });
        
        report
    }
    
    fn matches_pattern(&self, message: &str, pattern: &str) -> bool {
        use regex::Regex;
        if let Ok(regex) = Regex::new(pattern) {
            regex.is_match(message)
        } else {
            false
        }
    }
}

#[derive(Debug, Default)]
pub struct LogAnalysisReport {
    pub issues: Vec<LogIssue>,
}

#[derive(Debug)]
pub struct LogIssue {
    pub pattern_name: String,
    pub severity: LogSeverity,
    pub message: String,
    pub timestamp: DateTime<Utc>,
    pub action: String,
    pub occurrences: u32,
}

#[derive(Debug)]
pub struct LogEntry {
    pub timestamp: DateTime<Utc>,
    pub level: String,
    pub message: String,
    pub metadata: HashMap<String, String>,
}
}

Troubleshooting Runbooks

Common Runbooks

Runbook 1: High Command Latency

  1. Check connection pool status

    curl http://localhost:9090/metrics | grep eventcore_connection_pool
    
  2. Analyze slow queries

    SELECT query, mean_time, calls 
    FROM pg_stat_statements 
    ORDER BY mean_time DESC 
    LIMIT 10;
    
  3. Check for lock contention

    SELECT * FROM pg_locks WHERE NOT granted;
    
  4. Scale resources if needed

    kubectl scale deployment eventcore-app --replicas=6
    

Runbook 2: Projection Lag

  1. Check projection status

    curl http://localhost:8080/health/projections
    
  2. Identify lagging projections

    curl http://localhost:9090/metrics | grep projection_lag
    
  3. Restart projection processing

    kubectl delete pod -l app=eventcore-projections
    
  4. Consider projection rebuild if corruption detected

    kubectl exec -it eventcore-app -- eventcore-cli projection rebuild user-summary
    

Runbook 3: Memory Issues

  1. Check memory usage

    kubectl top pods -l app=eventcore
    
  2. Analyze memory patterns

    curl http://localhost:9090/metrics | grep memory_usage
    
  3. Generate heap dump if needed

    kubectl exec -it eventcore-app -- kill -USR1 1
    
  4. Scale up memory limits

    resources:
      limits:
        memory: "1Gi"
    

Best Practices

  1. Comprehensive monitoring - Monitor all system components
  2. Automated diagnostics - Use tools to detect issues early
  3. Detailed logging - Include context and correlation IDs
  4. Performance profiling - Regular performance analysis
  5. Runbook maintenance - Keep troubleshooting guides updated
  6. Incident response - Defined escalation procedures
  7. Root cause analysis - Learn from every incident
  8. Preventive measures - Address issues before they become problems

Summary

EventCore troubleshooting:

  • Systematic diagnosis - Structured approach to problem identification
  • Comprehensive tools - Built-in debugging and monitoring tools
  • Automated analysis - Log analysis and pattern detection
  • Performance profiling - Detailed performance insights
  • Runbook automation - Standardized troubleshooting procedures

Key components:

  1. Use comprehensive monitoring to detect issues early
  2. Implement systematic debugging approaches for complex problems
  3. Maintain detailed logs with proper correlation and context
  4. Use automated tools for log analysis and pattern detection
  5. Document and automate common troubleshooting procedures

Next, let’s explore Production Checklist

Chapter 6.5: Production Checklist

This chapter provides a comprehensive checklist for deploying EventCore applications to production. Use this as a final validation before going live and as a periodic review for existing production systems.

Pre-Deployment Checklist

Security

Authentication and Authorization

  • JWT secret key configured and secured
  • Token expiration properly configured
  • Role-based access control implemented and tested
  • API rate limiting configured
  • CORS origins restricted to known domains
  • HTTPS enforced for all endpoints
  • Security headers configured (HSTS, CSP, etc.)
#![allow(unused)]
fn main() {
// Security configuration validation
#[derive(Debug)]
pub struct SecurityAudit {
    pub findings: Vec<SecurityFinding>,
}

#[derive(Debug)]
pub struct SecurityFinding {
    pub category: SecurityCategory,
    pub severity: SecuritySeverity,
    pub description: String,
    pub recommendation: String,
}

#[derive(Debug)]
pub enum SecurityCategory {
    Authentication,
    Authorization,
    Encryption,
    NetworkSecurity,
    DataProtection,
}

#[derive(Debug)]
pub enum SecuritySeverity {
    Critical,
    High,
    Medium,
    Low,
}

pub struct SecurityAuditor;

impl SecurityAuditor {
    pub fn audit_configuration(config: &AppConfig) -> SecurityAudit {
        let mut findings = Vec::new();
        
        // Check JWT configuration
        if config.jwt.secret_key.len() < 32 {
            findings.push(SecurityFinding {
                category: SecurityCategory::Authentication,
                severity: SecuritySeverity::Critical,
                description: "JWT secret key is too short".to_string(),
                recommendation: "Use a secret key of at least 256 bits (32 bytes)".to_string(),
            });
        }
        
        // Check CORS configuration
        if config.cors.allowed_origins.contains(&"*".to_string()) {
            findings.push(SecurityFinding {
                category: SecurityCategory::NetworkSecurity,
                severity: SecuritySeverity::High,
                description: "CORS allows all origins".to_string(),
                recommendation: "Restrict CORS to specific trusted domains".to_string(),
            });
        }
        
        // Check HTTPS enforcement
        if !config.server.force_https {
            findings.push(SecurityFinding {
                category: SecurityCategory::NetworkSecurity,
                severity: SecuritySeverity::High,
                description: "HTTPS not enforced".to_string(),
                recommendation: "Enable HTTPS enforcement for all endpoints".to_string(),
            });
        }
        
        // Check rate limiting
        if config.rate_limiting.requests_per_minute == 0 {
            findings.push(SecurityFinding {
                category: SecurityCategory::NetworkSecurity,
                severity: SecuritySeverity::Medium,
                description: "Rate limiting not configured".to_string(),
                recommendation: "Configure appropriate rate limits for API endpoints".to_string(),
            });
        }
        
        SecurityAudit { findings }
    }
}
}

Database Security

  • Database credentials stored in secrets management
  • Connection encryption (SSL/TLS) enabled
  • Database user permissions follow principle of least privilege
  • Database firewall rules restrict access
  • Connection pooling properly configured
  • Query parameterization used (prevent SQL injection)
-- PostgreSQL security checklist queries
-- Check SSL is enforced
SHOW ssl;

-- Check user permissions
\du

-- Check database-level permissions
SELECT datname, datacl FROM pg_database;

-- Check table-level permissions
SELECT schemaname, tablename, tableowner, tablespace, hasindexes, hasrules, hastriggers 
FROM pg_tables 
WHERE schemaname = 'public';

-- Verify no wildcard permissions
SELECT * FROM information_schema.table_privileges 
WHERE grantee = 'PUBLIC';

Performance

Resource Limits

  • CPU limits set appropriately
  • Memory limits configured with buffer
  • Database connection pool sized correctly
  • Request timeouts configured
  • Circuit breakers implemented
  • Resource quotas set at namespace level
# Kubernetes resource configuration checklist
apiVersion: v1
kind: LimitRange
metadata:
  name: eventcore-limits
  namespace: eventcore
spec:
  limits:
  - type: Container
    default:
      memory: "512Mi"
      cpu: "500m"
    defaultRequest:
      memory: "256Mi"
      cpu: "250m"
    max:
      memory: "2Gi"
      cpu: "2000m"
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: eventcore-quota
  namespace: eventcore
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    persistentvolumeclaims: "4"

Performance Benchmarks

  • Load testing completed with realistic scenarios
  • Performance baselines established
  • Scalability limits identified
  • Database query performance optimized
  • Index usage analyzed and optimized
#![allow(unused)]
fn main() {
// Performance validation
pub struct PerformanceValidator {
    target_metrics: PerformanceTargets,
}

#[derive(Debug, Clone)]
pub struct PerformanceTargets {
    pub max_p95_latency_ms: u64,
    pub min_throughput_rps: f64,
    pub max_error_rate: f64,
    pub max_memory_usage_mb: f64,
}

impl PerformanceValidator {
    pub async fn validate_performance(&self) -> Result<PerformanceValidationResult, ValidationError> {
        let mut results = PerformanceValidationResult::default();
        
        // Test command latency
        let latency_test = self.test_command_latency().await?;
        results.latency_passed = latency_test.p95_latency_ms <= self.target_metrics.max_p95_latency_ms;
        
        // Test throughput
        let throughput_test = self.test_throughput().await?;
        results.throughput_passed = throughput_test.requests_per_second >= self.target_metrics.min_throughput_rps;
        
        // Test error rate
        let error_test = self.test_error_rate().await?;
        results.error_rate_passed = error_test.error_rate <= self.target_metrics.max_error_rate;
        
        // Test memory usage
        let memory_test = self.test_memory_usage().await?;
        results.memory_passed = memory_test.peak_memory_mb <= self.target_metrics.max_memory_usage_mb;
        
        results.overall_passed = results.latency_passed && 
                                 results.throughput_passed && 
                                 results.error_rate_passed && 
                                 results.memory_passed;
        
        Ok(results)
    }
    
    async fn test_command_latency(&self) -> Result<LatencyTestResult, ValidationError> {
        // Implement latency testing
        // Execute sample commands and measure response times
        Ok(LatencyTestResult {
            p95_latency_ms: 50, // Example result
            avg_latency_ms: 25,
        })
    }
    
    async fn test_throughput(&self) -> Result<ThroughputTestResult, ValidationError> {
        // Implement throughput testing
        // Execute concurrent commands and measure RPS
        Ok(ThroughputTestResult {
            requests_per_second: 150.0, // Example result
            peak_concurrent_requests: 50,
        })
    }
}

#[derive(Debug, Default)]
pub struct PerformanceValidationResult {
    pub latency_passed: bool,
    pub throughput_passed: bool,
    pub error_rate_passed: bool,
    pub memory_passed: bool,
    pub overall_passed: bool,
}
}

Reliability

High Availability

  • Multiple replicas deployed
  • Pod disruption budgets configured
  • Health checks implemented and tested
  • Readiness probes properly configured
  • Liveness probes tuned appropriately
  • Rolling update strategy configured
# High availability configuration
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: eventcore-pdb
  namespace: eventcore
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: eventcore
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: eventcore-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    spec:
      containers:
      - name: eventcore-app
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 5
          failureThreshold: 3

Backup and Recovery

  • Automated backups configured and tested
  • Backup verification automated
  • Recovery procedures documented and tested
  • Point-in-time recovery capability verified
  • Cross-region backup replication configured
  • Backup retention policies implemented
#![allow(unused)]
fn main() {
// Backup validation
pub struct BackupValidator;

impl BackupValidator {
    pub async fn validate_backup_system(&self) -> Result<BackupValidationResult, ValidationError> {
        let mut result = BackupValidationResult::default();
        
        // Test backup creation
        result.backup_creation = self.test_backup_creation().await?;
        
        // Test backup verification
        result.backup_verification = self.test_backup_verification().await?;
        
        // Test restore functionality
        result.restore_capability = self.test_restore_capability().await?;
        
        // Test backup schedule
        result.backup_schedule = self.verify_backup_schedule().await?;
        
        // Test retention policy
        result.retention_policy = self.verify_retention_policy().await?;
        
        result.overall_passed = result.backup_creation && 
                                result.backup_verification && 
                                result.restore_capability && 
                                result.backup_schedule && 
                                result.retention_policy;
        
        Ok(result)
    }
}

#[derive(Debug, Default)]
pub struct BackupValidationResult {
    pub backup_creation: bool,
    pub backup_verification: bool,
    pub restore_capability: bool,
    pub backup_schedule: bool,
    pub retention_policy: bool,
    pub overall_passed: bool,
}
}

Monitoring and Observability

Metrics Collection

  • Application metrics exported to Prometheus
  • Business metrics tracked
  • Infrastructure metrics monitored
  • Custom dashboards created for key metrics
  • SLI/SLO defined and monitored
#![allow(unused)]
fn main() {
// Metrics validation
pub struct MetricsValidator {
    prometheus_client: PrometheusClient,
}

impl MetricsValidator {
    pub async fn validate_metrics(&self) -> Result<MetricsValidationResult, ValidationError> {
        let mut result = MetricsValidationResult::default();
        
        // Check core application metrics
        result.core_metrics = self.check_core_metrics().await?;
        
        // Check business metrics
        result.business_metrics = self.check_business_metrics().await?;
        
        // Check infrastructure metrics
        result.infrastructure_metrics = self.check_infrastructure_metrics().await?;
        
        // Verify metric freshness
        result.metrics_current = self.check_metrics_freshness().await?;
        
        result.overall_passed = result.core_metrics && 
                                result.business_metrics && 
                                result.infrastructure_metrics && 
                                result.metrics_current;
        
        Ok(result)
    }
    
    async fn check_core_metrics(&self) -> Result<bool, ValidationError> {
        let required_metrics = vec![
            "eventcore_commands_total",
            "eventcore_command_duration_seconds",
            "eventcore_events_written_total",
            "eventcore_active_streams",
            "eventcore_projection_lag_seconds",
        ];
        
        for metric in required_metrics {
            if !self.prometheus_client.metric_exists(metric).await? {
                return Ok(false);
            }
        }
        
        Ok(true)
    }
}
}

Logging

  • Structured logging implemented
  • Log aggregation configured
  • Log retention policies set
  • Correlation IDs used throughout
  • Log levels appropriately configured
  • Sensitive data excluded from logs

Alerting

  • Critical alerts configured
  • Warning alerts tuned to reduce noise
  • Alert routing configured for different severities
  • Escalation policies defined
  • Alert fatigue minimized through proper thresholds
# Alerting validation checklist
groups:
- name: eventcore-critical
  rules:
  - alert: EventCoreDown
    expr: up{job="eventcore"} == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "EventCore service is down"
      
  - alert: HighErrorRate
    expr: rate(eventcore_command_errors_total[5m]) / rate(eventcore_commands_total[5m]) > 0.05
    for: 3m
    labels:
      severity: critical
    annotations:
      summary: "High error rate detected"
      
  - alert: DatabaseConnectionFailure
    expr: eventcore_connection_pool_errors_total > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Database connection issues"

Deployment Checklist

Environment Configuration

  • Environment variables properly set
  • Secrets configured and mounted
  • Config maps updated
  • Feature flags configured appropriately
  • Resource limits applied
  • Network policies configured

Database Setup

  • Database migrations applied and verified
  • Database indexes created and optimized
  • Database monitoring configured
  • Connection pooling tuned
  • Backup strategy implemented
  • Read replicas configured if needed

Infrastructure

  • DNS records configured
  • Load balancer configured
  • SSL certificates installed and valid
  • CDN configured if applicable
  • Firewall rules applied
  • Network segmentation implemented

Post-Deployment Verification

Functional Testing

  • Smoke tests pass
  • Critical user journeys work
  • API endpoints respond correctly
  • Authentication works
  • Authorization enforced
  • Error handling works properly
#![allow(unused)]
fn main() {
// Post-deployment validation suite
pub struct PostDeploymentValidator {
    base_url: String,
    auth_token: String,
}

impl PostDeploymentValidator {
    pub async fn run_validation_suite(&self) -> Result<ValidationSuite, ValidationError> {
        let mut suite = ValidationSuite::default();
        
        // Test 1: Health check
        suite.health_check = self.test_health_endpoint().await?;
        
        // Test 2: Authentication
        suite.authentication = self.test_authentication().await?;
        
        // Test 3: Core functionality
        suite.core_functionality = self.test_core_functionality().await?;
        
        // Test 4: Performance
        suite.performance = self.test_basic_performance().await?;
        
        // Test 5: Error handling
        suite.error_handling = self.test_error_handling().await?;
        
        suite.overall_passed = suite.health_check && 
                               suite.authentication && 
                               suite.core_functionality && 
                               suite.performance && 
                               suite.error_handling;
        
        Ok(suite)
    }
    
    async fn test_health_endpoint(&self) -> Result<bool, ValidationError> {
        let response = reqwest::get(&format!("{}/health", self.base_url)).await?;
        Ok(response.status().is_success())
    }
    
    async fn test_authentication(&self) -> Result<bool, ValidationError> {
        // Test with valid token
        let client = reqwest::Client::new();
        let response = client
            .get(&format!("{}/api/v1/test", self.base_url))
            .header("Authorization", format!("Bearer {}", self.auth_token))
            .send()
            .await?;
        
        if !response.status().is_success() {
            return Ok(false);
        }
        
        // Test without token (should fail)
        let response = client
            .get(&format!("{}/api/v1/test", self.base_url))
            .send()
            .await?;
        
        Ok(response.status() == 401)
    }
    
    async fn test_core_functionality(&self) -> Result<bool, ValidationError> {
        // Test a simple command execution
        let client = reqwest::Client::new();
        let create_user_payload = serde_json::json!({
            "email": "test@example.com",
            "first_name": "Test",
            "last_name": "User"
        });
        
        let response = client
            .post(&format!("{}/api/v1/users", self.base_url))
            .header("Authorization", format!("Bearer {}", self.auth_token))
            .json(&create_user_payload)
            .send()
            .await?;
        
        Ok(response.status().is_success())
    }
}

#[derive(Debug, Default)]
pub struct ValidationSuite {
    pub health_check: bool,
    pub authentication: bool,
    pub core_functionality: bool,
    pub performance: bool,
    pub error_handling: bool,
    pub overall_passed: bool,
}
}

Performance Validation

  • Response times within acceptable limits
  • Throughput meets requirements
  • Resource usage within limits
  • Memory leaks not detected
  • CPU usage stable
  • Database performance optimal

Monitoring Validation

  • Metrics flowing to monitoring system
  • Logs being collected and indexed
  • Traces visible in tracing system
  • Alerts triggering appropriately
  • Dashboards showing correct data
  • SLI/SLO monitoring active

Ongoing Operations Checklist

Daily Checks

  • System health green across all services
  • Error rates within acceptable thresholds
  • Performance metrics meeting SLOs
  • Resource utilization not approaching limits
  • Log analysis for new error patterns
  • Security alerts reviewed

Weekly Checks

  • Backup verification completed successfully
  • Performance trends analyzed
  • Capacity planning reviewed
  • Security patches evaluated and applied
  • Dependency updates reviewed
  • Documentation updated as needed

Monthly Checks

  • Disaster recovery procedures tested
  • Security audit completed
  • Performance benchmarks updated
  • Cost optimization opportunities identified
  • Capacity forecasting updated
  • Runbook accuracy verified

Automation Scripts

Deployment Validation Script

#!/bin/bash
# deployment-validation.sh

set -e

NAMESPACE="eventcore"
APP_NAME="eventcore-app"
BASE_URL="https://api.eventcore.example.com"

echo "🚀 Starting deployment validation..."

# Check deployment status
echo "📋 Checking deployment status..."
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s

# Check pod health
echo "🏥 Checking pod health..."
READY_PODS=$(kubectl get pods -l app=$APP_NAME -n $NAMESPACE -o jsonpath='{.items[?(@.status.phase=="Running")].metadata.name}' | wc -w)
DESIRED_PODS=$(kubectl get deployment $APP_NAME -n $NAMESPACE -o jsonpath='{.spec.replicas}')

if [ "$READY_PODS" -ne "$DESIRED_PODS" ]; then
    echo "❌ Not all pods are ready: $READY_PODS/$DESIRED_PODS"
    exit 1
fi

echo "✅ All pods are ready: $READY_PODS/$DESIRED_PODS"

# Check health endpoint
echo "🔍 Testing health endpoint..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $BASE_URL/health)
if [ "$HTTP_STATUS" -ne 200 ]; then
    echo "❌ Health check failed with status: $HTTP_STATUS"
    exit 1
fi

echo "✅ Health check passed"

# Check metrics endpoint
echo "📊 Testing metrics endpoint..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $BASE_URL/metrics)
if [ "$HTTP_STATUS" -ne 200 ]; then
    echo "❌ Metrics endpoint failed with status: $HTTP_STATUS"
    exit 1
fi

echo "✅ Metrics endpoint responding"

# Check database connectivity
echo "🗄️ Testing database connectivity..."
kubectl exec -n $NAMESPACE deployment/$APP_NAME -- eventcore-cli health-check database
if [ $? -ne 0 ]; then
    echo "❌ Database connectivity check failed"
    exit 1
fi

echo "✅ Database connectivity verified"

# Run smoke tests
echo "💨 Running smoke tests..."
kubectl exec -n $NAMESPACE deployment/$APP_NAME -- eventcore-cli test smoke
if [ $? -ne 0 ]; then
    echo "❌ Smoke tests failed"
    exit 1
fi

echo "✅ Smoke tests passed"

echo "🎉 Deployment validation completed successfully!"

Health Check Script

#!/bin/bash
# health-check.sh

set -e

NAMESPACE="eventcore"
PROMETHEUS_URL="http://prometheus.monitoring.svc.cluster.local:9090"

echo "🔍 Running comprehensive health check..."

# Check application health
echo "📱 Checking application health..."
APP_UP=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=up{job=\"eventcore\"}" | jq '.data.result[0].value[1]' -r)
if [ "$APP_UP" != "1" ]; then
    echo "❌ Application is down"
    exit 1
fi

# Check error rate
echo "🚨 Checking error rate..."
ERROR_RATE=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=rate(eventcore_command_errors_total[5m])/rate(eventcore_commands_total[5m])" | jq '.data.result[0].value[1]' -r)
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
    echo "❌ High error rate detected: $ERROR_RATE"
    exit 1
fi

# Check response time
echo "⏱️ Checking response time..."
P95_LATENCY=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m]))" | jq '.data.result[0].value[1]' -r)
if (( $(echo "$P95_LATENCY > 1.0" | bc -l) )); then
    echo "❌ High latency detected: ${P95_LATENCY}s"
    exit 1
fi

# Check database connectivity
echo "🗄️ Checking database health..."
DB_CONNECTIONS=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=eventcore_connection_pool_size" | jq '.data.result[0].value[1]' -r)
MAX_CONNECTIONS=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=eventcore_connection_pool_max_size" | jq '.data.result[0].value[1]' -r)
UTILIZATION=$(echo "scale=2; $DB_CONNECTIONS / $MAX_CONNECTIONS" | bc)

if (( $(echo "$UTILIZATION > 0.8" | bc -l) )); then
    echo "⚠️ High database connection utilization: $UTILIZATION"
fi

echo "✅ All health checks passed!"

Emergency Procedures

Incident Response

  1. Assess severity using incident severity matrix
  2. Activate incident response team if critical
  3. Create incident tracking (ticket/channel)
  4. Implement immediate mitigation if possible
  5. Communicate status to stakeholders
  6. Investigate root cause after mitigation
  7. Document lessons learned and improvements

Rollback Procedures

#!/bin/bash
# emergency-rollback.sh

NAMESPACE="eventcore"
APP_NAME="eventcore-app"

echo "🚨 Emergency rollback initiated..."

# Get previous revision
CURRENT_REVISION=$(kubectl rollout history deployment/$APP_NAME -n $NAMESPACE --output=json | jq '.items[-1].revision')
PREVIOUS_REVISION=$((CURRENT_REVISION - 1))

echo "Rolling back from revision $CURRENT_REVISION to $PREVIOUS_REVISION"

# Perform rollback
kubectl rollout undo deployment/$APP_NAME -n $NAMESPACE --to-revision=$PREVIOUS_REVISION

# Wait for rollback to complete
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s

# Verify health
sleep 30
./health-check.sh

echo "✅ Emergency rollback completed"

Summary

Production readiness checklist for EventCore:

  • Security - Authentication, authorization, encryption
  • Performance - Resource limits, optimization, benchmarks
  • Reliability - High availability, backup and recovery
  • Monitoring - Metrics, logging, alerting, dashboards
  • Operations - Deployment validation, health checks, incident response

Key principles:

  1. Validate everything - Don’t assume anything works in production
  2. Automate checks - Use scripts and tools for consistent validation
  3. Monitor continuously - Track all critical metrics and logs
  4. Plan for failure - Have rollback and recovery procedures ready
  5. Document procedures - Maintain up-to-date runbooks and checklists

This completes the EventCore Operations guide. You now have comprehensive documentation for deploying, monitoring, and maintaining EventCore applications in production environments.

Next, proceed to Part 7: Reference

Part 7: Reference

This part provides comprehensive reference documentation for EventCore. Use this section to look up specific APIs, configuration options, error codes, and terminology.

Chapters in This Part

  1. API Documentation - Complete API reference
  2. Configuration Reference - All configuration options
  3. Error Reference - Error codes and troubleshooting
  4. Glossary - Definitions and terminology

What You’ll Find

  • Complete API documentation with examples
  • Exhaustive configuration option reference
  • Comprehensive error code catalog
  • Definitions of all EventCore terminology

Usage

This reference documentation is designed for:

  • Quick lookups during development
  • Understanding specific configuration options
  • Troubleshooting error conditions
  • Learning EventCore terminology

Organization

Each reference chapter is organized alphabetically or logically for easy navigation. Use the search functionality in your documentation viewer to quickly find specific information.

Ready to explore the reference? Start with API Documentation

API Documentation

The complete EventCore API documentation is generated from the source code using rustdoc.

The API documentation includes:

  • Complete type and trait references - All public types, traits, and functions
  • Usage examples - Code examples demonstrating common patterns
  • Module documentation - Overview and guidance for each module
  • Cross-references - Links between related types and concepts

Key Modules

Core Library

  • eventcore - Core library with command execution, event stores, and projections
  • eventcore::prelude - Common imports for EventCore applications

Event Store Adapters

Derive Macros

Quick Reference

For quick access to commonly used items:

Chapter 7.2: Configuration Reference

This chapter provides a complete reference for all EventCore configuration options. Use this as a lookup guide when setting up and tuning your EventCore applications.

Core Configuration

EventStore Configuration

Configuration for event store implementations.

PostgresConfig

Configuration for PostgreSQL event store.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct PostgresConfig {
    pub database_url: String,
    pub pool_config: PoolConfig,
    pub migration_config: MigrationConfig,
    pub performance_config: PerformanceConfig,
    pub security_config: SecurityConfig,
}

impl PostgresConfig {
    pub fn new(database_url: String) -> Self
    pub fn from_env() -> Result<Self, ConfigError>
    pub fn with_pool_config(mut self, config: PoolConfig) -> Self
    pub fn with_migration_config(mut self, config: MigrationConfig) -> Self
}
}

Example:

#![allow(unused)]
fn main() {
let config = PostgresConfig::new("postgresql://localhost/eventcore".to_string())
    .with_pool_config(PoolConfig {
        max_connections: 20,
        min_connections: 5,
        connect_timeout: Duration::from_secs(10),
        idle_timeout: Some(Duration::from_secs(300)),
        max_lifetime: Some(Duration::from_secs(1800)),
    })
    .with_migration_config(MigrationConfig {
        auto_migrate: true,
        migration_timeout: Duration::from_secs(60),
    });
}

PoolConfig

Database connection pool configuration.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct PoolConfig {
    /// Maximum number of connections in the pool
    pub max_connections: u32,
    
    /// Minimum number of connections to maintain
    pub min_connections: u32,
    
    /// Timeout for establishing new connections
    pub connect_timeout: Duration,
    
    /// Maximum time a connection can be idle before being closed
    pub idle_timeout: Option<Duration>,
    
    /// Maximum lifetime of a connection
    pub max_lifetime: Option<Duration>,
    
    /// Test connections before use
    pub test_before_acquire: bool,
}

impl Default for PoolConfig {
    fn default() -> Self {
        Self {
            max_connections: 10,
            min_connections: 2,
            connect_timeout: Duration::from_secs(5),
            idle_timeout: Some(Duration::from_secs(600)),
            max_lifetime: Some(Duration::from_secs(3600)),
            test_before_acquire: true,
        }
    }
}
}

Tuning Guidelines:

  • max_connections: 2-4x CPU cores for CPU-bound workloads, higher for I/O-bound
  • min_connections: 10-20% of max_connections
  • connect_timeout: 5-10 seconds for local databases, 15-30 seconds for remote
  • idle_timeout: 5-10 minutes to balance connection reuse and resource usage
  • max_lifetime: 30-60 minutes to prevent connection staleness

MigrationConfig

Database migration configuration.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct MigrationConfig {
    /// Automatically run migrations on startup
    pub auto_migrate: bool,
    
    /// Timeout for migration operations
    pub migration_timeout: Duration,
    
    /// Lock timeout for migration coordination
    pub lock_timeout: Duration,
    
    /// Migration table name
    pub migration_table: String,
}

impl Default for MigrationConfig {
    fn default() -> Self {
        Self {
            auto_migrate: false,
            migration_timeout: Duration::from_secs(300),
            lock_timeout: Duration::from_secs(60),
            migration_table: "_sqlx_migrations".to_string(),
        }
    }
}
}

Command Execution Configuration

CommandExecutorConfig

Configuration for command execution behavior.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct CommandExecutorConfig {
    pub retry_config: RetryConfig,
    pub timeout_config: TimeoutConfig,
    pub concurrency_config: ConcurrencyConfig,
    pub metrics_config: MetricsConfig,
}

impl Default for CommandExecutorConfig {
    fn default() -> Self {
        Self {
            retry_config: RetryConfig::default(),
            timeout_config: TimeoutConfig::default(),
            concurrency_config: ConcurrencyConfig::default(),
            metrics_config: MetricsConfig::default(),
        }
    }
}
}

RetryConfig

Configuration for command retry behavior.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct RetryConfig {
    /// Maximum number of retry attempts
    pub max_attempts: u32,
    
    /// Initial delay before first retry
    pub initial_delay: Duration,
    
    /// Maximum delay between retries
    pub max_delay: Duration,
    
    /// Multiplier for exponential backoff
    pub backoff_multiplier: f64,
    
    /// Which types of errors to retry
    pub retry_policy: RetryPolicy,
    
    /// Add jitter to prevent thundering herd
    pub jitter: bool,
}

impl RetryConfig {
    pub fn none() -> Self {
        Self {
            max_attempts: 0,
            ..Default::default()
        }
    }
    
    pub fn aggressive() -> Self {
        Self {
            max_attempts: 10,
            initial_delay: Duration::from_millis(10),
            max_delay: Duration::from_secs(5),
            backoff_multiplier: 1.5,
            retry_policy: RetryPolicy::All,
            jitter: true,
        }
    }
    
    pub fn conservative() -> Self {
        Self {
            max_attempts: 3,
            initial_delay: Duration::from_millis(100),
            max_delay: Duration::from_secs(2),
            backoff_multiplier: 2.0,
            retry_policy: RetryPolicy::ConcurrencyConflictsOnly,
            jitter: true,
        }
    }
}

impl Default for RetryConfig {
    fn default() -> Self {
        Self {
            max_attempts: 5,
            initial_delay: Duration::from_millis(50),
            max_delay: Duration::from_secs(1),
            backoff_multiplier: 2.0,
            retry_policy: RetryPolicy::TransientErrorsOnly,
            jitter: true,
        }
    }
}

#[derive(Debug, Clone)]
pub enum RetryPolicy {
    /// Never retry
    None,
    
    /// Only retry concurrency conflicts
    ConcurrencyConflictsOnly,
    
    /// Only retry transient errors (connection issues, timeouts)
    TransientErrorsOnly,
    
    /// Retry all retryable errors
    All,
}
}

Retry Policy Guidelines:

  • ConcurrencyConflictsOnly: Use for high-conflict scenarios where immediate retry is beneficial
  • TransientErrorsOnly: Use for stable systems where business logic errors shouldn’t be retried
  • All: Use for development or systems where any failure might be recoverable

TimeoutConfig

Configuration for command timeouts.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct TimeoutConfig {
    /// Default timeout for command execution
    pub default_timeout: Duration,
    
    /// Timeout for reading streams
    pub read_timeout: Duration,
    
    /// Timeout for writing events
    pub write_timeout: Duration,
    
    /// Timeout for stream discovery
    pub discovery_timeout: Duration,
}

impl Default for TimeoutConfig {
    fn default() -> Self {
        Self {
            default_timeout: Duration::from_secs(30),
            read_timeout: Duration::from_secs(10),
            write_timeout: Duration::from_secs(15),
            discovery_timeout: Duration::from_secs(5),
        }
    }
}
}

ConcurrencyConfig

Configuration for concurrent command execution.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct ConcurrencyConfig {
    /// Maximum number of concurrent commands
    pub max_concurrent_commands: usize,
    
    /// Maximum iterations for stream discovery
    pub max_discovery_iterations: usize,
    
    /// Enable command batching
    pub enable_batching: bool,
    
    /// Maximum batch size for event writes
    pub max_batch_size: usize,
    
    /// Batch timeout
    pub batch_timeout: Duration,
}

impl Default for ConcurrencyConfig {
    fn default() -> Self {
        Self {
            max_concurrent_commands: 100,
            max_discovery_iterations: 10,
            enable_batching: true,
            max_batch_size: 1000,
            batch_timeout: Duration::from_millis(100),
        }
    }
}
}

Concurrency Tuning:

  • max_concurrent_commands: Balance between throughput and resource usage
  • max_discovery_iterations: Higher values allow more complex stream patterns but increase latency
  • max_batch_size: Larger batches improve throughput but increase memory usage and latency

Projection Configuration

ProjectionConfig

Configuration for projection management.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct ProjectionConfig {
    pub checkpoint_config: CheckpointConfig,
    pub processing_config: ProcessingConfig,
    pub recovery_config: RecoveryConfig,
}

### CheckpointConfig

Configuration for projection checkpointing.

```rust
#[derive(Debug, Clone)]
pub struct CheckpointConfig {
    /// How often to save checkpoints
    pub checkpoint_interval: Duration,
    
    /// Number of events to process before checkpointing
    pub events_per_checkpoint: usize,
    
    /// Store for checkpoint persistence
    pub checkpoint_store: CheckpointStoreConfig,
    
    /// Enable checkpoint compression
    pub compress_checkpoints: bool,
}

impl Default for CheckpointConfig {
    fn default() -> Self {
        Self {
            checkpoint_interval: Duration::from_secs(30),
            events_per_checkpoint: 1000,
            checkpoint_store: CheckpointStoreConfig::Database,
            compress_checkpoints: true,
        }
    }
}

#[derive(Debug, Clone)]
pub enum CheckpointStoreConfig {
    /// Store checkpoints in the main database
    Database,
    
    /// Store checkpoints in Redis
    Redis { connection_string: String },
    
    /// Store checkpoints in memory (testing only)
    InMemory,
    
    /// Custom checkpoint store
    Custom { store_type: String, config: HashMap<String, String> },
}
}

ProcessingConfig

Configuration for event processing.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct ProcessingConfig {
    /// Number of events to process in each batch
    pub batch_size: usize,
    
    /// Timeout for processing a single event
    pub event_timeout: Duration,
    
    /// Timeout for processing a batch
    pub batch_timeout: Duration,
    
    /// Number of parallel processors
    pub parallelism: usize,
    
    /// Buffer size for event queues
    pub buffer_size: usize,
    
    /// Error handling strategy
    pub error_handling: ErrorHandlingStrategy,
}

impl Default for ProcessingConfig {
    fn default() -> Self {
        Self {
            batch_size: 100,
            event_timeout: Duration::from_secs(5),
            batch_timeout: Duration::from_secs(30),
            parallelism: 1,
            buffer_size: 10000,
            error_handling: ErrorHandlingStrategy::SkipAndLog,
        }
    }
}

#[derive(Debug, Clone)]
pub enum ErrorHandlingStrategy {
    /// Skip failed events and log errors
    SkipAndLog,
    
    /// Stop processing on first error
    FailFast,
    
    /// Retry failed events with backoff
    Retry { max_attempts: u32, backoff: Duration },
    
    /// Send failed events to dead letter queue
    DeadLetter { queue_config: DeadLetterConfig },
}
}

Monitoring Configuration

MetricsConfig

Configuration for metrics collection.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct MetricsConfig {
    /// Enable metrics collection
    pub enabled: bool,
    
    /// Metrics export format
    pub export_format: MetricsFormat,
    
    /// Export interval
    pub export_interval: Duration,
    
    /// Histogram buckets for latency metrics
    pub latency_buckets: Vec<f64>,
    
    /// Labels to add to all metrics
    pub default_labels: HashMap<String, String>,
    
    /// Metrics to collect
    pub collectors: Vec<MetricsCollector>,
}

impl Default for MetricsConfig {
    fn default() -> Self {
        Self {
            enabled: true,
            export_format: MetricsFormat::Prometheus,
            export_interval: Duration::from_secs(15),
            latency_buckets: vec![
                0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0
            ],
            default_labels: HashMap::new(),
            collectors: vec![
                MetricsCollector::Commands,
                MetricsCollector::Events,
                MetricsCollector::Projections,
                MetricsCollector::System,
            ],
        }
    }
}

#[derive(Debug, Clone)]
pub enum MetricsFormat {
    Prometheus,
    OpenTelemetry,
    StatsD,
    Custom { format: String },
}

#[derive(Debug, Clone)]
pub enum MetricsCollector {
    Commands,
    Events,
    Projections,
    System,
    Custom { name: String },
}
}

TracingConfig

Configuration for distributed tracing.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct TracingConfig {
    /// Enable tracing
    pub enabled: bool,
    
    /// Tracing exporter configuration
    pub exporter: TracingExporter,
    
    /// Sampling configuration
    pub sampling: SamplingConfig,
    
    /// Resource attributes
    pub resource_attributes: HashMap<String, String>,
    
    /// Span attributes to add to all spans
    pub default_span_attributes: HashMap<String, String>,
}

impl Default for TracingConfig {
    fn default() -> Self {
        Self {
            enabled: true,
            exporter: TracingExporter::Jaeger {
                endpoint: "http://localhost:14268/api/traces".to_string(),
            },
            sampling: SamplingConfig::default(),
            resource_attributes: HashMap::new(),
            default_span_attributes: HashMap::new(),
        }
    }
}

#[derive(Debug, Clone)]
pub enum TracingExporter {
    Jaeger { endpoint: String },
    Zipkin { endpoint: String },
    OpenTelemetry { endpoint: String },
    Console,
    None,
}

#[derive(Debug, Clone)]
pub struct SamplingConfig {
    /// Sampling rate (0.0 to 1.0)
    pub sample_rate: f64,
    
    /// Always sample errors
    pub always_sample_errors: bool,
    
    /// Sampling strategy
    pub strategy: SamplingStrategy,
}

impl Default for SamplingConfig {
    fn default() -> Self {
        Self {
            sample_rate: 0.1,
            always_sample_errors: true,
            strategy: SamplingStrategy::Probabilistic,
        }
    }
}

#[derive(Debug, Clone)]
pub enum SamplingStrategy {
    /// Always sample
    Always,
    
    /// Never sample
    Never,
    
    /// Probabilistic sampling
    Probabilistic,
    
    /// Rate limiting sampling
    RateLimit { max_per_second: u32 },
}
}

LoggingConfig

Configuration for structured logging.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct LoggingConfig {
    /// Log level
    pub level: LogLevel,
    
    /// Log format
    pub format: LogFormat,
    
    /// Output destination
    pub output: LogOutput,
    
    /// Include timestamps
    pub include_timestamps: bool,
    
    /// Include source code locations
    pub include_locations: bool,
    
    /// Correlation ID header name
    pub correlation_id_header: String,
    
    /// Fields to include in all log entries
    pub default_fields: HashMap<String, String>,
}

impl Default for LoggingConfig {
    fn default() -> Self {
        Self {
            level: LogLevel::Info,
            format: LogFormat::Json,
            output: LogOutput::Stdout,
            include_timestamps: true,
            include_locations: false,
            correlation_id_header: "x-correlation-id".to_string(),
            default_fields: HashMap::new(),
        }
    }
}

#[derive(Debug, Clone)]
pub enum LogLevel {
    Trace,
    Debug,
    Info,
    Warn,
    Error,
}

#[derive(Debug, Clone)]
pub enum LogFormat {
    Json,
    Logfmt,
    Pretty,
    Compact,
}

#[derive(Debug, Clone)]
pub enum LogOutput {
    Stdout,
    Stderr,
    File { path: String, rotation: RotationConfig },
    Syslog { facility: String },
    Network { endpoint: String },
}

#[derive(Debug, Clone)]
pub struct RotationConfig {
    pub max_size_mb: u64,
    pub max_files: u32,
    pub compress: bool,
}
}

Security Configuration

SecurityConfig

Configuration for security features.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct SecurityConfig {
    pub tls_config: Option<TlsConfig>,
    pub auth_config: AuthConfig,
    pub encryption_config: EncryptionConfig,
}

### TlsConfig

Configuration for TLS encryption.

```rust
#[derive(Debug, Clone)]
pub struct TlsConfig {
    /// Path to certificate file
    pub cert_file: String,
    
    /// Path to private key file
    pub key_file: String,
    
    /// Path to CA certificate file (for client verification)
    pub ca_file: Option<String>,
    
    /// Require client certificates
    pub require_client_cert: bool,
    
    /// Minimum TLS version
    pub min_version: TlsVersion,
    
    /// Allowed cipher suites
    pub cipher_suites: Vec<String>,
}

#[derive(Debug, Clone)]
pub enum TlsVersion {
    V1_2,
    V1_3,
}
}

AuthConfig

Configuration for authentication.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct AuthConfig {
    /// Authentication provider
    pub provider: AuthProvider,
    
    /// Token validation settings
    pub token_validation: TokenValidationConfig,
    
    /// Session configuration
    pub session_config: SessionConfig,
}

#[derive(Debug, Clone)]
pub enum AuthProvider {
    /// JWT-based authentication
    Jwt { 
        secret_key: String,
        algorithm: JwtAlgorithm,
        issuer: Option<String>,
        audience: Option<String>,
    },
    
    /// OAuth2 authentication
    OAuth2 {
        client_id: String,
        client_secret: String,
        auth_url: String,
        token_url: String,
        scopes: Vec<String>,
    },
    
    /// API key authentication
    ApiKey {
        header_name: String,
        query_param: Option<String>,
    },
    
    /// Custom authentication
    Custom { provider_type: String, config: HashMap<String, String> },
}

#[derive(Debug, Clone)]
pub enum JwtAlgorithm {
    HS256,
    HS384,
    HS512,
    RS256,
    RS384,
    RS512,
    ES256,
    ES384,
    ES512,
}
}

EncryptionConfig

Configuration for data encryption.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct EncryptionConfig {
    /// Enable encryption at rest
    pub encrypt_at_rest: bool,
    
    /// Encryption algorithm
    pub algorithm: EncryptionAlgorithm,
    
    /// Key management configuration
    pub key_management: KeyManagementConfig,
    
    /// Fields to encrypt
    pub encrypted_fields: Vec<String>,
}

#[derive(Debug, Clone)]
pub enum EncryptionAlgorithm {
    AES256GCM,
    ChaCha20Poly1305,
    XChaCha20Poly1305,
}

#[derive(Debug, Clone)]
pub enum KeyManagementConfig {
    /// Environment variable
    Environment { key_var: String },
    
    /// AWS KMS
    AwsKms { key_id: String, region: String },
    
    /// HashiCorp Vault
    Vault { endpoint: String, token: String, key_path: String },
    
    /// File-based key storage
    File { key_file: String },
}
}

Environment Variables

EventCore supports configuration via environment variables with the EVENTCORE_ prefix:

Core Settings

# Database configuration
EVENTCORE_DATABASE_URL=postgresql://localhost/eventcore
EVENTCORE_DATABASE_MAX_CONNECTIONS=20
EVENTCORE_DATABASE_MIN_CONNECTIONS=5
EVENTCORE_DATABASE_CONNECT_TIMEOUT=10
EVENTCORE_DATABASE_IDLE_TIMEOUT=300
EVENTCORE_DATABASE_MAX_LIFETIME=1800

# Command execution
EVENTCORE_COMMAND_DEFAULT_TIMEOUT=30
EVENTCORE_COMMAND_MAX_RETRIES=5
EVENTCORE_COMMAND_RETRY_DELAY_MS=50
EVENTCORE_COMMAND_MAX_CONCURRENT=100

# Projections
EVENTCORE_PROJECTION_BATCH_SIZE=100
EVENTCORE_PROJECTION_CHECKPOINT_INTERVAL=30
EVENTCORE_PROJECTION_EVENTS_PER_CHECKPOINT=1000

# Metrics and monitoring
EVENTCORE_METRICS_ENABLED=true
EVENTCORE_METRICS_EXPORT_INTERVAL=15
EVENTCORE_TRACING_ENABLED=true
EVENTCORE_TRACING_SAMPLE_RATE=0.1

# Security
EVENTCORE_JWT_SECRET=your-secret-key
EVENTCORE_TLS_CERT_FILE=/path/to/cert.pem
EVENTCORE_TLS_KEY_FILE=/path/to/key.pem
EVENTCORE_ENCRYPT_AT_REST=true

Logging Configuration

EVENTCORE_LOG_LEVEL=info
EVENTCORE_LOG_FORMAT=json
EVENTCORE_LOG_OUTPUT=stdout
EVENTCORE_LOG_INCLUDE_TIMESTAMPS=true
EVENTCORE_LOG_INCLUDE_LOCATIONS=false

Development Settings

# Development mode settings
EVENTCORE_DEV_MODE=true
EVENTCORE_DEV_AUTO_MIGRATE=true
EVENTCORE_DEV_RESET_DB=false
EVENTCORE_DEV_SEED_DATA=true

# Testing settings  
EVENTCORE_TEST_DATABASE_URL=postgresql://localhost/eventcore_test
EVENTCORE_TEST_PARALLEL=true
EVENTCORE_TEST_RESET_BETWEEN_TESTS=true

Configuration Files

TOML Configuration Example

# eventcore.toml

[database]
url = "postgresql://localhost/eventcore"
max_connections = 20
min_connections = 5
connect_timeout = "10s"
idle_timeout = "5m"
max_lifetime = "30m"

[commands]
default_timeout = "30s"
max_retries = 5
retry_delay = "50ms"
max_concurrent = 100
max_discovery_iterations = 10

[projections]
batch_size = 100
checkpoint_interval = "30s"
events_per_checkpoint = 1000
parallelism = 1

[metrics]
enabled = true
export_format = "prometheus"
export_interval = "15s"
latency_buckets = [0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]

[tracing]
enabled = true
exporter = "jaeger"
jaeger_endpoint = "http://localhost:14268/api/traces"
sample_rate = 0.1
always_sample_errors = true

[logging]
level = "info"
format = "json"
output = "stdout"
include_timestamps = true
include_locations = false

[security]
encrypt_at_rest = true
jwt_secret = "${JWT_SECRET}"

[security.tls]
cert_file = "/etc/ssl/certs/eventcore.pem"
key_file = "/etc/ssl/private/eventcore.key"
require_client_cert = false
min_version = "1.3"

YAML Configuration Example

# eventcore.yaml

database:
  url: postgresql://localhost/eventcore
  pool:
    max_connections: 20
    min_connections: 5
    connect_timeout: 10s
    idle_timeout: 5m
    max_lifetime: 30m
  migration:
    auto_migrate: false
    migration_timeout: 5m

commands:
  timeout:
    default_timeout: 30s
    read_timeout: 10s
    write_timeout: 15s
  retry:
    max_attempts: 5
    initial_delay: 50ms
    max_delay: 1s
    backoff_multiplier: 2.0
    policy: transient_errors_only
    jitter: true
  concurrency:
    max_concurrent_commands: 100
    max_discovery_iterations: 10
    enable_batching: true
    max_batch_size: 1000

projections:
  checkpoint:
    interval: 30s
    events_per_checkpoint: 1000
    store: database
    compress: true
  processing:
    batch_size: 100
    event_timeout: 5s
    batch_timeout: 30s
    parallelism: 1
    error_handling: skip_and_log

monitoring:
  metrics:
    enabled: true
    export_format: prometheus
    export_interval: 15s
    collectors:
      - commands
      - events
      - projections
      - system
  tracing:
    enabled: true
    exporter:
      type: jaeger
      endpoint: http://localhost:14268/api/traces
    sampling:
      sample_rate: 0.1
      always_sample_errors: true
  logging:
    level: info
    format: json
    output: stdout
    correlation_id_header: x-correlation-id

security:
  auth:
    provider:
      type: jwt
      secret_key: ${JWT_SECRET}
      algorithm: HS256
  encryption:
    encrypt_at_rest: true
    algorithm: AES256GCM
    key_management:
      type: environment
      key_var: ENCRYPTION_KEY

Configuration Loading

EventCore supports multiple configuration sources with the following precedence order:

  1. Command line arguments (highest priority)
  2. Environment variables
  3. Configuration files (TOML, YAML, JSON)
  4. Default values (lowest priority)

Loading Configuration in Code

#![allow(unused)]
fn main() {
use eventcore::config::{EventCoreConfig, ConfigBuilder};

// Load from environment and files
let config = EventCoreConfig::from_env()
    .expect("Failed to load configuration");

// Custom configuration loading
let config = ConfigBuilder::new()
    .load_from_file("config/eventcore.toml")?
    .load_from_env()?
    .override_with_args(std::env::args())?
    .build()?;

// Validate configuration
config.validate()?;
}

This completes the configuration reference. All EventCore configuration options are documented with examples, default values, and tuning guidelines.

Next, explore Error Reference

Chapter 7.3: Error Reference

This chapter provides a comprehensive reference for all EventCore error types, error codes, and troubleshooting guidance. Use this reference to understand and resolve errors in your EventCore applications.

Error Categories

EventCore errors are organized into several categories based on their origin and nature:

  1. Command Errors - Errors during command execution
  2. Event Store Errors - Errors from event store operations
  3. Projection Errors - Errors in projection processing
  4. Validation Errors - Input validation failures
  5. Configuration Errors - Configuration and setup issues
  6. Network Errors - Network and connectivity issues
  7. Serialization Errors - Data serialization/deserialization issues

Command Errors

CommandError

The primary error type for command execution failures.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum CommandError {
    #[error("Validation failed: {message}")]
    ValidationFailed { message: String },
    
    #[error("Business rule violation: {rule} - {message}")]
    BusinessRuleViolation { rule: String, message: String },
    
    #[error("Concurrency conflict on streams: {streams:?}")]
    ConcurrencyConflict { streams: Vec<StreamId> },
    
    #[error("Stream not found: {stream_id}")]
    StreamNotFound { stream_id: StreamId },
    
    #[error("Unauthorized: {permission}")]
    Unauthorized { permission: String },
    
    #[error("Timeout after {duration:?}")]
    Timeout { duration: Duration },
    
    #[error("Stream access denied: cannot write to {stream_id}")]
    StreamAccessDenied { stream_id: StreamId },
    
    #[error("Maximum discovery iterations exceeded: {iterations}")]
    MaxIterationsExceeded { iterations: usize },
    
    #[error("Event store error: {source}")]
    EventStoreError { 
        #[from]
        source: EventStoreError 
    },
    
    #[error("Serialization error: {message}")]
    SerializationError { message: String },
    
    #[error("Internal error: {message}")]
    InternalError { message: String },
}
}

Error Codes and Solutions

CE001: ValidationFailed

Error: Validation failed: StreamId cannot be empty
Code: CE001

Cause: Input validation failed during command construction or execution. Solution:

  • Check input parameters for correct format and constraints
  • Ensure all required fields are provided
  • Verify string lengths and format requirements

CE002: BusinessRuleViolation

Error: Business rule violation: insufficient_balance - Account balance $100.00 is less than transfer amount $150.00
Code: CE002

Cause: Business logic constraints were violated. Solution:

  • Review business rules and ensure command logic respects them
  • Check application state before executing commands
  • Implement proper validation in command handlers

CE003: ConcurrencyConflict

Error: Concurrency conflict on streams: ["account-123", "account-456"]
Code: CE003

Cause: Multiple commands attempted to modify the same streams simultaneously. Solution:

  • Implement retry logic with exponential backoff
  • Consider command design to reduce conflicts
  • Use optimistic concurrency control patterns

CE004: StreamNotFound

Error: Stream not found: account-nonexistent
Code: CE004

Cause: Command attempted to read from a stream that doesn’t exist. Solution:

  • Verify stream IDs are correct
  • Check if the resource exists before referencing it
  • Implement proper error handling for missing resources

CE005: Unauthorized

Error: Unauthorized: write_account_events
Code: CE005

Cause: Insufficient permissions to execute the command. Solution:

  • Verify user authentication and authorization
  • Check role-based access control configuration
  • Ensure proper security context is set

CE006: Timeout

Error: Timeout after 30s
Code: CE006

Cause: Command execution exceeded configured timeout. Solution:

  • Check system performance and database connectivity
  • Increase timeout configuration if appropriate
  • Optimize command logic and database queries

CE007: StreamAccessDenied

Error: Stream access denied: cannot write to protected-stream-123
Code: CE007

Cause: Command attempted to write to a stream it didn’t declare. Solution:

  • Add the stream to the command’s read_streams() method
  • Verify command design follows EventCore stream access patterns
  • Check for typos in stream ID generation

CE008: MaxIterationsExceeded

Error: Maximum discovery iterations exceeded: 10
Code: CE008

Cause: Stream discovery loop exceeded configured maximum iterations. Solution:

  • Review command logic for potential infinite discovery loops
  • Increase max_discovery_iterations if legitimate
  • Optimize stream discovery patterns

Command Execution Flow Errors

These errors occur during specific phases of command execution:

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ExecutionPhaseError {
    #[error("Stream reading failed: {message}")]
    StreamReadError { message: String },
    
    #[error("State reconstruction failed: {message}")]
    StateReconstructionError { message: String },
    
    #[error("Command handling failed: {message}")]
    CommandHandlingError { message: String },
    
    #[error("Event writing failed: {message}")]
    EventWritingError { message: String },
    
    #[error("Stream discovery failed: {message}")]
    StreamDiscoveryError { message: String },
}
}

Event Store Errors

EventStoreError

Errors from event store operations.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum EventStoreError {
    #[error("Version conflict: expected {expected:?}, got {actual}")]
    VersionConflict {
        expected: ExpectedVersion,
        actual: EventVersion,
    },
    
    #[error("Stream not found: {stream_id}")]
    StreamNotFound { stream_id: StreamId },
    
    #[error("Connection failed: {message}")]
    ConnectionFailed { message: String },
    
    #[error("Database error: {source}")]
    DatabaseError {
        #[from]
        source: sqlx::Error,
    },
    
    #[error("Serialization error: {message}")]
    SerializationError { message: String },
    
    #[error("Transaction failed: {message}")]
    TransactionError { message: String },
    
    #[error("Migration error: {message}")]
    MigrationError { message: String },
    
    #[error("Configuration error: {message}")]
    ConfigurationError { message: String },
    
    #[error("Timeout error: operation timed out after {duration:?}")]
    TimeoutError { duration: Duration },
    
    #[error("Connection pool exhausted")]
    ConnectionPoolExhausted,
    
    #[error("Invalid event data: {message}")]
    InvalidEventData { message: String },
}
}

Error Codes and Solutions

ES001: VersionConflict

Error: Version conflict: expected Exact(5), got 7
Code: ES001

Cause: Optimistic concurrency control detected concurrent modification. Solution:

  • Implement retry logic in command execution
  • Consider command design to reduce conflicts
  • Use appropriate ExpectedVersion strategy

ES002: ConnectionFailed

Error: Connection failed: Failed to connect to database at postgresql://localhost/eventcore
Code: ES002

Cause: Unable to establish database connection. Solution:

  • Verify database is running and accessible
  • Check connection string configuration
  • Verify network connectivity and firewall rules

ES003: ConnectionPoolExhausted

Error: Connection pool exhausted
Code: ES003

Cause: All database connections in the pool are in use. Solution:

  • Increase max_connections in pool configuration
  • Check for connection leaks in application code
  • Monitor connection usage patterns

ES004: TransactionError

Error: Transaction failed: serialization failure
Code: ES004

Cause: Database transaction could not be completed due to conflicts. Solution:

  • Implement transaction retry logic
  • Review transaction isolation levels
  • Consider reducing transaction scope

ES005: MigrationError

Error: Migration error: Migration 20231201_001_create_events failed
Code: ES005

Cause: Database migration failed during startup. Solution:

  • Check database permissions
  • Verify migration scripts are valid
  • Review database schema state

PostgreSQL-Specific Errors

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum PostgresError {
    #[error("Unique constraint violation: {constraint}")]
    UniqueConstraintViolation { constraint: String },
    
    #[error("Foreign key constraint violation: {constraint}")]
    ForeignKeyViolation { constraint: String },
    
    #[error("Check constraint violation: {constraint}")]
    CheckConstraintViolation { constraint: String },
    
    #[error("Deadlock detected: {message}")]
    DeadlockDetected { message: String },
    
    #[error("Query timeout: query exceeded {timeout:?}")]
    QueryTimeout { timeout: Duration },
    
    #[error("Connection limit exceeded")]
    ConnectionLimitExceeded,
}
}

Projection Errors

ProjectionError

Errors from projection operations.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ProjectionError {
    #[error("Projection not found: {name}")]
    NotFound { name: String },
    
    #[error("Projection already exists: {name}")]
    AlreadyExists { name: String },
    
    #[error("Event processing failed: {message}")]
    ProcessingFailed { message: String },
    
    #[error("Checkpoint save failed: {message}")]
    CheckpointFailed { message: String },
    
    #[error("Rebuild failed: {message}")]
    RebuildFailed { message: String },
    
    #[error("Subscription error: {message}")]
    SubscriptionError { message: String },
    
    #[error("State corruption detected: {message}")]
    StateCorruption { message: String },
    
    #[error("Projection timeout: {projection} timed out after {duration:?}")]
    Timeout { projection: String, duration: Duration },
    
    #[error("Configuration error: {message}")]
    ConfigurationError { message: String },
}
}

Error Codes and Solutions

PR001: ProcessingFailed

Error: Event processing failed: Failed to apply UserCreated event
Code: PR001

Cause: Projection failed to process an event. Solution:

  • Check projection logic for errors
  • Verify event format matches expectations
  • Implement proper error handling in projections

PR002: CheckpointFailed

Error: Checkpoint save failed: Database connection lost
Code: PR002

Cause: Unable to save projection checkpoint. Solution:

  • Check database connectivity
  • Verify checkpoint storage configuration
  • Implement checkpoint retry logic

PR003: RebuildFailed

Error: Rebuild failed: Out of memory during rebuild
Code: PR003

Cause: Projection rebuild encountered an error. Solution:

  • Increase memory allocation for rebuild operations
  • Implement incremental rebuild strategies
  • Check for memory leaks in projection code

PR004: StateCorruption

Error: State corruption detected: Checksum mismatch
Code: PR004

Cause: Projection state integrity check failed. Solution:

  • Rebuild projection from beginning
  • Investigate potential data corruption causes
  • Verify checkpoint storage integrity

Validation Errors

ValidationError

Input validation errors from the nutype validation system.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ValidationError {
    #[error("Required field is empty: {field}")]
    Empty { field: String },
    
    #[error("Value too long: {field} length {length} exceeds maximum {max}")]
    TooLong { field: String, length: usize, max: usize },
    
    #[error("Value too short: {field} length {length} below minimum {min}")]
    TooShort { field: String, length: usize, min: usize },
    
    #[error("Invalid format: {field} does not match expected format")]
    InvalidFormat { field: String },
    
    #[error("Invalid range: {field} value {value} outside range [{min}, {max}]")]
    OutOfRange { field: String, value: String, min: String, max: String },
    
    #[error("Predicate failed: {field} failed validation rule")]
    PredicateFailed { field: String },
    
    #[error("Parse error: {field} could not be parsed - {message}")]
    ParseError { field: String, message: String },
}
}

Error Codes and Solutions

VE001: Empty

Error: Required field is empty: stream_id
Code: VE001

Cause: Required field was empty or contained only whitespace. Solution:

  • Ensure all required fields have values
  • Check for null or empty string inputs
  • Verify string trimming behavior

VE002: TooLong

Error: Value too long: stream_id length 300 exceeds maximum 255
Code: VE002

Cause: Input value exceeded maximum length constraint. Solution:

  • Reduce input length to meet constraints
  • Consider using shorter identifiers
  • Review length requirements

VE003: InvalidFormat

Error: Invalid format: email does not match expected format
Code: VE003

Cause: Input value didn’t match expected format pattern. Solution:

  • Verify input format matches requirements
  • Check regular expression patterns
  • Validate input on client side before submission

Configuration Errors

ConfigError

Configuration and setup errors.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum ConfigError {
    #[error("Missing required configuration: {key}")]
    MissingRequired { key: String },
    
    #[error("Invalid configuration value: {key} = {value}")]
    InvalidValue { key: String, value: String },
    
    #[error("Configuration file not found: {path}")]
    FileNotFound { path: String },
    
    #[error("Configuration parse error: {message}")]
    ParseError { message: String },
    
    #[error("Environment variable error: {message}")]
    EnvironmentError { message: String },
    
    #[error("Validation error: {message}")]
    ValidationError { message: String },
}
}

Error Codes and Solutions

CF001: MissingRequired

Error: Missing required configuration: DATABASE_URL
Code: CF001

Cause: Required configuration parameter not provided. Solution:

  • Set missing environment variable or configuration value
  • Check configuration file completeness
  • Verify environment setup

CF002: InvalidValue

Error: Invalid configuration value: MAX_CONNECTIONS = -5
Code: CF002

Cause: Configuration value is invalid for the parameter type. Solution:

  • Check value format and type requirements
  • Verify numeric ranges and constraints
  • Review configuration documentation

Network Errors

NetworkError

Network and connectivity related errors.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum NetworkError {
    #[error("Connection timeout: {endpoint}")]
    ConnectionTimeout { endpoint: String },
    
    #[error("DNS resolution failed: {hostname}")]
    DnsResolutionFailed { hostname: String },
    
    #[error("TLS error: {message}")]
    TlsError { message: String },
    
    #[error("HTTP error: {status} - {message}")]
    HttpError { status: u16, message: String },
    
    #[error("Network unreachable: {endpoint}")]
    NetworkUnreachable { endpoint: String },
    
    #[error("Connection refused: {endpoint}")]
    ConnectionRefused { endpoint: String },
}
}

Serialization Errors

SerializationError

Data serialization and deserialization errors.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum SerializationError {
    #[error("JSON serialization failed: {message}")]
    JsonSerializationFailed { message: String },
    
    #[error("JSON deserialization failed: {message}")]
    JsonDeserializationFailed { message: String },
    
    #[error("Invalid JSON format: {message}")]
    InvalidJsonFormat { message: String },
    
    #[error("Missing required field: {field}")]
    MissingField { field: String },
    
    #[error("Unknown field: {field}")]
    UnknownField { field: String },
    
    #[error("Type mismatch: expected {expected}, found {found}")]
    TypeMismatch { expected: String, found: String },
    
    #[error("Schema version mismatch: expected {expected}, found {found}")]
    SchemaVersionMismatch { expected: String, found: String },
}
}

Error Handling Patterns

Retry Strategies

EventCore provides different retry strategies for different error types:

#![allow(unused)]
fn main() {
// Automatic retry for transient errors
match command_executor.execute(&command).await {
    Ok(result) => result,
    Err(CommandError::ConcurrencyConflict { .. }) => {
        // Retry with exponential backoff
        retry_with_backoff(|| command_executor.execute(&command)).await?
    },
    Err(CommandError::Timeout { .. }) => {
        // Retry with different timeout
        command_executor.execute_with_timeout(&command, increased_timeout).await?
    },
    Err(other) => return Err(other), // Don't retry business logic errors
}
}

Error Conversion

Common error conversion patterns:

#![allow(unused)]
fn main() {
// Convert EventStore errors to Command errors
impl From<EventStoreError> for CommandError {
    fn from(err: EventStoreError) -> Self {
        match err {
            EventStoreError::VersionConflict { .. } => {
                CommandError::ConcurrencyConflict { streams: vec![] }
            },
            EventStoreError::StreamNotFound { stream_id } => {
                CommandError::StreamNotFound { stream_id }
            },
            other => CommandError::EventStoreError { source: other },
        }
    }
}
}

Error Context

Adding context to errors for better debugging:

#![allow(unused)]
fn main() {
use anyhow::{Context, Result};

async fn execute_command<C: Command>(command: &C) -> Result<ExecutionResult> {
    command_executor
        .execute(command)
        .await
        .with_context(|| format!("Failed to execute command: {}", std::any::type_name::<C>()))
        .with_context(|| "Command execution failed in main handler")
}
}

Troubleshooting Guide

Quick Reference

Performance Issues:

  1. Check CE006: Timeout errors → Review system performance
  2. Check ES003: ConnectionPoolExhausted → Increase pool size or fix leaks
  3. Check PR003: RebuildFailed → Optimize memory usage

Data Issues:

  1. Check CE003: ConcurrencyConflict → Implement retry logic
  2. Check ES001: VersionConflict → Review optimistic concurrency
  3. Check PR004: StateCorruption → Rebuild projections

Configuration Issues:

  1. Check CF001: MissingRequired → Set required configuration
  2. Check ES002: ConnectionFailed → Verify database connectivity
  3. Check CF002: InvalidValue → Review configuration values

Security Issues:

  1. Check CE005: Unauthorized → Verify permissions
  2. Check CE007: StreamAccessDenied → Fix stream access patterns

Diagnostic Commands

# Check EventCore health
eventcore-cli health-check

# Validate configuration
eventcore-cli config validate

# Test database connectivity
eventcore-cli database ping

# Check projection status
eventcore-cli projections status

# Verify stream access
eventcore-cli commands validate <command-type>

Log Analysis

Common log patterns to look for:

# High error rates
grep "ERROR" logs/eventcore.log | grep -c "CommandError"

# Concurrency conflicts
grep "ConcurrencyConflict" logs/eventcore.log | tail -10

# Performance issues
grep "Timeout\|slow query" logs/eventcore.log

# Connection issues
grep "ConnectionFailed\|ConnectionPoolExhausted" logs/eventcore.log

Error Prevention

Best Practices

  1. Input Validation: Use type-safe domain types with validation
  2. Error Handling: Implement comprehensive error handling strategies
  3. Monitoring: Set up alerts for error rate thresholds
  4. Testing: Include error scenarios in integration tests
  5. Documentation: Document expected error conditions

Type Safety

EventCore’s type system prevents many error categories:

#![allow(unused)]
fn main() {
// Good: Type-safe stream access
#[derive(Command)]
struct TransferMoney {
    #[stream]
    source_account: StreamId,  // Guaranteed valid
    
    #[stream] 
    target_account: StreamId,  // Guaranteed valid
    
    amount: Money,  // Guaranteed valid currency/amount
}

// Prevents: CE007 StreamAccessDenied, VE001-VE003 validation errors
}

This completes the error reference documentation. Use this guide to understand, diagnose, and resolve EventCore errors effectively.

Next, explore the Glossary

Chapter 7.4: Glossary

This glossary defines all terms and concepts used throughout EventCore documentation. Use this as a reference to understand EventCore terminology and concepts.

Core Concepts

Aggregate

In traditional event sourcing, an aggregate is a cluster of domain objects that can be treated as a single unit. EventCore eliminates traditional aggregates in favor of dynamic consistency boundaries defined by commands.

Command

A request to change the state of the system by writing events to one or more streams. Commands in EventCore can read from and write to multiple streams atomically, defining their own consistency boundaries.

Command Executor

The component responsible for executing commands against the event store. It handles stream reading, state reconstruction, command logic execution, and event writing.

Consistency Boundary

The scope within which ACID properties are maintained. In EventCore, each command defines its own consistency boundary by specifying which streams it needs to read from and write to.

CQRS (Command Query Responsibility Segregation)

An architectural pattern that separates read and write operations. EventCore naturally supports CQRS through its command system (writes) and projection system (reads).

Dynamic Consistency Boundaries

EventCore’s approach where consistency boundaries are determined at runtime by individual commands, rather than being fixed by aggregate design.

Event

An immutable fact that represents something that happened in the system. Events are stored in streams and contain a payload, metadata, and system-generated identifiers.

Event Sourcing

A data storage pattern where the state of entities is derived from a sequence of events, rather than storing current state directly.

Event Store

The database or storage system that persists events in streams. EventCore provides abstractions for different event store implementations.

Event Stream

See Stream.

Multi-Stream Event Sourcing

EventCore’s approach where a single command can atomically read from and write to multiple event streams, enabling complex business operations across multiple entities.

Projection

A read model built by processing events from one or more streams. Projections transform event data into formats optimized for querying.

Stream

A sequence of events identified by a unique StreamId. Streams represent the event history for a particular entity or concept.

EventCore Specific Terms

CommandStreams Trait

A trait that defines which streams a command needs to read from. Typically implemented automatically by the #[derive(Command)] macro.

CommandLogic Trait

A trait containing the domain logic for command execution. Separates business logic from infrastructure concerns.

EventId

A UUIDv7 identifier for events that provides both uniqueness and chronological ordering across the entire event store.

EventVersion

A monotonically increasing number representing the position of an event within its stream, starting from 0.

ExecutionResult

The result of executing a command, containing information about events written, affected streams, and execution metadata.

ReadStreams

A type-safe container providing access to stream data during command execution. Prevents commands from accessing streams they didn’t declare.

StreamData

The collection of events from a single stream, along with metadata like the current version.

StreamId

A validated identifier for event streams. Must be non-empty and under 255 characters.

StreamResolver

A component that allows commands to dynamically request additional streams during execution.

StreamWrite

A type-safe wrapper for writing events to streams that enforces stream access control.

TypeState Pattern

A compile-time safety pattern used in EventCore’s execution engine to prevent race conditions and ensure proper execution flow.

Architecture Terms

Functional Core, Imperative Shell

An architectural pattern where pure business logic (functional core) is separated from side effects and I/O operations (imperative shell).

Phantom Types

Types that exist only at compile time to provide additional type safety. EventCore uses phantom types to track stream access permissions.

Smart Constructor

A constructor function that validates input and returns a Result type, ensuring that successfully constructed values are always valid.

Type-Driven Development

A development approach where types are designed first to make illegal states unrepresentable, followed by implementation guided by the type system.

Event Store Terms

Checkpoint

A saved position in an event stream indicating how far a projection has processed events.

Expected Version

A constraint used for optimistic concurrency control when writing events to a stream.

Optimistic Concurrency Control

A concurrency control method that assumes conflicts are rare and checks for conflicts only when committing changes.

Position

The global ordering position of an event across all streams in the event store.

Snapshot

A saved state of an entity at a particular point in time, used to optimize event replay performance.

Subscription

A mechanism for receiving real-time notifications of new events as they’re written to streams.

WAL (Write-Ahead Log)

A logging mechanism where changes are written to a log before being applied to the main database.

Patterns and Techniques

Circuit Breaker

A pattern that prevents cascading failures by temporarily disabling operations that are likely to fail.

Dead Letter Queue

A queue that stores messages that could not be processed successfully, allowing for later analysis and reprocessing.

Event Envelope

A wrapper around event data that includes metadata like event type, version, and timestamps.

Event Upcasting

The process of transforming old event formats to new formats when event schemas evolve.

Idempotency

The property where performing an operation multiple times has the same effect as performing it once.

Process Manager

A component that coordinates long-running business processes by reacting to events and issuing commands.

Railway-Oriented Programming

A functional programming pattern for chaining operations that might fail, using Result types to handle errors gracefully.

Saga

A pattern for managing complex business transactions that span multiple services or aggregates.

Temporal Coupling

A coupling between components based on timing, which EventCore helps avoid through its event-driven architecture.

Database and Storage Terms

ACID Properties

Atomicity (all or nothing), Consistency (valid state), Isolation (concurrent safety), Durability (persistent storage).

Connection Pool

A cache of database connections that can be reused across multiple requests to improve performance.

Connection Pooling

The practice of maintaining a pool of reusable database connections.

Index

A database structure that improves query performance by creating ordered access paths to data.

Materialized View

A database object that contains the results of a query, physically stored and periodically refreshed.

PostgreSQL

The primary database system supported by EventCore for production event storage.

Transaction

A unit of work that is either completed entirely or not at all, maintaining database consistency.

UUIDv7

A UUID variant that includes a timestamp component, providing both uniqueness and chronological ordering.

Monitoring and Operations Terms

Alert

A notification triggered when monitored metrics exceed predefined thresholds.

Dashboard

A visual display of key metrics and system status information.

Health Check

An endpoint or service that reports the operational status of a system component.

Metrics

Quantitative measurements of system behavior and performance.

Observability

The ability to understand the internal state of a system based on its external outputs.

SLI (Service Level Indicator)

A metric that measures the performance of a service.

SLO (Service Level Objective)

A target value or range for an SLI.

Telemetry

The automated collection and transmission of data from remote sources.

Tracing

The practice of tracking requests through distributed systems to understand performance and behavior.

Security Terms

Authentication

The process of verifying the identity of a user or system.

Authorization

The process of determining what actions an authenticated user is allowed to perform.

JWT (JSON Web Token)

A standard for securely transmitting information between parties as a JSON object.

RBAC (Role-Based Access Control)

An access control method where permissions are associated with roles, and users are assigned roles.

TLS (Transport Layer Security)

A cryptographic protocol for securing communications over a network.

Development Terms

Cargo

Rust’s package manager and build system.

CI/CD (Continuous Integration/Continuous Deployment)

Practices for automating the integration, testing, and deployment of code changes.

Integration Test

A test that verifies the interaction between multiple components or systems.

Mock

A test double that simulates the behavior of real objects in controlled ways.

Property-Based Testing

A testing approach that verifies system properties hold for a wide range of generated inputs.

Regression Test

A test that ensures previously working functionality continues to work after changes.

Unit Test

A test that verifies the behavior of individual components in isolation.

Error Handling Terms

Backoff

A delay mechanism that increases wait time between retry attempts.

Circuit Breaker

See Patterns and Techniques section.

Error Boundary

A component that catches and handles errors from child components.

Exponential Backoff

A backoff strategy where delays increase exponentially with each retry attempt.

Failure Mode

A specific way in which a system can fail.

Graceful Degradation

The ability of a system to continue operating with reduced functionality when components fail.

Retry Logic

Code that automatically retries failed operations with appropriate delays and limits.

Timeout

A limit on how long an operation is allowed to run before being considered failed.

Configuration Terms

Environment Variable

A value set in the operating system environment that can be read by applications.

Configuration File

A file containing settings and parameters for application behavior.

Secret

Sensitive configuration data like passwords or API keys that must be protected.

TOML

A configuration file format that is easy to read and write.

YAML

A human-readable data serialization standard often used for configuration files.

Performance Terms

Benchmark

A test that measures system performance under specific conditions.

Bottleneck

The component or operation that limits overall system performance.

Latency

The time it takes for a single operation to complete.

Load Test

A test that simulates expected system load to verify performance characteristics.

Profiling

The process of analyzing system performance to identify optimization opportunities.

Scalability

The ability of a system to handle increased load by adding resources.

Throughput

The number of operations a system can handle per unit of time.

Data Terms

Immutable

Data that cannot be changed after creation.

Normalization

The process of organizing data to reduce redundancy and improve integrity.

Payload

The actual data content of an event, excluding metadata.

Schema

The structure and constraints that define how data is organized.

Serialization

The process of converting data structures into a format that can be stored or transmitted.

Validation

The process of checking that data meets specified requirements and constraints.

Rust-Specific Terms

Async/Await

Rust’s asynchronous programming model for non-blocking operations.

Borrow Checker

Rust’s compile-time mechanism that ensures memory safety without garbage collection.

Cargo.toml

The manifest file for Rust projects that specifies dependencies and metadata.

Crate

A compilation unit in Rust; equivalent to a library or package in other languages.

Derive Macro

A Rust macro that automatically generates implementations of traits for types.

Lifetime

A construct in Rust that tracks how long references are valid.

Option

Rust’s type for representing optional values, similar to nullable types in other languages.

Result

Rust’s type for representing operations that might fail, containing either a success value or an error.

Trait

Rust’s mechanism for defining shared behavior that types can implement.

Ownership

Rust’s system for managing memory through compile-time tracking of resource ownership.

Acronyms and Abbreviations

API - Application Programming Interface
CI - Continuous Integration
CLI - Command Line Interface
CQRS - Command Query Responsibility Segregation
DDD - Domain-Driven Design
DNS - Domain Name System
HTTP - Hypertext Transfer Protocol
HTTPS - HTTP Secure
I/O - Input/Output
JSON - JavaScript Object Notation
JWT - JSON Web Token
CRUD - Create, Read, Update, Delete
ORM - Object-Relational Mapping
REST - Representational State Transfer
SQL - Structured Query Language
SSL - Secure Sockets Layer
TDD - Test-Driven Development
TLS - Transport Layer Security
UUID - Universally Unique Identifier
XML - eXtensible Markup Language

EventCore Command Reference

Common EventCore CLI commands and their purposes:

eventcore-cli

The main command-line interface for EventCore operations.

health-check

Verify system health and connectivity.

migrate

Run database migrations.

config validate

Validate configuration files and settings.

projections status

Check the status of all projections.

projections rebuild

Rebuild projections from event history.

streams list

List available event streams.

events export

Export events for backup or analysis.

Common Patterns

Builder Pattern

A creational pattern for constructing complex objects step by step.

Factory Pattern

A creational pattern for creating objects without specifying their exact classes.

Observer Pattern

A behavioral pattern where objects notify observers of state changes.

Repository Pattern

A design pattern that encapsulates data access logic.

Strategy Pattern

A behavioral pattern that enables selecting algorithms at runtime.

Best Practices

Fail Fast

The practice of detecting and reporting errors as early as possible.

Immutable Infrastructure

Infrastructure that is never modified after deployment, only replaced.

Least Privilege

The security principle of granting minimum necessary permissions.

Separation of Concerns

The principle of dividing software into distinct sections with specific responsibilities.

Single Responsibility Principle

Each class or module should have only one reason to change.

This glossary provides comprehensive coverage of EventCore terminology and related concepts. Use it as a reference when working with EventCore or reading the documentation.

That completes Part 7: Reference documentation for EventCore!

Banking Example

The banking example demonstrates EventCore’s multi-stream atomic operations by implementing a double-entry bookkeeping system.

Key Features

  • Atomic Transfers: Move money between accounts with ACID guarantees
  • Balance Validation: Prevent overdrafts with compile-time safe types
  • Audit Trail: Complete history of all transactions
  • Account Lifecycle: Open, close, and freeze accounts

Running the Example

cargo run --example banking

Code Structure

The example includes:

  • types.rs - Domain types with validation (AccountId, Money, etc.)
  • events.rs - Account events (Opened, Deposited, Withdrawn, etc.)
  • commands.rs - Business operations (OpenAccount, Transfer, etc.)
  • projections.rs - Read models for account balances and history

View Source Code

E-Commerce Example

The e-commerce example shows how to build a complete order processing system with inventory management using EventCore.

Key Features

  • Order Processing: Multi-step order workflow with validation
  • Inventory Management: Real-time stock tracking across warehouses
  • Dynamic Pricing: Apply discounts and calculate totals
  • Multi-Stream Operations: Coordinate between orders, inventory, and customers

Running the Example

cargo run --example ecommerce

Code Walkthrough

The example demonstrates:

  • Complex state machines for order lifecycle
  • Compensation patterns for failed operations
  • Projection-based inventory queries
  • Integration with external payment systems

View Source Code

Sagas Example

The saga example implements distributed transaction patterns using EventCore’s multi-stream capabilities.

What are Sagas?

Sagas are a pattern for managing long-running business processes that span multiple bounded contexts or services. EventCore makes implementing sagas straightforward with its multi-stream atomicity.

Example Scenario

This example implements a travel booking saga that coordinates:

  • Flight reservation
  • Hotel booking
  • Car rental
  • Payment processing

Each step can fail, triggering compensating actions to maintain consistency.

Running the Example

cargo run --example sagas

Implementation Details

  • Orchestration: Central saga coordinator manages the workflow
  • Compensation: Automatic rollback on failures
  • Idempotency: Safe retries with exactly-once semantics
  • Monitoring: Built-in observability for saga progress

View Source Code

Contributing to EventCore

Thank you for your interest in contributing to EventCore! We welcome contributions from the community.

Code of Conduct

Contributor Covenant Code of Conduct

Our Pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.

Our Standards

Examples of behavior that contributes to a positive environment for our community include:

  • Demonstrating empathy and kindness toward other people
  • Being respectful of differing opinions, viewpoints, and experiences
  • Giving and gracefully accepting constructive feedback
  • Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
  • Focusing on what is best not just for us as individuals, but for the overall community

Examples of unacceptable behavior include:

  • The use of sexualized language or imagery, and sexual attention or advances of any kind
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information, such as a physical or email address, without their explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at john@johnwilger.com. All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the reporter of any incident.

Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:

1. Correction

Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.

Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.

2. Warning

Community Impact: A violation through a single incident or series of actions.

Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.

3. Temporary Ban

Community Impact: A serious violation of community standards, including sustained inappropriate behavior.

Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.

4. Permanent Ban

Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.

Consequence: A permanent ban from any sort of public interaction within the community.

Attribution

This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.

For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.

Changelog

All notable changes to the EventCore project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Comprehensive interactive documentation tutorials
  • Enhanced error diagnostics with miette integration
  • Fluent CommandExecutorBuilder API for configuration
  • Command definition macros for cleaner code
  • Multi-stream event sourcing with dynamic consistency boundaries
  • Type-safe command system with compile-time stream access control
  • Flexible command-controlled dynamic stream discovery
  • PostgreSQL adapter with full type safety
  • In-memory event store adapter for testing
  • Comprehensive benchmarking suite
  • Complete examples for banking, e-commerce, and saga pattern domains
  • Property-based testing throughout the codebase
  • Extensive monitoring and observability features
  • Projection system with checkpointing and recovery
  • Event serialization with schema evolution support
  • Command retry mechanisms with configurable policies
  • Developer experience improvements with macros
  • Complete CI/CD pipeline with PostgreSQL integration

Changed

  • Replaced aggregate-per-command terminology with multi-stream event sourcing
  • Made PostgreSQL adapter generic over event type for better type safety
  • Updated Command trait to include StreamResolver for flexible stream discovery
  • Enhanced concurrency control to check all read stream versions
  • Improved CI configuration with PostgreSQL services and coverage optimization

Fixed

  • PostgreSQL schema initialization concurrency issues in CI
  • All pre-commit hook failures across the codebase
  • CI workflow syntax errors and configuration issues
  • Test isolation with unique stream IDs and database cleanup
  • Race conditions in concurrent command execution

Security

  • Forbid unsafe code throughout the workspace
  • Comprehensive security audit integration in CI
  • Protection against dependency vulnerabilities

[0.1.3] - 2025-07-07

Fixed

  • PostgreSQL test configuration (missing TEST_DATABASE_URL) in PR #31
  • Documentation sync script symlink issue in PR #31
  • Cargo.toml version specifications for crates.io publishing in PR #33
  • CSS directory creation for documentation builds in PR #33
  • Workspace dependency syntax errors (rand.workspace = true to rand = { workspace = true })
  • Version conflicts preventing re-release after partial crates.io publishing
  • Circular dependency in eventcore-macros preventing crates.io release
  • Publishing order to resolve dev-dependency circular dependencies

Changed

  • Implemented workspace dependencies for internal crates to enable automatic lockstep versioning
  • Updated publishing order to macros → memory → postgres → eventcore
  • Added PR template compliance rules to CLAUDE.md
  • Improved PR validation workflow with debouncing and comment deduplication
  • Replaced PR validation workflow with Definition of Done bot
  • Removed version numbers from internal workspace dependencies for cleaner dependency management

Added

  • PR template compliance enforcement in development workflow
  • Definition of Done bot configuration for automatic PR checklists
  • Critical rule #4 to CLAUDE.md: Always stop and ask for help rather than taking shortcuts

[0.1.2] - 2025-07-05

Fixed

  • Rand crate v0.9.1 deprecation errors:
    • Updated thread_rng() to rng() across codebase
    • Updated gen() to random() and gen_range() to random_range()
    • Fixed ThreadRng Send issue in stress tests
  • OpenTelemetry v0.30.0 API breaking changes:
    • Updated Resource::new() to Resource::builder() pattern
    • Removed unnecessary runtime parameter from PeriodicReader::builder()
    • Added required grpc-tonic feature to opentelemetry-otlp dependency
  • Bincode v2.0.1 API breaking changes:
    • Updated to use bincode::serde::encode_to_vec() and bincode::serde::decode_from_slice()
    • Added “serde” feature to bincode dependency
    • Replaced deprecated bincode::serialize() and bincode::deserialize() functions

Changed

  • Updated actions/configure-pages from v4 to v5 (PR #3)
  • Updated codecov/codecov-action from v3 to v5 (PR #4)

[0.1.1] - 2025-07-05

Added

  • Modern documentation website with mdBook
    • GitHub Pages deployment workflow
    • Custom EventCore branding and responsive design
    • Automated documentation synchronization from markdown sources
    • Deployment on releases with version information
  • Comprehensive security infrastructure:
    • SECURITY.md with vulnerability reporting via GitHub Security Advisories
    • Improved cargo-audit CI job using rustsec/audit-check action
    • Dependabot configuration for automated dependency updates
    • CONTRIBUTING.md with GPG signing documentation
    • Security guide in user manual covering authentication, encryption, validation, and compliance
    • COMPLIANCE_CHECKLIST.md mapping to OWASP/NIST/SOC2/PCI/GDPR/HIPAA
    • Pull request template with security and performance review checklists
  • GitHub Copilot instructions for automated PR reviews
  • Pre-commit hook improvements:
    • Added doctests to pre-commit hooks
    • Auto-format and stage files instead of failing
  • GitHub MCP server integration for all GitHub operations

Fixed

  • Outdated Command trait references (now CommandLogic) in documentation
  • Broken documentation links in README.md
  • License information to reflect MIT-only licensing
  • Doctest compilation error in resource.rs

Changed

  • Reorganized documentation structure (renumbered operations to 07, reference to 08)
  • Consolidated documentation to single source (symlinked docs/manual to website/src/manual)
  • Updated PR template to remove redundant pre-merge checklist and add Review Focus section
  • Enhanced CLAUDE.md with GitHub MCP integration and PR-based workflow documentation
  • Simplified PR template by consolidating multiple checklists into single Submitter Checklist

[0.1.0] - Initial Release

Added

  • Core Event Sourcing Foundation

    • StreamId, EventId, EventVersion types with validation
    • Command trait system with type-safe execution
    • Event store abstraction with pluggable backends
    • Multi-stream atomicity with optimistic concurrency control
    • Event metadata tracking (causation, correlation, user)
  • Type-Driven Development

    • Extensive use of nutype for domain type validation
    • Smart constructors that make illegal states unrepresentable
    • Result types for all fallible operations
    • Property-based testing with proptest
  • PostgreSQL Adapter (eventcore-postgres)

    • Full PostgreSQL event store implementation
    • Database schema migrations
    • Transaction-based multi-stream writes
    • Optimistic concurrency control with version checking
    • Connection pooling and error mapping
  • In-Memory Adapter (eventcore-memory)

    • Fast in-memory event store for testing
    • Thread-safe storage with Arc
    • Complete EventStore trait implementation
    • Version tracking per stream
  • Command System

    • Type-safe command execution
    • Automatic state reconstruction from events
    • Multi-stream read/write operations
    • Retry mechanisms with exponential backoff
    • Command context and metadata support
  • Projection System

    • Projection trait for building read models
    • Checkpoint management for resume capability
    • Projection manager with lifecycle control
    • Event subscription and processing
    • Error recovery and retry logic
  • Monitoring & Observability

    • Metrics collection (counters, gauges, timers)
    • Health checks for event store and projections
    • Structured logging with tracing integration
    • Performance monitoring and alerts
  • Serialization & Persistence

    • JSON event serialization with schema evolution
    • Type registry for dynamic deserialization
    • Unknown event type handling
    • Migration chain support
  • Developer Experience

    • Comprehensive test utilities and fixtures
    • Property test generators for all domain types
    • Event and command builders
    • Assertion helpers for testing
    • Test harness for end-to-end scenarios
  • Macro System (eventcore-macros)

    • #[derive(Command)] procedural macro
    • Automatic stream field detection
    • Type-safe StreamSet generation
    • Declarative command! macro
  • Examples (eventcore-examples)

    • Banking domain with money transfers
    • E-commerce domain with order management
    • Order fulfillment saga with distributed transaction coordination
    • Complete integration tests
    • Usage patterns and best practices
  • Benchmarks (eventcore-benchmarks)

    • Command execution performance tests
    • Event store read/write benchmarks
    • Projection processing benchmarks
    • Memory allocation profiling
  • Documentation

    • Comprehensive rustdoc for all public APIs
    • Interactive tutorials for common scenarios
    • Usage examples in documentation
    • Migration guides and best practices

Performance

  • Target: 5,000-10,000 single-stream commands/sec
  • Target: 2,000-5,000 multi-stream commands/sec
  • Target: 20,000+ events/sec (batched writes)
  • Target: P95 command latency < 10ms

Breaking Changes

  • N/A (initial release)

Migration Guide

  • N/A (initial release)

Dependencies

  • Rust: Minimum supported version 1.70.0
  • PostgreSQL: Version 15+ (for PostgreSQL adapter)
  • Key Dependencies:
    • tokio 1.45+ for async runtime
    • sqlx 0.8+ for PostgreSQL integration
    • uuid 1.17+ with v7 support for event ordering
    • serde 1.0+ for serialization
    • nutype 0.6+ for type safety
    • miette 7.6+ for enhanced error diagnostics
    • proptest 1.7+ for property-based testing

Architecture Highlights

  • Multi-Stream Event Sourcing: Commands define their own consistency boundaries
  • Type-Driven Development: Leverage Rust’s type system for domain modeling
  • Functional Core, Imperative Shell: Pure business logic with side effects at boundaries
  • Parse, Don’t Validate: Transform unstructured data at system boundaries only
  • Railway-Oriented Programming: Chain operations using Result types

Versioning Strategy

EventCore follows Semantic Versioning with the following guidelines:

Major Version (X.0.0)

  • Breaking changes to public APIs
  • Changes to the Command trait signature
  • Database schema changes requiring migration
  • Changes to serialization format requiring migration

Minor Version (0.X.0)

  • New features and capabilities
  • New optional methods on traits
  • New crates in the workspace
  • Performance improvements
  • New configuration options

Patch Version (0.0.X)

  • Bug fixes
  • Documentation improvements
  • Dependency updates (compatible versions)
  • Internal refactoring without API changes

Workspace Versioning

All crates in the EventCore workspace share the same version number to ensure compatibility:

  • eventcore (core library)
  • eventcore-postgres (PostgreSQL adapter)
  • eventcore-memory (in-memory adapter)
  • eventcore-examples (example implementations)
  • eventcore-benchmarks (performance benchmarks)
  • eventcore-macros (procedural macros)

Pre-release Versions

  • Alpha: X.Y.Z-alpha.N - Early development, APIs may change
  • Beta: X.Y.Z-beta.N - Feature complete, testing phase
  • RC: X.Y.Z-rc.N - Release candidate, final testing

Compatibility Promise

  • Patch versions: Fully compatible, safe to upgrade
  • Minor versions: Backward compatible, safe to upgrade
  • Major versions: May contain breaking changes, migration guide provided

Contributing

When contributing to EventCore:

  1. Follow the conventional commits format
  2. Update this CHANGELOG.md with your changes
  3. Ensure all tests pass and coverage remains high
  4. Update documentation for any API changes
  5. Add property-based tests for new functionality

Commit Message Format

type(scope): description

[optional body]

[optional footer]

Types: feat, fix, docs, style, refactor, test, chore Scopes: core, postgres, memory, examples, macros, benchmarks

MIT License

Copyright (c) 2025 John Wilger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.