Why EventCore?
Traditional event sourcing forces you to choose aggregate boundaries upfront, leading to complex workarounds when business logic spans multiple aggregates. EventCore eliminates this constraint with dynamic consistency boundaries - each command defines exactly which streams it needs, enabling atomic operations across multiple event streams.
🚀 Key Features
🔄 Multi-Stream Atomicity
Read from and write to multiple event streams in a single atomic operation. No more saga patterns for simple cross-aggregate operations.
🎯 Type-Safe Commands
Leverage Rust’s type system to ensure compile-time correctness. Illegal states are unrepresentable.
⚡ High Performance
Optimized for both in-memory and PostgreSQL backends with sophisticated caching and batching strategies.
🔍 Built-in CQRS
First-class support for projections and read models with automatic position tracking and replay capabilities.
🛡️ Production Ready
Battle-tested with comprehensive observability, monitoring, and error recovery mechanisms.
🧪 Testing First
Extensive testing utilities including property-based tests, chaos testing, and deterministic event stores.
Quick Example
#![allow(unused)] fn main() { use eventcore::prelude::*; #[derive(Command)] #[command(event = "BankingEvent")] struct TransferMoney { from_account: AccountId, to_account: AccountId, amount: Money, } impl TransferMoney { fn read_streams(&self) -> Vec<StreamId> { vec![ self.from_account.stream_id(), self.to_account.stream_id(), ] } } #[async_trait] impl CommandLogic for TransferMoney { type State = BankingState; type Event = BankingEvent; async fn handle( &self, _: ReadStreams<Self::StreamSet>, state: Self::State, _: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate business rules require!(state.balance(&self.from_account) >= self.amount, "Insufficient funds" ); // Emit events - atomically written to both streams Ok(vec![ emit!(self.from_account.stream_id(), BankingEvent::Withdrawn { amount: self.amount } ), emit!(self.to_account.stream_id(), BankingEvent::Deposited { amount: self.amount } ), ]) } } }
Getting Started
Use Cases
EventCore excels in domains where business operations naturally span multiple entities:
- 💰 Financial Systems: Atomic transfers, double-entry bookkeeping, complex trading operations
- 🛒 E-Commerce: Order fulfillment, inventory management, distributed transactions
- 🏢 Enterprise Applications: Workflow engines, approval processes, resource allocation
- 🎮 Gaming: Player interactions, economy systems, real-time state synchronization
- 📊 Analytics Platforms: Event-driven architectures, audit trails, temporal queries
Performance
Community
Join our growing community of developers building event-sourced systems:
Resources
Supported By
EventCore is an open-source project supported by the community.
Become a SponsorQuick Start Guide
Get up and running with EventCore in 15 minutes!
Installation
Add EventCore to your Cargo.toml
:
[dependencies]
eventcore = "0.1"
eventcore-postgres = "0.1" # For PostgreSQL backend
# OR
eventcore-memory = "0.1" # For in-memory backend
# Required dependencies
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
async-trait = "0.1"
Your First Event-Sourced Application
Let’s build a simple task management system to demonstrate EventCore’s key concepts.
1. Define Your Domain Types
#![allow(unused)] fn main() { use eventcore::prelude::*; use serde::{Deserialize, Serialize}; // Domain types with compile-time validation #[nutype(sanitize(trim), validate(not_empty, len_char_max = 50))] #[derive(Debug, Clone, PartialEq, Serialize, Deserialize, AsRef)] pub struct TaskId(String); #[nutype(sanitize(trim), validate(not_empty, len_char_max = 200))] #[derive(Debug, Clone, Serialize, Deserialize)] pub struct TaskTitle(String); // Events that represent state changes #[derive(Debug, Clone, Serialize, Deserialize)] pub enum TaskEvent { Created { title: TaskTitle }, Completed, Reopened, } }
2. Create Your First Command
#![allow(unused)] fn main() { #[derive(Clone, Command)] #[command(event = "TaskEvent")] pub struct CreateTask { pub task_id: TaskId, pub title: TaskTitle, } impl CreateTask { fn read_streams(&self) -> Vec<StreamId> { vec![StreamId::from(self.task_id.as_ref())] } } #[async_trait] impl CommandLogic for CreateTask { type State = Option<TaskState>; type Event = TaskEvent; async fn handle( &self, _: ReadStreams<Self::StreamSet>, state: Self::State, _: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Ensure task doesn't already exist require!(state.is_none(), "Task already exists"); // Emit the event Ok(vec![emit!( StreamId::from(self.task_id.as_ref()), TaskEvent::Created { title: self.title.clone() } )]) } } }
3. Define State and Event Application
#![allow(unused)] fn main() { #[derive(Default, Debug, Clone)] pub struct TaskState { pub title: TaskTitle, pub completed: bool, } impl CreateTask { fn apply(&self, state: &mut Self::State, event: &StoredEvent<TaskEvent>) { if let Some(task_state) = state { match &event.event { TaskEvent::Created { title } => { // This shouldn't happen with proper command validation *task_state = TaskState { title: title.clone(), completed: false, }; } TaskEvent::Completed => { task_state.completed = true; } TaskEvent::Reopened => { task_state.completed = false; } } } else if let TaskEvent::Created { title } = &event.event { *state = Some(TaskState { title: title.clone(), completed: false, }); } } } }
4. Set Up the Event Store and Execute Commands
#[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Initialize event store (using in-memory for this example) let event_store = InMemoryEventStore::new(); let executor = CommandExecutor::new(event_store); // Create a new task let task_id = TaskId::try_new("task-001".to_string())?; let title = TaskTitle::try_new("Learn EventCore".to_string())?; let create_cmd = CreateTask { task_id: task_id.clone(), title, }; // Execute the command let result = executor.execute(create_cmd).await?; println!("Task created with {} event(s)", result.events.len()); // Create a complete task command let complete_cmd = CompleteTask { task_id: task_id.clone(), }; let result = executor.execute(complete_cmd).await?; println!("Task completed!"); Ok(()) }
5. Add a Projection for Queries
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct TaskListProjection { tasks: Arc<RwLock<HashMap<TaskId, TaskSummary>>>, } #[derive(Debug, Clone)] struct TaskSummary { title: TaskTitle, completed: bool, } #[async_trait] impl Projection for TaskListProjection { type Event = TaskEvent; async fn handle_event( &mut self, event: StoredEvent<Self::Event>, stream_id: &StreamId, ) -> Result<(), ProjectionError> { let task_id = TaskId::try_new(stream_id.as_ref().to_string()) .map_err(|e| ProjectionError::InvalidData(e.to_string()))?; let mut tasks = self.tasks.write().await; match event.event { TaskEvent::Created { title } => { tasks.insert(task_id, TaskSummary { title, completed: false, }); } TaskEvent::Completed => { if let Some(task) = tasks.get_mut(&task_id) { task.completed = true; } } TaskEvent::Reopened => { if let Some(task) = tasks.get_mut(&task_id) { task.completed = false; } } } Ok(()) } } }
Next Steps
Congratulations! You’ve built your first event-sourced application with EventCore. Here’s what to explore next:
- Domain Modeling Guide - Learn best practices for modeling your domain with types
- Commands Deep Dive - Understand multi-stream operations and dynamic consistency
- Building Web APIs - Integrate EventCore with Axum or Actix
- Testing Strategies - Property-based testing and chaos testing
Example Projects
Check out these complete examples in the repository:
- Banking System - Multi-account transfers with ACID guarantees
- E-Commerce Platform - Order processing with inventory management
- Saga Orchestration - Long-running business processes
Getting Help
Installation
Requirements
- Rust 1.70.0 or later
- PostgreSQL 13+ (for PostgreSQL backend)
- Tokio async runtime
Adding EventCore to Your Project
Add the following to your Cargo.toml
:
[dependencies]
# Core library
eventcore = "0.1"
# Choose your backend (one of these):
eventcore-postgres = "0.1" # Production-ready PostgreSQL backend
eventcore-memory = "0.1" # In-memory backend for development/testing
# Required dependencies
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
uuid = { version = "1", features = ["v7", "serde"] }
thiserror = "1"
# Optional but recommended
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
Backend Configuration
PostgreSQL Backend
- Database Setup
# Using Docker
docker run -d \
--name eventcore-postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=eventcore \
-p 5432:5432 \
postgres:15-alpine
# Or use the provided docker-compose.yml
docker-compose up -d
- Run Migrations
EventCore will automatically create required tables on first use. For manual setup:
-- See eventcore-postgres/migrations/ for schema
- Connection Configuration
#![allow(unused)] fn main() { use eventcore_postgres::PostgresEventStore; let database_url = "postgres://postgres:postgres@localhost/eventcore"; let event_store = PostgresEventStore::new(database_url).await?; }
In-Memory Backend
Perfect for development and testing:
#![allow(unused)] fn main() { use eventcore_memory::InMemoryEventStore; let event_store = InMemoryEventStore::new(); }
Feature Flags
EventCore supports various feature flags:
[dependencies]
eventcore = { version = "0.1", features = ["full"] }
# Individual features:
# - "testing" - Testing utilities and fixtures
# - "chaos" - Chaos testing support
# - "monitoring" - OpenTelemetry integration
# - "cqrs" - CQRS pattern support
Verification
Create a simple test to verify installation:
#![allow(unused)] fn main() { use eventcore::prelude::*; #[tokio::test] async fn test_eventcore_setup() { let event_store = eventcore_memory::InMemoryEventStore::new(); let executor = CommandExecutor::new(event_store); // If this compiles, EventCore is properly installed! assert!(true); } }
Next Steps
- Follow the Quick Start Guide
- Explore the Examples
- Read about Core Concepts
Your First EventCore Application
Let’s build a complete event-sourced application from scratch: a simple blog engine that demonstrates EventCore’s key concepts.
Project Setup
- Create a new Rust project
cargo new blog-engine
cd blog-engine
- Update Cargo.toml
[package]
name = "blog-engine"
version = "0.1.0"
edition = "2021"
[dependencies]
eventcore = "0.1"
eventcore-memory = "0.1"
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
uuid = { version = "1", features = ["v7", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
thiserror = "1"
nutype = { version = "0.4", features = ["serde"] }
Step 1: Define Domain Types
Create src/types.rs
:
#![allow(unused)] fn main() { use eventcore::prelude::*; use chrono::{DateTime, Utc}; use serde::{Deserialize, Serialize}; // Use nutype for domain validation #[nutype( sanitize(trim), validate(not_empty, len_char_max = 100), derive(Debug, Clone, PartialEq, Serialize, Deserialize, AsRef) )] pub struct PostId(String); #[nutype( sanitize(trim), validate(not_empty, len_char_max = 200), derive(Debug, Clone, Serialize, Deserialize) )] pub struct PostTitle(String); #[nutype( sanitize(trim), validate(not_empty, len_char_max = 10000), derive(Debug, Clone, Serialize, Deserialize) )] pub struct PostContent(String); #[nutype( sanitize(trim), validate(not_empty, len_char_max = 100), derive(Debug, Clone, PartialEq, Serialize, Deserialize) )] pub struct AuthorId(String); #[derive(Debug, Clone, Serialize, Deserialize)] pub struct Comment { pub id: String, pub author: AuthorId, pub content: String, pub created_at: DateTime<Utc>, } }
Step 2: Define Events
Create src/events.rs
:
#![allow(unused)] fn main() { use crate::types::*; use chrono::{DateTime, Utc}; use serde::{Deserialize, Serialize}; #[derive(Debug, Clone, Serialize, Deserialize)] pub enum BlogEvent { PostPublished { title: PostTitle, content: PostContent, author: AuthorId, published_at: DateTime<Utc>, }, PostUpdated { title: PostTitle, content: PostContent, updated_at: DateTime<Utc>, }, PostDeleted { deleted_at: DateTime<Utc>, }, CommentAdded { comment: Comment, }, CommentRemoved { comment_id: String, }, } }
Step 3: Define State
Create src/state.rs
:
#![allow(unused)] fn main() { use crate::types::*; use crate::events::BlogEvent; use chrono::{DateTime, Utc}; use std::collections::HashMap; #[derive(Debug, Clone, Default)] pub struct PostState { pub exists: bool, pub title: Option<PostTitle>, pub content: Option<PostContent>, pub author: Option<AuthorId>, pub published_at: Option<DateTime<Utc>>, pub updated_at: Option<DateTime<Utc>>, pub deleted_at: Option<DateTime<Utc>>, pub comments: HashMap<String, Comment>, } impl PostState { pub fn is_deleted(&self) -> bool { self.deleted_at.is_some() } pub fn apply_event(&mut self, event: &BlogEvent) { match event { BlogEvent::PostPublished { title, content, author, published_at, } => { self.exists = true; self.title = Some(title.clone()); self.content = Some(content.clone()); self.author = Some(author.clone()); self.published_at = Some(*published_at); } BlogEvent::PostUpdated { title, content, updated_at, } => { self.title = Some(title.clone()); self.content = Some(content.clone()); self.updated_at = Some(*updated_at); } BlogEvent::PostDeleted { deleted_at } => { self.deleted_at = Some(*deleted_at); } BlogEvent::CommentAdded { comment } => { self.comments.insert(comment.id.clone(), comment.clone()); } BlogEvent::CommentRemoved { comment_id } => { self.comments.remove(comment_id); } } } } }
Step 4: Implement Commands
Create src/commands.rs
:
#![allow(unused)] fn main() { use crate::events::BlogEvent; use crate::state::PostState; use crate::types::*; use chrono::Utc; use eventcore::prelude::*; // Publish a new blog post #[derive(Clone, Command)] #[command(event = "BlogEvent")] pub struct PublishPost { pub post_id: PostId, pub title: PostTitle, pub content: PostContent, pub author: AuthorId, } impl PublishPost { fn read_streams(&self) -> Vec<StreamId> { vec![StreamId::from(format!("post-{}", self.post_id.as_ref()))] } } #[async_trait] impl CommandLogic for PublishPost { type State = PostState; type Event = BlogEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { state.apply_event(&event.event); } async fn handle( &self, _: ReadStreams<Self::StreamSet>, state: Self::State, _: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate business rules require!(!state.exists, "Post already exists"); // Emit event Ok(vec![emit!( StreamId::from(format!("post-{}", self.post_id.as_ref())), BlogEvent::PostPublished { title: self.title.clone(), content: self.content.clone(), author: self.author.clone(), published_at: Utc::now(), } )]) } } // Add a comment to a post #[derive(Clone, Command)] #[command(event = "BlogEvent")] pub struct AddComment { pub post_id: PostId, pub comment_id: String, pub author: AuthorId, pub content: String, } impl AddComment { fn read_streams(&self) -> Vec<StreamId> { vec![StreamId::from(format!("post-{}", self.post_id.as_ref()))] } } #[async_trait] impl CommandLogic for AddComment { type State = PostState; type Event = BlogEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { state.apply_event(&event.event); } async fn handle( &self, _: ReadStreams<Self::StreamSet>, state: Self::State, _: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate require!(state.exists, "Post does not exist"); require!(!state.is_deleted(), "Cannot comment on deleted post"); require!(!state.comments.contains_key(&self.comment_id), "Comment ID already exists"); // Emit event Ok(vec![emit!( StreamId::from(format!("post-{}", self.post_id.as_ref())), BlogEvent::CommentAdded { comment: Comment { id: self.comment_id.clone(), author: self.author.clone(), content: self.content.clone(), created_at: Utc::now(), } } )]) } } }
Step 5: Create the Application
Update src/main.rs
:
mod commands; mod events; mod state; mod types; use commands::*; use eventcore::prelude::*; use eventcore_memory::InMemoryEventStore; use types::*; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Initialize the event store let event_store = InMemoryEventStore::new(); let executor = CommandExecutor::new(event_store); // Create author and post IDs let author = AuthorId::try_new("alice".to_string())?; let post_id = PostId::try_new("hello-eventcore".to_string())?; // Publish a blog post let publish_cmd = PublishPost { post_id: post_id.clone(), title: PostTitle::try_new("Hello EventCore!".to_string())?, content: PostContent::try_new( "This is my first event-sourced blog post!".to_string() )?, author: author.clone(), }; let result = executor.execute(publish_cmd).await?; println!("Post published with {} event(s)", result.events.len()); // Add a comment let comment_cmd = AddComment { post_id: post_id.clone(), comment_id: "comment-1".to_string(), author: AuthorId::try_new("bob".to_string())?, content: "Great post!".to_string(), }; let result = executor.execute(comment_cmd).await?; println!("Comment added!"); // Try to add duplicate comment (will fail) let duplicate_comment = AddComment { post_id, comment_id: "comment-1".to_string(), // Same ID! author: AuthorId::try_new("charlie".to_string())?, content: "Another comment".to_string(), }; match executor.execute(duplicate_comment).await { Ok(_) => println!("This shouldn't happen!"), Err(e) => println!("Expected error: {}", e), } Ok(()) }
Step 6: Run Your Application
cargo run
You should see:
Post published with 1 event(s)
Comment added!
Expected error: Comment ID already exists
What You’ve Learned
In this tutorial, you’ve implemented:
- Type-Safe Domain Modeling - Using
nutype
for validation - Event Sourcing Basics - Events as the source of truth
- Command Pattern - Encapsulating business operations
- Business Rule Validation - Enforcing invariants
- State Reconstruction - Building state from events
Next Steps
Enhance your blog engine with:
- Projections for querying posts by author or tag
- Multi-stream operations for author profiles
- Web API using Axum or Actix
- PostgreSQL backend for persistence
- Subscriptions for real-time updates
Continue learning:
Part 1: Introduction
Welcome to EventCore! This section introduces the library, its philosophy, and when to use it.
Chapters in This Part
- What is EventCore? - Understanding multi-stream event sourcing
- When to Use EventCore - Decision guide for choosing EventCore
- Event Modeling Fundamentals - Learn to design systems with events
- Architecture Overview - High-level view of EventCore’s design
What You’ll Learn
- The problems EventCore solves
- How multi-stream event sourcing differs from traditional approaches
- When EventCore is the right choice (and when it’s not)
- How to think in events and model your domain
- EventCore’s architecture and design principles
Prerequisites
- Basic Rust knowledge
- Familiarity with async programming helpful but not required
- No prior event sourcing experience needed
Time to Complete
- Reading: ~20 minutes
- With exercises: ~45 minutes
Ready? Let’s start with What is EventCore? →
Chapter 1.1: What is EventCore?
EventCore is a Rust library that implements multi-stream event sourcing - a powerful pattern that eliminates the traditional constraints of aggregate boundaries while maintaining strong consistency guarantees.
The Problem with Traditional Event Sourcing
Traditional event sourcing forces you to define rigid aggregate boundaries upfront:
#![allow(unused)] fn main() { // Traditional approach - forced aggregate boundaries struct BankAccount { id: AccountId, balance: Money, // Can only modify THIS account } // Problem: How do you transfer money atomically? // Option 1: Two separate commands (not atomic!) // Option 2: Process managers/sagas (complex!) // Option 3: Eventual consistency (risky!) }
These boundaries often don’t match real business requirements:
- Money transfers need to modify two accounts atomically
- Order fulfillment needs to update inventory, orders, and shipping together
- User registration might need to create accounts, profiles, and notifications
The EventCore Solution
EventCore introduces dynamic consistency boundaries - each command defines which streams it needs:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct TransferMoney { #[stream] from_account: StreamId, // Read and write this stream #[stream] to_account: StreamId, // Read and write this stream too amount: Money, } // This command atomically: // 1. Reads both account streams // 2. Validates the business rules // 3. Writes events to both streams // 4. All in ONE atomic transaction! }
Key Concepts
1. Event Streams
Instead of aggregates, EventCore uses streams - ordered sequences of events identified by a StreamId:
#![allow(unused)] fn main() { // Streams are just identifiers let alice_account = StreamId::from_static("account-alice"); let bob_account = StreamId::from_static("account-bob"); let order_123 = StreamId::from_static("order-123"); }
2. Multi-Stream Commands
Commands can read from and write to multiple streams atomically:
#![allow(unused)] fn main() { // A command that involves multiple business entities #[derive(Command, Clone)] struct FulfillOrder { #[stream] order_id: StreamId, // The order to fulfill #[stream] inventory_id: StreamId, // The inventory to deduct from #[stream] shipping_id: StreamId, // Create shipping record } }
3. Type-Safe Stream Access
The macro system ensures you can only write to streams you declared:
#![allow(unused)] fn main() { // In your handle method: let events = vec![ StreamWrite::new( &read_streams, self.order_id.clone(), // ✅ OK - declared with #[stream] OrderEvent::Fulfilled )?, StreamWrite::new( &read_streams, some_other_stream, // ❌ Compile error! Not declared SomeEvent::Happened )?, ]; }
4. Optimistic Concurrency Control
EventCore tracks stream versions to detect conflicts:
- Command reads streams at specific versions
- Command produces new events
- Write only succeeds if streams haven’t changed
- Automatic retry on conflicts
Benefits
-
Simplified Architecture
- No aggregate boundaries to design upfront
- No process managers for cross-aggregate operations
- No eventual consistency complexity
-
Strong Consistency
- All changes are atomic
- No partial failures between streams
- Transactions that match business requirements
-
Type Safety
- Commands declare their streams at compile time
- Illegal operations won’t compile
- Self-documenting code
-
Performance
- ~100 operations/second with PostgreSQL
- Optimized for correctness over raw throughput
- Batched operations for better performance
How It Works
- Command Declaration: Use
#[derive(Command)]
to declare which streams you need - State Reconstruction: EventCore reads all requested streams and builds current state
- Business Logic: Your command validates rules and produces events
- Atomic Write: All events are written in a single transaction
- Optimistic Retry: On conflicts, EventCore retries automatically
Example: Complete Money Transfer
#![allow(unused)] fn main() { use eventcore::prelude::*; use eventcore_macros::Command; #[derive(Command, Clone)] struct TransferMoney { #[stream] from_account: StreamId, #[stream] to_account: StreamId, amount: Money, } #[async_trait] impl CommandLogic for TransferMoney { type State = AccountBalances; type Event = BankingEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Update state based on events match &event.payload { BankingEvent::MoneyWithdrawn { amount, .. } => { state.debit(&event.stream_id, *amount); } BankingEvent::MoneyDeposited { amount, .. } => { state.credit(&event.stream_id, *amount); } _ => {} } } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check balance let from_balance = state.balance(&self.from_account); require!( from_balance >= self.amount.value(), "Insufficient funds: balance={}, requested={}", from_balance, self.amount ); // Create atomic events for both accounts Ok(vec![ StreamWrite::new( &read_streams, self.from_account.clone(), BankingEvent::MoneyWithdrawn { amount: self.amount.value(), to: self.to_account.to_string(), } )?, StreamWrite::new( &read_streams, self.to_account.clone(), BankingEvent::MoneyDeposited { amount: self.amount.value(), from: self.from_account.to_string(), } )?, ]) } } }
Next Steps
Now that you understand what EventCore is, let’s explore when to use it →
Chapter 1.2: When to Use EventCore
In the modern age of fast computers and cheap storage, event sourcing should be the default approach for any line-of-business application. This chapter explores why EventCore is the right choice for your next project and addresses common concerns.
Why Event Sourcing Should Be Your Default
Traditional CRUD databases were designed in an era of expensive storage and slow computers. They optimize for storage efficiency by throwing away history - a terrible trade-off in today’s world. Here’s why event sourcing, and specifically EventCore, should be your default choice:
1. History is Free
Storage costs have plummeted. The complete history of your business operations costs pennies to store but provides immense value:
- Debug production issues by replaying events
- Satisfy any future audit requirement
- Build new features on historical data
- Prove compliance retroactively
2. CRUD Lies About Your Business
CRUD operations (Create, Read, Update, Delete) are technical concepts that don’t match business reality:
- “Update” erases the reason for change
- “Delete” pretends things never existed
- State-based models lose critical business context
Event sourcing captures what actually happened: “CustomerChangedAddress”, “OrderCancelled”, “PriceAdjusted”
3. Future-Proof by Default
With EventCore, you never have to say “we didn’t track that”:
- New reporting requirements? Replay events into new projections
- Need to add analytics? The data is already there
- Compliance rules changed? Full history available
EventCore Makes Event Sourcing Practical
While event sourcing should be the default, EventCore specifically excels by solving traditional event sourcing pain points:
1. Complex Business Transactions
Problem: Your business operations span multiple entities that must change together.
Example: E-commerce order fulfillment
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct FulfillOrder { #[stream] order: StreamId, // Update order status #[stream] inventory: StreamId, // Deduct items #[stream] shipping: StreamId, // Create shipping record #[stream] customer: StreamId, // Update loyalty points } }
Why EventCore: Traditional systems require distributed transactions or eventual consistency. EventCore makes this atomic and simple.
2. Financial Systems
Problem: Need complete audit trail and strong consistency for money movements.
Example: Payment processing
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ProcessPayment { #[stream] customer_account: StreamId, #[stream] merchant_account: StreamId, #[stream] payment_gateway: StreamId, #[stream] tax_authority: StreamId, } }
Why EventCore:
- Every state change is recorded
- Natural audit log for compliance
- Atomic operations prevent partial payments
- Easy to replay for reconciliation
3. Collaborative Systems
Problem: Multiple users modifying shared resources with conflict resolution needs.
Example: Project management tool
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct MoveTaskToColumn { #[stream] task: StreamId, #[stream] from_column: StreamId, #[stream] to_column: StreamId, #[stream] project: StreamId, } }
Why EventCore:
- Event streams enable real-time updates
- Natural conflict resolution through events
- Complete history of who did what when
4. Regulatory Compliance
Problem: Regulations require you to show complete history of data changes.
Example: Healthcare records
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct UpdatePatientRecord { #[stream] patient: StreamId, #[stream] physician: StreamId, #[stream] audit_log: StreamId, } }
Why EventCore:
- Immutable event log satisfies auditors
- Can prove system state at any point in time
- Natural GDPR compliance (event-level data retention)
5. Domain-Driven Design
Problem: Your domain has complex rules that span multiple aggregates.
Example: Insurance claim processing
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ProcessClaim { #[stream] claim: StreamId, #[stream] policy: StreamId, #[stream] customer: StreamId, #[stream] adjuster: StreamId, } }
Why EventCore:
- Commands match business operations exactly
- No artificial aggregate boundaries
- Domain events become first-class citizens
Addressing Common Concerns
“But Event Sourcing is Complex!”
Myth: Event sourcing adds unnecessary complexity.
Reality: EventCore makes it simpler than CRUD:
- No O/R mapping impedance mismatch
- Commands map directly to business operations
- No “load-modify-save” race conditions
- Debugging is easier with full history
“What About Performance?”
Myth: Event sourcing is slow because it stores everything.
Reality:
- EventCore achieves ~83 ops/sec with PostgreSQL - plenty for most business applications
- Read models can be optimized for any query pattern
- No complex joins needed - data is pre-projected
- Scales horizontally by splitting streams
“Storage Costs Will Explode!”
Myth: Storing all events is expensive.
Reality: Let’s do the math:
- Average event size: ~1KB
- 1000 events/day = 365K events/year = 365MB/year
- S3 storage cost: ~$0.023/GB/month = $0.10/year
- Your complete business history costs less than a coffee
“What About GDPR/Privacy?”
Myth: You can’t delete data with event sourcing.
Reality: EventCore provides better privacy controls:
- Crypto-shredding: Delete encryption keys to make data unreadable
- Event-level retention policies
- Selective projection rebuilding
- Actually know what data you have about someone
Special Considerations
Large Binary Data
For systems with large binary data (images, videos), use a hybrid approach:
- Store metadata and operations as events
- Store binaries in object storage (S3)
- Best of both worlds
Graph-Heavy Queries
For social networks or recommendation engines:
- Use EventCore for the write side
- Project into graph databases for queries
- Maintain consistency through event streams
Cache-Like Workloads
For session storage or caching:
- These aren’t business operations
- Use appropriate tools (Redis)
- EventCore for business logic, Redis for caching
Migration Considerations
From Traditional Database
Good fit if:
- You need better audit trails
- Business rules span multiple tables
- You’re already using event-driven architecture
Poor fit if:
- Current solution works well
- No complex business rules
- Just need basic CRUD
From Microservices
Good fit if:
- Struggling with distributed transactions
- Need better consistency guarantees
- Want to simplify architecture
Poor fit if:
- True service isolation is required
- Different teams own different services
- Services use different tech stacks
Performance Considerations
EventCore is optimized for:
- ✅ Correctness and consistency
- ✅ Complex business operations
- ✅ Audit and compliance needs
EventCore is NOT optimized for:
- ❌ Maximum throughput (~83 ops/sec with PostgreSQL)
- ❌ Minimum latency (ms-level operations)
- ❌ Large binary data
The Right Question
Instead of asking “Do I need event sourcing?”, ask:
“Can I afford to throw away my business history?”
In an era of:
- Regulatory scrutiny
- Data-driven decisions
- Machine learning opportunities
- Debugging production issues
- Changing business requirements
The answer is almost always NO.
Decision Framework
Start with EventCore for:
- ✅ Any line-of-business application - Your default choice
- ✅ Multi-entity operations - EventCore’s sweet spot
- ✅ Financial systems - Audit trail included
- ✅ Collaborative tools - Natural conflict resolution
- ✅ Regulated industries - Compliance built-in
- ✅ Domain-driven design - Commands match your domain
Consider Alternatives Only For:
- 🤔 Pure caching layers - Use Redis alongside EventCore
- 🤔 Binary blob storage - Hybrid approach with S3
- 🤔 >1000 ops/sec - Add caching or consider specialized solutions
Summary
In 2024 and beyond, the question isn’t “Why event sourcing?” but “Why would you throw away your business history?”
EventCore makes event sourcing practical by:
- Eliminating aggregate boundary problems
- Providing multi-stream atomicity
- Making it type-safe and simple
- Scaling to real business needs
Storage is cheap. History is valuable. Make event sourcing your default.
Ready to dive deeper? Let’s explore Event Modeling Fundamentals →
Chapter 1.3: Event Modeling Fundamentals
Event modeling is a visual technique for designing event-driven systems. It helps you discover your domain events, commands, and read models before writing any code. This chapter teaches you how to model systems that naturally translate to EventCore implementations.
What is Event Modeling?
Event modeling is a method of describing systems using three core elements:
- Events (Orange) - Things that happened
- Commands (Blue) - Things users want to do
- Read Models (Green) - Views of current state
The genius is in its simplicity: model your system on a timeline showing what happens when.
The Event Modeling Process
Step 1: Brain Storming Events
Start by identifying what happens in your system. Use past-tense language:
Example: Task Management System
Events (what happened):
- Task Created
- Task Assigned
- Task Completed
- Comment Added
- Due Date Changed
- Task Archived
Key principles:
- Past tense (“Created” not “Create”)
- Record facts (“Task Completed” not “Complete Task”)
- Include relevant data in event names
Step 2: Building the Timeline
Arrange events on a timeline to tell the story of your system:
Time →
|
├─ Task Created ──┬─ Task Assigned ──┬─ Comment Added ──┬─ Task Completed
| (by: Alice) | (to: Bob) | (by: Bob) | (by: Bob)
| title: "Fix" | | "Working on it" |
| | | |
└─────────────────┴──────────────────┴───────────────────┴─────────────────
This visual representation helps you:
- See the flow of your system
- Identify missing events
- Understand event relationships
Step 3: Identifying Commands
Commands trigger events. Look at each event and ask “What user action caused this?”
Command (Blue) → Event (Orange)
─────────────────────────────────────────
Create Task → Task Created
Assign Task → Task Assigned
Complete Task → Task Completed
Add Comment → Comment Added
In EventCore, these become your command types:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CreateTask { #[stream] task_id: StreamId, title: TaskTitle, description: TaskDescription, } #[derive(Command, Clone)] struct AssignTask { #[stream] task_id: StreamId, #[stream] user_id: StreamId, } }
Step 4: Designing Read Models
Read models answer questions. Look at your UI/API needs:
Question → Read Model (Green)
────────────────────────────────────────────────
"What tasks do I have?" → My Tasks List
"What's the project status?" → Project Dashboard
"Who worked on what?" → Activity Timeline
In EventCore, these become projections:
#![allow(unused)] fn main() { // Read model for "My Tasks" struct MyTasksProjection { tasks_by_user: HashMap<UserId, Vec<TaskSummary>>, } impl CqrsProjection for MyTasksProjection { fn apply(&mut self, event: &StoredEvent<TaskEvent>) { match &event.payload { TaskEvent::TaskAssigned { user_id, .. } => { // Update tasks_by_user } // ... handle other events } } } }
Event Modeling Patterns
Pattern 1: State Transitions
Many business processes are state machines:
Draft → Published → Archived
↓ ↓
Deleted Unpublished
Events:
- ArticleDrafted
- ArticlePublished
- ArticleUnpublished
- ArticleArchived
- ArticleDeleted
In EventCore:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct PublishArticle { #[stream] article_id: StreamId, #[stream] author_id: StreamId, // Also track author actions scheduled_time: Option<Timestamp>, } }
Pattern 2: Collaborative Operations
When multiple entities participate:
Money Transfer Timeline:
Source Account ──────┬──────────────┬─────────
↓ ↑
Money Withdrawn │
│
Target Account ──────────────┬──────┴─────────
↓
Money Deposited
In EventCore, this is ONE atomic command:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct TransferMoney { #[stream] from_account: StreamId, #[stream] to_account: StreamId, amount: Money, } }
Pattern 3: Process Flows
Complex business processes with multiple steps:
Order Flow:
Order Created → Payment Processed → Inventory Reserved → Order Shipped
| | | |
Order Stream Payment Stream Inventory Stream Shipping Stream
Each step might be a separate command or one complex command:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct FulfillOrder { #[stream] order_id: StreamId, #[stream] payment_id: StreamId, #[stream] inventory_id: StreamId, #[stream] shipping_id: StreamId, } }
From Model to Implementation
1. Events Become Rust Enums
Your discovered events:
#![allow(unused)] fn main() { #[derive(Debug, Clone, Serialize, Deserialize)] enum TaskEvent { Created { title: String, description: String }, Assigned { user_id: UserId }, Completed { completed_at: Timestamp }, CommentAdded { author: UserId, text: String }, } }
2. Commands Become EventCore Commands
Your identified commands:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CreateTask { #[stream] task_id: StreamId, title: TaskTitle, } #[async_trait] impl CommandLogic for CreateTask { type Event = TaskEvent; type State = TaskState; async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { require!(!state.exists, "Task already exists"); Ok(vec![ StreamWrite::new( &read_streams, self.task_id.clone(), TaskEvent::Created { title: self.title.as_ref().to_string(), description: String::new(), } )? ]) } } }
3. Read Models Become Projections
Your view requirements:
#![allow(unused)] fn main() { #[derive(Default)] struct TasksByUserProjection { index: HashMap<UserId, HashSet<TaskId>>, } impl CqrsProjection for TasksByUserProjection { fn apply(&mut self, event: &StoredEvent<TaskEvent>) { match &event.payload { TaskEvent::Assigned { user_id } => { self.index .entry(user_id.clone()) .or_default() .insert(TaskId::from(&event.stream_id)); } _ => {} } } } }
Workshop: Model a Coffee Shop
Let’s practice with a simple domain:
Step 1: Brainstorm Events
What happens in a coffee shop?
- Customer Entered
- Order Placed
- Payment Received
- Coffee Prepared
- Order Completed
- Customer Left
Step 2: Build Timeline
Customer Entered → Order Placed → Payment Received → Coffee Prepared → Order Completed
| | | | |
Customer ID Order Stream Payment Stream Barista Stream Order Stream
Step 3: Identify Commands
- Enter Shop → Customer Entered
- Place Order → Order Placed
- Process Payment → Payment Received
- Prepare Coffee → Coffee Prepared
- Complete Order → Order Completed
Step 4: Design Read Models
- Queue Display: Shows pending orders for baristas
- Customer Receipt: Shows order details and status
- Daily Sales Report: Aggregates all payments
Step 5: Implement in EventCore
#![allow(unused)] fn main() { // One command handling the full order flow #[derive(Command, Clone)] struct PlaceAndPayOrder { #[stream] order_id: StreamId, #[stream] customer_id: StreamId, #[stream] register_id: StreamId, items: Vec<MenuItem>, payment: PaymentMethod, } }
Best Practices
-
Start with Events, Not Structure
- Don’t design database schemas
- Focus on what happens in the business
-
Use Domain Language
- “InvoiceSent” not “UpdateInvoiceStatus”
- Match the language your users use
-
Model Time Explicitly
- Show the flow of events
- Understand concurrent vs sequential operations
-
Keep Events Focused
- One event = one business fact
- Don’t combine unrelated changes
-
Commands Match User Intent
- “TransferMoney” not “UpdateAccountBalance”
- Commands are what users want to do
Common Pitfalls
❌ Modeling State Instead of Events
#![allow(unused)] fn main() { // Bad: Thinking in state AccountUpdated { balance: 100 } // Good: Thinking in events MoneyDeposited { amount: 50 } }
❌ Technical Events
#![allow(unused)] fn main() { // Bad: Technical focus DatabaseRecordInserted // Good: Business focus CustomerRegistered }
❌ Missing the Why
#![allow(unused)] fn main() { // Bad: Just the what PriceChanged { new_price: 100 } // Good: Including why PriceReducedForSale { original: 150, sale_price: 100, reason: "Black Friday" } }
Summary
Event modeling helps you:
- Understand your domain before coding
- Discover events, commands, and read models
- Design systems that map naturally to EventCore
- Communicate with stakeholders visually
The key insight: Model what happens, not what is.
Next, let’s look at EventCore’s Architecture to understand how your models become working systems →
Chapter 1.4: Architecture Overview
This chapter provides a high-level view of EventCore’s architecture, showing how commands, events, and projections work together to create robust event-sourced systems.
Core Architecture
EventCore follows a clean, layered architecture:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Application │ │ Application │ │ Application │
│ (Axum) │ │ (CLI) │ │ (gRPC) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└───────────────────────┴───────────────────────┘
│
┌────────────▼────────────┐
│ Command Executor │
│ (Validation & Retry) │
└────────────┬────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌────────▼────────┐ ┌──────────▼──────────┐ ┌────────▼────────┐
│ Commands │ │ Event Store │ │ Projections │
│ (Domain Logic) │ │ (PostgreSQL) │ │ (Read Models) │
└─────────────────┘ └─────────────────────┘ └─────────────────┘
Key Components
1. Commands
Commands encapsulate business operations. They declare what streams they need and contain the business logic:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ApproveOrder { #[stream] order: StreamId, #[stream] approver: StreamId, #[stream] inventory: StreamId, } }
Responsibilities:
- Declare stream dependencies via
#[stream]
attributes - Implement business validation rules
- Generate events representing what happened
- Ensure consistency within their boundaries
2. Command Executor
The executor orchestrates command execution with automatic retry logic:
#![allow(unused)] fn main() { let executor = CommandExecutor::builder() .with_store(event_store) .with_retry_policy(RetryPolicy::exponential_backoff()) .build(); let result = executor.execute(&command).await?; }
Execution Flow:
- Read Phase: Fetch all declared streams
- Reconstruct State: Apply events to build current state
- Execute Command: Run business logic
- Write Phase: Atomically write new events
- Retry on Conflict: Handle optimistic concurrency
3. Event Store
The event store provides durable, ordered storage of events:
#![allow(unused)] fn main() { #[async_trait] pub trait EventStore: Send + Sync { async fn read_stream(&self, stream_id: &StreamId) -> Result<Vec<StoredEvent>>; async fn write_events(&self, events: Vec<EventToWrite>) -> Result<()>; } }
Guarantees:
- Atomic multi-stream writes
- Optimistic concurrency control
- Global ordering via UUIDv7 event IDs
- Exactly-once semantics
4. Projections
Projections build read models from events:
#![allow(unused)] fn main() { impl CqrsProjection for OrderSummaryProjection { type Event = OrderEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match &event.payload { OrderEvent::Approved { .. } => { self.approved_count += 1; } // Handle other events } Ok(()) } } }
Capabilities:
- Real-time updates from event streams
- Rebuild from any point in time
- Multiple projections from same events
- Optimized for specific queries
Data Flow
Write Path (Commands)
User Action
↓
HTTP Request
↓
Command Creation ──────→ #[derive(Command)] macro generates boilerplate
↓
Executor.execute()
↓
Read Streams ──────────→ PostgreSQL: SELECT events WHERE stream_id IN (...)
↓
Reconstruct State ─────→ Fold events into current state
↓
Command.handle() ──────→ Business logic validates and generates events
↓
Write Events ──────────→ PostgreSQL: INSERT events (atomic transaction)
↓
Return Result
Read Path (Projections)
Events Written
↓
Event Notification
↓
Projection Runner ─────→ Subscribes to event streams
↓
Load Event
↓
Projection.apply() ────→ Update read model state
↓
Save Checkpoint ───────→ Track position for resume
↓
Query Read Model ──────→ Optimized for specific access patterns
Multi-Stream Atomicity
EventCore’s key innovation is atomic operations across multiple streams:
Traditional Event Sourcing
Account A Account B
│ │
├─ Withdraw? │ ❌ Two separate operations
│ ├─ Deposit? (not atomic!)
↓ ↓
EventCore Approach
TransferMoney Command
│
┌──────────┴──────────┐
↓ ↓
Account A Account B
│ │
├─ Withdrawn ←────────┤ Deposited ✅ One atomic operation!
↓ ↓
Concurrency Model
EventCore uses optimistic concurrency control:
- Version Tracking: Each stream has a version number
- Read Version: Commands note the version when reading
- Conflict Detection: Writes fail if version changed
- Automatic Retry: Executor retries with fresh data
#![allow(unused)] fn main() { // Internally tracked by EventCore struct StreamVersion { stream_id: StreamId, version: EventVersion, } // Automatic retry on conflicts let result = executor .execute(&command) .await?; // Retries handled internally }
Type Safety
EventCore leverages Rust’s type system for correctness:
Stream Access Control
#![allow(unused)] fn main() { // Compile-time enforcement impl TransferMoney { fn handle(&self, read_streams: ReadStreams<Self::StreamSet>) { // ✅ Can only write to declared streams StreamWrite::new(&read_streams, self.from_account, event)?; // ❌ Compile error - stream not declared! StreamWrite::new(&read_streams, other_stream, event)?; } } }
Validated Types
#![allow(unused)] fn main() { // Parse, don't validate #[nutype(validate(greater = 0))] struct Money(u64); // Once created, always valid let amount = Money::try_new(100)?; // Validated at boundary transfer_money(amount); // No validation needed }
Deployment Architecture
Simple Deployment
┌─────────────┐ ┌──────────────┐
│ Your App │────▶│ PostgreSQL │
└─────────────┘ └──────────────┘
Production Deployment
Load Balancer
│
┌────────────────┼────────────────┐
↓ ↓ ↓
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ App Pod 1 │ │ App Pod 2 │ │ App Pod 3 │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
↓
┌─────────────────┐
│ PostgreSQL │
│ (Primary) │
└────────┬────────┘
│
┌────────────────┼────────────────┐
↓ ↓
┌───────────────┐ ┌───────────────┐
│ PG Replica 1 │ │ PG Replica 2 │
└───────────────┘ └───────────────┘
Performance Characteristics
EventCore is optimized for correctness and developer productivity:
Throughput
- Single-stream commands: ~83 ops/sec (PostgreSQL), 187,711 ops/sec (in-memory)
- Multi-stream commands: ~25-50 ops/sec (PostgreSQL)
- Batch operations: 750,000-820,000 events/sec (in-memory)
Latency
- Command execution: 10-20ms (typical)
- Conflict retry: +5-10ms per retry
- Projection lag: <100ms (typical)
Scaling Strategies
- Vertical: Larger PostgreSQL instance
- Read Scaling: PostgreSQL read replicas
- Stream Sharding: Partition by stream ID
- Caching: Read model caching layer
Error Handling
EventCore provides structured error handling:
#![allow(unused)] fn main() { pub enum CommandError { ValidationFailed(String), // Business rule violations ConcurrencyConflict, // Version conflicts (retried) StreamNotFound(StreamId), // Missing streams EventStoreFailed(String), // Infrastructure errors } }
Errors are categorized for appropriate handling:
- Retriable: Concurrency conflicts, transient failures
- Non-retriable: Validation failures, business rule violations
- Fatal: Infrastructure failures, panic recovery
Monitoring and Observability
Built-in instrumentation for production visibility:
#![allow(unused)] fn main() { // Automatic metrics eventcore.commands.executed{command="TransferMoney", status="success"} eventcore.events.written{stream="account-123"} eventcore.retries{reason="concurrency_conflict"} // Structured logging {"level":"info", "command":"TransferMoney", "duration_ms":15, "events_written":2} // OpenTelemetry traces TransferMoney ├─ read_streams (5ms) ├─ reconstruct_state (2ms) ├─ handle_command (3ms) └─ write_events (5ms) }
Summary
EventCore’s architecture provides:
- Clean Separation: Commands, events, and projections have clear responsibilities
- Multi-Stream Atomicity: Complex operations remain consistent
- Type Safety: Rust’s type system prevents errors
- Production Ready: Built-in retry, monitoring, and error handling
- Flexible Deployment: From simple to highly-scaled architectures
The architecture is designed to make the right thing easy and the wrong thing impossible.
Ready to build something? Continue to Part 2: Getting Started →
Part 2: Getting Started
This comprehensive tutorial walks you through building a complete task management system with EventCore. You’ll learn event modeling, domain design, command implementation, projections, and testing.
What We’ll Build
A task management system with:
- Creating and managing tasks
- Assigning tasks to users
- Comments and activity tracking
- Real-time task lists and dashboards
- Complete audit trail
Chapters in This Part
- Setting Up Your Project - Create a new Rust project with EventCore
- Modeling the Domain - Design events and commands using event modeling
- Implementing Commands - Build commands with the macro system
- Working with Projections - Create read models for queries
- Testing Your Application - Write comprehensive tests
Prerequisites
- Rust 1.70+ installed
- Basic Rust knowledge (ownership, traits, async)
- PostgreSQL 12+ (or use in-memory store for learning)
- 30-60 minutes to complete
Learning Outcomes
By the end of this tutorial, you’ll understand:
- How to model domains with events
- Using EventCore’s macro system
- Building multi-stream commands
- Creating and updating projections
- Testing event-sourced systems
Code Repository
The complete code for this tutorial is available at:
git clone https://github.com/your-org/eventcore-task-tutorial
cd eventcore-task-tutorial
Ready? Let’s set up your project →
Chapter 2.1: Setting Up Your Project
Let’s create a new Rust project and add EventCore dependencies. We’ll build a task management system that demonstrates EventCore’s key features.
Create a New Project
cargo new taskmaster --bin
cd taskmaster
Add Dependencies
Edit Cargo.toml
to include EventCore and related dependencies:
[package]
name = "taskmaster"
version = "0.1.0"
edition = "2021"
[dependencies]
# EventCore core functionality
eventcore = "0.1"
eventcore-macros = "0.1"
# For development/testing - switch to eventcore-postgres for production
eventcore-memory = "0.1"
# Async runtime
tokio = { version = "1.40", features = ["full"] }
async-trait = "0.1"
# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Type validation
nutype = { version = "0.6", features = ["serde"] }
# Utilities
uuid = { version = "1.11", features = ["v7", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
thiserror = "2.0"
# For our CLI interface
clap = { version = "4.5", features = ["derive"] }
[dev-dependencies]
# Testing utilities
proptest = "1.6"
Project Structure
Create the following directory structure:
taskmaster/
├── Cargo.toml
├── src/
│ ├── main.rs # Application entry point
│ ├── domain/
│ │ ├── mod.rs # Domain module
│ │ ├── types.rs # Domain types with validation
│ │ ├── events.rs # Event definitions
│ │ └── commands/ # Command implementations
│ │ ├── mod.rs
│ │ ├── create_task.rs
│ │ ├── assign_task.rs
│ │ └── complete_task.rs
│ ├── projections/
│ │ ├── mod.rs # Projections module
│ │ ├── task_list.rs # User task lists
│ │ └── statistics.rs # Task statistics
│ └── api/
│ ├── mod.rs # API module (we'll add this in Part 4)
│ └── handlers.rs # HTTP handlers
Create the directories:
mkdir -p src/domain/commands
mkdir -p src/projections
mkdir -p src/api
Initial Setup Code
Let’s create the basic module structure:
src/main.rs
mod domain; mod projections; use clap::{Parser, Subcommand}; use eventcore::prelude::*; use eventcore_memory::InMemoryEventStore; #[derive(Parser)] #[command(name = "taskmaster")] #[command(about = "A task management system built with EventCore")] struct Cli { #[command(subcommand)] command: Commands, } #[derive(Subcommand)] enum Commands { /// Create a new task Create { /// Task title title: String, /// Task description description: String, }, /// List all tasks List, /// Run interactive demo Demo, } #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Initialize event store (in-memory for now) let event_store = InMemoryEventStore::new(); let executor = CommandExecutor::new(event_store); let cli = Cli::parse(); match cli.command { Commands::Create { title, description } => { println!("Creating task: {} - {}", title, description); // We'll implement this in Chapter 2.3 } Commands::List => { println!("Listing tasks..."); // We'll implement this in Chapter 2.4 } Commands::Demo => { println!("Running demo..."); run_demo(executor).await?; } } Ok(()) } async fn run_demo<ES: EventStore>(executor: CommandExecutor<ES>) -> Result<(), Box<dyn std::error::Error>> where ES::Event: From<domain::events::TaskEvent> + TryInto<domain::events::TaskEvent>, { println!("🚀 EventCore Task Management Demo"); println!("================================\n"); // We'll add demo code as we build features Ok(()) }
src/domain/mod.rs
#![allow(unused)] fn main() { pub mod types; pub mod events; pub mod commands; // Re-export commonly used items pub use types::*; pub use events::*; }
src/domain/types.rs
#![allow(unused)] fn main() { use nutype::nutype; use serde::{Deserialize, Serialize}; use uuid::Uuid; /// Validated task title - must be non-empty and reasonable length #[nutype( sanitize(trim), validate(not_empty, len_char_max = 200), derive( Debug, Clone, PartialEq, Eq, AsRef, Serialize, Deserialize, Display ) )] pub struct TaskTitle(String); /// Validated task description #[nutype( sanitize(trim), validate(len_char_max = 2000), derive( Debug, Clone, PartialEq, Eq, AsRef, Serialize, Deserialize ) )] pub struct TaskDescription(String); /// Validated comment text #[nutype( sanitize(trim), validate(not_empty, len_char_max = 1000), derive( Debug, Clone, PartialEq, Eq, AsRef, Serialize, Deserialize ) )] pub struct CommentText(String); /// Validated user name #[nutype( sanitize(trim), validate(not_empty, len_char_max = 100), derive( Debug, Clone, PartialEq, Eq, Hash, AsRef, Serialize, Deserialize, Display ) )] pub struct UserName(String); /// Strongly-typed task ID #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)] pub struct TaskId(Uuid); impl TaskId { pub fn new() -> Self { Self(Uuid::now_v7()) } } impl Default for TaskId { fn default() -> Self { Self::new() } } impl std::fmt::Display for TaskId { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { write!(f, "{}", self.0) } } /// Task priority levels #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] pub enum Priority { Low, Medium, High, Critical, } impl Default for Priority { fn default() -> Self { Self::Medium } } /// Task status - note we model this as events, not mutable state #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)] pub enum TaskStatus { Open, InProgress, Completed, Cancelled, } impl Default for TaskStatus { fn default() -> Self { Self::Open } } }
src/domain/events.rs
#![allow(unused)] fn main() { use super::types::*; use serde::{Deserialize, Serialize}; use chrono::{DateTime, Utc}; /// Events that can occur in our task management system #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] #[serde(tag = "type")] pub enum TaskEvent { /// A new task was created Created { task_id: TaskId, title: TaskTitle, description: TaskDescription, creator: UserName, created_at: DateTime<Utc>, }, /// Task was assigned to a user Assigned { task_id: TaskId, assignee: UserName, assigned_by: UserName, assigned_at: DateTime<Utc>, }, /// Task was unassigned Unassigned { task_id: TaskId, unassigned_by: UserName, unassigned_at: DateTime<Utc>, }, /// Task priority was changed PriorityChanged { task_id: TaskId, old_priority: Priority, new_priority: Priority, changed_by: UserName, changed_at: DateTime<Utc>, }, /// Comment was added to task CommentAdded { task_id: TaskId, comment: CommentText, author: UserName, commented_at: DateTime<Utc>, }, /// Task was completed Completed { task_id: TaskId, completed_by: UserName, completed_at: DateTime<Utc>, }, /// Task was reopened after completion Reopened { task_id: TaskId, reopened_by: UserName, reopened_at: DateTime<Utc>, reason: Option<String>, }, /// Task was cancelled Cancelled { task_id: TaskId, cancelled_by: UserName, cancelled_at: DateTime<Utc>, reason: Option<String>, }, } // Required for EventCore's type conversion impl TryFrom<&TaskEvent> for TaskEvent { type Error = std::convert::Infallible; fn try_from(value: &TaskEvent) -> Result<Self, Self::Error> { Ok(value.clone()) } } }
src/domain/commands/mod.rs
#![allow(unused)] fn main() { mod create_task; mod assign_task; mod complete_task; pub use create_task::*; pub use assign_task::*; pub use complete_task::*; }
src/projections/mod.rs
#![allow(unused)] fn main() { mod task_list; mod statistics; pub use task_list::*; pub use statistics::*; }
Verify Setup
Let’s make sure everything compiles:
cargo build
You should see output like:
Compiling taskmaster v0.1.0
Finished dev [unoptimized + debuginfo] target(s) in X.XXs
Create a Simple Test
Add to src/main.rs
:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use crate::domain::types::*; #[test] fn test_validated_types() { // Valid title let title = TaskTitle::try_new("Fix the bug").unwrap(); assert_eq!(title.as_ref(), "Fix the bug"); // Empty title should fail assert!(TaskTitle::try_new("").is_err()); // Whitespace is trimmed let title = TaskTitle::try_new(" Trimmed ").unwrap(); assert_eq!(title.as_ref(), "Trimmed"); } #[test] fn test_task_id_generation() { let id1 = TaskId::new(); let id2 = TaskId::new(); // IDs should be unique assert_ne!(id1, id2); // IDs should be sortable by creation time (UUIDv7 property) assert!(id1.0 < id2.0); } } }
Run the tests:
cargo test
Environment Setup for PostgreSQL (Optional)
If you want to use PostgreSQL instead of the in-memory store:
- Start PostgreSQL with Docker:
docker run -d \
--name eventcore-postgres \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_DB=taskmaster \
-p 5432:5432 \
postgres:17
- Update
Cargo.toml
:
[dependencies]
eventcore-postgres = "0.1"
sqlx = { version = "0.8", features = ["runtime-tokio-rustls", "postgres"] }
- Set environment variable:
export DATABASE_URL="postgres://postgres:password@localhost/taskmaster"
Summary
We’ve set up:
- ✅ A new Rust project with EventCore dependencies
- ✅ Domain types with validation using
nutype
- ✅ Event definitions for our task system
- ✅ Basic project structure
- ✅ Test infrastructure
Next, we’ll model our domain using event modeling techniques →
Chapter 2.2: Modeling the Domain
Now that our project is set up, let’s use event modeling to design our task management system. We’ll discover the events, commands, and read models that make up our domain.
Step 1: Brainstorm the Events
What happens in a task management system? Let’s think through a typical workflow:
Events (Orange - things that happened):
- Task Created
- Task Assigned
- Task Started
- Comment Added
- Task Completed
- Task Reopened
- Priority Changed
- Due Date Set
- Task Cancelled
Step 2: Build the Timeline
Let’s visualize how these events flow through time:
Timeline →
Task Created ──┬── Task Assigned ──┬── Comment Added ──┬── Task Completed
│ │ │
User: Alice │ User: Bob │ User: Bob │ User: Bob
Title: "Fix login bug" │ Assignee: Bob │ "Found issue" │
│ │ │
Stream: task-123 │ Streams: │ Stream: │ Streams:
│ - task-123 │ - task-123 │ - task-123
│ - user-bob │ │ - user-bob
Notice how some operations involve multiple streams - this is where EventCore shines!
Step 3: Identify Commands
For each event, what user action triggered it?
Command (Blue) | → | Events (Orange) | Streams Involved |
---|---|---|---|
Create Task | → | Task Created | task |
Assign Task | → | Task Assigned | task, assignee |
Start Task | → | Task Started | task, user |
Add Comment | → | Comment Added | task |
Complete Task | → | Task Completed | task, user |
Reopen Task | → | Task Reopened | task, user |
Change Priority | → | Priority Changed | task |
Cancel Task | → | Task Cancelled | task, user |
Step 4: Design Read Models
What questions do users need answered?
Question | Read Model (Green) | Updated By Events |
---|---|---|
“What are my tasks?” | User Task List | Assigned, Completed, Cancelled |
“What’s the task status?” | Task Details | All task events |
“What’s the team workload?” | Team Dashboard | Created, Assigned, Completed |
“What happened to this task?” | Task History | All events (audit log) |
Step 5: Discover Business Rules
As we model, we discover rules that our commands must enforce:
-
Task Creation
- Title is required and non-empty
- Description has reasonable length limit
- Creator must be identified
-
Task Assignment
- Can’t assign to non-existent user
- Should track assignment history
- Unassigning is explicit action
-
Task Completion
- Only assigned user can complete (or admin)
- Can’t complete cancelled tasks
- Completion can be undone (reopen)
-
Comments
- Must have content
- Track author and timestamp
- Comments are immutable
Translating to EventCore
Events Stay Close to Our Model
Our discovered events map directly to code:
#![allow(unused)] fn main() { #[derive(Debug, Clone, Serialize, Deserialize)] #[serde(tag = "type")] pub enum TaskEvent { Created { task_id: TaskId, title: TaskTitle, description: TaskDescription, creator: UserName, created_at: DateTime<Utc>, }, Assigned { task_id: TaskId, assignee: UserName, assigned_by: UserName, assigned_at: DateTime<Utc>, }, // ... other events } }
Commands Declare Their Streams
Multi-stream operations are explicit:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct AssignTask { #[stream] task_id: StreamId, // The task stream #[stream] user_id: StreamId, // The assignee's stream assigned_by: UserName, } }
This command will:
- Read both streams atomically
- Validate the assignment
- Write events to both streams
- All in one transaction!
State Models for Each Command
Each command needs different state views:
#![allow(unused)] fn main() { // State for task operations #[derive(Default)] struct TaskState { exists: bool, title: String, status: TaskStatus, assignee: Option<UserName>, creator: UserName, } // State for user operations #[derive(Default)] struct UserTasksState { user_name: UserName, assigned_tasks: Vec<TaskId>, completed_count: u32, } }
Modeling Complex Scenarios
Scenario: Task Handoff
When reassigning a task from Alice to Bob:
Timeline →
Task Assigned Task Unassigned Task Assigned
(to: Alice) (from: Alice) (to: Bob)
│ │ │
├────────────────────┴───────────────────┤
│ │
Streams affected: Streams affected:
- task-123 - task-123
- user-alice - user-alice
- user-bob
In EventCore, we can model this as one atomic operation:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ReassignTask { #[stream] task_id: StreamId, #[stream] from_user: StreamId, #[stream] to_user: StreamId, reassigned_by: UserName, } }
Scenario: Bulk Operations
Assigning multiple tasks to a user:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct BulkAssignTasks { #[stream] user_id: StreamId, #[stream("tasks")] // Multiple task streams task_ids: Vec<StreamId>, assigned_by: UserName, } }
The beauty of EventCore: this remains atomic across ALL streams!
Visual Domain Model
Here’s our complete domain model:
┌─────────────────────────────────────────────────────────────┐
│ COMMANDS │
├─────────────────────────────────────────────────────────────┤
│ CreateTask │ AssignTask │ CompleteTask │ AddComment │ ... │
└─────────────┬───────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ EVENTS │
├─────────────────────────────────────────────────────────────┤
│ TaskCreated │ TaskAssigned │ TaskCompleted │ CommentAdded │
└─────────────┬───────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ READ MODELS │
├─────────────────────────────────────────────────────────────┤
│ UserTaskList │ TaskDetails │ TeamDashboard │ ActivityFeed │
└─────────────────────────────────────────────────────────────┘
Key Insights from Modeling
-
Multi-Stream Operations are Common
- Task assignment affects task AND user streams
- Completion updates task AND user statistics
- EventCore handles this naturally
-
Events are Business Facts
- “TaskAssigned” not “UpdateTask”
- Events capture intent and context
- Rich events enable better projections
-
Commands Match User Intent
- “AssignTask” not “UpdateTaskAssignee”
- Commands are what users want to do
- Natural API emerges from modeling
-
Read Models Serve Specific Needs
- UserTaskList for “my tasks” view
- TeamDashboard for manager overview
- Different projections from same events
Refining Our Event Model
Based on our modeling, let’s update src/domain/events.rs
:
#![allow(unused)] fn main() { use super::types::*; use serde::{Deserialize, Serialize}; use chrono::{DateTime, Utc}; use eventcore::StreamId; /// Events that can occur in our task management system #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] #[serde(tag = "type", rename_all = "snake_case")] pub enum TaskEvent { // Task lifecycle events Created { task_id: TaskId, title: TaskTitle, description: TaskDescription, creator: UserName, created_at: DateTime<Utc>, }, // Assignment events - note these affect multiple streams Assigned { task_id: TaskId, assignee: UserName, assigned_by: UserName, assigned_at: DateTime<Utc>, }, Unassigned { task_id: TaskId, previous_assignee: UserName, unassigned_by: UserName, unassigned_at: DateTime<Utc>, }, // Work events Started { task_id: TaskId, started_by: UserName, started_at: DateTime<Utc>, }, Completed { task_id: TaskId, completed_by: UserName, completed_at: DateTime<Utc>, }, // Collaboration events CommentAdded { task_id: TaskId, comment_id: Uuid, comment: CommentText, author: UserName, commented_at: DateTime<Utc>, }, // Management events PriorityChanged { task_id: TaskId, old_priority: Priority, new_priority: Priority, changed_by: UserName, changed_at: DateTime<Utc>, }, DueDateSet { task_id: TaskId, due_date: DateTime<Utc>, set_by: UserName, set_at: DateTime<Utc>, }, } /// Events specific to user streams #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] #[serde(tag = "type", rename_all = "snake_case")] pub enum UserEvent { /// Track when user is assigned a task TaskAssigned { user_name: UserName, task_id: TaskId, assigned_at: DateTime<Utc>, }, /// Track when user completes a task TaskCompleted { user_name: UserName, task_id: TaskId, completed_at: DateTime<Utc>, }, /// Track workload changes WorkloadUpdated { user_name: UserName, active_tasks: u32, completed_today: u32, }, } /// Combined event type for our system #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] #[serde(tag = "event_type", rename_all = "snake_case")] pub enum SystemEvent { Task(TaskEvent), User(UserEvent), } // Required conversions for EventCore impl TryFrom<&SystemEvent> for SystemEvent { type Error = std::convert::Infallible; fn try_from(value: &SystemEvent) -> Result<Self, Self::Error> { Ok(value.clone()) } } }
Summary
Through event modeling, we’ve discovered:
- Our Events: Business facts that capture what happened
- Our Commands: User intentions that trigger events
- Our Read Models: Views that answer user questions
- Our Streams: How data is organized (tasks, users)
The key insight: by modeling events first, the rest of the system design follows naturally. EventCore’s multi-stream capabilities mean we can implement our model exactly as designed, without compromise.
Next, let’s implement our commands using EventCore’s macro system →
Chapter 2.3: Implementing Commands
Now we’ll implement the commands we discovered during domain modeling. EventCore’s macro system makes this straightforward while maintaining type safety.
Command Structure
Every EventCore command follows this pattern:
- Derive the Command macro - Generates boilerplate
- Declare streams with #[stream] - Define what streams you need
- Implement CommandLogic - Your business logic
- Generate events - What happened as a result
Our First Command: Create Task
Let’s implement task creation:
src/domain/commands/create_task.rs
#![allow(unused)] fn main() { use crate::domain::{events::*, types::*}; use async_trait::async_trait; use chrono::Utc; use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite}; use eventcore_macros::Command; /// Command to create a new task #[derive(Command, Clone)] pub struct CreateTask { /// The task stream - will contain all task events #[stream] pub task_id: StreamId, /// Task details pub title: TaskTitle, pub description: TaskDescription, pub creator: UserName, pub priority: Priority, } impl CreateTask { /// Smart constructor ensures valid StreamId pub fn new( task_id: TaskId, title: TaskTitle, description: TaskDescription, creator: UserName, ) -> Result<Self, CommandError> { Ok(Self { task_id: StreamId::from_static(&format!("task-{}", task_id)), title, description, creator, priority: Priority::default(), }) } } /// State for create task command - tracks if task exists #[derive(Default)] pub struct CreateTaskState { exists: bool, } #[async_trait] impl CommandLogic for CreateTask { type State = CreateTaskState; type Event = TaskEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { TaskEvent::Created { .. } => { state.exists = true; } _ => {} // Other events don't affect creation } } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Business rule: Can't create a task that already exists require!( !state.exists, "Task {} already exists", self.task_id ); // Generate the TaskCreated event let event = TaskEvent::Created { task_id: TaskId::from(&self.task_id), title: self.title.clone(), description: self.description.clone(), creator: self.creator.clone(), created_at: Utc::now(), }; // Write to the task stream Ok(vec![ StreamWrite::new(&read_streams, self.task_id.clone(), event)? ]) } } }
Key Points
-
#[derive(Command)] generates:
- The
StreamSet
phantom type - Implementation of
CommandStreams
trait - The
read_streams()
method
- The
-
#[stream] attribute declares which streams this command needs
-
apply() method reconstructs state from events
-
handle() method contains your business logic
-
require! macro provides clean validation with good error messages
-
StreamWrite::new() ensures type-safe writes to declared streams
Multi-Stream Command: Assign Task
Task assignment affects both the task and the user:
src/domain/commands/assign_task.rs
#![allow(unused)] fn main() { use crate::domain::{events::*, types::*}; use async_trait::async_trait; use chrono::Utc; use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite}; use eventcore_macros::Command; /// Command to assign a task to a user /// This is a multi-stream command affecting both task and user streams #[derive(Command, Clone)] pub struct AssignTask { #[stream] pub task_id: StreamId, #[stream] pub assignee_id: StreamId, pub assigned_by: UserName, } impl AssignTask { pub fn new( task_id: TaskId, assignee: UserName, assigned_by: UserName, ) -> Result<Self, CommandError> { Ok(Self { task_id: StreamId::from_static(&format!("task-{}", task_id)), assignee_id: StreamId::from_static(&format!("user-{}", assignee)), assigned_by, }) } } /// State that combines task and user information #[derive(Default)] pub struct AssignTaskState { // Task state task_exists: bool, task_title: String, current_assignee: Option<UserName>, task_status: TaskStatus, // User state user_exists: bool, user_name: Option<UserName>, active_task_count: u32, } #[async_trait] impl CommandLogic for AssignTask { type State = AssignTaskState; type Event = SystemEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Apply events from different streams match &event.payload { SystemEvent::Task(task_event) => { match task_event { TaskEvent::Created { title, .. } => { state.task_exists = true; state.task_title = title.to_string(); } TaskEvent::Assigned { assignee, .. } => { state.current_assignee = Some(assignee.clone()); } TaskEvent::Unassigned { .. } => { state.current_assignee = None; } TaskEvent::Completed { .. } => { state.task_status = TaskStatus::Completed; } _ => {} } } SystemEvent::User(user_event) => { match user_event { UserEvent::TaskAssigned { user_name, .. } => { state.user_exists = true; state.user_name = Some(user_name.clone()); state.active_task_count += 1; } UserEvent::TaskCompleted { .. } => { state.active_task_count = state.active_task_count.saturating_sub(1); } _ => {} } } } } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate business rules require!( state.task_exists, "Cannot assign non-existent task" ); require!( state.task_status != TaskStatus::Completed, "Cannot assign completed task" ); require!( state.task_status != TaskStatus::Cancelled, "Cannot assign cancelled task" ); // Check if already assigned to this user if let Some(current) = &state.current_assignee { require!( current != &state.user_name.clone().unwrap_or_default(), "Task is already assigned to this user" ); } let now = Utc::now(); let task_id = TaskId::from(&self.task_id); let assignee = UserName::from(&self.assignee_id); let mut events = Vec::new(); // If task is currently assigned, unassign first if let Some(previous_assignee) = state.current_assignee { events.push(StreamWrite::new( &read_streams, self.task_id.clone(), SystemEvent::Task(TaskEvent::Unassigned { task_id, previous_assignee, unassigned_by: self.assigned_by.clone(), unassigned_at: now, }) )?); } // Write assignment event to task stream events.push(StreamWrite::new( &read_streams, self.task_id.clone(), SystemEvent::Task(TaskEvent::Assigned { task_id, assignee: assignee.clone(), assigned_by: self.assigned_by.clone(), assigned_at: now, }) )?); // Write assignment event to user stream events.push(StreamWrite::new( &read_streams, self.assignee_id.clone(), SystemEvent::User(UserEvent::TaskAssigned { user_name: assignee, task_id, assigned_at: now, }) )?); // Update user workload events.push(StreamWrite::new( &read_streams, self.assignee_id.clone(), SystemEvent::User(UserEvent::WorkloadUpdated { user_name: assignee, active_tasks: state.active_task_count + 1, completed_today: 0, // Would calculate from state }) )?); Ok(events) } } }
Multi-Stream Benefits
- Atomic Updates: Both task and user streams update together
- Consistent State: No partial updates possible
- Rich Events: Each stream gets relevant events
- Type Safety: Can only write to declared streams
Command with Business Logic: Complete Task
src/domain/commands/complete_task.rs
#![allow(unused)] fn main() { use crate::domain::{events::*, types::*}; use async_trait::async_trait; use chrono::Utc; use eventcore::{prelude::*, CommandLogic, ReadStreams, StreamResolver, StreamWrite}; use eventcore_macros::Command; /// Command to complete a task #[derive(Command, Clone)] pub struct CompleteTask { #[stream] pub task_id: StreamId, #[stream] pub user_id: StreamId, pub completed_by: UserName, } #[derive(Default)] pub struct CompleteTaskState { task_exists: bool, task_status: TaskStatus, assignee: Option<UserName>, user_name: Option<UserName>, completed_count: u32, } #[async_trait] impl CommandLogic for CompleteTask { type State = CompleteTaskState; type Event = SystemEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { SystemEvent::Task(task_event) => { match task_event { TaskEvent::Created { .. } => { state.task_exists = true; state.task_status = TaskStatus::Open; } TaskEvent::Assigned { assignee, .. } => { state.assignee = Some(assignee.clone()); } TaskEvent::Started { .. } => { state.task_status = TaskStatus::InProgress; } TaskEvent::Completed { .. } => { state.task_status = TaskStatus::Completed; } _ => {} } } SystemEvent::User(user_event) => { match user_event { UserEvent::TaskCompleted { user_name, .. } => { state.user_name = Some(user_name.clone()); state.completed_count += 1; } _ => {} } } } } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Business rules require!( state.task_exists, "Cannot complete non-existent task" ); require!( state.task_status != TaskStatus::Completed, "Task is already completed" ); require!( state.task_status != TaskStatus::Cancelled, "Cannot complete cancelled task" ); // Only assigned user can complete (or admin) if let Some(assignee) = &state.assignee { require!( assignee == &self.completed_by || self.completed_by.as_ref() == "admin", "Only assigned user or admin can complete task" ); } let now = Utc::now(); let task_id = TaskId::from(&self.task_id); Ok(vec![ // Mark task as completed StreamWrite::new( &read_streams, self.task_id.clone(), SystemEvent::Task(TaskEvent::Completed { task_id, completed_by: self.completed_by.clone(), completed_at: now, }) )?, // Update user's completion stats StreamWrite::new( &read_streams, self.user_id.clone(), SystemEvent::User(UserEvent::TaskCompleted { user_name: self.completed_by.clone(), task_id, completed_at: now, }) )?, ]) } } }
Helper Functions
Add these to src/domain/types.rs
:
#![allow(unused)] fn main() { use eventcore::StreamId; impl From<&StreamId> for TaskId { fn from(stream_id: &StreamId) -> Self { // Extract TaskId from stream ID like "task-{uuid}" let id_str = stream_id.as_ref() .strip_prefix("task-") .unwrap_or(stream_id.as_ref()); TaskId(Uuid::parse_str(id_str).unwrap_or_else(|_| Uuid::nil())) } } impl From<&StreamId> for UserName { fn from(stream_id: &StreamId) -> Self { // Extract UserName from stream ID like "user-{name}" let name = stream_id.as_ref() .strip_prefix("user-") .unwrap_or(stream_id.as_ref()); UserName::try_new(name).unwrap_or_else(|_| UserName::try_new("unknown").unwrap() ) } } }
Testing Our Commands
Add to src/main.rs
:
#![allow(unused)] fn main() { #[cfg(test)] mod command_tests { use super::*; use crate::domain::commands::*; use crate::domain::types::*; use eventcore_memory::InMemoryEventStore; #[tokio::test] async fn test_create_task() { // Setup let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); // Create command let task_id = TaskId::new(); let command = CreateTask::new( task_id, TaskTitle::try_new("Write tests").unwrap(), TaskDescription::try_new("Add unit tests").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); // Execute let result = executor.execute(&command).await.unwrap(); // Verify assert_eq!(result.events_written.len(), 1); assert_eq!(result.streams_affected.len(), 1); // Try to create again - should fail let result = executor.execute(&command).await; assert!(result.is_err()); } #[tokio::test] async fn test_assign_task() { let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); // First create a task let task_id = TaskId::new(); let create = CreateTask::new( task_id, TaskTitle::try_new("Test task").unwrap(), TaskDescription::try_new("Description").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); executor.execute(&create).await.unwrap(); // Now assign it let assign = AssignTask::new( task_id, UserName::try_new("bob").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); let result = executor.execute(&assign).await.unwrap(); // Should write to both task and user streams assert_eq!(result.events_written.len(), 3); // Assigned + UserAssigned + Workload assert_eq!(result.streams_affected.len(), 2); // task and user streams } } }
Running the Demo
Update the demo in src/main.rs
:
#![allow(unused)] fn main() { async fn run_demo<ES: EventStore>(executor: CommandExecutor<ES>) -> Result<(), Box<dyn std::error::Error>> where ES::Event: From<SystemEvent> + TryInto<SystemEvent>, { println!("🚀 EventCore Task Management Demo"); println!("================================\n"); // Create a task let task_id = TaskId::new(); println!("1. Creating task {}...", task_id); let create = CreateTask::new( task_id, TaskTitle::try_new("Build awesome features").unwrap(), TaskDescription::try_new("Use EventCore to build great things").unwrap(), UserName::try_new("alice").unwrap(), )?; let result = executor.execute(&create).await?; println!(" ✅ Task created with {} event(s)\n", result.events_written.len()); // Assign the task println!("2. Assigning task to Bob..."); let assign = AssignTask::new( task_id, UserName::try_new("bob").unwrap(), UserName::try_new("alice").unwrap(), )?; let result = executor.execute(&assign).await?; println!(" ✅ Task assigned, {} stream(s) updated\n", result.streams_affected.len()); // Complete the task println!("3. Bob completes the task..."); let complete = CompleteTask { task_id: StreamId::from_static(&format!("task-{}", task_id)), user_id: StreamId::from_static("user-bob"), completed_by: UserName::try_new("bob").unwrap(), }; let result = executor.execute(&complete).await?; println!(" ✅ Task completed!\n", ); println!("Demo complete! 🎉"); Ok(()) } }
Key Takeaways
- Macro Magic:
#[derive(Command)]
eliminates boilerplate - Stream Declaration:
#[stream]
attributes declare what you need - Type Safety: Can only write to declared streams
- Multi-Stream: Natural support for operations across entities
- Business Logic: Clear separation in
handle()
method - State Building:
apply()
reconstructs state from events
Common Patterns
Conditional Stream Access
Sometimes you need streams based on runtime data:
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, // Note: not unused ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Discover we need another stream if state.requires_manager_approval { let manager_stream = StreamId::from_static("user-manager"); stream_resolver.add_streams(vec![manager_stream]); // EventCore will re-execute with the additional stream } // Continue with logic... } }
Batch Operations
For operations on multiple items:
#![allow(unused)] fn main() { let mut events = Vec::new(); for task_id in &self.task_ids { events.push(StreamWrite::new( &read_streams, task_id.clone(), TaskEvent::BatchUpdated { /* ... */ } )?); } Ok(events) }
Summary
We’ve implemented our core commands using EventCore’s macro system:
- ✅ Single-stream commands (CreateTask)
- ✅ Multi-stream commands (AssignTask)
- ✅ Complex business logic (CompleteTask)
- ✅ Type-safe stream access
- ✅ Comprehensive testing
Next, let’s build projections to query our data →
Chapter 2.4: Working with Projections
Projections transform your event streams into read models optimized for queries. This chapter shows how to build projections that answer specific questions about your data.
What Are Projections?
Projections are read-side views built from events. They:
- Listen to event streams
- Apply events to build state
- Optimize for specific queries
- Can be rebuilt from scratch
Think of projections as materialized views that are kept up-to-date by processing events.
Our First Projection: User Task List
Let’s build a projection that answers: “What tasks does each user have?”
src/projections/task_list.rs
#![allow(unused)] fn main() { use crate::domain::{events::*, types::*}; use eventcore::prelude::*; use eventcore::cqrs::{CqrsProjection, ProjectionError}; use std::collections::{HashMap, HashSet}; use serde::{Serialize, Deserialize}; use chrono::{DateTime, Utc}; /// A summary of a task for display #[derive(Debug, Clone, Serialize, Deserialize)] pub struct TaskSummary { pub id: TaskId, pub title: String, pub status: TaskStatus, pub priority: Priority, pub assigned_at: DateTime<Utc>, pub completed_at: Option<DateTime<Utc>>, } /// Projection that maintains task lists for each user #[derive(Default, Clone, Serialize, Deserialize)] pub struct UserTaskListProjection { /// Tasks indexed by user tasks_by_user: HashMap<UserName, HashMap<TaskId, TaskSummary>>, /// Reverse index: task to user task_assignments: HashMap<TaskId, UserName>, /// Task details cache task_details: HashMap<TaskId, TaskDetails>, } #[derive(Clone, Serialize, Deserialize)] struct TaskDetails { title: String, created_at: DateTime<Utc>, priority: Priority, } impl UserTaskListProjection { /// Get all tasks for a user pub fn get_user_tasks(&self, user: &UserName) -> Vec<TaskSummary> { self.tasks_by_user .get(user) .map(|tasks| { let mut list: Vec<_> = tasks.values().cloned().collect(); // Sort by priority (high to low) then by assigned date list.sort_by(|a, b| { b.priority.cmp(&a.priority) .then_with(|| a.assigned_at.cmp(&b.assigned_at)) }); list }) .unwrap_or_default() } /// Get active task count for a user pub fn get_active_task_count(&self, user: &UserName) -> usize { self.tasks_by_user .get(user) .map(|tasks| { tasks.values() .filter(|t| matches!(t.status, TaskStatus::Open | TaskStatus::InProgress)) .count() }) .unwrap_or(0) } /// Get task by ID pub fn get_task(&self, task_id: &TaskId) -> Option<&TaskSummary> { self.task_assignments .get(task_id) .and_then(|user| { self.tasks_by_user .get(user)? .get(task_id) }) } } #[async_trait] impl CqrsProjection for UserTaskListProjection { type Event = SystemEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match &event.payload { SystemEvent::Task(task_event) => { self.apply_task_event(task_event, &event.occurred_at)?; } SystemEvent::User(_) => { // User events handled separately if needed } } Ok(()) } fn name(&self) -> &str { "user_task_list" } } impl UserTaskListProjection { fn apply_task_event( &mut self, event: &TaskEvent, occurred_at: &DateTime<Utc> ) -> Result<(), ProjectionError> { match event { TaskEvent::Created { task_id, title, creator, .. } => { // Cache task details for later use self.task_details.insert( *task_id, TaskDetails { title: title.to_string(), created_at: *occurred_at, priority: Priority::default(), } ); } TaskEvent::Assigned { task_id, assignee, assigned_at, .. } => { // Remove from previous assignee if any if let Some(previous_user) = self.task_assignments.get(task_id) { if let Some(user_tasks) = self.tasks_by_user.get_mut(previous_user) { user_tasks.remove(task_id); } } // Add to new assignee let task_details = self.task_details.get(task_id) .ok_or_else(|| ProjectionError::InvalidState( format!("Task {} not found in cache", task_id) ))?; let summary = TaskSummary { id: *task_id, title: task_details.title.clone(), status: TaskStatus::Open, priority: task_details.priority, assigned_at: *assigned_at, completed_at: None, }; self.tasks_by_user .entry(assignee.clone()) .or_default() .insert(*task_id, summary); self.task_assignments.insert(*task_id, assignee.clone()); } TaskEvent::Unassigned { task_id, previous_assignee, .. } => { // Remove from assignee if let Some(user_tasks) = self.tasks_by_user.get_mut(previous_assignee) { user_tasks.remove(task_id); } self.task_assignments.remove(task_id); } TaskEvent::Started { task_id, .. } => { // Update status if let Some(user) = self.task_assignments.get(task_id) { if let Some(task) = self.tasks_by_user .get_mut(user) .and_then(|tasks| tasks.get_mut(task_id)) { task.status = TaskStatus::InProgress; } } } TaskEvent::Completed { task_id, completed_at, .. } => { // Update status and completion time if let Some(user) = self.task_assignments.get(task_id) { if let Some(task) = self.tasks_by_user .get_mut(user) .and_then(|tasks| tasks.get_mut(task_id)) { task.status = TaskStatus::Completed; task.completed_at = Some(*completed_at); } } } TaskEvent::PriorityChanged { task_id, new_priority, .. } => { // Update priority in cache and summary if let Some(details) = self.task_details.get_mut(task_id) { details.priority = *new_priority; } if let Some(user) = self.task_assignments.get(task_id) { if let Some(task) = self.tasks_by_user .get_mut(user) .and_then(|tasks| tasks.get_mut(task_id)) { task.priority = *new_priority; } } } _ => {} // Handle other events as needed } Ok(()) } } }
Statistics Projection
Let’s build another projection for team statistics:
src/projections/statistics.rs
#![allow(unused)] fn main() { use crate::domain::{events::*, types::*}; use eventcore::prelude::*; use eventcore::cqrs::{CqrsProjection, ProjectionError}; use std::collections::HashMap; use serde::{Serialize, Deserialize}; use chrono::{DateTime, Utc, Datelike}; /// Team statistics projection #[derive(Default, Clone, Serialize, Deserialize)] pub struct TeamStatisticsProjection { /// Total tasks created pub total_tasks_created: u64, /// Tasks by status pub tasks_by_status: HashMap<TaskStatus, u64>, /// Tasks by priority pub tasks_by_priority: HashMap<Priority, u64>, /// User statistics pub user_stats: HashMap<UserName, UserStatistics>, /// Daily completion rates pub daily_completions: HashMap<String, u64>, // Date string -> count /// Average completion time in hours pub avg_completion_hours: f64, /// Completion times for average calculation completion_times: Vec<f64>, } #[derive(Default, Clone, Serialize, Deserialize)] pub struct UserStatistics { pub tasks_assigned: u64, pub tasks_completed: u64, pub tasks_in_progress: u64, pub total_comments: u64, pub avg_completion_hours: f64, completion_times: Vec<f64>, } impl TeamStatisticsProjection { /// Get completion rate percentage pub fn completion_rate(&self) -> f64 { if self.total_tasks_created == 0 { return 0.0; } let completed = self.tasks_by_status .get(&TaskStatus::Completed) .copied() .unwrap_or(0); (completed as f64 / self.total_tasks_created as f64) * 100.0 } /// Get most productive user pub fn most_productive_user(&self) -> Option<(&UserName, u64)> { self.user_stats .iter() .max_by_key(|(_, stats)| stats.tasks_completed) .map(|(user, stats)| (user, stats.tasks_completed)) } /// Get workload distribution pub fn workload_distribution(&self) -> Vec<(UserName, f64)> { let total_active: u64 = self.user_stats .values() .map(|s| s.tasks_in_progress) .sum(); if total_active == 0 { return vec![]; } self.user_stats .iter() .filter(|(_, stats)| stats.tasks_in_progress > 0) .map(|(user, stats)| { let percentage = (stats.tasks_in_progress as f64 / total_active as f64) * 100.0; (user.clone(), percentage) }) .collect() } } #[async_trait] impl CqrsProjection for TeamStatisticsProjection { type Event = SystemEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match &event.payload { SystemEvent::Task(task_event) => { self.apply_task_event(task_event, &event.occurred_at)?; } SystemEvent::User(user_event) => { self.apply_user_event(user_event)?; } } Ok(()) } fn name(&self) -> &str { "team_statistics" } } impl TeamStatisticsProjection { fn apply_task_event( &mut self, event: &TaskEvent, occurred_at: &DateTime<Utc> ) -> Result<(), ProjectionError> { match event { TaskEvent::Created { .. } => { self.total_tasks_created += 1; *self.tasks_by_status.entry(TaskStatus::Open).or_insert(0) += 1; *self.tasks_by_priority.entry(Priority::default()).or_insert(0) += 1; } TaskEvent::Assigned { assignee, .. } => { let stats = self.user_stats.entry(assignee.clone()).or_default(); stats.tasks_assigned += 1; stats.tasks_in_progress += 1; } TaskEvent::Completed { task_id, completed_by, completed_at, .. } => { // Update status counts *self.tasks_by_status.entry(TaskStatus::Open).or_insert(0) = self.tasks_by_status.get(&TaskStatus::Open).unwrap_or(&0).saturating_sub(1); *self.tasks_by_status.entry(TaskStatus::Completed).or_insert(0) += 1; // Update user stats let stats = self.user_stats.entry(completed_by.clone()).or_default(); stats.tasks_completed += 1; stats.tasks_in_progress = stats.tasks_in_progress.saturating_sub(1); // Track daily completions let date_key = completed_at.format("%Y-%m-%d").to_string(); *self.daily_completions.entry(date_key).or_insert(0) += 1; // Calculate completion time (would need task creation time) // For demo, using a placeholder let completion_hours = 24.0; // In real app, calculate from creation self.completion_times.push(completion_hours); stats.completion_times.push(completion_hours); // Update averages self.avg_completion_hours = self.completion_times.iter().sum::<f64>() / self.completion_times.len() as f64; stats.avg_completion_hours = stats.completion_times.iter().sum::<f64>() / stats.completion_times.len() as f64; } TaskEvent::CommentAdded { author, .. } => { let stats = self.user_stats.entry(author.clone()).or_default(); stats.total_comments += 1; } TaskEvent::PriorityChanged { old_priority, new_priority, .. } => { *self.tasks_by_priority.entry(*old_priority).or_insert(0) = self.tasks_by_priority.get(old_priority).unwrap_or(&0).saturating_sub(1); *self.tasks_by_priority.entry(*new_priority).or_insert(0) += 1; } _ => {} } Ok(()) } fn apply_user_event(&mut self, event: &UserEvent) -> Result<(), ProjectionError> { // Handle user-specific events if needed Ok(()) } } }
Running Projections
EventCore provides infrastructure for running projections:
Setting Up Projection Runner
#![allow(unused)] fn main() { use eventcore::prelude::*; use eventcore::cqrs::{ CqrsProjectionRunner, InMemoryCheckpointStore, InMemoryReadModelStore, ProjectionRunnerConfig, }; use eventcore_memory::InMemoryEventStore; async fn setup_projections() -> Result<(), Box<dyn std::error::Error>> { // Event store let event_store = InMemoryEventStore::<SystemEvent>::new(); // Projection infrastructure let checkpoint_store = InMemoryCheckpointStore::new(); let read_model_store = InMemoryReadModelStore::new(); // Create projection let mut task_list_projection = UserTaskListProjection::default(); // Configure runner let config = ProjectionRunnerConfig::default() .with_batch_size(100) .with_checkpoint_frequency(50); // Create and start runner let runner = CqrsProjectionRunner::new( event_store.clone(), checkpoint_store, read_model_store.clone(), config, ); // Run projection runner.run_projection(&mut task_list_projection).await?; // Query the projection let alice_tasks = task_list_projection.get_user_tasks( &UserName::try_new("alice").unwrap() ); println!("Alice has {} tasks", alice_tasks.len()); Ok(()) } }
Querying Projections
EventCore provides a query builder for complex queries:
#![allow(unused)] fn main() { use eventcore::cqrs::{QueryBuilder, FilterOperator}; async fn query_tasks( projection: &UserTaskListProjection, ) -> Result<(), Box<dyn std::error::Error>> { let alice = UserName::try_new("alice").unwrap(); // Get all tasks for Alice let all_tasks = projection.get_user_tasks(&alice); // Filter high priority tasks let high_priority: Vec<_> = all_tasks .iter() .filter(|t| t.priority == Priority::High) .collect(); // Get active tasks only let active_tasks: Vec<_> = all_tasks .iter() .filter(|t| matches!(t.status, TaskStatus::Open | TaskStatus::InProgress)) .collect(); println!("Alice's tasks:"); println!("- Total: {}", all_tasks.len()); println!("- High priority: {}", high_priority.len()); println!("- Active: {}", active_tasks.len()); Ok(()) } }
Real-time Updates
Projections can be updated in real-time as events are written:
#![allow(unused)] fn main() { use tokio::sync::RwLock; use std::sync::Arc; struct ProjectionService { projection: Arc<RwLock<UserTaskListProjection>>, event_store: Arc<dyn EventStore>, } impl ProjectionService { async fn start_real_time_updates(self) { let mut last_position = EventId::default(); loop { // Poll for new events let events = self.event_store .read_all_events(ReadOptions::default().after(last_position)) .await .unwrap_or_default(); if !events.is_empty() { let mut projection = self.projection.write().await; for event in &events { if let Err(e) = projection.apply(event).await { eprintln!("Projection error: {}", e); } last_position = event.id.clone(); } } // Sleep before next poll tokio::time::sleep(tokio::time::Duration::from_millis(100)).await; } } } }
Rebuilding Projections
One of the powerful features of event sourcing is the ability to rebuild projections:
#![allow(unused)] fn main() { use eventcore::cqrs::{RebuildCoordinator, RebuildStrategy}; async fn rebuild_projection( event_store: Arc<dyn EventStore>, projection: &mut UserTaskListProjection, ) -> Result<(), Box<dyn std::error::Error>> { let coordinator = RebuildCoordinator::new(event_store); // Clear existing state *projection = UserTaskListProjection::default(); // Rebuild from beginning let strategy = RebuildStrategy::FromBeginning; coordinator.rebuild(projection, strategy).await?; println!("Projection rebuilt successfully"); Ok(()) } }
Testing Projections
Testing projections is straightforward:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use eventcore::testing::prelude::*; #[tokio::test] async fn test_user_task_list_projection() { let mut projection = UserTaskListProjection::default(); // Create test events let task_id = TaskId::new(); let alice = UserName::try_new("alice").unwrap(); // Apply created event let created_event = create_test_event( StreamId::from_static("task-123"), SystemEvent::Task(TaskEvent::Created { task_id, title: TaskTitle::try_new("Test").unwrap(), description: TaskDescription::try_new("").unwrap(), creator: alice.clone(), created_at: Utc::now(), }) ); projection.apply(&created_event).await.unwrap(); // Apply assigned event let assigned_event = create_test_event( StreamId::from_static("task-123"), SystemEvent::Task(TaskEvent::Assigned { task_id, assignee: alice.clone(), assigned_by: alice.clone(), assigned_at: Utc::now(), }) ); projection.apply(&assigned_event).await.unwrap(); // Verify let tasks = projection.get_user_tasks(&alice); assert_eq!(tasks.len(), 1); assert_eq!(tasks[0].id, task_id); assert_eq!(tasks[0].status, TaskStatus::Open); } #[tokio::test] async fn test_statistics_projection() { let mut projection = TeamStatisticsProjection::default(); // Apply multiple events for i in 0..10 { let event = create_test_event( StreamId::from_static(&format!("task-{}", i)), SystemEvent::Task(TaskEvent::Created { task_id: TaskId::new(), title: TaskTitle::try_new("Task").unwrap(), description: TaskDescription::try_new("").unwrap(), creator: UserName::try_new("alice").unwrap(), created_at: Utc::now(), }) ); projection.apply(&event).await.unwrap(); } assert_eq!(projection.total_tasks_created, 10); assert_eq!(projection.completion_rate(), 0.0); } } }
Performance Considerations
1. Batch Processing
Process events in batches for better performance:
#![allow(unused)] fn main() { let config = ProjectionRunnerConfig::default() .with_batch_size(1000) // Process 1000 events at a time .with_checkpoint_frequency(100); // Checkpoint every 100 events }
2. Selective Projections
Only process relevant streams:
#![allow(unused)] fn main() { impl CqrsProjection for UserTaskListProjection { fn relevant_streams(&self) -> Vec<&str> { vec!["task-*", "user-*"] // Only process task and user streams } } }
3. Caching
Use in-memory caching for frequently accessed data:
#![allow(unused)] fn main() { struct CachedProjection { inner: UserTaskListProjection, cache: HashMap<UserName, Vec<TaskSummary>>, cache_ttl: Duration, } }
Common Patterns
1. Denormalized Views
Projections often denormalize data for query performance:
#![allow(unused)] fn main() { // Instead of joins, store everything needed struct TaskView { task_id: TaskId, title: String, assignee_name: String, // Denormalized assignee_email: String, // Denormalized creator_name: String, // Denormalized // ... all data needed for display } }
2. Multiple Projections
Create different projections for different query needs:
UserTaskListProjection
- For user-specific viewsTeamDashboardProjection
- For manager overviewSearchIndexProjection
- For full-text searchReportingProjection
- For analytics
3. Event Enrichment
Projections can enrich events with additional context:
#![allow(unused)] fn main() { async fn enrich_event(&self, event: &TaskEvent) -> EnrichedTaskEvent { // Add user details, timestamps, etc. } }
Summary
Projections in EventCore:
- ✅ Transform events into query-optimized read models
- ✅ Can be rebuilt from events at any time
- ✅ Support real-time updates
- ✅ Enable complex queries without affecting write performance
- ✅ Allow multiple views of the same data
Key benefits:
- Flexibility: Change read models without touching events
- Performance: Optimized for specific queries
- Evolution: Add new projections as needs change
- Testing: Easy to test with synthetic events
Next, let’s look at testing your application →
Chapter 2.5: Testing Your Application
Testing event-sourced systems is actually easier than testing traditional CRUD applications. With EventCore, you can test commands, projections, and entire workflows using deterministic event streams.
Testing Philosophy
EventCore testing follows these principles:
- Test Behavior, Not Implementation - Focus on what events are produced
- Use Real Events - Test with actual domain events, not mocks
- Deterministic Tests - Events provide repeatable test scenarios
- Fast Feedback - In-memory event store for rapid testing
Testing Commands
Basic Command Testing
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use eventcore::prelude::*; use eventcore::testing::prelude::*; use eventcore_memory::InMemoryEventStore; #[tokio::test] async fn test_create_task_success() { // Arrange let store = InMemoryEventStore::<SystemEvent>::new(); let executor = CommandExecutor::new(store); let task_id = TaskId::new(); let command = CreateTask::new( task_id, TaskTitle::try_new("Write tests").unwrap(), TaskDescription::try_new("Add comprehensive test coverage").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); // Act let result = executor.execute(&command).await; // Assert assert!(result.is_ok()); let execution_result = result.unwrap(); assert_eq!(execution_result.events_written.len(), 1); // Verify the event match &execution_result.events_written[0] { SystemEvent::Task(TaskEvent::Created { title, creator, .. }) => { assert_eq!(title.as_ref(), "Write tests"); assert_eq!(creator.as_ref(), "alice"); } _ => panic!("Expected TaskCreated event"), } } #[tokio::test] async fn test_create_duplicate_task_fails() { // Arrange let store = InMemoryEventStore::<SystemEvent>::new(); let executor = CommandExecutor::new(store); let task_id = TaskId::new(); let command = CreateTask::new( task_id, TaskTitle::try_new("Task").unwrap(), TaskDescription::try_new("").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); // Act - Create first time executor.execute(&command).await.unwrap(); // Act - Try to create again let result = executor.execute(&command).await; // Assert assert!(result.is_err()); match result.unwrap_err() { CommandError::ValidationFailed(msg) => { assert!(msg.contains("already exists")); } _ => panic!("Expected ValidationFailed error"), } } } }
Testing Multi-Stream Commands
#![allow(unused)] fn main() { #[tokio::test] async fn test_assign_task_multi_stream() { // Arrange let store = InMemoryEventStore::<SystemEvent>::new(); let executor = CommandExecutor::new(store); // Create a task first let task_id = TaskId::new(); let create = CreateTask::new( task_id, TaskTitle::try_new("Multi-stream test").unwrap(), TaskDescription::try_new("").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); executor.execute(&create).await.unwrap(); // Assign the task let assign = AssignTask::new( task_id, UserName::try_new("bob").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); // Act let result = executor.execute(&assign).await.unwrap(); // Assert - Should affect both task and user streams assert_eq!(result.streams_affected.len(), 2); assert!(result.streams_affected.contains(&StreamId::from_static(&format!("task-{}", task_id)))); assert!(result.streams_affected.contains(&StreamId::from_static("user-bob"))); // Verify events in both streams let task_events = store.read_stream( &StreamId::from_static(&format!("task-{}", task_id)), ReadOptions::default() ).await.unwrap(); let user_events = store.read_stream( &StreamId::from_static("user-bob"), ReadOptions::default() ).await.unwrap(); assert_eq!(task_events.events.len(), 2); // Created + Assigned assert_eq!(user_events.events.len(), 2); // TaskAssigned + WorkloadUpdated } }
Testing Projections
Unit Testing Projections
#![allow(unused)] fn main() { #[tokio::test] async fn test_user_task_list_projection() { use eventcore::testing::builders::*; // Arrange let mut projection = UserTaskListProjection::default(); let task_id = TaskId::new(); let alice = UserName::try_new("alice").unwrap(); // Build test events let events = vec![ StoredEventBuilder::new() .with_stream_id(StreamId::from_static("task-123")) .with_payload(SystemEvent::Task(TaskEvent::Created { task_id, title: TaskTitle::try_new("Test task").unwrap(), description: TaskDescription::try_new("").unwrap(), creator: alice.clone(), created_at: Utc::now(), })) .build(), StoredEventBuilder::new() .with_stream_id(StreamId::from_static("task-123")) .with_payload(SystemEvent::Task(TaskEvent::Assigned { task_id, assignee: alice.clone(), assigned_by: alice.clone(), assigned_at: Utc::now(), })) .build(), ]; // Act for event in events { projection.apply(&event).await.unwrap(); } // Assert let tasks = projection.get_user_tasks(&alice); assert_eq!(tasks.len(), 1); assert_eq!(tasks[0].id, task_id); assert_eq!(tasks[0].status, TaskStatus::Open); } }
Testing Projection Accuracy
#![allow(unused)] fn main() { #[tokio::test] async fn test_statistics_projection_accuracy() { let mut projection = TeamStatisticsProjection::default(); // Create a series of events let events = create_test_scenario(TestScenario { tasks_created: 10, tasks_assigned: 8, tasks_completed: 5, users: vec!["alice", "bob", "charlie"], }); // Apply all events for event in events { projection.apply(&event).await.unwrap(); } // Verify statistics assert_eq!(projection.total_tasks_created, 10); assert_eq!(projection.tasks_by_status[&TaskStatus::Completed], 5); assert_eq!(projection.tasks_by_status[&TaskStatus::Open], 2); // 10 - 8 assigned assert_eq!(projection.tasks_by_status[&TaskStatus::InProgress], 3); // 8 - 5 completed // Verify completion rate assert_eq!(projection.completion_rate(), 50.0); // 5/10 * 100 } }
Property-Based Testing
EventCore works well with property-based testing:
#![allow(unused)] fn main() { use proptest::prelude::*; proptest! { #[test] fn task_assignment_maintains_consistency( task_count in 1..50usize, user_count in 1..10usize, assignment_ratio in 0.0..1.0f64, ) { // Property: Total assigned tasks equals sum of user assignments let runtime = tokio::runtime::Runtime::new().unwrap(); runtime.block_on(async { let mut projection = UserTaskListProjection::default(); let users = generate_users(user_count); let tasks = generate_tasks(task_count); // Assign tasks based on ratio let assignments = assign_tasks_to_users(&tasks, &users, assignment_ratio); // Apply events for event in assignments { projection.apply(&event).await.unwrap(); } // Verify consistency let total_assigned: usize = users.iter() .map(|u| projection.get_user_tasks(u).len()) .sum(); let expected_assigned = (task_count as f64 * assignment_ratio) as usize; assert_eq!(total_assigned, expected_assigned); }); } } }
Integration Testing
Testing Complete Workflows
#![allow(unused)] fn main() { #[tokio::test] async fn test_complete_task_workflow() { // Setup let store = InMemoryEventStore::<SystemEvent>::new(); let executor = CommandExecutor::new(store.clone()); let mut projection = UserTaskListProjection::default(); // Execute workflow let task_id = TaskId::new(); let alice = UserName::try_new("alice").unwrap(); let bob = UserName::try_new("bob").unwrap(); // 1. Create task let create = CreateTask::new( task_id, TaskTitle::try_new("Complete workflow").unwrap(), TaskDescription::try_new("Test the entire flow").unwrap(), alice.clone(), ).unwrap(); executor.execute(&create).await.unwrap(); // 2. Assign to Bob let assign = AssignTask::new(task_id, bob.clone(), alice.clone()).unwrap(); executor.execute(&assign).await.unwrap(); // 3. Bob completes the task let complete = CompleteTask { task_id: StreamId::from_static(&format!("task-{}", task_id)), user_id: StreamId::from_static(&format!("user-{}", bob)), completed_by: bob.clone(), }; executor.execute(&complete).await.unwrap(); // Update projection with all events let all_events = store.read_all_events(ReadOptions::default()).await.unwrap(); for event in all_events { projection.apply(&event).await.unwrap(); } // Verify end state let bob_tasks = projection.get_user_tasks(&bob); assert_eq!(bob_tasks.len(), 1); assert_eq!(bob_tasks[0].status, TaskStatus::Completed); assert!(bob_tasks[0].completed_at.is_some()); } }
Testing Helpers
EventCore provides testing utilities:
Event Builders
#![allow(unused)] fn main() { use eventcore::testing::builders::*; fn create_test_event(payload: SystemEvent) -> StoredEvent<SystemEvent> { StoredEventBuilder::new() .with_id(EventId::new()) .with_stream_id(StreamId::from_static("test-stream")) .with_version(EventVersion::new(1)) .with_payload(payload) .with_metadata( EventMetadataBuilder::new() .with_user_id(UserId::from("test-user")) .build() ) .build() } }
Test Scenarios
#![allow(unused)] fn main() { use eventcore::testing::fixtures::*; struct TaskScenario; impl TestScenario for TaskScenario { type Event = SystemEvent; fn events(&self) -> Vec<EventToWrite<Self::Event>> { vec![ // Series of events that create a test scenario create_task_event("task-1", "Test Task 1"), assign_task_event("task-1", "alice"), complete_task_event("task-1", "alice"), ] } } }
Assertion Helpers
#![allow(unused)] fn main() { use eventcore::testing::assertions::*; #[tokio::test] async fn test_event_ordering() { let events = vec![/* ... */]; // Assert events are properly ordered assert_events_ordered(&events); // Assert no duplicate event IDs assert_unique_event_ids(&events); // Assert version progression assert_stream_version_progression(&events, &StreamId::from_static("test")); } }
Testing Error Cases
Command Validation Errors
#![allow(unused)] fn main() { #[tokio::test] async fn test_invalid_command_inputs() { let executor = CommandExecutor::new(InMemoryEventStore::<SystemEvent>::new()); // Test empty title let result = TaskTitle::try_new(""); assert!(result.is_err()); // Test whitespace-only title let result = TaskTitle::try_new(" "); assert!(result.is_err()); // Test overly long description let long_desc = "x".repeat(3000); let result = TaskDescription::try_new(&long_desc); assert!(result.is_err()); } }
Concurrency Conflicts
#![allow(unused)] fn main() { #[tokio::test] async fn test_concurrent_modifications() { let store = InMemoryEventStore::<SystemEvent>::new(); let executor = CommandExecutor::new(store); // Create a task let task_id = TaskId::new(); let create = CreateTask::new( task_id, TaskTitle::try_new("Concurrent test").unwrap(), TaskDescription::try_new("").unwrap(), UserName::try_new("alice").unwrap(), ).unwrap(); executor.execute(&create).await.unwrap(); // Simulate concurrent updates let assign1 = AssignTask::new(task_id, UserName::try_new("bob").unwrap(), UserName::try_new("alice").unwrap()).unwrap(); let assign2 = AssignTask::new(task_id, UserName::try_new("charlie").unwrap(), UserName::try_new("alice").unwrap()).unwrap(); // Execute both concurrently let (result1, result2) = tokio::join!( executor.execute(&assign1), executor.execute(&assign2) ); // One should succeed, one should retry and then succeed assert!(result1.is_ok() || result2.is_ok()); } }
Performance Testing
#![allow(unused)] fn main() { #[tokio::test] #[ignore] // Run with --ignored flag async fn test_high_volume_event_processing() { use std::time::Instant; let mut projection = UserTaskListProjection::default(); let event_count = 10_000; // Generate events let events: Vec<_> = (0..event_count) .map(|i| create_task_assigned_event(i)) .collect(); // Measure processing time let start = Instant::now(); for event in events { projection.apply(&event).await.unwrap(); } let duration = start.elapsed(); let events_per_second = event_count as f64 / duration.as_secs_f64(); println!("Processed {} events in {:?}", event_count, duration); println!("Rate: {:.2} events/second", events_per_second); // Assert reasonable performance assert!(events_per_second > 1000.0, "Projection too slow"); } }
Test Organization
Structure your tests for clarity:
tests/
├── unit/
│ ├── commands/
│ │ ├── create_task_test.rs
│ │ ├── assign_task_test.rs
│ │ └── complete_task_test.rs
│ └── projections/
│ ├── task_list_test.rs
│ └── statistics_test.rs
├── integration/
│ ├── workflows/
│ │ └── task_lifecycle_test.rs
│ └── projections/
│ └── real_time_updates_test.rs
└── performance/
└── high_volume_test.rs
Debugging Tests
EventCore provides excellent debugging support:
#![allow(unused)] fn main() { #[tokio::test] async fn test_with_debugging() { // Enable debug logging let _ = env_logger::builder() .filter_level(log::LevelFilter::Debug) .try_init(); let store = InMemoryEventStore::<SystemEvent>::new(); // Print all events after execution let events = store.read_all_events(ReadOptions::default()).await.unwrap(); for event in &events { println!("Event: {:?}", event); println!(" Stream: {}", event.stream_id); println!(" Version: {}", event.version); println!(" Payload: {:?}", event.payload); println!(" Metadata: {:?}", event.metadata); println!(); } } }
Summary
Testing EventCore applications is straightforward because:
- ✅ Events are deterministic - Same events always produce same state
- ✅ No mocking needed - Use real events and in-memory stores
- ✅ Fast feedback - In-memory testing is instantaneous
- ✅ Complete scenarios - Test entire workflows easily
- ✅ Time travel - Test any historical state
Best practices:
- Test commands by verifying produced events
- Test projections by applying known events
- Use property-based testing for invariants
- Test complete workflows for integration
- Keep tests fast with in-memory stores
You’ve now completed the Getting Started tutorial! You can:
- Model domains with events
- Implement type-safe commands
- Build projections for queries
- Test everything thoroughly
Continue to Part 3: Core Concepts for deeper understanding →
Part 3: Core Concepts
This part provides a deep dive into EventCore’s core concepts and design principles. Understanding these concepts will help you build robust, scalable event-sourced systems.
Chapters in This Part
- Commands and the Macro System - Deep dive into command implementation
- Events and Event Stores - Understanding events and storage
- State Reconstruction - How EventCore rebuilds state from events
- Multi-Stream Atomicity - The key innovation of EventCore
- Error Handling - Comprehensive error handling strategies
What You’ll Learn
- How the
#[derive(Command)]
macro works internally - Event design principles and best practices
- The state reconstruction algorithm
- How multi-stream atomicity is guaranteed
- Error handling patterns for production systems
Prerequisites
- Completed Part 2: Getting Started
- Basic understanding of Rust macros helpful
- Familiarity with database transactions
Time to Complete
- Reading: ~30 minutes
- With examples: ~1 hour
Ready to dive deep? Let’s start with Commands and the Macro System →
Chapter 3.1: Commands and the Macro System
This chapter explores how EventCore’s command system works, focusing on the #[derive(Command)]
macro that eliminates boilerplate while maintaining type safety.
The Command Pattern
Commands in EventCore represent user intentions - things that should happen in your system. They:
- Declare required streams - What data they need access to
- Validate business rules - Ensure operations are allowed
- Generate events - Record what actually happened
- Maintain consistency - All changes are atomic
Anatomy of a Command
Let’s dissect a command to understand each part:
#![allow(unused)] fn main() { #[derive(Command, Clone)] // 1. Derive macro generates boilerplate struct TransferMoney { #[stream] // 2. Declares this field is a stream from_account: StreamId, #[stream] to_account: StreamId, amount: Money, // 3. Regular fields for command data reference: String, } }
What the Macro Generates
The #[derive(Command)]
macro generates several things:
#![allow(unused)] fn main() { // 1. A phantom type for compile-time stream tracking #[derive(Debug, Clone, Copy, Default)] pub struct TransferMoneyStreamSet; // 2. Implementation of CommandStreams trait impl CommandStreams for TransferMoney { type StreamSet = TransferMoneyStreamSet; fn read_streams(&self) -> Vec<StreamId> { vec![ self.from_account.clone(), self.to_account.clone(), ] } } // 3. Blanket implementation gives you Command trait // (because TransferMoney also implements CommandLogic) }
The Two-Trait Design
EventCore splits the Command pattern into two traits:
CommandStreams (Generated)
Handles infrastructure concerns:
#![allow(unused)] fn main() { pub trait CommandStreams: Send + Sync + Clone { /// Phantom type for compile-time stream access control type StreamSet: Send + Sync; /// Returns the streams this command needs to read fn read_streams(&self) -> Vec<StreamId>; } }
CommandLogic (You Implement)
Contains your domain logic:
#![allow(unused)] fn main() { #[async_trait] pub trait CommandLogic: CommandStreams { /// State type that will be reconstructed from events type State: Default + Send + Sync; /// Event type this command produces type Event: Send + Sync; /// Apply an event to update state (event sourcing fold) fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>); /// Business logic that validates and produces events async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>>; } }
Stream Declaration Patterns
Basic Stream Declaration
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct UpdateProfile { #[stream] user_id: StreamId, // Single stream } }
Multiple Streams
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ProcessOrder { #[stream] order_id: StreamId, #[stream] customer_id: StreamId, #[stream] inventory_id: StreamId, #[stream] payment_id: StreamId, } }
Stream Arrays (Planned Feature)
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct BulkUpdate { #[stream("items")] item_ids: Vec<StreamId>, // Multiple streams of same type } }
Conditional Streams
For streams discovered at runtime:
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Discover we need another stream based on state if state.requires_approval { let approver_stream = StreamId::from_static("approver-stream"); stream_resolver.add_streams(vec![approver_stream]); // EventCore will re-execute with the additional stream } // Continue with logic... } }
Type-Safe Stream Access
The ReadStreams
type ensures you can only write to declared streams:
#![allow(unused)] fn main() { // In your handle method: async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // ✅ This works - from_account was declared with #[stream] let withdraw_event = StreamWrite::new( &read_streams, self.from_account.clone(), BankEvent::MoneyWithdrawn { amount: self.amount } )?; // ❌ This won't compile - random_stream wasn't declared let invalid = StreamWrite::new( &read_streams, StreamId::from_static("random-stream"), SomeEvent {} )?; // Compile error! Ok(vec![withdraw_event]) } }
State Reconstruction
The apply
method builds state by folding events:
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { BankEvent::AccountOpened { balance, .. } => { state.exists = true; state.balance = *balance; } BankEvent::MoneyDeposited { amount, .. } => { state.balance += amount; } BankEvent::MoneyWithdrawn { amount, .. } => { state.balance = state.balance.saturating_sub(*amount); } } } }
This is called for each event in sequence to rebuild current state.
Command Validation Patterns
Using the require!
Macro
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Business rule validation with good error messages require!( state.balance >= self.amount, "Insufficient funds: balance={}, requested={}", state.balance, self.amount ); require!( self.amount > 0, "Transfer amount must be positive" ); require!( self.from_account != self.to_account, "Cannot transfer to same account" ); // Generate events after validation passes Ok(vec![/* events */]) } }
Custom Validation Functions
#![allow(unused)] fn main() { impl TransferMoney { fn validate_transfer_limits(&self, state: &AccountState) -> CommandResult<()> { const DAILY_LIMIT: u64 = 10_000; let daily_total = state.transfers_today + self.amount; require!( daily_total <= DAILY_LIMIT, "Daily transfer limit exceeded: {} > {}", daily_total, DAILY_LIMIT ); Ok(()) } } }
Advanced Macro Features
Custom Stream Names
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ComplexCommand { #[stream(name = "primary")] main_stream: StreamId, #[stream(name = "secondary", optional = true)] optional_stream: Option<StreamId>, } }
Computed Streams
#![allow(unused)] fn main() { impl ComplexCommand { fn compute_streams(&self) -> Vec<StreamId> { let mut streams = vec![self.main_stream.clone()]; if let Some(ref optional) = self.optional_stream { streams.push(optional.clone()); } streams } } }
Command Composition
Commands can be composed for complex operations:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CompleteOrderWorkflow { #[stream] order_id: StreamId, // Sub-commands to execute payment: ProcessPayment, fulfillment: FulfillOrder, notification: SendNotification, } impl CommandLogic for CompleteOrderWorkflow { // ... implementation delegates to sub-commands } }
Performance Optimizations
Pre-computed State
For expensive computations:
#![allow(unused)] fn main() { #[derive(Default)] struct PrecomputedState { balance: u64, transaction_count: u64, daily_totals: HashMap<Date, u64>, // Pre-aggregated } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Update pre-computed values incrementally match &event.payload { BankEvent::MoneyTransferred { amount, date, .. } => { state.balance -= amount; *state.daily_totals.entry(*date).or_insert(0) += amount; } // ... } } }
Lazy State Loading
For large states:
#![allow(unused)] fn main() { struct LazyState { core: AccountCore, // Always loaded history: Option<Box<TransactionHistory>>, // Load on demand } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, mut state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Load history only if needed if self.requires_history_check() { state.load_history().await?; } // Continue... } }
Testing Commands
Unit Testing
#![allow(unused)] fn main() { #[test] fn test_command_stream_declaration() { let cmd = TransferMoney { from_account: StreamId::from_static("account-1"), to_account: StreamId::from_static("account-2"), amount: 100, reference: "test".to_string(), }; let streams = cmd.read_streams(); assert_eq!(streams.len(), 2); assert!(streams.contains(&StreamId::from_static("account-1"))); assert!(streams.contains(&StreamId::from_static("account-2"))); } }
Testing State Reconstruction
#![allow(unused)] fn main() { #[test] fn test_apply_events() { let cmd = TransferMoney { /* ... */ }; let mut state = AccountState::default(); let event = create_test_event(BankEvent::AccountOpened { balance: 1000, owner: "alice".to_string(), }); cmd.apply(&mut state, &event); assert_eq!(state.balance, 1000); assert!(state.exists); } }
Common Patterns
Idempotent Commands
Make commands idempotent by checking for duplicate operations:
#![allow(unused)] fn main() { async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check if operation was already performed if state.transfers.contains(&self.reference) { // Already processed - return success with no new events return Ok(vec![]); } // Process normally... } }
Command Versioning
Handle command evolution:
#![allow(unused)] fn main() { #[derive(Command, Clone)] #[command(version = 2)] struct TransferMoneyV2 { #[stream] from_account: StreamId, #[stream] to_account: StreamId, amount: Money, reference: String, // New in V2 category: TransferCategory, } }
Summary
The EventCore command system provides:
- ✅ Zero boilerplate through
#[derive(Command)]
- ✅ Type-safe stream access preventing invalid writes
- ✅ Clear separation between infrastructure and domain logic
- ✅ Flexible validation with the
require!
macro - ✅ Extensibility through the two-trait design
Key takeaways:
- Use
#[derive(Command)]
to eliminate boilerplate - Declare streams with
#[stream]
attributes - Implement business logic in
CommandLogic
- Leverage type safety for compile-time guarantees
- Commands are just data - easy to test and reason about
Next, let’s explore Events and Event Stores →
Chapter 3.2: Events and Event Stores
Events are the heart of EventCore - immutable records of things that happened in your system. This chapter explores event design, storage, and the guarantees EventCore provides.
What Makes a Good Event?
Events should be:
- Past Tense - They record what happened, not what should happen
- Immutable - Once written, events never change
- Self-Contained - Include all necessary data
- Business-Focused - Represent domain concepts, not technical details
Event Design Principles
#![allow(unused)] fn main() { // ❌ Bad: Technical focus, present tense, missing context #[derive(Serialize, Deserialize)] struct UpdateUser { id: String, data: HashMap<String, Value>, } // ✅ Good: Business focus, past tense, complete information #[derive(Serialize, Deserialize)] struct CustomerEmailChanged { customer_id: CustomerId, old_email: Email, new_email: Email, changed_by: UserId, changed_at: DateTime<Utc>, reason: EmailChangeReason, } }
Event Structure in EventCore
Core Event Types
#![allow(unused)] fn main() { /// Your domain event #[derive(Debug, Clone, Serialize, Deserialize)] pub struct OrderShipped { pub order_id: OrderId, pub tracking_number: TrackingNumber, pub carrier: Carrier, pub shipped_at: DateTime<Utc>, } /// Event ready to be written pub struct EventToWrite<E> { pub stream_id: StreamId, pub payload: E, pub metadata: Option<EventMetadata>, pub expected_version: ExpectedVersion, } /// Event as stored in the event store pub struct StoredEvent<E> { pub id: EventId, // UUIDv7 for global ordering pub stream_id: StreamId, // Which stream this belongs to pub version: EventVersion, // Position in the stream pub payload: E, // Your domain event pub metadata: EventMetadata, // Who, when, why pub occurred_at: DateTime<Utc>, // When it happened } }
Event IDs and Ordering
EventCore uses UUIDv7 for event IDs, providing:
#![allow(unused)] fn main() { // UUIDv7 properties: // - Globally unique // - Time-ordered (sortable) // - Millisecond precision timestamp // - No coordination required let event1 = EventId::new(); let event2 = EventId::new(); // Later events have higher IDs assert!(event2 > event1); // Extract timestamp let timestamp = event1.timestamp(); }
Event Metadata
Every event carries metadata for auditing and debugging:
#![allow(unused)] fn main() { pub struct EventMetadata { /// Who triggered this event pub user_id: Option<UserId>, /// Correlation ID for tracking across services pub correlation_id: CorrelationId, /// What caused this event (previous event ID) pub causation_id: Option<CausationId>, /// Custom metadata pub custom: HashMap<String, Value>, } // Building metadata let metadata = EventMetadata::new() .with_user_id(UserId::from("alice@example.com")) .with_correlation_id(CorrelationId::new()) .caused_by(&previous_event) .with_custom("ip_address", "192.168.1.1") .with_custom("user_agent", "MyApp/1.0"); }
Event Store Abstraction
EventCore defines a trait that storage adapters implement:
#![allow(unused)] fn main() { #[async_trait] pub trait EventStore: Send + Sync { type Event: Send + Sync; type Error: Error + Send + Sync; /// Read events from a specific stream async fn read_stream( &self, stream_id: &StreamId, options: ReadOptions, ) -> Result<StreamEvents<Self::Event>, Self::Error>; /// Read events from multiple streams async fn read_streams( &self, stream_ids: &[StreamId], options: ReadOptions, ) -> Result<Vec<StreamEvents<Self::Event>>, Self::Error>; /// Write events atomically to multiple streams async fn write_events( &self, events: Vec<EventToWrite<Self::Event>>, ) -> Result<WriteResult, Self::Error>; /// Subscribe to real-time events async fn subscribe( &self, options: SubscriptionOptions, ) -> Result<Box<dyn EventSubscription<Self::Event>>, Self::Error>; } }
Stream Versioning
Streams maintain version numbers for optimistic concurrency:
#![allow(unused)] fn main() { pub struct StreamEvents<E> { pub stream_id: StreamId, pub version: EventVersion, // Current version after these events pub events: Vec<StoredEvent<E>>, } // Version control options pub enum ExpectedVersion { /// Stream must not exist NoStream, /// Stream must be at this exact version Exact(EventVersion), /// Stream must exist but any version is OK Any, /// No version check (dangerous!) NoCheck, } }
Using Version Control
#![allow(unused)] fn main() { // First write - stream shouldn't exist let first_event = EventToWrite { stream_id: stream_id.clone(), payload: AccountOpened { /* ... */ }, metadata: None, expected_version: ExpectedVersion::NoStream, }; // Subsequent writes - check version let next_event = EventToWrite { stream_id: stream_id.clone(), payload: MoneyDeposited { /* ... */ }, metadata: None, expected_version: ExpectedVersion::Exact(EventVersion::new(1)), }; }
Storage Adapters
PostgreSQL Adapter
The production-ready adapter with ACID guarantees:
#![allow(unused)] fn main() { use eventcore_postgres::{PostgresEventStore, PostgresConfig}; let config = PostgresConfig::new("postgresql://localhost/eventcore") .with_pool_size(20) .with_schema("eventcore"); let event_store = PostgresEventStore::new(config).await?; // Initialize schema (one time) event_store.initialize().await?; }
PostgreSQL schema:
-- Events table with optimal indexing
CREATE TABLE events (
id UUID PRIMARY KEY DEFAULT gen_uuidv7(),
stream_id VARCHAR(255) NOT NULL,
version BIGINT NOT NULL,
event_type VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
metadata JSONB NOT NULL,
occurred_at TIMESTAMPTZ NOT NULL,
-- Ensure stream version uniqueness
UNIQUE(stream_id, version),
-- Indexes for common queries
INDEX idx_stream_id (stream_id),
INDEX idx_occurred_at (occurred_at),
INDEX idx_event_type (event_type)
);
In-Memory Adapter
Perfect for testing and development:
#![allow(unused)] fn main() { use eventcore_memory::InMemoryEventStore; let event_store = InMemoryEventStore::<MyEvent>::new(); // Optionally add chaos for testing let chaotic_store = event_store .with_chaos(ChaosConfig { failure_probability: 0.1, // 10% chance of failure latency_ms: Some(50..200), // Random latency }); }
Event Design Patterns
Event Granularity
Choose the right level of detail:
#![allow(unused)] fn main() { // ❌ Too coarse - loses important details struct OrderUpdated { order_id: OrderId, new_state: OrderState, // What actually changed? } // ❌ Too fine - creates event spam struct OrderFieldUpdated { order_id: OrderId, field_name: String, old_value: Value, new_value: Value, } // ✅ Just right - meaningful business events enum OrderEvent { OrderPlaced { customer: CustomerId, items: Vec<Item> }, PaymentReceived { amount: Money, method: PaymentMethod }, OrderShipped { tracking: TrackingNumber }, OrderDelivered { signed_by: String }, } }
Event Evolution
Design events to evolve gracefully:
#![allow(unused)] fn main() { // Version 1 #[derive(Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email: Email, } // Version 2 - Added field with default #[derive(Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email: Email, #[serde(default)] referral_code: Option<String>, // New field } // Version 3 - Structural change #[derive(Serialize, Deserialize)] #[serde(tag = "version")] enum UserRegisteredVersioned { #[serde(rename = "1")] V1 { user_id: UserId, email: Email }, #[serde(rename = "2")] V2 { user_id: UserId, email: Email, referral_code: Option<String>, }, #[serde(rename = "3")] V3 { user_id: UserId, email: Email, referral: Option<ReferralInfo>, // Richer type }, } }
Event Enrichment
Add context to events:
#![allow(unused)] fn main() { trait EventEnricher { fn enrich<E>(&self, event: E) -> EnrichedEvent<E>; } struct EnrichedEvent<E> { pub event: E, pub context: EventContext, } struct EventContext { pub session_id: SessionId, pub request_id: RequestId, pub feature_flags: HashMap<String, bool>, pub environment: Environment, } }
Querying Events
Read Options
Control how events are read:
#![allow(unused)] fn main() { let options = ReadOptions::default() .from_version(EventVersion::new(10)) // Start from version 10 .to_version(EventVersion::new(20)) // Up to version 20 .max_events(100) // Limit results .backwards(); // Read in reverse let events = event_store .read_stream(&stream_id, options) .await?; }
Reading Multiple Streams
For multi-stream operations:
#![allow(unused)] fn main() { let stream_ids = vec![ StreamId::from_static("order-123"), StreamId::from_static("inventory-abc"), StreamId::from_static("payment-xyz"), ]; let all_events = event_store .read_streams(&stream_ids, ReadOptions::default()) .await?; // Events from all streams, ordered by EventId (time) }
Global Event Feed
Read all events across all streams:
#![allow(unused)] fn main() { let all_events = event_store .read_all_events( ReadOptions::default() .after(last_known_event_id) // For pagination .max_events(1000) ) .await?; }
Event Store Guarantees
1. Atomicity
All events in a write operation succeed or fail together:
#![allow(unused)] fn main() { let events = vec![ EventToWrite { /* withdraw from account A */ }, EventToWrite { /* deposit to account B */ }, ]; // Both events written atomically event_store.write_events(events).await?; }
2. Consistency
Version checks prevent conflicting writes:
#![allow(unused)] fn main() { // Two concurrent commands read version 5 let command1_events = vec![/* ... */]; let command2_events = vec![/* ... */]; // First write succeeds event_store.write_events(command1_events).await?; // OK // Second write fails - version conflict event_store.write_events(command2_events).await?; // Error: Version conflict }
3. Durability
Events are persisted before returning success:
#![allow(unused)] fn main() { // After this returns, events are durable let result = event_store.write_events(events).await?; // Even if the process crashes, events are safe }
4. Ordering
Events maintain both stream order and global order:
#![allow(unused)] fn main() { // Stream order: version within a stream stream_events.events[0].version < stream_events.events[1].version // Global order: EventId across all streams all_events[0].id < all_events[1].id }
Performance Optimization
Batch Writing
Write multiple events efficiently:
#![allow(unused)] fn main() { // Batch events for better performance let mut batch = Vec::with_capacity(1000); for item in large_dataset { batch.push(EventToWrite { stream_id: compute_stream_id(&item), payload: process_item(item), metadata: None, expected_version: ExpectedVersion::Any, }); // Write in batches if batch.len() >= 100 { event_store.write_events(batch.drain(..).collect()).await?; } } // Write remaining if !batch.is_empty() { event_store.write_events(batch).await?; } }
Stream Partitioning
Distribute load across streams:
#![allow(unused)] fn main() { // Instead of one hot stream let stream_id = StreamId::from_static("orders"); // Partition by hash let stream_id = StreamId::from_static(&format!( "orders-{}", order_id.hash() % 16 // 16 partitions )); }
Caching Strategies
Cache recent events for read performance:
#![allow(unused)] fn main() { struct CachedEventStore<ES: EventStore> { inner: ES, cache: Arc<RwLock<LruCache<StreamId, StreamEvents<ES::Event>>>>, } impl<ES: EventStore> CachedEventStore<ES> { async fn read_stream_cached( &self, stream_id: &StreamId, options: ReadOptions, ) -> Result<StreamEvents<ES::Event>, ES::Error> { // Check cache first if options.is_from_start() { if let Some(cached) = self.cache.read().await.get(stream_id) { return Ok(cached.clone()); } } // Read from store let events = self.inner.read_stream(stream_id, options).await?; // Update cache self.cache.write().await.insert(stream_id.clone(), events.clone()); Ok(events) } } }
Testing with Events
Event Fixtures
Create test events easily:
#![allow(unused)] fn main() { use eventcore::testing::builders::*; fn create_account_opened_event() -> StoredEvent<BankEvent> { StoredEventBuilder::new() .with_stream_id(StreamId::from_static("account-123")) .with_version(EventVersion::new(1)) .with_payload(BankEvent::AccountOpened { owner: "Alice".to_string(), initial_balance: 1000, }) .with_metadata( EventMetadataBuilder::new() .with_user_id(UserId::from("alice@example.com")) .build() ) .build() } }
Event Assertions
Test event properties:
#![allow(unused)] fn main() { use eventcore::testing::assertions::*; #[test] fn test_events_are_ordered() { let events = vec![/* ... */]; assert_events_ordered(&events); assert_unique_event_ids(&events); assert_stream_version_progression(&events, &stream_id); } }
Summary
Events in EventCore are:
- ✅ Immutable records of business facts
- ✅ Time-ordered with UUIDv7 IDs
- ✅ Version-controlled for consistency
- ✅ Atomically written across streams
- ✅ Rich with metadata for auditing
Best practices:
- Design events around business concepts
- Include all necessary data in events
- Plan for event evolution
- Use version control for consistency
- Optimize storage with partitioning
Next, let’s explore State Reconstruction →
Chapter 3.3: State Reconstruction
State reconstruction is the heart of event sourcing - rebuilding current state by replaying historical events. EventCore makes this process efficient, type-safe, and predictable.
The Concept
Instead of storing current state in a database, event sourcing:
- Stores events - The facts about what happened
- Rebuilds state - By replaying events in order
- Guarantees consistency - Same events always produce same state
Think of it like a bank account:
- Traditional: Store balance = $1000
- Event Sourcing: Store deposits and withdrawals, calculate balance
How EventCore Reconstructs State
The Apply Function
Every command defines how events modify state:
#![allow(unused)] fn main() { impl CommandLogic for TransferMoney { type State = AccountState; type Event = BankEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { BankEvent::AccountOpened { initial_balance, owner } => { state.exists = true; state.balance = *initial_balance; state.owner = owner.clone(); state.opened_at = event.occurred_at; } BankEvent::MoneyDeposited { amount, .. } => { state.balance += amount; state.transaction_count += 1; state.last_activity = event.occurred_at; } BankEvent::MoneyWithdrawn { amount, .. } => { state.balance = state.balance.saturating_sub(*amount); state.transaction_count += 1; state.last_activity = event.occurred_at; } } } } }
The Reconstruction Process
When a command executes, EventCore:
- Reads declared streams - Gets all events from specified streams
- Creates default state - Starts with
State::default()
- Applies events in order - Calls
apply()
for each event - Passes state to handle - Your business logic receives reconstructed state
#![allow(unused)] fn main() { // EventCore does this automatically: let mut state = AccountState::default(); for event in events_from_streams { command.apply(&mut state, &event); } // Your handle() method receives the final state }
State Design Patterns
Accumulator Pattern
Build up state incrementally:
#![allow(unused)] fn main() { #[derive(Default)] struct OrderState { exists: bool, items: Vec<OrderItem>, total: Money, status: OrderStatus, customer: Option<CustomerId>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { OrderEvent::Created { customer_id } => { state.exists = true; state.customer = Some(*customer_id); state.status = OrderStatus::Draft; } OrderEvent::ItemAdded { item, price } => { state.items.push(item.clone()); state.total += price; } OrderEvent::Placed { .. } => { state.status = OrderStatus::Placed; } } } }
Snapshot Pattern
For expensive computations, pre-calculate during apply:
#![allow(unused)] fn main() { #[derive(Default)] struct AnalyticsState { total_revenue: Money, transactions_by_day: HashMap<Date, Vec<TransactionSummary>>, customer_lifetime_values: HashMap<CustomerId, Money>, // Pre-computed aggregates daily_averages: HashMap<Date, Money>, top_customers: BTreeSet<(Money, CustomerId)>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { AnalyticsEvent::Purchase { customer, amount, date } => { // Update raw data state.total_revenue += amount; state.transactions_by_day .entry(*date) .or_default() .push(TransactionSummary { customer: *customer, amount: *amount }); // Update pre-computed values *state.customer_lifetime_values.entry(*customer).or_default() += amount; // Maintain sorted top customers state.top_customers.insert((*amount, *customer)); if state.top_customers.len() > 100 { state.top_customers.pop_first(); } // Recalculate daily average for this date let daily_total: Money = state.transactions_by_day[date] .iter() .map(|t| t.amount) .sum(); let tx_count = state.transactions_by_day[date].len(); state.daily_averages.insert(*date, daily_total / tx_count as u64); } } } }
State Machine Pattern
Track valid transitions:
#![allow(unused)] fn main() { #[derive(Default)] struct WorkflowState { current_phase: WorkflowPhase, completed_phases: HashSet<WorkflowPhase>, phase_durations: HashMap<WorkflowPhase, Duration>, last_transition: DateTime<Utc>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { WorkflowEvent::PhaseCompleted { phase, started_at } => { // Record phase duration let duration = event.occurred_at - started_at; state.phase_durations.insert(*phase, duration); // Mark as completed state.completed_phases.insert(*phase); // Transition to next phase state.current_phase = phase.next_phase(); state.last_transition = event.occurred_at; } } } }
Multi-Stream State Reconstruction
When commands read multiple streams, state combines data from all:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ProcessPayment { #[stream] order_id: StreamId, #[stream] customer_id: StreamId, #[stream] payment_method_id: StreamId, amount: Money, } #[derive(Default)] struct PaymentState { // From order stream order: OrderInfo, // From customer stream customer: CustomerInfo, customer_payment_history: Vec<PaymentRecord>, // From payment method stream payment_method: PaymentMethodInfo, recent_charges: Vec<ChargeAttempt>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Events from different streams update different parts of state match (&event.stream_id, &event.payload) { (stream_id, PaymentEvent::Order(order_event)) if stream_id == &self.order_id => { // Update order portion of state apply_order_event(&mut state.order, order_event); } (stream_id, PaymentEvent::Customer(customer_event)) if stream_id == &self.customer_id => { // Update customer portion of state apply_customer_event(&mut state.customer, customer_event); } (stream_id, PaymentEvent::PaymentMethod(pm_event)) if stream_id == &self.payment_method_id => { // Update payment method portion of state apply_payment_method_event(&mut state.payment_method, pm_event); } _ => {} // Ignore events from other streams } } }
Performance Optimization
Selective State Loading
Only reconstruct what you need:
#![allow(unused)] fn main() { #[derive(Default)] struct AccountState { // Core fields - always loaded exists: bool, balance: Money, status: AccountStatus, // Optional expensive data transaction_history: Option<Vec<Transaction>>, statistics: Option<AccountStatistics>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Always update core fields match &event.payload { BankEvent::MoneyDeposited { amount, .. } => { state.balance += amount; } // ... } // Only build history if requested if state.transaction_history.is_some() { if let Some(tx) = event_to_transaction(&event) { state.transaction_history .as_mut() .unwrap() .push(tx); } } } // In handle(), decide what to load: async fn handle(&self, /* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Enable history loading for this command let mut state = Self::State::default(); if self.requires_history() { state.transaction_history = Some(Vec::new()); } // State reconstruction will populate history // ... } }
Event Filtering
Skip irrelevant events during reconstruction:
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { // Skip old events for performance let cutoff_date = Utc::now() - Duration::days(90); if event.occurred_at < cutoff_date { return; // Skip events older than 90 days } match &event.payload { // Process only recent events } } }
Memoization
Cache expensive calculations:
#![allow(unused)] fn main() { #[derive(Default)] struct MemoizedState { balance: Money, // Cache expensive calculations #[serde(skip)] cached_risk_score: Option<(DateTime<Utc>, RiskScore)>, } impl MemoizedState { fn risk_score(&mut self) -> RiskScore { let now = Utc::now(); // Check cache validity (1 hour) if let Some((cached_at, score)) = self.cached_risk_score { if now - cached_at < Duration::hours(1) { return score; } } // Calculate expensive risk score let score = calculate_risk_score(self); self.cached_risk_score = Some((now, score)); score } } }
Testing State Reconstruction
Unit Testing Apply Functions
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use eventcore::testing::builders::*; #[test] fn test_balance_calculation() { let command = TransferMoney { /* ... */ }; let mut state = AccountState::default(); // Create test events let events = vec![ create_event(BankEvent::AccountOpened { initial_balance: 1000, owner: "Alice".to_string(), }), create_event(BankEvent::MoneyDeposited { amount: 500, reference: "Salary".to_string(), }), create_event(BankEvent::MoneyWithdrawn { amount: 200, reference: "Rent".to_string(), }), ]; // Apply events for event in events { command.apply(&mut state, &event); } // Verify final state assert_eq!(state.balance, 1300); // 1000 + 500 - 200 assert_eq!(state.transaction_count, 2); assert!(state.exists); } } }
Property-Based Testing
#![allow(unused)] fn main() { use proptest::prelude::*; proptest! { #[test] fn balance_never_negative_with_saturating_sub( deposits in prop::collection::vec(1..1000u64, 0..10), withdrawals in prop::collection::vec(1..2000u64, 0..20), ) { let command = TransferMoney { /* ... */ }; let mut state = AccountState::default(); // Open account let open_event = create_event(BankEvent::AccountOpened { initial_balance: 0, owner: "Test".to_string(), }); command.apply(&mut state, &open_event); // Apply deposits for amount in deposits { let event = create_event(BankEvent::MoneyDeposited { amount, reference: "Deposit".to_string(), }); command.apply(&mut state, &event); } // Apply withdrawals for amount in withdrawals { let event = create_event(BankEvent::MoneyWithdrawn { amount, reference: "Withdrawal".to_string(), }); command.apply(&mut state, &event); } // Balance should never be negative due to saturating_sub prop_assert!(state.balance >= 0); } } }
Testing Event Order Independence
Some state calculations should be order-independent:
#![allow(unused)] fn main() { #[test] fn test_commutative_operations() { let events = vec![ create_tag_added_event("rust"), create_tag_added_event("async"), create_tag_added_event("eventstore"), ]; // Apply in different orders let mut state1 = TagState::default(); for event in &events { apply_tag_event(&mut state1, event); } let mut state2 = TagState::default(); for event in events.iter().rev() { apply_tag_event(&mut state2, event); } // Final state should be the same assert_eq!(state1.tags, state2.tags); } }
Common Pitfalls and Solutions
1. Mutable External State
❌ Wrong: Depending on external state
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { OrderEvent::Created { .. } => { // DON'T DO THIS - external dependency! state.tax_rate = fetch_current_tax_rate(); } } } }
✅ Right: Store everything in events
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { OrderEvent::Created { tax_rate, .. } => { // Tax rate was captured when event was created state.tax_rate = *tax_rate; } } } }
2. Non-Deterministic Operations
❌ Wrong: Using current time
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { OrderEvent::Created { .. } => { // DON'T DO THIS - non-deterministic! state.age_in_days = (Utc::now() - event.occurred_at).num_days(); } } } }
✅ Right: Calculate in handle() if needed
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Calculate age here, not in apply() let age_in_days = (Utc::now() - state.created_at).num_days(); // Use for business logic... } }
3. Unbounded State Growth
❌ Wrong: Keeping everything forever
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { LogEvent::Entry { message } => { // DON'T DO THIS - unbounded growth! state.all_log_entries.push(message.clone()); } } } }
✅ Right: Keep bounded state
#![allow(unused)] fn main() { fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { LogEvent::Entry { message, level } => { // Keep only recent errors if *level == LogLevel::Error { state.recent_errors.push(message.clone()); if state.recent_errors.len() > 100 { state.recent_errors.remove(0); } } // Track counts instead of full data *state.entries_by_level.entry(*level).or_default() += 1; } } } }
Advanced Patterns
Temporal State
Track state changes over time:
#![allow(unused)] fn main() { #[derive(Default)] struct TemporalState { current_value: i32, history: BTreeMap<DateTime<Utc>, i32>, transitions: Vec<StateTransition>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { let old_value = state.current_value; match &event.payload { ValueEvent::Changed { new_value } => { state.current_value = *new_value; state.history.insert(event.occurred_at, *new_value); state.transitions.push(StateTransition { at: event.occurred_at, from: old_value, to: *new_value, event_id: event.id, }); } } } impl TemporalState { /// Get value at a specific point in time fn value_at(&self, timestamp: DateTime<Utc>) -> Option<i32> { self.history .range(..=timestamp) .next_back() .map(|(_, &value)| value) } } }
Derived State
Calculate derived values efficiently:
#![allow(unused)] fn main() { #[derive(Default)] struct DerivedState { // Raw data orders: Vec<Order>, // Derived data (calculated in apply) total_revenue: Money, average_order_value: Option<Money>, orders_by_status: HashMap<OrderStatus, usize>, } fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { OrderEvent::Placed { order } => { // Update raw data state.orders.push(order.clone()); // Update derived data incrementally state.total_revenue += order.total; state.average_order_value = Some( state.total_revenue / state.orders.len() as u64 ); *state.orders_by_status .entry(OrderStatus::Placed) .or_default() += 1; } } } }
Summary
State reconstruction in EventCore:
- ✅ Deterministic - Same events always produce same state
- ✅ Type-safe - State structure defined by types
- ✅ Efficient - Only reconstruct what you need
- ✅ Testable - Easy to verify with known events
- ✅ Flexible - Support any state structure
Best practices:
- Keep apply() functions pure and deterministic
- Pre-calculate expensive derived data
- Design state for your command’s needs
- Test state reconstruction thoroughly
- Optimize for your access patterns
Next, let’s explore Multi-Stream Atomicity →
Chapter 3.4: Multi-Stream Atomicity
Multi-stream atomicity is EventCore’s key innovation. Traditional event sourcing forces you to choose aggregate boundaries upfront. EventCore lets each command define its own consistency boundary dynamically.
The Problem with Traditional Aggregates
In traditional event sourcing:
#![allow(unused)] fn main() { // Traditional approach - rigid boundaries struct BankAccount { id: AccountId, balance: Money, // Can only modify THIS account atomically } // ❌ Cannot atomically transfer between accounts! // Must use sagas, process managers, or eventual consistency }
This leads to:
- Complex workflows for operations spanning aggregates
- Eventual consistency where immediate consistency is needed
- Race conditions between related operations
- Difficult refactoring when boundaries need to change
EventCore’s Solution
EventCore allows atomic operations across multiple streams:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct TransferMoney { #[stream] from_account: StreamId, // Read and write this stream #[stream] to_account: StreamId, // Read and write this stream too amount: Money, } // ✅ Both accounts updated atomically or not at all! }
How It Works
1. Stream Declaration
Commands declare all streams they need:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ProcessOrder { #[stream] order: StreamId, #[stream] inventory: StreamId, #[stream] customer: StreamId, #[stream] payment: StreamId, } }
2. Atomic Read Phase
EventCore reads all declared streams with version tracking:
#![allow(unused)] fn main() { // EventCore does this internally: let stream_data = HashMap::new(); for stream_id in command.read_streams() { let events = event_store.read_stream(&stream_id).await?; stream_data.insert(stream_id, StreamData { version: events.version, events: events.events, }); } }
3. State Reconstruction
State is built from all streams:
#![allow(unused)] fn main() { let mut state = OrderProcessingState::default(); for (stream_id, data) in &stream_data { for event in &data.events { command.apply(&mut state, event); } } }
4. Command Execution
Your business logic runs with full state:
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate across all streams require!(state.order.is_valid(), "Invalid order"); require!(state.inventory.has_stock(&self.items), "Insufficient stock"); require!(state.customer.can_purchase(), "Customer not authorized"); require!(state.payment.has_funds(self.total), "Insufficient funds"); // Generate events for multiple streams Ok(vec![ StreamWrite::new(&read_streams, self.order.clone(), OrderEvent::Confirmed { /* ... */ })?, StreamWrite::new(&read_streams, self.inventory.clone(), InventoryEvent::Reserved { /* ... */ })?, StreamWrite::new(&read_streams, self.customer.clone(), CustomerEvent::OrderPlaced { /* ... */ })?, StreamWrite::new(&read_streams, self.payment.clone(), PaymentEvent::Charged { /* ... */ })?, ]) } }
5. Atomic Write Phase
All events written atomically with version checks:
#![allow(unused)] fn main() { // EventCore ensures all-or-nothing write event_store.write_events(vec![ EventToWrite { stream_id: order_stream, payload: order_event, expected_version: ExpectedVersion::Exact(order_version), }, EventToWrite { stream_id: inventory_stream, payload: inventory_event, expected_version: ExpectedVersion::Exact(inventory_version), }, // ... more events ]).await?; }
Consistency Guarantees
Version Checking
EventCore prevents concurrent modifications:
#![allow(unused)] fn main() { // Command A reads order v5, inventory v10 // Command B reads order v5, inventory v10 // Command A writes first - succeeds // Order → v6, Inventory → v11 // Command B tries to write - FAILS // Version conflict detected! }
Automatic Retry
On version conflicts, EventCore:
- Re-reads all streams
- Rebuilds state with new events
- Re-executes command logic
- Attempts write again
#![allow(unused)] fn main() { // This happens automatically: loop { let (state, versions) = read_and_build_state().await?; let events = command.handle(state).await?; match write_with_version_check(events, versions).await { Ok(_) => return Ok(()), Err(VersionConflict) => continue, // Retry Err(e) => return Err(e), } } }
Dynamic Stream Discovery
Commands can discover additional streams during execution:
#![allow(unused)] fn main() { async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Discover we need product streams based on order items let product_streams: Vec<StreamId> = state.order.items .iter() .map(|item| StreamId::from(format!("product-{}", item.product_id))) .collect(); // Request these additional streams stream_resolver.add_streams(product_streams); // EventCore will re-execute with all streams Ok(vec![]) } }
Real-World Examples
E-Commerce Checkout
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CheckoutCart { #[stream] cart: StreamId, #[stream] customer: StreamId, #[stream] payment_method: StreamId, // Product streams discovered dynamically } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Add product streams for inventory check let product_streams: Vec<StreamId> = state.cart.items .keys() .map(|id| StreamId::from(format!("product-{}", id))) .collect(); stream_resolver.add_streams(product_streams); // Validate everything atomically for (product_id, quantity) in &state.cart.items { let product_state = &state.products[product_id]; require!( product_state.available_stock >= *quantity, "Insufficient stock for product {}", product_id ); } // Generate events for all affected streams let mut events = vec![ // Convert cart to order StreamWrite::new(&read_streams, self.cart.clone(), CartEvent::CheckedOut { order_id })?, // Create order StreamWrite::new(&read_streams, order_stream, OrderEvent::Created { /* ... */ })?, // Charge payment StreamWrite::new(&read_streams, self.payment_method.clone(), PaymentEvent::Charged { amount: state.cart.total })?, ]; // Reserve inventory from each product for (product_id, quantity) in &state.cart.items { let product_stream = StreamId::from(format!("product-{}", product_id)); events.push(StreamWrite::new(&read_streams, product_stream, ProductEvent::StockReserved { quantity: *quantity })?); } Ok(events) } }
Distributed Ledger
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct RecordTransaction { #[stream] ledger: StreamId, #[stream] account_a: StreamId, #[stream] account_b: StreamId, entry: LedgerEntry, } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Ensure double-entry bookkeeping consistency require!( self.entry.debits == self.entry.credits, "Debits must equal credits" ); // Validate account states require!( state.account_a.is_active && state.account_b.is_active, "Both accounts must be active" ); // Record atomically in all streams Ok(vec![ StreamWrite::new(&read_streams, self.ledger.clone(), LedgerEvent::EntryRecorded { entry: self.entry.clone() })?, StreamWrite::new(&read_streams, self.account_a.clone(), AccountEvent::Debited { amount: self.entry.debit_amount, reference: self.entry.id, })?, StreamWrite::new(&read_streams, self.account_b.clone(), AccountEvent::Credited { amount: self.entry.credit_amount, reference: self.entry.id, })?, ]) } }
Workflow Orchestration
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CompleteWorkflowStep { #[stream] workflow: StreamId, #[stream] current_step: StreamId, // Next step stream discovered dynamically step_result: StepResult, } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Determine next step based on current state and result let next_step_id = match (&state.current_step.step_type, &self.step_result) { (StepType::Approval, StepResult::Approved) => state.workflow.next_step, (StepType::Approval, StepResult::Rejected) => state.workflow.rejection_step, (StepType::Processing, StepResult::Success) => state.workflow.next_step, (StepType::Processing, StepResult::Error) => state.workflow.error_step, _ => None, }; // Add next step stream if needed if let Some(next_id) = next_step_id { let next_stream = StreamId::from(format!("step-{}", next_id)); stream_resolver.add_streams(vec![next_stream.clone()]); } // Atomic update across workflow and steps let mut events = vec![ StreamWrite::new(&read_streams, self.workflow.clone(), WorkflowEvent::StepCompleted { step_id: state.current_step.id, result: self.step_result.clone(), })?, StreamWrite::new(&read_streams, self.current_step.clone(), StepEvent::Completed { result: self.step_result.clone(), })?, ]; // Activate next step if let Some(next_id) = next_step_id { let next_stream = StreamId::from(format!("step-{}", next_id)); events.push(StreamWrite::new(&read_streams, next_stream, StepEvent::Activated { workflow_id: state.workflow.id, activation_time: Utc::now(), })?); } Ok(events) } }
Performance Considerations
Stream Count Impact
Reading more streams has costs:
#![allow(unused)] fn main() { // Benchmark results (example): // 1 stream: 5ms average latency // 5 streams: 12ms average latency // 10 streams: 25ms average latency // 50 streams: 150ms average latency // Design commands to read only necessary streams }
Optimization Strategies
- Stream Partitioning
#![allow(unused)] fn main() { // Instead of one hot stream let stream = StreamId::from_static("orders"); // Partition by customer segment let stream = StreamId::from(format!("orders-{}", customer_id.hash() % 16 )); }
- Lazy Stream Loading
#![allow(unused)] fn main() { async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Only load detail streams if needed if state.requires_detailed_check() { let detail_streams = compute_detail_streams(&state); stream_resolver.add_streams(detail_streams); } // Continue with basic validation... } }
- Read Filtering
#![allow(unused)] fn main() { // EventCore may support filtered reads (future feature) let options = ReadOptions::default() .from_version(EventVersion::new(1000)) // Skip old events .event_types(&["OrderPlaced", "OrderShipped"]); // Only specific types }
Testing Multi-Stream Commands
Integration Tests
#![allow(unused)] fn main() { #[tokio::test] async fn test_multi_stream_atomicity() { let store = InMemoryEventStore::<BankEvent>::new(); let executor = CommandExecutor::new(store.clone()); // Setup initial state create_account(&executor, "account-1", 1000).await; create_account(&executor, "account-2", 500).await; // Execute transfer let transfer = TransferMoney { from_account: StreamId::from_static("account-1"), to_account: StreamId::from_static("account-2"), amount: 300, }; executor.execute(&transfer).await.unwrap(); // Verify both accounts updated atomically let account1 = get_balance(&store, "account-1").await; let account2 = get_balance(&store, "account-2").await; assert_eq!(account1, 700); // 1000 - 300 assert_eq!(account2, 800); // 500 + 300 assert_eq!(account1 + account2, 1500); // Total preserved } }
Concurrent Modification Tests
#![allow(unused)] fn main() { #[tokio::test] async fn test_concurrent_transfers() { let store = InMemoryEventStore::<BankEvent>::new(); let executor = CommandExecutor::new(store); // Setup accounts create_account(&executor, "A", 1000).await; create_account(&executor, "B", 1000).await; create_account(&executor, "C", 1000).await; // Concurrent transfers forming a cycle let transfer_ab = TransferMoney { from_account: StreamId::from_static("A"), to_account: StreamId::from_static("B"), amount: 100, }; let transfer_bc = TransferMoney { from_account: StreamId::from_static("B"), to_account: StreamId::from_static("C"), amount: 100, }; let transfer_ca = TransferMoney { from_account: StreamId::from_static("C"), to_account: StreamId::from_static("A"), amount: 100, }; // Execute concurrently let (r1, r2, r3) = tokio::join!( executor.execute(&transfer_ab), executor.execute(&transfer_bc), executor.execute(&transfer_ca), ); // All should succeed (with retries) assert!(r1.is_ok()); assert!(r2.is_ok()); assert!(r3.is_ok()); // Total balance preserved let total = get_balance(&store, "A").await + get_balance(&store, "B").await + get_balance(&store, "C").await; assert_eq!(total, 3000); } }
Common Patterns
Read-Only Streams
Some streams are read but not written:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct ValidateTransaction { #[stream] transaction: StreamId, #[stream] rules_engine: StreamId, // Read-only for validation rules #[stream] fraud_history: StreamId, // Read-only for risk assessment } async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Use read-only streams for validation let risk_score = calculate_risk(&state.fraud_history); let applicable_rules = state.rules_engine.rules_for(&self.transaction); // Only write to transaction stream Ok(vec![ StreamWrite::new(&read_streams, self.transaction.clone(), TransactionEvent::Validated { risk_score })? ]) } }
Conditional Stream Writes
Write to streams based on business logic:
#![allow(unused)] fn main() { async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { let mut events = vec![]; // Always update the main stream events.push(StreamWrite::new(&read_streams, self.order.clone(), OrderEvent::Processed { /* ... */ })?); // Conditionally update other streams if state.customer.is_vip { events.push(StreamWrite::new(&read_streams, self.customer.clone(), CustomerEvent::VipPointsEarned { points: calculate_points() })?); } if state.requires_fraud_check() { events.push(StreamWrite::new(&read_streams, fraud_stream, FraudEvent::CheckRequested { /* ... */ })?); } Ok(events) } }
Summary
Multi-stream atomicity in EventCore provides:
- ✅ Dynamic boundaries - Each command defines its consistency needs
- ✅ True atomicity - All streams updated together or not at all
- ✅ Automatic retries - Handle concurrent modifications gracefully
- ✅ Stream discovery - Add streams dynamically during execution
- ✅ Type safety - Compile-time guarantees about stream access
Best practices:
- Declare minimal required streams upfront
- Use dynamic discovery for conditional streams
- Design for retry-ability (idempotent operations)
- Test concurrent scenarios thoroughly
- Monitor retry rates in production
Next, let’s explore Error Handling →
Chapter 3.5: Error Handling
Error handling in EventCore is designed to be explicit, recoverable, and informative. This chapter covers error types, handling strategies, and best practices for building resilient event-sourced systems.
Error Philosophy
EventCore follows these principles:
- Errors are values - Use
Result<T, E>
everywhere - Be specific - Different error types for different failures
- Fail fast - Validate early in the command pipeline
- Recover gracefully - Automatic retries for transient errors
- Provide context - Rich error messages for debugging
Error Types
Command Errors
The main error type for command execution:
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum CommandError { #[error("Validation failed: {0}")] ValidationFailed(String), #[error("Business rule violation: {0}")] BusinessRuleViolation(String), #[error("Stream not found: {0}")] StreamNotFound(StreamId), #[error("Concurrency conflict on streams: {0:?}")] ConcurrencyConflict(Vec<StreamId>), #[error("Event store error: {0}")] EventStore(#[from] EventStoreError), #[error("Serialization error: {0}")] Serialization(#[from] serde_json::Error), #[error("Maximum retries exceeded: {0}")] MaxRetriesExceeded(String), } }
Event Store Errors
Storage-specific errors:
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum EventStoreError { #[error("Version conflict in stream {stream_id}: expected {expected:?}, actual {actual}")] VersionConflict { stream_id: StreamId, expected: ExpectedVersion, actual: EventVersion, }, #[error("Stream {0} not found")] StreamNotFound(StreamId), #[error("Database error: {0}")] Database(String), #[error("Connection error: {0}")] Connection(String), #[error("Timeout after {0:?}")] Timeout(Duration), #[error("Transaction rolled back: {0}")] TransactionRollback(String), } }
Validation Patterns
Using the require!
Macro
The require!
macro makes validation concise:
#![allow(unused)] fn main() { use eventcore::require; async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Simple validation require!(self.amount > 0, "Amount must be positive"); // Validation with formatting require!( state.balance >= self.amount, "Insufficient balance: have {}, need {}", state.balance, self.amount ); // Complex validation require!( state.account.is_active && !state.account.is_frozen, "Account must be active and not frozen" ); // Continue with business logic... Ok(vec![/* events */]) } }
Custom Validation Functions
For complex validations:
#![allow(unused)] fn main() { impl TransferMoney { fn validate_business_rules(&self, state: &AccountState) -> CommandResult<()> { // Daily limit check self.validate_daily_limit(state)?; // Fraud check self.validate_fraud_rules(state)?; // Compliance check self.validate_compliance(state)?; Ok(()) } fn validate_daily_limit(&self, state: &AccountState) -> CommandResult<()> { const DAILY_LIMIT: Money = Money::from_cents(50_000_00); let today_total = state.transfers_today() + self.amount; require!( today_total <= DAILY_LIMIT, "Daily transfer limit exceeded: {} > {}", today_total, DAILY_LIMIT ); Ok(()) } } // In handle() async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Run all validations self.validate_business_rules(&state)?; // Generate events... } }
Type-Safe Validation
Use types to make invalid states unrepresentable:
#![allow(unused)] fn main() { use nutype::nutype; // Email validation at type level #[nutype( sanitize(lowercase, trim), validate(regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"), derive(Debug, Clone, Serialize, Deserialize) )] pub struct Email(String); // Money that can't be negative #[nutype( validate(greater_or_equal = 0), derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord) )] pub struct Money(u64); // Now these validations happen at construction let email = Email::try_new("invalid-email")?; // Fails at parse time let amount = Money::try_new(-100)?; // Compile error - u64 can't be negative }
Handling Transient Errors
Automatic Retries
EventCore automatically retries on version conflicts:
#![allow(unused)] fn main() { // This happens inside EventCore: pub async fn execute_with_retry<C: Command>( command: &C, max_retries: usize, ) -> CommandResult<ExecutionResult> { let mut attempts = 0; loop { attempts += 1; match execute_once(command).await { Ok(result) => return Ok(result), Err(CommandError::ConcurrencyConflict(_)) if attempts < max_retries => { // Exponential backoff let delay = Duration::from_millis(100 * 2_u64.pow(attempts as u32)); tokio::time::sleep(delay).await; continue; } Err(e) => return Err(e), } } } }
Circuit Breaker Pattern
Protect against cascading failures:
#![allow(unused)] fn main() { use std::sync::Arc; use std::sync::atomic::{AtomicU32, AtomicU64, Ordering}; pub struct CircuitBreaker { failure_count: AtomicU32, last_failure_time: AtomicU64, threshold: u32, timeout: Duration, } impl CircuitBreaker { pub fn call<F, T, E>(&self, f: F) -> Result<T, CircuitBreakerError<E>> where F: FnOnce() -> Result<T, E>, { // Check if circuit is open if self.is_open() { return Err(CircuitBreakerError::Open); } // Try the operation match f() { Ok(result) => { self.on_success(); Ok(result) } Err(e) => { self.on_failure(); Err(CircuitBreakerError::Failed(e)) } } } fn is_open(&self) -> bool { let failures = self.failure_count.load(Ordering::Relaxed); if failures >= self.threshold { let last_failure = self.last_failure_time.load(Ordering::Relaxed); let elapsed = Duration::from_millis( SystemTime::now() .duration_since(UNIX_EPOCH) .unwrap() .as_millis() as u64 - last_failure ); elapsed < self.timeout } else { false } } } // Usage in event store impl PostgresEventStore { pub async fn read_stream_with_circuit_breaker( &self, stream_id: &StreamId, ) -> Result<StreamEvents, EventStoreError> { self.circuit_breaker.call(|| { self.read_stream_internal(stream_id).await }) } } }
Error Recovery Strategies
Compensating Commands
When things go wrong, emit compensating events:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct RefundPayment { #[stream] payment: StreamId, #[stream] account: StreamId, reason: RefundReason, } async fn handle(/* ... */) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Validate refund is possible require!( state.payment.status == PaymentStatus::Completed, "Can only refund completed payments" ); require!( !state.payment.is_refunded, "Payment already refunded" ); // Compensating events Ok(vec![ StreamWrite::new(&read_streams, self.payment.clone(), PaymentEvent::Refunded { amount: state.payment.amount, reason: self.reason.clone(), })?, StreamWrite::new(&read_streams, self.account.clone(), AccountEvent::Credited { amount: state.payment.amount, reference: format!("Refund for payment {}", state.payment.id), })?, ]) } }
Dead Letter Queues
Handle permanently failed commands:
#![allow(unused)] fn main() { pub struct DeadLetterQueue<C: Command> { failed_commands: Vec<FailedCommand<C>>, } #[derive(Debug)] pub struct FailedCommand<C> { pub command: C, pub error: CommandError, pub attempts: usize, pub first_attempted: DateTime<Utc>, pub last_attempted: DateTime<Utc>, } impl<C: Command> CommandExecutor<C> { pub async fn execute_with_dlq( &self, command: C, dlq: &mut DeadLetterQueue<C>, ) -> CommandResult<ExecutionResult> { match self.execute_with_retry(&command, 5).await { Ok(result) => Ok(result), Err(e) if e.is_permanent() => { // Add to DLQ for manual intervention dlq.add(FailedCommand { command, error: e.clone(), attempts: 5, first_attempted: Utc::now(), last_attempted: Utc::now(), }); Err(e) } Err(e) => Err(e), } } } }
Error Context and Debugging
Rich Error Context
Add context to errors:
#![allow(unused)] fn main() { use std::fmt; #[derive(Debug)] pub struct ErrorContext { pub command_type: &'static str, pub stream_ids: Vec<StreamId>, pub correlation_id: CorrelationId, pub user_id: Option<UserId>, pub additional_context: HashMap<String, String>, } impl fmt::Display for ErrorContext { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!(f, "Command: {}, Streams: {:?}, Correlation: {}", self.command_type, self.stream_ids, self.correlation_id )?; if let Some(user) = &self.user_id { write!(f, ", User: {}", user)?; } for (key, value) in &self.additional_context { write!(f, ", {}: {}", key, value)?; } Ok(()) } } // Wrap errors with context pub type ContextualResult<T> = Result<T, ContextualError>; #[derive(Debug, thiserror::Error)] #[error("{context}\nError: {source}")] pub struct ContextualError { #[source] source: CommandError, context: ErrorContext, } }
Structured Logging
Log errors with full context:
#![allow(unused)] fn main() { use tracing::{error, warn, info, instrument}; #[instrument(skip(self, read_streams, state, stream_resolver))] async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { info!( amount = %self.amount, from = %self.from_account, to = %self.to_account, "Processing transfer" ); if let Err(e) = self.validate_business_rules(&state) { error!( error = %e, balance = %state.balance, daily_total = %state.transfers_today(), "Transfer validation failed" ); return Err(e); } // Continue... } }
Testing Error Scenarios
Unit Tests for Validation
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; #[tokio::test] async fn test_insufficient_balance_error() { let command = TransferMoney { from_account: StreamId::from_static("account-1"), to_account: StreamId::from_static("account-2"), amount: Money::from_cents(1000), }; let state = AccountState { balance: Money::from_cents(500), ..Default::default() }; let result = command.validate_business_rules(&state); assert!(matches!( result, Err(CommandError::ValidationFailed(msg)) if msg.contains("Insufficient balance") )); } #[tokio::test] async fn test_daily_limit_exceeded() { let command = TransferMoney { from_account: StreamId::from_static("account-1"), to_account: StreamId::from_static("account-2"), amount: Money::from_cents(10_000), }; let mut state = AccountState::default(); state.add_todays_transfer(Money::from_cents(45_000)); let result = command.validate_business_rules(&state); assert!(matches!( result, Err(CommandError::BusinessRuleViolation(msg)) if msg.contains("Daily transfer limit") )); } } }
Integration Tests for Concurrency
#![allow(unused)] fn main() { #[tokio::test] async fn test_concurrent_modification_handling() { let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); // Setup create_account(&executor, "account-1", 1000).await; // Create two conflicting commands let withdraw1 = WithdrawMoney { account: StreamId::from_static("account-1"), amount: Money::from_cents(600), }; let withdraw2 = WithdrawMoney { account: StreamId::from_static("account-1"), amount: Money::from_cents(700), }; // Execute concurrently let (result1, result2) = tokio::join!( executor.execute(&withdraw1), executor.execute(&withdraw2) ); // One should succeed, one should fail due to insufficient funds after retry let successes = [&result1, &result2] .iter() .filter(|r| r.is_ok()) .count(); assert_eq!(successes, 1, "Exactly one withdrawal should succeed"); // Check final balance let balance = get_account_balance(&store, "account-1").await; assert!(balance == 400 || balance == 300); // 1000 - 600 or 1000 - 700 } }
Chaos Testing
#![allow(unused)] fn main() { use eventcore::testing::chaos::ChaosConfig; #[tokio::test] async fn test_resilience_under_chaos() { let base_store = InMemoryEventStore::new(); let chaos_store = base_store.with_chaos(ChaosConfig { failure_probability: 0.1, // 10% chance of failure latency_ms: Some(50..200), // Random latency version_conflict_probability: 0.2, // 20% chance of conflicts }); let executor = CommandExecutor::new(chaos_store) .with_max_retries(10); // Run many operations let mut handles = vec![]; for i in 0..100 { let executor = executor.clone(); let handle = tokio::spawn(async move { let command = CreateTask { title: format!("Task {}", i), // ... }; executor.execute(&command).await }); handles.push(handle); } // Collect results let results: Vec<_> = futures::future::join_all(handles).await; // Despite chaos, most should succeed due to retries let success_rate = results.iter() .filter(|r| r.as_ref().unwrap().is_ok()) .count() as f64 / results.len() as f64; assert!(success_rate > 0.95, "Success rate too low: {}", success_rate); } }
Production Error Handling
Monitoring and Alerting
#![allow(unused)] fn main() { use prometheus::{Counter, Histogram, register_counter, register_histogram}; lazy_static! { static ref COMMAND_ERRORS: Counter = register_counter!( "eventcore_command_errors_total", "Total number of command errors" ).unwrap(); static ref RETRY_COUNT: Histogram = register_histogram!( "eventcore_command_retries", "Number of retries per command" ).unwrap(); } impl CommandExecutor { async fn execute_with_metrics(&self, command: &impl Command) -> CommandResult<ExecutionResult> { let start = Instant::now(); let mut retries = 0; loop { match self.execute_once(command).await { Ok(result) => { RETRY_COUNT.observe(retries as f64); return Ok(result); } Err(e) => { COMMAND_ERRORS.inc(); if e.is_retriable() && retries < self.max_retries { retries += 1; continue; } return Err(e); } } } } } }
Error Recovery Procedures
Document recovery procedures:
#![allow(unused)] fn main() { /// Recovery procedure for payment processing failures /// /// 1. Check payment provider status /// 2. Verify account balances match event history /// 3. Look for orphaned payments in provider but not in events /// 4. Run reconciliation command if discrepancies found /// 5. Contact support if automated recovery fails #[derive(Command, Clone)] struct ReconcilePayments { #[stream] payment_provider: StreamId, #[stream] reconciliation_log: StreamId, provider_transactions: Vec<ProviderTransaction>, } }
Best Practices
1. Fail Fast
Validate as early as possible:
#![allow(unused)] fn main() { // ✅ Good - validate at construction impl TransferMoney { pub fn new( from: StreamId, to: StreamId, amount: Money, ) -> Result<Self, ValidationError> { if from == to { return Err(ValidationError::SameAccount); } Ok(Self { from_account: from, to_account: to, amount, }) } } // ❌ Bad - validate late in handle() }
2. Be Specific
Use specific error types:
#![allow(unused)] fn main() { // ✅ Good - specific errors #[derive(Debug, thiserror::Error)] pub enum TransferError { #[error("Insufficient balance: available {available}, requested {requested}")] InsufficientBalance { available: Money, requested: Money }, #[error("Daily limit exceeded: limit {limit}, attempted {attempted}")] DailyLimitExceeded { limit: Money, attempted: Money }, #[error("Account {0} is frozen")] AccountFrozen(AccountId), } // ❌ Bad - generic errors Err("Transfer failed".into()) }
3. Make Errors Actionable
Provide enough context to fix issues:
#![allow(unused)] fn main() { // ✅ Good - actionable error require!( state.account.kyc_verified, "Account KYC verification required. Please complete verification at: https://example.com/kyc/{}", state.account.id ); // ❌ Bad - vague error require!(state.account.kyc_verified, "KYC required"); }
Summary
Error handling in EventCore:
- ✅ Type-safe - Errors encoded in function signatures
- ✅ Recoverable - Automatic retries for transient failures
- ✅ Informative - Rich context for debugging
- ✅ Testable - Easy to test error scenarios
- ✅ Production-ready - Monitoring and recovery built-in
Best practices:
- Use
require!
macro for concise validation - Create specific error types for your domain
- Add context to errors for debugging
- Test error scenarios thoroughly
- Monitor errors in production
You’ve completed Part 3! Continue to Part 4: Building Web APIs →
Part 4: Building Web APIs
This part shows how to expose your EventCore application through HTTP APIs. We’ll cover command handlers, query endpoints, authentication, and API design best practices.
Chapters in This Part
- Setting Up HTTP Endpoints - Web framework integration
- Command Handlers - Exposing commands via HTTP
- Query Endpoints - Building read APIs with projections
- Authentication and Authorization - Securing your API
- API Versioning - Evolving APIs without breaking clients
What You’ll Learn
- Integrate EventCore with popular Rust web frameworks
- Design RESTful and GraphQL APIs for event-sourced systems
- Handle authentication and authorization
- Build efficient query endpoints using projections
- Version your API as your system evolves
Prerequisites
- Completed Part 3: Core Concepts
- Basic understanding of HTTP and REST APIs
- Familiarity with at least one Rust web framework helpful
Framework Examples
This part includes examples for:
- Axum - Modern, ergonomic web framework
- Actix Web - High-performance actor-based framework
- Rocket - Type-safe, developer-friendly framework
Time to Complete
- Reading: ~45 minutes
- With implementation: ~2 hours
Ready to build APIs? Let’s start with Setting Up HTTP Endpoints →
Chapter 4.1: Setting Up HTTP Endpoints
EventCore is framework-agnostic - you can use it with any Rust web framework. This chapter shows how to integrate EventCore with popular frameworks and structure your API.
Architecture Overview
HTTP Request → Web Framework → Command/Query → EventCore → Response
Your web layer should be thin, focusing on:
- Request parsing - Convert HTTP to domain types
- Authentication - Verify caller identity
- Authorization - Check permissions
- Command/Query execution - Delegate to EventCore
- Response formatting - Convert results to HTTP
Axum Integration
Axum is a modern web framework that pairs well with EventCore:
Setup
[dependencies]
eventcore = "1.0"
axum = "0.7"
tokio = { version = "1", features = ["full"] }
tower = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
Basic Application Structure
use axum::{ extract::{State, Json}, http::StatusCode, response::IntoResponse, routing::{get, post}, Router, }; use eventcore::prelude::*; use std::sync::Arc; use tokio::sync::RwLock; // Application state shared across handlers #[derive(Clone)] struct AppState { executor: CommandExecutor<PostgresEventStore>, projections: Arc<RwLock<ProjectionManager>>, } #[tokio::main] async fn main() { // Initialize EventCore let event_store = PostgresEventStore::new( "postgresql://localhost/eventcore" ).await.unwrap(); let executor = CommandExecutor::new(event_store); let projections = Arc::new(RwLock::new(ProjectionManager::new())); let state = AppState { executor, projections, }; // Build routes let app = Router::new() .route("/api/v1/tasks", post(create_task)) .route("/api/v1/tasks/:id", get(get_task)) .route("/api/v1/tasks/:id/assign", post(assign_task)) .route("/api/v1/tasks/:id/complete", post(complete_task)) .route("/api/v1/users/:id/tasks", get(get_user_tasks)) .route("/health", get(health_check)) .with_state(state); // Start server let listener = tokio::net::TcpListener::bind("0.0.0.0:3000") .await .unwrap(); axum::serve(listener, app).await.unwrap(); }
Command Handler Example
#![allow(unused)] fn main() { #[derive(Debug, Deserialize)] struct CreateTaskRequest { title: String, description: String, } #[derive(Debug, Serialize)] struct CreateTaskResponse { task_id: String, message: String, } async fn create_task( State(state): State<AppState>, Json(request): Json<CreateTaskRequest>, ) -> Result<Json<CreateTaskResponse>, ApiError> { // Validate input let title = TaskTitle::try_new(request.title) .map_err(|e| ApiError::validation(e))?; let description = TaskDescription::try_new(request.description) .map_err(|e| ApiError::validation(e))?; // Create command let task_id = TaskId::new(); let command = CreateTask { task_id: StreamId::from(format!("task-{}", task_id)), title, description, }; // Execute command state.executor .execute(&command) .await .map_err(|e| ApiError::from_command_error(e))?; // Return response Ok(Json(CreateTaskResponse { task_id: task_id.to_string(), message: "Task created successfully".to_string(), })) } }
Error Handling
#![allow(unused)] fn main() { #[derive(Debug)] struct ApiError { status: StatusCode, message: String, details: Option<serde_json::Value>, } impl ApiError { fn validation<E: std::error::Error>(error: E) -> Self { Self { status: StatusCode::BAD_REQUEST, message: error.to_string(), details: None, } } fn from_command_error(error: CommandError) -> Self { match error { CommandError::ValidationFailed(msg) => Self { status: StatusCode::BAD_REQUEST, message: msg, details: None, }, CommandError::BusinessRuleViolation(msg) => Self { status: StatusCode::UNPROCESSABLE_ENTITY, message: msg, details: None, }, CommandError::StreamNotFound(_) => Self { status: StatusCode::NOT_FOUND, message: "Resource not found".to_string(), details: None, }, CommandError::ConcurrencyConflict(_) => Self { status: StatusCode::CONFLICT, message: "Resource was modified by another request".to_string(), details: None, }, _ => Self { status: StatusCode::INTERNAL_SERVER_ERROR, message: "An internal error occurred".to_string(), details: None, }, } } } impl IntoResponse for ApiError { fn into_response(self) -> axum::response::Response { let body = serde_json::json!({ "error": { "message": self.message, "details": self.details, } }); (self.status, Json(body)).into_response() } } }
Actix Web Integration
Actix Web offers high performance and actor-based architecture:
Setup
[dependencies]
eventcore = "1.0"
actix-web = "4"
actix-rt = "2"
Application Structure
use actix_web::{web, App, HttpServer, HttpResponse, Result}; use eventcore::prelude::*; struct AppData { executor: CommandExecutor<PostgresEventStore>, } #[actix_web::main] async fn main() -> std::io::Result<()> { let event_store = PostgresEventStore::new( "postgresql://localhost/eventcore" ).await.unwrap(); let app_data = web::Data::new(AppData { executor: CommandExecutor::new(event_store), }); HttpServer::new(move || { App::new() .app_data(app_data.clone()) .service( web::scope("/api/v1") .route("/tasks", web::post().to(create_task)) .route("/tasks/{id}", web::get().to(get_task)) .route("/tasks/{id}/assign", web::post().to(assign_task)) ) }) .bind("127.0.0.1:8080")? .run() .await } async fn create_task( data: web::Data<AppData>, request: web::Json<CreateTaskRequest>, ) -> Result<HttpResponse> { // Similar to Axum example Ok(HttpResponse::Created().json(CreateTaskResponse { task_id: "...", message: "...", })) }
Rocket Integration
Rocket provides a declarative, type-safe approach:
Setup
[dependencies]
eventcore = "1.0"
rocket = { version = "0.5", features = ["json"] }
Application Structure
#![allow(unused)] fn main() { use rocket::{State, serde::json::Json}; use eventcore::prelude::*; struct AppState { executor: CommandExecutor<PostgresEventStore>, } #[rocket::post("/tasks", data = "<request>")] async fn create_task( state: &State<AppState>, request: Json<CreateTaskRequest>, ) -> Result<Json<CreateTaskResponse>, ApiError> { // Implementation similar to Axum } #[rocket::launch] fn rocket() -> _ { let event_store = /* initialize */; rocket::build() .manage(AppState { executor: CommandExecutor::new(event_store), }) .mount("/api/v1", rocket::routes![ create_task, get_task, assign_task, ]) } }
Request/Response Design
Command Requests
Design your API requests to map cleanly to commands:
#![allow(unused)] fn main() { // HTTP Request #[derive(Deserialize)] struct TransferMoneyRequest { from_account: String, to_account: String, amount: Decimal, reference: Option<String>, } // Convert to command impl TryFrom<TransferMoneyRequest> for TransferMoney { type Error = ValidationError; fn try_from(req: TransferMoneyRequest) -> Result<Self, Self::Error> { Ok(TransferMoney { from_account: StreamId::try_new(req.from_account)?, to_account: StreamId::try_new(req.to_account)?, amount: Money::try_from_decimal(req.amount)?, reference: req.reference.unwrap_or_default(), }) } } }
Response Design
Return minimal, useful information:
#![allow(unused)] fn main() { #[derive(Serialize)] #[serde(tag = "status")] enum CommandResponse { #[serde(rename = "success")] Success { message: String, #[serde(skip_serializing_if = "Option::is_none")] resource_id: Option<String>, #[serde(skip_serializing_if = "Option::is_none")] resource_url: Option<String>, }, #[serde(rename = "accepted")] Accepted { message: String, tracking_id: String, }, } }
Middleware and Interceptors
Request ID Middleware
Track requests through your system:
#![allow(unused)] fn main() { use axum::middleware::{self, Next}; use axum::extract::Request; use uuid::Uuid; async fn request_id_middleware( mut request: Request, next: Next, ) -> impl IntoResponse { let request_id = Uuid::new_v4().to_string(); // Add to request extensions request.extensions_mut().insert(RequestId(request_id.clone())); // Add to response headers let mut response = next.run(request).await; response.headers_mut().insert( "X-Request-ID", request_id.parse().unwrap(), ); response } // Use in router let app = Router::new() .route("/api/v1/tasks", post(create_task)) .layer(middleware::from_fn(request_id_middleware)); }
Timing Middleware
Monitor performance:
#![allow(unused)] fn main() { use std::time::Instant; async fn timing_middleware( request: Request, next: Next, ) -> impl IntoResponse { let start = Instant::now(); let path = request.uri().path().to_owned(); let method = request.method().clone(); let response = next.run(request).await; let duration = start.elapsed(); tracing::info!( method = %method, path = %path, duration_ms = %duration.as_millis(), status = %response.status(), "Request completed" ); response } }
Configuration
Use environment variables for configuration:
#![allow(unused)] fn main() { use serde::Deserialize; #[derive(Debug, Deserialize)] struct Config { #[serde(default = "default_port")] port: u16, #[serde(default = "default_host")] host: String, database_url: String, #[serde(default = "default_max_connections")] max_connections: u32, } fn default_port() -> u16 { 3000 } fn default_host() -> String { "0.0.0.0".to_string() } fn default_max_connections() -> u32 { 20 } impl Config { fn from_env() -> Result<Self, config::ConfigError> { let mut cfg = config::Config::default(); // Load from environment cfg.merge(config::Environment::default())?; // Load from file if exists if std::path::Path::new("config.toml").exists() { cfg.merge(config::File::with_name("config"))?; } cfg.try_into() } } }
Health Checks
Expose system health:
#![allow(unused)] fn main() { #[derive(Serialize)] struct HealthResponse { status: HealthStatus, version: &'static str, checks: HashMap<String, CheckResult>, } #[derive(Serialize)] #[serde(rename_all = "lowercase")] enum HealthStatus { Healthy, Degraded, Unhealthy, } async fn health_check(State(state): State<AppState>) -> Json<HealthResponse> { let mut checks = HashMap::new(); // Check event store match state.executor.event_store().health_check().await { Ok(_) => checks.insert("event_store".to_string(), CheckResult::healthy()), Err(e) => checks.insert("event_store".to_string(), CheckResult::unhealthy(e)), }; // Check projections let projections = state.projections.read().await; for (name, health) in projections.health_status() { checks.insert(format!("projection_{}", name), health); } // Overall status let status = if checks.values().all(|c| c.is_healthy()) { HealthStatus::Healthy } else if checks.values().any(|c| c.is_unhealthy()) { HealthStatus::Unhealthy } else { HealthStatus::Degraded }; Json(HealthResponse { status, version: env!("CARGO_PKG_VERSION"), checks, }) } }
Graceful Shutdown
Handle shutdown gracefully:
#![allow(unused)] fn main() { use tokio::signal; async fn shutdown_signal() { let ctrl_c = async { signal::ctrl_c() .await .expect("failed to install Ctrl+C handler"); }; #[cfg(unix)] let terminate = async { signal::unix::signal(signal::unix::SignalKind::terminate()) .expect("failed to install signal handler") .recv() .await; }; #[cfg(not(unix))] let terminate = std::future::pending::<()>(); tokio::select! { _ = ctrl_c => {}, _ = terminate => {}, } } // In main let app = /* build app */; axum::serve(listener, app) .with_graceful_shutdown(shutdown_signal()) .await .unwrap(); }
Testing HTTP Endpoints
Test your API endpoints:
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use axum::http::StatusCode; use tower::ServiceExt; #[tokio::test] async fn test_create_task_success() { let app = create_test_app().await; let response = app .oneshot( Request::builder() .method("POST") .uri("/api/v1/tasks") .header("content-type", "application/json") .body(Body::from(r#"{ "title": "Test Task", "description": "Test Description" }"#)) .unwrap(), ) .await .unwrap(); assert_eq!(response.status(), StatusCode::CREATED); let body: CreateTaskResponse = serde_json::from_slice( &hyper::body::to_bytes(response.into_body()).await.unwrap() ).unwrap(); assert!(!body.task_id.is_empty()); } async fn create_test_app() -> Router { let event_store = InMemoryEventStore::new(); let state = AppState { executor: CommandExecutor::new(event_store), projections: Arc::new(RwLock::new(ProjectionManager::new())), }; Router::new() .route("/api/v1/tasks", post(create_task)) .with_state(state) } } }
Best Practices
- Keep handlers thin - Delegate business logic to commands
- Use proper status codes - 201 for creation, 202 for accepted, etc.
- Version your API - Use URL versioning (/api/v1/)
- Document with OpenAPI - Generate from code when possible
- Use correlation IDs - Track requests across services
- Log appropriately - Info for requests, error for failures
- Handle errors gracefully - Never expose internal details
Summary
Setting up HTTP endpoints for EventCore:
- ✅ Framework agnostic - Works with any Rust web framework
- ✅ Thin HTTP layer - Focus on translation, not business logic
- ✅ Type-safe - Leverage Rust’s type system
- ✅ Error handling - Map domain errors to HTTP responses
- ✅ Testable - Easy to test endpoints in isolation
Key patterns:
- Parse and validate requests early
- Convert to domain commands
- Execute with EventCore
- Map results to HTTP responses
- Handle errors appropriately
Next, let’s explore Command Handlers →
Chapter 4.2: Command Handlers
Command handlers are the bridge between HTTP requests and your EventCore commands. This chapter covers patterns for building robust, maintainable command handlers.
Command Handler Architecture
HTTP Request
↓
Parse & Validate
↓
Authenticate & Authorize
↓
Create Command
↓
Execute Command
↓
Format Response
Basic Command Handler Pattern
The Handler Function
#![allow(unused)] fn main() { use axum::{ extract::{State, Path, Json}, http::StatusCode, response::IntoResponse, }; use serde::{Deserialize, Serialize}; use eventcore::prelude::*; #[derive(Debug, Deserialize)] struct AssignTaskRequest { assignee_id: String, } #[derive(Debug, Serialize)] struct AssignTaskResponse { message: String, task_id: String, assignee_id: String, assigned_at: DateTime<Utc>, } async fn assign_task( State(state): State<AppState>, Path(task_id): Path<String>, user: AuthenticatedUser, // From middleware Json(request): Json<AssignTaskRequest>, ) -> Result<Json<AssignTaskResponse>, ApiError> { // 1. Parse and validate input let task_stream = StreamId::try_new(format!("task-{}", task_id)) .map_err(|e| ApiError::validation("Invalid task ID"))?; let assignee_stream = StreamId::try_new(format!("user-{}", request.assignee_id)) .map_err(|e| ApiError::validation("Invalid assignee ID"))?; // 2. Create command let command = AssignTask { task_id: task_stream, assignee_id: assignee_stream, assigned_by: user.id.clone(), }; // 3. Execute with context let result = state.executor .execute_with_context( &command, ExecutionContext::new() .with_user_id(user.id) .with_correlation_id(extract_correlation_id(&request)) ) .await .map_err(ApiError::from_command_error)?; // 4. Format response Ok(Json(AssignTaskResponse { message: "Task assigned successfully".to_string(), task_id: task_id.clone(), assignee_id: request.assignee_id, assigned_at: Utc::now(), })) } }
Authentication and Authorization
Authentication Middleware
#![allow(unused)] fn main() { use axum::{ extract::{Request, FromRequestParts}, http::{header, StatusCode}, response::Response, middleware::Next, }; use jsonwebtoken::{decode, DecodingKey, Validation}; #[derive(Debug, Clone, Serialize, Deserialize)] struct Claims { sub: String, // User ID exp: usize, // Expiration time roles: Vec<String>, } #[derive(Debug, Clone)] struct AuthenticatedUser { id: UserId, roles: Vec<String>, } #[async_trait] impl<S> FromRequestParts<S> for AuthenticatedUser where S: Send + Sync, { type Rejection = ApiError; async fn from_request_parts( parts: &mut http::request::Parts, _state: &S, ) -> Result<Self, Self::Rejection> { // Extract token from Authorization header let token = parts .headers .get(header::AUTHORIZATION) .and_then(|auth| auth.to_str().ok()) .and_then(|auth| auth.strip_prefix("Bearer ")) .ok_or_else(|| ApiError::unauthorized("Missing authentication token"))?; // Decode and validate token let token_data = decode::<Claims>( token, &DecodingKey::from_secret(JWT_SECRET.as_ref()), &Validation::default(), ) .map_err(|_| ApiError::unauthorized("Invalid authentication token"))?; Ok(AuthenticatedUser { id: UserId::try_new(token_data.claims.sub)?, roles: token_data.claims.roles, }) } } }
Authorization in Handlers
#![allow(unused)] fn main() { impl AuthenticatedUser { fn has_role(&self, role: &str) -> bool { self.roles.contains(&role.to_string()) } fn can_manage_tasks(&self) -> bool { self.has_role("admin") || self.has_role("manager") } fn can_assign_tasks(&self) -> bool { self.has_role("admin") || self.has_role("manager") || self.has_role("lead") } } async fn delete_task( State(state): State<AppState>, Path(task_id): Path<String>, user: AuthenticatedUser, ) -> Result<StatusCode, ApiError> { // Check authorization if !user.can_manage_tasks() { return Err(ApiError::forbidden("Insufficient permissions to delete tasks")); } let command = DeleteTask { task_id: StreamId::try_new(format!("task-{}", task_id))?, deleted_by: user.id, }; state.executor.execute(&command).await?; Ok(StatusCode::NO_CONTENT) } }
Input Validation
Request Validation
#![allow(unused)] fn main() { use validator::{Validate, ValidationError}; #[derive(Debug, Deserialize, Validate)] struct CreateProjectRequest { #[validate(length(min = 3, max = 100))] name: String, #[validate(length(max = 1000))] description: Option<String>, #[validate(email)] owner_email: String, #[validate(range(min = 1, max = 365))] duration_days: u32, #[validate(custom = "validate_start_date")] start_date: Option<DateTime<Utc>>, } fn validate_start_date(date: &DateTime<Utc>) -> Result<(), ValidationError> { if *date < Utc::now() { return Err(ValidationError::new("Start date cannot be in the past")); } Ok(()) } async fn create_project( State(state): State<AppState>, user: AuthenticatedUser, Json(request): Json<CreateProjectRequest>, ) -> Result<Json<CreateProjectResponse>, ApiError> { // Validate request request.validate() .map_err(|e| ApiError::validation_errors(e))?; // Create command with validated data let command = CreateProject { project_id: StreamId::from(format!("project-{}", ProjectId::new())), name: ProjectName::try_new(request.name)?, description: request.description .map(|d| ProjectDescription::try_new(d)) .transpose()?, owner: UserId::try_new(request.owner_email)?, duration: Duration::days(request.duration_days as i64), start_date: request.start_date.unwrap_or_else(Utc::now), created_by: user.id, }; // Execute and return response // ... } }
Custom Validation Rules
#![allow(unused)] fn main() { mod validators { use super::*; pub fn validate_business_hours(time: &NaiveTime) -> Result<(), ValidationError> { const BUSINESS_START: NaiveTime = NaiveTime::from_hms_opt(9, 0, 0).unwrap(); const BUSINESS_END: NaiveTime = NaiveTime::from_hms_opt(17, 0, 0).unwrap(); if *time < BUSINESS_START || *time > BUSINESS_END { return Err(ValidationError::new("Outside business hours")); } Ok(()) } pub fn validate_future_date(date: &NaiveDate) -> Result<(), ValidationError> { if *date <= Local::now().naive_local().date() { return Err(ValidationError::new("Date must be in the future")); } Ok(()) } pub fn validate_currency_code(code: &str) -> Result<(), ValidationError> { const VALID_CURRENCIES: &[&str] = &["USD", "EUR", "GBP", "JPY"]; if !VALID_CURRENCIES.contains(&code) { return Err(ValidationError::new("Invalid currency code")); } Ok(()) } } }
Idempotency
Ensure commands can be safely retried:
Idempotency Keys
#![allow(unused)] fn main() { use axum::extract::FromRequest; #[derive(Debug, Clone)] struct IdempotencyKey(String); #[async_trait] impl<S> FromRequestParts<S> for IdempotencyKey where S: Send + Sync, { type Rejection = ApiError; async fn from_request_parts( parts: &mut http::request::Parts, _state: &S, ) -> Result<Self, Self::Rejection> { parts .headers .get("Idempotency-Key") .and_then(|v| v.to_str().ok()) .map(|s| IdempotencyKey(s.to_string())) .ok_or_else(|| ApiError::bad_request("Idempotency-Key header required")) } } // Store for idempotency #[derive(Clone)] struct IdempotencyStore { cache: Arc<RwLock<HashMap<String, CachedResponse>>>, } #[derive(Clone)] struct CachedResponse { status: StatusCode, body: Vec<u8>, created_at: DateTime<Utc>, } async fn idempotent_handler<F, Fut>( key: IdempotencyKey, store: State<IdempotencyStore>, handler: F, ) -> Response where F: FnOnce() -> Fut, Fut: Future<Output = Response>, { // Check cache let cache = store.cache.read().await; if let Some(cached) = cache.get(&key.0) { // Return cached response return Response::builder() .status(cached.status) .body(Body::from(cached.body.clone())) .unwrap(); } drop(cache); // Execute handler let response = handler().await; // Cache successful responses if response.status().is_success() { let (parts, body) = response.into_parts(); let body_bytes = hyper::body::to_bytes(body).await.unwrap().to_vec(); let mut cache = store.cache.write().await; cache.insert(key.0, CachedResponse { status: parts.status, body: body_bytes.clone(), created_at: Utc::now(), }); Response::from_parts(parts, Body::from(body_bytes)) } else { response } } }
Command-Level Idempotency
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct TransferMoney { #[stream] from_account: StreamId, #[stream] to_account: StreamId, amount: Money, // Idempotency key embedded in command transfer_id: TransferId, } impl CommandLogic for TransferMoney { // ... other implementations async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check if transfer already processed if state.processed_transfers.contains(&self.transfer_id) { // Already processed - return success with no new events return Ok(vec![]); } // Process transfer... Ok(vec![ StreamWrite::new( &read_streams, self.from_account.clone(), BankEvent::TransferProcessed { transfer_id: self.transfer_id, amount: self.amount, } )?, // ... other events ]) } } }
Error Response Formatting
Provide consistent, helpful error responses:
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct ErrorResponse { error: ErrorDetails, #[serde(skip_serializing_if = "Option::is_none")] request_id: Option<String>, } #[derive(Debug, Serialize)] struct ErrorDetails { code: String, message: String, #[serde(skip_serializing_if = "Option::is_none")] field_errors: Option<HashMap<String, Vec<String>>>, #[serde(skip_serializing_if = "Option::is_none")] help: Option<String>, } impl ApiError { fn to_response(&self, request_id: Option<String>) -> Response { let (status, error_details) = match self { ApiError::Validation { errors } => ( StatusCode::BAD_REQUEST, ErrorDetails { code: "VALIDATION_ERROR".to_string(), message: "Invalid request data".to_string(), field_errors: Some(errors.clone()), help: Some("Check the field_errors for specific validation issues".to_string()), } ), ApiError::BusinessRule { message } => ( StatusCode::UNPROCESSABLE_ENTITY, ErrorDetails { code: "BUSINESS_RULE_VIOLATION".to_string(), message: message.clone(), field_errors: None, help: None, } ), ApiError::NotFound { resource } => ( StatusCode::NOT_FOUND, ErrorDetails { code: "RESOURCE_NOT_FOUND".to_string(), message: format!("{} not found", resource), field_errors: None, help: None, } ), ApiError::Conflict { message } => ( StatusCode::CONFLICT, ErrorDetails { code: "CONFLICT".to_string(), message: message.clone(), field_errors: None, help: Some("The resource was modified. Please refresh and try again.".to_string()), } ), // ... other error types }; let response = ErrorResponse { error: error_details, request_id, }; (status, Json(response)).into_response() } } }
Batch Command Handlers
Handle multiple commands efficiently:
#![allow(unused)] fn main() { #[derive(Debug, Deserialize)] struct BatchRequest<T> { operations: Vec<T>, #[serde(default)] stop_on_error: bool, } #[derive(Debug, Serialize)] struct BatchResponse<T> { results: Vec<BatchResult<T>>, successful: usize, failed: usize, } #[derive(Debug, Serialize)] #[serde(tag = "status")] enum BatchResult<T> { #[serde(rename = "success")] Success { result: T }, #[serde(rename = "error")] Error { error: ErrorDetails }, } async fn batch_create_tasks( State(state): State<AppState>, user: AuthenticatedUser, Json(batch): Json<BatchRequest<CreateTaskRequest>>, ) -> Result<Json<BatchResponse<CreateTaskResponse>>, ApiError> { let mut results = Vec::new(); let mut successful = 0; let mut failed = 0; for request in batch.operations { match create_single_task(&state, &user, request).await { Ok(response) => { successful += 1; results.push(BatchResult::Success { result: response }); } Err(error) => { failed += 1; results.push(BatchResult::Error { error: error.to_error_details() }); if batch.stop_on_error { break; } } } } Ok(Json(BatchResponse { results, successful, failed, })) } }
Async Command Processing
For long-running commands:
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct AsyncCommandResponse { tracking_id: String, status_url: String, message: String, } async fn import_large_dataset( State(state): State<AppState>, user: AuthenticatedUser, Json(request): Json<ImportDatasetRequest>, ) -> Result<Json<AsyncCommandResponse>, ApiError> { // Validate request request.validate()?; // Create tracking ID let tracking_id = TrackingId::new(); // Queue command for async processing let command = ImportDataset { dataset_id: StreamId::from(format!("dataset-{}", DatasetId::new())), source_url: request.source_url, import_options: request.options, initiated_by: user.id, tracking_id: tracking_id.clone(), }; // Submit to background queue state.command_queue .submit(command) .await .map_err(|_| ApiError::service_unavailable("Import service temporarily unavailable"))?; // Return tracking information Ok(Json(AsyncCommandResponse { tracking_id: tracking_id.to_string(), status_url: format!("/api/v1/imports/{}/status", tracking_id), message: "Import queued for processing".to_string(), })) } // Status endpoint async fn get_import_status( State(state): State<AppState>, Path(tracking_id): Path<String>, ) -> Result<Json<ImportStatus>, ApiError> { let status = state.import_tracker .get_status(&TrackingId::try_new(tracking_id)?) .await? .ok_or_else(|| ApiError::not_found("Import"))?; Ok(Json(status)) } }
Command Handler Testing
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use eventcore::testing::prelude::*; #[tokio::test] async fn test_assign_task_authorization() { let state = create_test_state().await; // User without permission let user = AuthenticatedUser { id: UserId::try_new("user@example.com").unwrap(), roles: vec!["member".to_string()], }; let request = AssignTaskRequest { assignee_id: "assignee@example.com".to_string(), }; let result = assign_task( State(state), Path("task-123".to_string()), user, Json(request), ).await; assert!(matches!( result, Err(ApiError::Forbidden { .. }) )); } #[tokio::test] async fn test_idempotent_transfer() { let state = create_test_state().await; let transfer_id = TransferId::new(); let request = TransferMoneyRequest { from_account: "account-1".to_string(), to_account: "account-2".to_string(), amount: 100.0, transfer_id: transfer_id.to_string(), }; // First call let response1 = transfer_money( State(state.clone()), Json(request.clone()), ).await.unwrap(); // Second call with same transfer_id let response2 = transfer_money( State(state), Json(request), ).await.unwrap(); // Should return same response assert_eq!(response1.0.transfer_id, response2.0.transfer_id); assert_eq!(response1.0.status, response2.0.status); } } }
Monitoring and Metrics
Track command handler performance:
#![allow(unused)] fn main() { use prometheus::{IntCounter, Histogram, register_int_counter, register_histogram}; lazy_static! { static ref COMMAND_COUNTER: IntCounter = register_int_counter!( "api_commands_total", "Total number of commands processed" ).unwrap(); static ref COMMAND_DURATION: Histogram = register_histogram!( "api_command_duration_seconds", "Command processing duration" ).unwrap(); static ref COMMAND_ERRORS: IntCounter = register_int_counter!( "api_command_errors_total", "Total number of command errors" ).unwrap(); } async fn measured_handler<F, Fut, T>( command_type: &str, handler: F, ) -> Result<T, ApiError> where F: FnOnce() -> Fut, Fut: Future<Output = Result<T, ApiError>>, { COMMAND_COUNTER.inc(); let timer = COMMAND_DURATION.start_timer(); let result = handler().await; timer.observe_duration(); if result.is_err() { COMMAND_ERRORS.inc(); } // Log with structured data match &result { Ok(_) => { tracing::info!( command_type = %command_type, duration_ms = %timer.stop_and_record() * 1000.0, "Command completed successfully" ); } Err(e) => { tracing::error!( command_type = %command_type, error = %e, "Command failed" ); } } result } }
Best Practices
- Validate early - Check inputs before creating commands
- Use strong types - Convert strings to domain types ASAP
- Handle all errors - Map domain errors to appropriate HTTP responses
- Be idempotent - Design for safe retries
- Authenticate first - Verify identity before any processing
- Authorize actions - Check permissions for each operation
- Log appropriately - Include context for debugging
- Monitor everything - Track success rates and latencies
Summary
Command handlers in EventCore APIs:
- ✅ Type-safe - Leverage Rust’s type system
- ✅ Validated - Check inputs thoroughly
- ✅ Authenticated - Know who’s making requests
- ✅ Authorized - Enforce permissions
- ✅ Idempotent - Safe to retry
- ✅ Monitored - Track performance and errors
Key patterns:
- Parse and validate input
- Check authentication and authorization
- Create strongly-typed commands
- Execute with proper context
- Handle errors gracefully
- Return appropriate responses
Next, let’s explore Query Endpoints →
Chapter 4.3: Query Endpoints
Query endpoints serve read requests from your projections. Unlike commands that modify state, queries are side-effect free and can be cached, making them perfect for high-performance read operations.
Query Architecture
HTTP Request → Authenticate → Authorize → Query Projection → Format Response
↑
Read Model Store
Basic Query Pattern
Simple Query Endpoint
#![allow(unused)] fn main() { use axum::{ extract::{State, Path, Query as QueryParams}, Json, }; use serde::{Deserialize, Serialize}; #[derive(Debug, Deserialize)] struct ListTasksQuery { #[serde(default)] status: Option<TaskStatus>, #[serde(default)] assigned_to: Option<String>, #[serde(default = "default_page")] page: u32, #[serde(default = "default_page_size")] page_size: u32, } fn default_page() -> u32 { 1 } fn default_page_size() -> u32 { 20 } #[derive(Debug, Serialize)] struct ListTasksResponse { tasks: Vec<TaskSummary>, pagination: PaginationInfo, } #[derive(Debug, Serialize)] struct TaskSummary { id: String, title: String, status: TaskStatus, assigned_to: Option<String>, created_at: DateTime<Utc>, updated_at: DateTime<Utc>, } #[derive(Debug, Serialize)] struct PaginationInfo { page: u32, page_size: u32, total_items: u64, total_pages: u32, } async fn list_tasks( State(state): State<AppState>, QueryParams(query): QueryParams<ListTasksQuery>, ) -> Result<Json<ListTasksResponse>, ApiError> { // Get projection let projection = state.projections .read() .await .get::<TaskListProjection>() .ok_or_else(|| ApiError::internal("Task projection not available"))?; // Apply filters let mut tasks = projection.get_all_tasks(); if let Some(status) = query.status { tasks.retain(|t| t.status == status); } if let Some(assigned_to) = query.assigned_to { tasks.retain(|t| t.assigned_to.as_ref() == Some(&assigned_to)); } // Calculate pagination let total_items = tasks.len() as u64; let total_pages = ((total_items as f32) / (query.page_size as f32)).ceil() as u32; // Apply pagination let start = ((query.page - 1) * query.page_size) as usize; let end = (start + query.page_size as usize).min(tasks.len()); let page_tasks = tasks[start..end].to_vec(); Ok(Json(ListTasksResponse { tasks: page_tasks.into_iter().map(Into::into).collect(), pagination: PaginationInfo { page: query.page, page_size: query.page_size, total_items, total_pages, }, })) } }
Advanced Query Patterns
Filtering and Sorting
#![allow(unused)] fn main() { #[derive(Debug, Deserialize)] #[serde(rename_all = "snake_case")] enum SortField { CreatedAt, UpdatedAt, Title, Priority, DueDate, } #[derive(Debug, Deserialize)] #[serde(rename_all = "snake_case")] enum SortOrder { Asc, Desc, } #[derive(Debug, Deserialize)] struct AdvancedTaskQuery { // Filters #[serde(default)] status: Option<Vec<TaskStatus>>, #[serde(default)] assigned_to: Option<Vec<String>>, #[serde(default)] created_after: Option<DateTime<Utc>>, #[serde(default)] created_before: Option<DateTime<Utc>>, #[serde(default)] search: Option<String>, // Sorting #[serde(default = "default_sort_field")] sort_by: SortField, #[serde(default = "default_sort_order")] sort_order: SortOrder, // Pagination #[serde(default)] cursor: Option<String>, #[serde(default = "default_limit")] limit: u32, } fn default_sort_field() -> SortField { SortField::CreatedAt } fn default_sort_order() -> SortOrder { SortOrder::Desc } fn default_limit() -> u32 { 50 } async fn search_tasks( State(state): State<AppState>, QueryParams(query): QueryParams<AdvancedTaskQuery>, ) -> Result<Json<CursorPaginatedResponse<TaskSummary>>, ApiError> { let projection = state.projections .read() .await .get::<TaskSearchProjection>() .ok_or_else(|| ApiError::internal("Search projection not available"))?; // Build query let mut search_query = SearchQuery::new(); if let Some(statuses) = query.status { search_query = search_query.with_status_in(statuses); } if let Some(assignees) = query.assigned_to { search_query = search_query.with_assignee_in(assignees); } if let Some(after) = query.created_after { search_query = search_query.created_after(after); } if let Some(before) = query.created_before { search_query = search_query.created_before(before); } if let Some(search_text) = query.search { search_query = search_query.with_text_search(search_text); } // Apply sorting search_query = match query.sort_by { SortField::CreatedAt => search_query.sort_by_created_at(query.sort_order), SortField::UpdatedAt => search_query.sort_by_updated_at(query.sort_order), SortField::Title => search_query.sort_by_title(query.sort_order), SortField::Priority => search_query.sort_by_priority(query.sort_order), SortField::DueDate => search_query.sort_by_due_date(query.sort_order), }; // Apply cursor pagination if let Some(cursor) = query.cursor { search_query = search_query.after_cursor(Cursor::decode(&cursor)?); } search_query = search_query.limit(query.limit); // Execute query let results = projection.search(search_query).await?; Ok(Json(results)) } }
Aggregation Queries
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct TaskStatistics { total_tasks: u64, tasks_by_status: HashMap<TaskStatus, u64>, tasks_by_assignee: Vec<AssigneeStats>, completion_rate: f64, average_completion_time: Option<Duration>, overdue_tasks: u64, } #[derive(Debug, Serialize)] struct AssigneeStats { assignee_id: String, assignee_name: String, total_tasks: u64, completed_tasks: u64, in_progress_tasks: u64, } async fn get_task_statistics( State(state): State<AppState>, QueryParams(query): QueryParams<DateRangeQuery>, ) -> Result<Json<TaskStatistics>, ApiError> { let projection = state.projections .read() .await .get::<TaskAnalyticsProjection>() .ok_or_else(|| ApiError::internal("Analytics projection not available"))?; let stats = projection.calculate_statistics( query.start_date, query.end_date, ).await?; Ok(Json(stats)) } // Time-series data #[derive(Debug, Serialize)] struct TimeSeriesData { period: String, data_points: Vec<DataPoint>, } #[derive(Debug, Serialize)] struct DataPoint { timestamp: DateTime<Utc>, value: f64, metadata: Option<serde_json::Value>, } async fn get_task_completion_trend( State(state): State<AppState>, QueryParams(query): QueryParams<TimeSeriesQuery>, ) -> Result<Json<TimeSeriesData>, ApiError> { let projection = state.projections .read() .await .get::<TaskMetricsProjection>() .ok_or_else(|| ApiError::internal("Metrics projection not available"))?; let data = projection.get_completion_trend( query.start_date, query.end_date, query.granularity, ).await?; Ok(Json(data)) } }
GraphQL Integration
For complex queries, GraphQL can be more efficient:
#![allow(unused)] fn main() { use async_graphql::{ Context, Object, Schema, EmptyMutation, EmptySubscription, ID, Result as GraphQLResult, }; struct QueryRoot; #[Object] impl QueryRoot { async fn task(&self, ctx: &Context<'_>, id: ID) -> GraphQLResult<Option<Task>> { let projection = ctx.data::<Arc<TaskProjection>>()?; Ok(projection.get_task(&id.to_string()).await?) } async fn tasks( &self, ctx: &Context<'_>, filter: Option<TaskFilter>, sort: Option<TaskSort>, pagination: Option<PaginationInput>, ) -> GraphQLResult<TaskConnection> { let projection = ctx.data::<Arc<TaskProjection>>()?; let query = build_query(filter, sort, pagination); let results = projection.query(query).await?; Ok(TaskConnection::from(results)) } async fn user(&self, ctx: &Context<'_>, id: ID) -> GraphQLResult<Option<User>> { let projection = ctx.data::<Arc<UserProjection>>()?; Ok(projection.get_user(&id.to_string()).await?) } } // GraphQL types #[derive(async_graphql::SimpleObject)] struct Task { id: ID, title: String, description: String, status: TaskStatus, assigned_to: Option<User>, created_at: DateTime<Utc>, updated_at: DateTime<Utc>, } #[derive(async_graphql::InputObject)] struct TaskFilter { status: Option<Vec<TaskStatus>>, assigned_to: Option<Vec<ID>>, created_after: Option<DateTime<Utc>>, search: Option<String>, } // Axum handler async fn graphql_handler( State(state): State<AppState>, user: Option<AuthenticatedUser>, req: GraphQLRequest, ) -> GraphQLResponse { let schema = Schema::build(QueryRoot, EmptyMutation, EmptySubscription) .data(state.projections.clone()) .data(user) .finish(); schema.execute(req.into_inner()).await.into() } }
Caching Strategies
Response Caching
#![allow(unused)] fn main() { use axum::http::header::{CACHE_CONTROL, ETAG, IF_NONE_MATCH}; use sha2::{Sha256, Digest}; #[derive(Clone)] struct CacheConfig { public_max_age: Duration, private_max_age: Duration, stale_while_revalidate: Duration, } async fn cached_query_handler<F, Fut, T>( headers: HeaderMap, cache_config: CacheConfig, query_fn: F, ) -> Response where F: FnOnce() -> Fut, Fut: Future<Output = Result<T, ApiError>>, T: Serialize, { // Execute query let result = match query_fn().await { Ok(data) => data, Err(e) => return e.into_response(), }; // Serialize response let body = match serde_json::to_vec(&result) { Ok(bytes) => bytes, Err(_) => return ApiError::internal("Serialization failed").into_response(), }; // Calculate ETag let mut hasher = Sha256::new(); hasher.update(&body); let etag = format!("\"{}\"", hex::encode(hasher.finalize())); // Check If-None-Match if let Some(if_none_match) = headers.get(IF_NONE_MATCH) { if if_none_match.to_str().ok() == Some(&etag) { return StatusCode::NOT_MODIFIED.into_response(); } } // Build response with caching headers Response::builder() .status(StatusCode::OK) .header(CONTENT_TYPE, "application/json") .header(ETAG, &etag) .header( CACHE_CONTROL, format!( "public, max-age={}, stale-while-revalidate={}", cache_config.public_max_age.as_secs(), cache_config.stale_while_revalidate.as_secs() ) ) .body(Body::from(body)) .unwrap() } // Usage async fn get_public_statistics( State(state): State<AppState>, headers: HeaderMap, ) -> Response { cached_query_handler( headers, CacheConfig { public_max_age: Duration::from_secs(300), // 5 minutes private_max_age: Duration::from_secs(0), stale_while_revalidate: Duration::from_secs(60), }, || async { let projection = state.projections .read() .await .get::<PublicStatsProjection>() .ok_or_else(|| ApiError::internal("Stats not available"))?; projection.get_current_stats().await }, ).await } }
Query Result Caching
#![allow(unused)] fn main() { use moka::future::Cache; #[derive(Clone)] struct QueryCache { cache: Cache<String, CachedResult>, } #[derive(Clone)] struct CachedResult { data: Vec<u8>, cached_at: DateTime<Utc>, ttl: Duration, } impl QueryCache { fn new() -> Self { Self { cache: Cache::builder() .max_capacity(10_000) .time_to_live(Duration::from_secs(300)) .build(), } } async fn get_or_compute<F, Fut, T>( &self, key: &str, ttl: Duration, compute_fn: F, ) -> Result<T, ApiError> where F: FnOnce() -> Fut, Fut: Future<Output = Result<T, ApiError>>, T: Serialize + DeserializeOwned, { // Check cache if let Some(cached) = self.cache.get(key).await { if Utc::now() - cached.cached_at < cached.ttl { return serde_json::from_slice(&cached.data) .map_err(|_| ApiError::internal("Cache deserialization failed")); } } // Compute result let result = compute_fn().await?; // Cache result let data = serde_json::to_vec(&result) .map_err(|_| ApiError::internal("Cache serialization failed"))?; self.cache.insert( key.to_string(), CachedResult { data, cached_at: Utc::now(), ttl, } ).await; Ok(result) } } }
Real-time Queries with SSE
Server-Sent Events for live updates:
#![allow(unused)] fn main() { use axum::response::sse::{Event, Sse}; use futures::stream::Stream; use tokio_stream::StreamExt; async fn task_updates_stream( State(state): State<AppState>, user: AuthenticatedUser, ) -> Sse<impl Stream<Item = Result<Event, ApiError>>> { let stream = async_stream::stream! { let mut subscription = state.projections .read() .await .get::<TaskProjection>() .unwrap() .subscribe_to_updates(user.id) .await; while let Some(update) = subscription.next().await { let event = match update { TaskUpdate::Created(task) => { Event::default() .event("task-created") .json_data(task) .unwrap() } TaskUpdate::Updated(task) => { Event::default() .event("task-updated") .json_data(task) .unwrap() } TaskUpdate::Deleted(task_id) => { Event::default() .event("task-deleted") .data(task_id) } }; yield Ok(event); } }; Sse::new(stream).keep_alive( axum::response::sse::KeepAlive::new() .interval(Duration::from_secs(30)) .text("keep-alive") ) } }
Query Performance Optimization
N+1 Query Prevention
#![allow(unused)] fn main() { // Bad: N+1 queries async fn get_tasks_with_assignees_bad( projection: &TaskProjection, ) -> Result<Vec<TaskWithAssignee>, ApiError> { let tasks = projection.get_all_tasks().await?; let mut results = Vec::new(); for task in tasks { // This makes a separate query for each task! let assignee = if let Some(assignee_id) = &task.assigned_to { projection.get_user(assignee_id).await? } else { None }; results.push(TaskWithAssignee { task, assignee, }); } Ok(results) } // Good: Batch loading async fn get_tasks_with_assignees_good( projection: &TaskProjection, ) -> Result<Vec<TaskWithAssignee>, ApiError> { let tasks = projection.get_all_tasks().await?; // Collect all assignee IDs let assignee_ids: HashSet<_> = tasks .iter() .filter_map(|t| t.assigned_to.as_ref()) .cloned() .collect(); // Load all assignees in one query let assignees = projection .get_users_by_ids(assignee_ids.into_iter().collect()) .await?; // Build results let assignee_map: HashMap<_, _> = assignees .into_iter() .map(|u| (u.id.clone(), u)) .collect(); Ok(tasks.into_iter().map(|task| { let assignee = task.assigned_to .as_ref() .and_then(|id| assignee_map.get(id)) .cloned(); TaskWithAssignee { task, assignee } }).collect()) } }
Query Complexity Limits
#![allow(unused)] fn main() { use async_graphql::{extensions::ComplexityLimit, ValidationResult}; struct QueryComplexity; impl QueryComplexity { fn calculate_complexity(query: &GraphQLQuery) -> u32 { // Simple heuristic: count fields and multiply by depth let field_count = count_fields(query); let max_depth = calculate_max_depth(query); field_count * max_depth } } // In GraphQL schema let schema = Schema::build(QueryRoot, EmptyMutation, EmptySubscription) .extension(ComplexityLimit::new(1000)) // Max complexity .finish(); // For REST endpoints #[derive(Debug)] struct QueryComplexityGuard { max_items: u32, max_depth: u32, } impl QueryComplexityGuard { fn validate(&self, query: &AdvancedTaskQuery) -> Result<(), ApiError> { // Check pagination limits if query.limit > self.max_items { return Err(ApiError::bad_request( format!("Limit cannot exceed {}", self.max_items) )); } // Check filter complexity let filter_count = query.status.as_ref().map(|s| s.len()).unwrap_or(0) + query.assigned_to.as_ref().map(|a| a.len()).unwrap_or(0); if filter_count > 100 { return Err(ApiError::bad_request( "Too many filter values" )); } Ok(()) } } }
Security Considerations
Query Authorization
#![allow(unused)] fn main() { #[async_trait] trait QueryAuthorizer { async fn can_view_task(&self, user: &AuthenticatedUser, task_id: &str) -> bool; async fn can_view_user_tasks(&self, user: &AuthenticatedUser, target_user_id: &str) -> bool; async fn can_view_statistics(&self, user: &AuthenticatedUser) -> bool; } struct RoleBasedAuthorizer; #[async_trait] impl QueryAuthorizer for RoleBasedAuthorizer { async fn can_view_task(&self, user: &AuthenticatedUser, task_id: &str) -> bool { // Admin can see all if user.has_role("admin") { return true; } // Others can only see their own tasks or tasks they created // Would need to check task details... true } async fn can_view_user_tasks(&self, user: &AuthenticatedUser, target_user_id: &str) -> bool { // Users can see their own tasks if user.id.to_string() == target_user_id { return true; } // Managers can see their team's tasks user.has_role("manager") || user.has_role("admin") } async fn can_view_statistics(&self, user: &AuthenticatedUser) -> bool { user.has_role("manager") || user.has_role("admin") } } // Use in handlers async fn get_user_tasks( State(state): State<AppState>, Path(user_id): Path<String>, user: AuthenticatedUser, ) -> Result<Json<Vec<TaskSummary>>, ApiError> { // Check authorization if !state.authorizer.can_view_user_tasks(&user, &user_id).await { return Err(ApiError::forbidden("Cannot view tasks for this user")); } // Continue with query... } }
Rate Limiting
#![allow(unused)] fn main() { use governor::{Quota, RateLimiter}; #[derive(Clone)] struct RateLimitConfig { anonymous_quota: Quota, authenticated_quota: Quota, admin_quota: Quota, } async fn rate_limit_middleware( State(limiter): State<Arc<RateLimiter<String>>>, user: Option<AuthenticatedUser>, request: Request, next: Next, ) -> Result<Response, ApiError> { let key = match &user { Some(u) => u.id.to_string(), None => request .headers() .get("x-forwarded-for") .and_then(|h| h.to_str().ok()) .unwrap_or("anonymous") .to_string(), }; limiter .check_key(&key) .map_err(|_| ApiError::too_many_requests("Rate limit exceeded"))?; Ok(next.run(request).await) } }
Testing Query Endpoints
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; #[tokio::test] async fn test_pagination() { let state = create_test_state_with_tasks(100).await; // First page let response = list_tasks( State(state.clone()), QueryParams(ListTasksQuery { page: 1, page_size: 20, ..Default::default() }), ).await.unwrap(); assert_eq!(response.0.tasks.len(), 20); assert_eq!(response.0.pagination.total_items, 100); assert_eq!(response.0.pagination.total_pages, 5); // Last page let response = list_tasks( State(state), QueryParams(ListTasksQuery { page: 5, page_size: 20, ..Default::default() }), ).await.unwrap(); assert_eq!(response.0.tasks.len(), 20); } #[tokio::test] async fn test_caching_headers() { let state = create_test_state().await; let response = get_public_statistics( State(state), HeaderMap::new(), ).await; assert_eq!(response.status(), StatusCode::OK); assert!(response.headers().contains_key(ETAG)); assert!(response.headers().contains_key(CACHE_CONTROL)); let cache_control = response.headers() .get(CACHE_CONTROL) .unwrap() .to_str() .unwrap(); assert!(cache_control.contains("max-age=300")); } } }
Best Practices
- Use projections - Don’t query event streams directly
- Paginate results - Never return unbounded lists
- Cache aggressively - Read queries are perfect for caching
- Validate query parameters - Prevent resource exhaustion
- Monitor performance - Track slow queries
- Use appropriate protocols - REST for simple, GraphQL for complex
- Implement authorization - Check permissions for all queries
- Version your API - Queries can evolve independently
Summary
Query endpoints in EventCore applications:
- ✅ Projection-based - Read from optimized projections
- ✅ Performant - Caching and optimization built-in
- ✅ Flexible - Support REST, GraphQL, and real-time
- ✅ Secure - Authorization and rate limiting
- ✅ Testable - Easy to test in isolation
Key patterns:
- Read from projections, not event streams
- Implement proper pagination
- Cache responses appropriately
- Validate and limit query complexity
- Authorize access to data
- Monitor query performance
Next, let’s explore Authentication and Authorization →
Chapter 4.4: Authentication and Authorization
Security is critical for event-sourced systems. This chapter covers authentication (who you are) and authorization (what you can do) patterns for EventCore APIs.
Authentication Strategies
JWT Authentication
JSON Web Tokens are stateless and work well with EventCore:
#![allow(unused)] fn main() { use jsonwebtoken::{encode, decode, Header, Algorithm, Validation, EncodingKey, DecodingKey}; use serde::{Deserialize, Serialize}; #[derive(Debug, Serialize, Deserialize)] struct Claims { sub: String, // Subject (user ID) exp: usize, // Expiration time iat: usize, // Issued at roles: Vec<String>, // User roles permissions: Vec<String>, // Specific permissions } #[derive(Clone)] struct JwtConfig { secret: String, issuer: String, audience: String, access_token_duration: Duration, refresh_token_duration: Duration, } impl JwtConfig { fn create_access_token(&self, user: &User) -> Result<String, ApiError> { let now = Utc::now(); let exp = now + self.access_token_duration; let claims = Claims { sub: user.id.to_string(), exp: exp.timestamp() as usize, iat: now.timestamp() as usize, roles: user.roles.clone(), permissions: user.permissions.clone(), }; encode( &Header::default(), &claims, &EncodingKey::from_secret(self.secret.as_ref()), ) .map_err(|_| ApiError::internal("Failed to create token")) } fn validate_token(&self, token: &str) -> Result<Claims, ApiError> { let mut validation = Validation::new(Algorithm::HS256); validation.set_issuer(&[&self.issuer]); validation.set_audience(&[&self.audience]); decode::<Claims>( token, &DecodingKey::from_secret(self.secret.as_ref()), &validation, ) .map(|data| data.claims) .map_err(|e| match e.kind() { jsonwebtoken::errors::ErrorKind::ExpiredSignature => { ApiError::unauthorized("Token expired") } _ => ApiError::unauthorized("Invalid token"), }) } } }
Login Endpoint
#![allow(unused)] fn main() { #[derive(Debug, Deserialize)] struct LoginRequest { email: String, password: String, } #[derive(Debug, Serialize)] struct LoginResponse { access_token: String, refresh_token: String, token_type: String, expires_in: u64, } async fn login( State(state): State<AppState>, Json(request): Json<LoginRequest>, ) -> Result<Json<LoginResponse>, ApiError> { // Validate credentials let email = Email::try_new(request.email) .map_err(|_| ApiError::bad_request("Invalid email"))?; // Execute authentication command let command = AuthenticateUser { email: email.clone(), password: Password::from(request.password), }; let result = state.executor .execute(&command) .await .map_err(|_| ApiError::unauthorized("Invalid credentials"))?; // Get user from projection let user = state.projections .read() .await .get::<UserProjection>() .unwrap() .get_user_by_email(&email) .await? .ok_or_else(|| ApiError::unauthorized("Invalid credentials"))?; // Create tokens let access_token = state.jwt_config.create_access_token(&user)?; let refresh_token = state.jwt_config.create_refresh_token(&user)?; // Store refresh token (for revocation) let store_command = StoreRefreshToken { user_id: user.id.clone(), token_hash: hash_token(&refresh_token), expires_at: Utc::now() + state.jwt_config.refresh_token_duration, }; state.executor.execute(&store_command).await?; Ok(Json(LoginResponse { access_token, refresh_token, token_type: "Bearer".to_string(), expires_in: state.jwt_config.access_token_duration.as_secs(), })) } }
Authentication Middleware
#![allow(unused)] fn main() { use axum::{ extract::{FromRequestParts, Request}, middleware::{self, Next}, response::Response, }; #[derive(Debug, Clone)] pub struct AuthenticatedUser { pub id: UserId, pub roles: Vec<String>, pub permissions: Vec<String>, } #[async_trait] impl<S> FromRequestParts<S> for AuthenticatedUser where S: Send + Sync, { type Rejection = ApiError; async fn from_request_parts( parts: &mut http::request::Parts, state: &S, ) -> Result<Self, Self::Rejection> { // Get JWT config from extensions (set by middleware) let jwt_config = parts .extensions .get::<JwtConfig>() .ok_or_else(|| ApiError::internal("JWT config not found"))?; // Extract token from Authorization header let token = extract_bearer_token(&parts.headers)?; // Validate token let claims = jwt_config.validate_token(token)?; Ok(AuthenticatedUser { id: UserId::try_new(claims.sub)?, roles: claims.roles, permissions: claims.permissions, }) } } fn extract_bearer_token(headers: &HeaderMap) -> Result<&str, ApiError> { headers .get(AUTHORIZATION) .and_then(|v| v.to_str().ok()) .and_then(|v| v.strip_prefix("Bearer ")) .ok_or_else(|| ApiError::unauthorized("Missing or invalid Authorization header")) } // Optional authentication extractor pub struct OptionalAuth(pub Option<AuthenticatedUser>); #[async_trait] impl<S> FromRequestParts<S> for OptionalAuth where S: Send + Sync, { type Rejection = Infallible; async fn from_request_parts( parts: &mut http::request::Parts, state: &S, ) -> Result<Self, Self::Rejection> { Ok(OptionalAuth( AuthenticatedUser::from_request_parts(parts, state) .await .ok() )) } } }
Authorization Patterns
Role-Based Access Control (RBAC)
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Hash)] enum Role { Admin, Manager, Employee, Guest, } impl AuthenticatedUser { pub fn has_role(&self, role: &str) -> bool { self.roles.contains(&role.to_string()) } pub fn has_any_role(&self, roles: &[&str]) -> bool { roles.iter().any(|role| self.has_role(role)) } pub fn has_all_roles(&self, roles: &[&str]) -> bool { roles.iter().all(|role| self.has_role(role)) } } // Authorization guard async fn require_role( user: &AuthenticatedUser, role: &str, ) -> Result<(), ApiError> { if !user.has_role(role) { return Err(ApiError::forbidden( format!("Requires {} role", role) )); } Ok(()) } // In handlers async fn admin_endpoint( user: AuthenticatedUser, // other params... ) -> Result<Json<AdminData>, ApiError> { require_role(&user, "admin").await?; // Admin-only logic... } }
Permission-Based Access Control
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Hash)] enum Permission { // Task permissions CreateTask, ReadTask, UpdateTask, DeleteTask, AssignTask, // User permissions CreateUser, ReadUser, UpdateUser, DeleteUser, // Admin permissions ViewAnalytics, ManageSystem, } impl AuthenticatedUser { pub fn has_permission(&self, permission: &str) -> bool { self.permissions.contains(&permission.to_string()) } pub fn can(&self, action: Permission) -> bool { self.has_permission(&action.to_string()) } } // Permission checking in handlers async fn create_task_handler( user: AuthenticatedUser, Json(request): Json<CreateTaskRequest>, ) -> Result<Json<CreateTaskResponse>, ApiError> { if !user.can(Permission::CreateTask) { return Err(ApiError::forbidden("Cannot create tasks")); } // Create task... } }
Resource-Based Access Control
#![allow(unused)] fn main() { #[async_trait] trait ResourceAuthorizer { async fn can_read(&self, user: &AuthenticatedUser, resource_id: &str) -> bool; async fn can_write(&self, user: &AuthenticatedUser, resource_id: &str) -> bool; async fn can_delete(&self, user: &AuthenticatedUser, resource_id: &str) -> bool; } struct TaskAuthorizer { projection: Arc<TaskProjection>, } #[async_trait] impl ResourceAuthorizer for TaskAuthorizer { async fn can_read(&self, user: &AuthenticatedUser, task_id: &str) -> bool { // Admins can read all if user.has_role("admin") { return true; } // Check if user owns or is assigned to task if let Ok(Some(task)) = self.projection.get_task(task_id).await { return task.created_by == user.id || task.assigned_to == Some(user.id.clone()); } false } async fn can_write(&self, user: &AuthenticatedUser, task_id: &str) -> bool { // Similar logic for write permissions if user.has_role("admin") || user.has_role("manager") { return true; } // Check ownership or assignment if let Ok(Some(task)) = self.projection.get_task(task_id).await { return task.assigned_to == Some(user.id.clone()); } false } async fn can_delete(&self, user: &AuthenticatedUser, task_id: &str) -> bool { // Only admins and creators can delete if user.has_role("admin") { return true; } if let Ok(Some(task)) = self.projection.get_task(task_id).await { return task.created_by == user.id; } false } } }
Command Authorization
Embed authorization in commands:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct UpdateTask { #[stream] task_id: StreamId, title: Option<TaskTitle>, description: Option<TaskDescription>, // Who is making the change updated_by: UserId, } impl CommandLogic for UpdateTask { // ... other implementations async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check authorization within command require!( state.can_user_update_task(&self.updated_by), "User {} cannot update task {}", self.updated_by, self.task_id ); // Proceed with update... } } // State includes authorization data impl TaskState { fn can_user_update_task(&self, user_id: &UserId) -> bool { // Task creator can always update if self.created_by == *user_id { return true; } // Assigned user can update if self.assigned_to == Some(user_id.clone()) { return true; } // Check roles (would need to be passed in state) false } } }
API Key Authentication
For service-to-service communication:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] struct ApiKey { key: String, service_name: String, permissions: Vec<String>, rate_limit: Option<u32>, } #[async_trait] impl<S> FromRequestParts<S> for ApiKey where S: Send + Sync, { type Rejection = ApiError; async fn from_request_parts( parts: &mut http::request::Parts, state: &S, ) -> Result<Self, Self::Rejection> { let key = parts .headers .get("X-API-Key") .and_then(|v| v.to_str().ok()) .ok_or_else(|| ApiError::unauthorized("Missing API key"))?; // Look up API key (from cache/database) let api_key = validate_api_key(key).await?; Ok(api_key) } } async fn validate_api_key(key: &str) -> Result<ApiKey, ApiError> { // Hash the key for lookup let key_hash = hash_api_key(key); // Look up in projection/cache let api_key = get_api_key_by_hash(&key_hash) .await? .ok_or_else(|| ApiError::unauthorized("Invalid API key"))?; // Check if expired if api_key.expires_at < Utc::now() { return Err(ApiError::unauthorized("API key expired")); } Ok(api_key) } }
OAuth2 Integration
For third-party authentication:
#![allow(unused)] fn main() { use oauth2::{ AuthorizationCode, AuthUrl, ClientId, ClientSecret, CsrfToken, PkceCodeChallenge, RedirectUrl, TokenResponse, TokenUrl, }; #[derive(Clone)] struct OAuth2Config { client_id: ClientId, client_secret: ClientSecret, auth_url: AuthUrl, token_url: TokenUrl, redirect_url: RedirectUrl, } async fn oauth_login( State(oauth): State<OAuth2Config>, Query(params): Query<HashMap<String, String>>, ) -> Result<Redirect, ApiError> { let client = BasicClient::new( oauth.client_id, Some(oauth.client_secret), oauth.auth_url, Some(oauth.token_url), ) .set_redirect_uri(oauth.redirect_url); // Generate PKCE challenge let (pkce_challenge, pkce_verifier) = PkceCodeChallenge::new_random_sha256(); // Generate authorization URL let (auth_url, csrf_token) = client .authorize_url(CsrfToken::new_random) .add_scope(Scope::new("read:user".to_string())) .set_pkce_challenge(pkce_challenge) .url(); // Store CSRF token and PKCE verifier (in session/cache) store_oauth_state(&csrf_token, &pkce_verifier).await?; Ok(Redirect::to(auth_url.as_str())) } async fn oauth_callback( State(state): State<AppState>, Query(params): Query<OAuthCallbackParams>, ) -> Result<Json<LoginResponse>, ApiError> { // Verify CSRF token let (stored_csrf, pkce_verifier) = get_oauth_state(¶ms.state).await?; if stored_csrf != params.state { return Err(ApiError::bad_request("Invalid state parameter")); } // Exchange code for token let token_result = exchange_code_for_token( &state.oauth_config, ¶ms.code, &pkce_verifier, ).await?; // Get user info from provider let user_info = fetch_user_info(&token_result.access_token()).await?; // Create or update user in EventCore let command = CreateOrUpdateOAuthUser { provider: "github".to_string(), provider_user_id: user_info.id, email: user_info.email, name: user_info.name, }; state.executor.execute(&command).await?; // Create JWT tokens let user = get_user_by_email(&user_info.email).await?; let access_token = state.jwt_config.create_access_token(&user)?; Ok(Json(LoginResponse { access_token, // ... other fields })) } }
Session Management
Track active sessions:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct CreateSession { #[stream] user_id: StreamId, #[stream] session_id: StreamId, ip_address: IpAddr, user_agent: String, expires_at: DateTime<Utc>, } #[derive(Command, Clone)] struct RevokeSession { #[stream] session_id: StreamId, #[stream] user_id: StreamId, reason: RevocationReason, } // Session validation middleware async fn validate_session( State(state): State<AppState>, user: AuthenticatedUser, request: Request, next: Next, ) -> Result<Response, ApiError> { let session_id = extract_session_id(&request)?; // Check if session is valid let session = state.projections .read() .await .get::<SessionProjection>() .unwrap() .get_session(&session_id) .await? .ok_or_else(|| ApiError::unauthorized("Invalid session"))?; // Verify session belongs to user if session.user_id != user.id { return Err(ApiError::unauthorized("Session mismatch")); } // Check expiration if session.expires_at < Utc::now() { return Err(ApiError::unauthorized("Session expired")); } // Check if revoked if session.revoked { return Err(ApiError::unauthorized("Session revoked")); } Ok(next.run(request).await) } }
Security Headers
Add security headers to all responses:
#![allow(unused)] fn main() { async fn security_headers_middleware( request: Request, next: Next, ) -> Response { let mut response = next.run(request).await; let headers = response.headers_mut(); // Prevent clickjacking headers.insert( "X-Frame-Options", HeaderValue::from_static("DENY"), ); // XSS protection headers.insert( "X-Content-Type-Options", HeaderValue::from_static("nosniff"), ); // CSP headers.insert( "Content-Security-Policy", HeaderValue::from_static("default-src 'self'"), ); // HSTS headers.insert( "Strict-Transport-Security", HeaderValue::from_static("max-age=31536000; includeSubDomains"), ); response } }
Testing Authentication
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; fn create_test_token(user_id: &str, roles: Vec<&str>) -> String { let claims = Claims { sub: user_id.to_string(), exp: (Utc::now() + Duration::hours(1)).timestamp() as usize, iat: Utc::now().timestamp() as usize, roles: roles.into_iter().map(|s| s.to_string()).collect(), permissions: vec![], }; encode( &Header::default(), &claims, &EncodingKey::from_secret(TEST_SECRET.as_ref()), ).unwrap() } #[tokio::test] async fn test_authentication_required() { let app = create_test_app(); // No token let response = app .oneshot( Request::builder() .uri("/api/v1/protected") .body(Body::empty()) .unwrap(), ) .await .unwrap(); assert_eq!(response.status(), StatusCode::UNAUTHORIZED); } #[tokio::test] async fn test_role_authorization() { let app = create_test_app(); // User token without admin role let token = create_test_token("user123", vec!["user"]); let response = app .oneshot( Request::builder() .uri("/api/v1/admin/users") .header("Authorization", format!("Bearer {}", token)) .body(Body::empty()) .unwrap(), ) .await .unwrap(); assert_eq!(response.status(), StatusCode::FORBIDDEN); } } }
Best Practices
- Use HTTPS always - Never send tokens over unencrypted connections
- Short token lifetimes - Access tokens should expire quickly
- Refresh tokens - Use refresh tokens for long-lived sessions
- Store hashes - Never store plaintext tokens or passwords
- Audit everything - Log all authentication/authorization events
- Principle of least privilege - Grant minimal necessary permissions
- Defense in depth - Layer multiple security mechanisms
- Regular reviews - Audit permissions and access regularly
Summary
Authentication and authorization in EventCore:
- ✅ Flexible strategies - JWT, API keys, OAuth2
- ✅ Strong typing - Type-safe user and permission models
- ✅ Event sourced - Authentication events provide audit trail
- ✅ Performance - Caching for fast authorization checks
- ✅ Testable - Easy to test security rules
Key patterns:
- Authenticate early in the request pipeline
- Embed authorization in commands
- Use projections for fast permission lookups
- Audit all security events
- Test security thoroughly
Next, let’s explore API Versioning →
Chapter 4.5: API Versioning
APIs evolve over time. This chapter covers strategies for versioning your EventCore APIs while maintaining backward compatibility and providing a smooth migration path for clients.
Versioning Strategies
URL Path Versioning
The most explicit and commonly used approach:
#![allow(unused)] fn main() { use axum::{Router, routing::post}; fn create_versioned_routes() -> Router { Router::new() // Version 1 endpoints .nest("/api/v1", v1_routes()) // Version 2 endpoints .nest("/api/v2", v2_routes()) // Latest version alias (optional) .nest("/api/latest", v2_routes()) } fn v1_routes() -> Router { Router::new() .route("/tasks", post(v1::create_task)) .route("/tasks/:id", get(v1::get_task)) .route("/tasks/:id/assign", post(v1::assign_task)) } fn v2_routes() -> Router { Router::new() .route("/tasks", post(v2::create_task)) .route("/tasks/:id", get(v2::get_task)) .route("/tasks/:id/assign", post(v2::assign_task)) // New in v2 .route("/tasks/:id/subtasks", get(v2::get_subtasks)) .route("/tasks/bulk", post(v2::bulk_create_tasks)) } }
Header-Based Versioning
More RESTful but less discoverable:
#![allow(unused)] fn main() { use axum::{ extract::{FromRequestParts, Request}, http::HeaderValue, }; #[derive(Debug, Clone, Copy)] enum ApiVersion { V1, V2, } impl Default for ApiVersion { fn default() -> Self { ApiVersion::V2 // Latest version } } #[async_trait] impl<S> FromRequestParts<S> for ApiVersion where S: Send + Sync, { type Rejection = ApiError; async fn from_request_parts( parts: &mut http::request::Parts, _state: &S, ) -> Result<Self, Self::Rejection> { let version = parts .headers .get("API-Version") .and_then(|v| v.to_str().ok()) .map(|v| match v { "1" | "v1" => ApiVersion::V1, "2" | "v2" => ApiVersion::V2, _ => ApiVersion::default(), }) .unwrap_or_default(); Ok(version) } } // Use in handlers async fn create_task( version: ApiVersion, Json(request): Json<serde_json::Value>, ) -> Result<Response, ApiError> { match version { ApiVersion::V1 => v1::create_task_handler(request).await, ApiVersion::V2 => v2::create_task_handler(request).await, } } }
Content Type Versioning
Using vendor-specific media types:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] enum ContentVersion { V1, V2, } impl ContentVersion { fn from_content_type(content_type: &str) -> Self { if content_type.contains("vnd.eventcore.v1+json") { ContentVersion::V1 } else if content_type.contains("vnd.eventcore.v2+json") { ContentVersion::V2 } else { ContentVersion::V2 // Default to latest } } fn to_content_type(&self) -> &'static str { match self { ContentVersion::V1 => "application/vnd.eventcore.v1+json", ContentVersion::V2 => "application/vnd.eventcore.v2+json", } } } }
Request/Response Evolution
Backward Compatible Changes
These changes don’t require a new version:
#![allow(unused)] fn main() { // Original V1 request #[derive(Debug, Deserialize)] struct CreateTaskRequestV1 { title: String, description: String, } // Backward compatible V1 with optional field #[derive(Debug, Deserialize)] struct CreateTaskRequestV1Enhanced { title: String, description: String, #[serde(default)] priority: Option<Priority>, // New optional field } // Response expansion is also backward compatible #[derive(Debug, Serialize)] struct TaskResponseV1 { id: String, title: String, description: String, created_at: DateTime<Utc>, #[serde(skip_serializing_if = "Option::is_none")] priority: Option<Priority>, // New optional field } }
Breaking Changes
These require a new API version:
#![allow(unused)] fn main() { mod v1 { #[derive(Debug, Deserialize)] struct CreateTaskRequest { title: String, description: String, assigned_to: String, // Single assignee } } mod v2 { #[derive(Debug, Deserialize)] struct CreateTaskRequest { title: String, description: String, assigned_to: Vec<String>, // Breaking: Now multiple assignees #[serde(default)] tags: Vec<String>, // New field } } // Adapter to support both versions async fn create_task_adapter( version: ApiVersion, Json(value): Json<serde_json::Value>, ) -> Result<Json<TaskResponse>, ApiError> { match version { ApiVersion::V1 => { let request: v1::CreateTaskRequest = serde_json::from_value(value)?; // Convert V1 to internal command let command = CreateTask { title: request.title, description: request.description, assigned_to: vec![request.assigned_to], // Adapt single to vec tags: vec![], // Default for V1 }; execute_create_task(command).await } ApiVersion::V2 => { let request: v2::CreateTaskRequest = serde_json::from_value(value)?; let command = CreateTask { title: request.title, description: request.description, assigned_to: request.assigned_to, tags: request.tags, }; execute_create_task(command).await } } } }
Command Versioning
Version commands to handle different API versions:
#![allow(unused)] fn main() { // Internal command representation (latest version) #[derive(Command, Clone)] struct CreateTask { #[stream] task_id: StreamId, title: TaskTitle, description: TaskDescription, assigned_to: Vec<UserId>, tags: Vec<Tag>, priority: Priority, } // Version-specific command builders mod command_builders { use super::*; pub fn from_v1_request(req: v1::CreateTaskRequest) -> Result<CreateTask, ApiError> { Ok(CreateTask { task_id: StreamId::from(format!("task-{}", TaskId::new())), title: TaskTitle::try_new(req.title)?, description: TaskDescription::try_new(req.description)?, assigned_to: vec![UserId::try_new(req.assigned_to)?], tags: vec![], // V1 doesn't support tags priority: Priority::Normal, // Default for V1 }) } pub fn from_v2_request(req: v2::CreateTaskRequest) -> Result<CreateTask, ApiError> { Ok(CreateTask { task_id: StreamId::from(format!("task-{}", TaskId::new())), title: TaskTitle::try_new(req.title)?, description: TaskDescription::try_new(req.description)?, assigned_to: req.assigned_to .into_iter() .map(|a| UserId::try_new(a)) .collect::<Result<Vec<_>, _>>()?, tags: req.tags .into_iter() .map(|t| Tag::try_new(t)) .collect::<Result<Vec<_>, _>>()?, priority: req.priority.unwrap_or(Priority::Normal), }) } } }
Response Transformation
Transform internal data to version-specific responses:
#![allow(unused)] fn main() { // Internal projection data #[derive(Debug, Clone)] struct TaskData { id: TaskId, title: String, description: String, assigned_to: Vec<UserId>, tags: Vec<Tag>, priority: Priority, created_at: DateTime<Utc>, updated_at: DateTime<Utc>, subtasks: Vec<SubtaskData>, // Added in V2 } // Response transformers mod response_transformers { use super::*; pub fn to_v1_response(task: TaskData) -> v1::TaskResponse { v1::TaskResponse { id: task.id.to_string(), title: task.title, description: task.description, assigned_to: task.assigned_to.first() .map(|u| u.to_string()) .unwrap_or_default(), // V1 only supports single assignee created_at: task.created_at, updated_at: task.updated_at, } } pub fn to_v2_response(task: TaskData) -> v2::TaskResponse { v2::TaskResponse { id: task.id.to_string(), title: task.title, description: task.description, assigned_to: task.assigned_to .into_iter() .map(|u| u.to_string()) .collect(), tags: task.tags .into_iter() .map(|t| t.to_string()) .collect(), priority: task.priority, created_at: task.created_at, updated_at: task.updated_at, subtask_count: task.subtasks.len(), _links: v2::Links { self_: format!("/api/v2/tasks/{}", task.id), subtasks: format!("/api/v2/tasks/{}/subtasks", task.id), }, } } } }
Deprecation Strategy
Communicate deprecation clearly:
#![allow(unused)] fn main() { async fn deprecated_middleware( request: Request, next: Next, ) -> Response { let mut response = next.run(request).await; // Add deprecation headers response.headers_mut().insert( "Sunset", HeaderValue::from_static("Sat, 31 Dec 2024 23:59:59 GMT"), ); response.headers_mut().insert( "Deprecation", HeaderValue::from_static("true"), ); response.headers_mut().insert( "Link", HeaderValue::from_static( "</api/v2/docs>; rel=\"successor-version\"" ), ); response } // Apply to V1 routes let v1_routes = Router::new() .route("/tasks", post(v1::create_task)) .layer(middleware::from_fn(deprecated_middleware)); }
Deprecation Notices in Responses
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct DeprecatedResponse<T> { #[serde(flatten)] data: T, _deprecation: DeprecationNotice, } #[derive(Debug, Serialize)] struct DeprecationNotice { message: &'static str, sunset_date: &'static str, migration_guide: &'static str, } impl<T> DeprecatedResponse<T> { fn new(data: T) -> Self { Self { data, _deprecation: DeprecationNotice { message: "This API version is deprecated", sunset_date: "2024-12-31", migration_guide: "https://docs.eventcore.io/migration/v1-to-v2", }, } } } }
Version Discovery
Help clients discover available versions:
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct ApiVersionInfo { version: String, status: VersionStatus, deprecated: bool, sunset_date: Option<String>, endpoints: Vec<EndpointInfo>, } #[derive(Debug, Serialize)] #[serde(rename_all = "lowercase")] enum VersionStatus { Stable, Beta, Deprecated, Sunset, } async fn get_api_versions() -> Json<Vec<ApiVersionInfo>> { Json(vec![ ApiVersionInfo { version: "v1".to_string(), status: VersionStatus::Deprecated, deprecated: true, sunset_date: Some("2024-12-31".to_string()), endpoints: vec![ EndpointInfo { path: "/api/v1/tasks", methods: vec!["GET", "POST"], }, // ... other endpoints ], }, ApiVersionInfo { version: "v2".to_string(), status: VersionStatus::Stable, deprecated: false, sunset_date: None, endpoints: vec![ EndpointInfo { path: "/api/v2/tasks", methods: vec!["GET", "POST"], }, EndpointInfo { path: "/api/v2/tasks/bulk", methods: vec!["POST"], }, // ... other endpoints ], }, ]) } }
Migration Support
Help clients migrate between versions:
#![allow(unused)] fn main() { // Migration endpoint that accepts V1 format and returns V2 async fn migrate_task_format( Json(v1_task): Json<v1::TaskResponse>, ) -> Result<Json<v2::TaskResponse>, ApiError> { // Transform V1 to V2 format let v2_task = v2::TaskResponse { id: v1_task.id, title: v1_task.title, description: v1_task.description, assigned_to: vec![v1_task.assigned_to], // Convert single to array tags: vec![], // Default empty priority: Priority::Normal, // Default created_at: v1_task.created_at, updated_at: v1_task.updated_at, subtask_count: 0, // Default _links: v2::Links { self_: format!("/api/v2/tasks/{}", v1_task.id), subtasks: format!("/api/v2/tasks/{}/subtasks", v1_task.id), }, }; Ok(Json(v2_task)) } // Bulk migration endpoint async fn migrate_tasks_bulk( Json(request): Json<BulkMigrationRequest>, ) -> Result<Json<BulkMigrationResponse>, ApiError> { let mut migrated = Vec::new(); let mut errors = Vec::new(); for task_id in request.task_ids { match migrate_single_task(&task_id).await { Ok(task) => migrated.push(task), Err(e) => errors.push(MigrationError { task_id, error: e.to_string(), }), } } Ok(Json(BulkMigrationResponse { migrated_count: migrated.len(), error_count: errors.len(), errors: if errors.is_empty() { None } else { Some(errors) }, })) } }
Testing Multiple Versions
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; #[tokio::test] async fn test_v1_compatibility() { let app = create_app(); // V1 request format let v1_request = serde_json::json!({ "title": "Test Task", "description": "Test Description", "assigned_to": "user123" }); let response = app .clone() .oneshot( Request::builder() .uri("/api/v1/tasks") .method("POST") .header("Content-Type", "application/json") .body(Body::from(v1_request.to_string())) .unwrap(), ) .await .unwrap(); assert_eq!(response.status(), StatusCode::CREATED); // Verify deprecation headers assert_eq!( response.headers().get("Deprecation").unwrap(), "true" ); } #[tokio::test] async fn test_v2_enhancements() { let app = create_app(); // V2 request with new features let v2_request = serde_json::json!({ "title": "Test Task", "description": "Test Description", "assigned_to": ["user123", "user456"], "tags": ["urgent", "backend"], "priority": "high" }); let response = app .oneshot( Request::builder() .uri("/api/v2/tasks") .method("POST") .header("Content-Type", "application/json") .body(Body::from(v2_request.to_string())) .unwrap(), ) .await .unwrap(); assert_eq!(response.status(), StatusCode::CREATED); let body: v2::TaskResponse = serde_json::from_slice( &hyper::body::to_bytes(response.into_body()).await.unwrap() ).unwrap(); assert_eq!(body.assigned_to.len(), 2); assert_eq!(body.tags.len(), 2); } #[tokio::test] async fn test_version_negotiation() { let app = create_app(); // Test header-based versioning let response = app .clone() .oneshot( Request::builder() .uri("/api/tasks/123") .header("API-Version", "v1") .body(Body::empty()) .unwrap(), ) .await .unwrap(); // Should return V1 format let body: v1::TaskResponse = serde_json::from_slice( &hyper::body::to_bytes(response.into_body()).await.unwrap() ).unwrap(); assert!(body.assigned_to.is_string()); // V1 uses string } } }
Documentation
Generate version-specific documentation:
#![allow(unused)] fn main() { use utoipa::{OpenApi, ToSchema}; #[derive(OpenApi)] #[openapi( paths( v1::create_task, v1::get_task, ), components( schemas(v1::CreateTaskRequest, v1::TaskResponse) ), tags( (name = "tasks", description = "Task management API v1") ), info( title = "EventCore API v1", version = "1.0.0", description = "Legacy API version - deprecated" ) )] struct ApiDocV1; #[derive(OpenApi)] #[openapi( paths( v2::create_task, v2::get_task, v2::bulk_create_tasks, ), components( schemas(v2::CreateTaskRequest, v2::TaskResponse) ), tags( (name = "tasks", description = "Task management API v2") ), info( title = "EventCore API v2", version = "2.0.0", description = "Current stable API version" ) )] struct ApiDocV2; // Serve version-specific docs async fn serve_api_docs(version: ApiVersion) -> impl IntoResponse { match version { ApiVersion::V1 => Json(ApiDocV1::openapi()), ApiVersion::V2 => Json(ApiDocV2::openapi()), } } }
Best Practices
- Plan for versioning from day one - Even if you start with v1
- Use semantic versioning - Major.Minor.Patch
- Maintain backward compatibility - When possible
- Communicate changes clearly - Use headers and documentation
- Set deprecation timelines - Give clients time to migrate
- Version at the right level - Not every change needs a new version
- Test all versions - Maintain test suites for each supported version
- Monitor version usage - Track which versions clients use
Summary
API versioning in EventCore applications:
- ✅ Multiple strategies - URL, header, content-type versioning
- ✅ Smooth migration - Tools to help clients upgrade
- ✅ Clear deprecation - Sunset dates and migration guides
- ✅ Version discovery - Clients can explore available versions
- ✅ Backward compatibility - Maintain old versions gracefully
Key patterns:
- Choose a versioning strategy and stick to it
- Transform between versions at API boundaries
- Keep internal representations version-agnostic
- Communicate deprecation clearly
- Provide migration tools and guides
- Test all supported versions
Congratulations! You’ve completed Part 4. Continue to Part 5: Advanced Topics →
Part 5: Advanced Topics
This part covers advanced EventCore patterns and techniques for building sophisticated event-sourced systems. These topics build on the foundations from previous parts.
Chapters in This Part
- Schema Evolution - Evolving events and commands over time
- Event Versioning - Managing event format changes
- Long-Running Processes - Sagas and process managers
- Distributed Systems - Multi-service event sourcing
- Performance Optimization - Scaling EventCore applications
What You’ll Learn
- Handle schema changes gracefully
- Version events and commands safely
- Implement complex business processes
- Scale across multiple services
- Optimize for high performance
Prerequisites
- Completed Parts 1-4
- Production experience with EventCore recommended
- Understanding of distributed systems concepts helpful
Complexity Level
These topics are advanced and assume solid understanding of event sourcing principles and EventCore fundamentals.
Time to Complete
- Reading: ~60 minutes
- With implementation: ~4 hours
Ready for advanced topics? Let’s start with Schema Evolution →
Chapter 5.1: Schema Evolution
Schema evolution is the process of changing event and command structures over time while maintaining backward compatibility. EventCore provides powerful tools for handling schema changes gracefully.
The Challenge
Your system evolves. Business requirements change. Data structures need to adapt. But in event sourcing, you can never change historical events - they’re immutable facts about what happened.
#![allow(unused)] fn main() { // Day 1: Simple user registration #[derive(Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email: String, } // 6 months later: Need more fields #[derive(Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email: String, // New fields - but old events don't have them! first_name: String, last_name: String, preferences: UserPreferences, } }
EventCore’s Schema Evolution Approach
EventCore uses a combination of:
- Serde defaults - Handle missing fields gracefully
- Event versioning - Explicit version tracking
- Migration functions - Transform old formats to new
- Schema registry - Central type management
Backward Compatible Changes
These changes don’t break existing events:
Adding Optional Fields
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email: String, // New optional fields with defaults #[serde(default)] first_name: Option<String>, #[serde(default)] last_name: Option<String>, #[serde(default)] preferences: UserPreferences, } impl Default for UserPreferences { fn default() -> Self { Self { newsletter: false, notifications: true, theme: Theme::Light, } } } }
Adding Fields with Sensible Defaults
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] struct OrderPlaced { order_id: OrderId, customer_id: CustomerId, items: Vec<OrderItem>, // New field with computed default #[serde(default = "default_currency")] currency: Currency, // New field with timestamp default #[serde(default = "Utc::now")] placed_at: DateTime<Utc>, } fn default_currency() -> Currency { Currency::USD } }
Adding Enum Variants
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] #[serde(tag = "type")] enum PaymentMethod { CreditCard { last_four: String }, BankTransfer { account: String }, PayPal { email: String }, // New variants - old events still deserialize ApplePay { device_id: String }, GooglePay { account_id: String }, // Unknown variant fallback #[serde(other)] Unknown, } }
Breaking Changes
These require explicit versioning:
Removing Fields
#![allow(unused)] fn main() { // V1: Has deprecated field #[derive(Debug, Serialize, Deserialize)] struct UserRegisteredV1 { user_id: UserId, email: String, username: String, // Being removed } // V2: Field removed #[derive(Debug, Serialize, Deserialize)] struct UserRegisteredV2 { user_id: UserId, email: String, // username removed - breaking change! } }
Changing Field Types
#![allow(unused)] fn main() { // V1: String user ID #[derive(Debug, Serialize, Deserialize)] struct UserRegisteredV1 { user_id: String, // String email: String, } // V2: Structured user ID #[derive(Debug, Serialize, Deserialize)] struct UserRegisteredV2 { user_id: UserId, // Custom type - breaking change! email: String, } }
Restructuring Data
#![allow(unused)] fn main() { // V1: Flat structure #[derive(Debug, Serialize, Deserialize)] struct OrderPlacedV1 { order_id: OrderId, billing_street: String, billing_city: String, billing_state: String, shipping_street: String, shipping_city: String, shipping_state: String, } // V2: Nested structure #[derive(Debug, Serialize, Deserialize)] struct OrderPlacedV2 { order_id: OrderId, billing_address: Address, // Restructured - breaking change! shipping_address: Address, } }
Versioned Events
EventCore supports explicit event versioning:
#![allow(unused)] fn main() { use eventcore::serialization::VersionedEvent; #[derive(Debug, Serialize, Deserialize)] #[serde(tag = "version")] enum UserRegisteredVersioned { #[serde(rename = "1")] V1 { user_id: String, email: String, username: String, }, #[serde(rename = "2")] V2 { user_id: UserId, email: String, first_name: String, last_name: String, }, #[serde(rename = "3")] V3 { user_id: UserId, email: String, profile: UserProfile, // Further evolution }, } impl VersionedEvent for UserRegisteredVersioned { const EVENT_TYPE: &'static str = "UserRegistered"; fn current_version() -> u32 { 3 } fn migrate_to_current(self) -> Self { match self { UserRegisteredVersioned::V1 { user_id, email, username } => { // V1 → V2: Convert string ID, extract names from username let (first_name, last_name) = split_username(&username); let user_id = UserId::try_new(user_id).unwrap_or_else(|_| UserId::new()); UserRegisteredVersioned::V2 { user_id, email, first_name, last_name, } } UserRegisteredVersioned::V2 { user_id, email, first_name, last_name } => { // V2 → V3: Create profile from names UserRegisteredVersioned::V3 { user_id, email, profile: UserProfile { first_name, last_name, bio: None, avatar_url: None, }, } } v3 => v3, // Already current version } } } }
Migration Functions
For complex transformations, use migration functions:
#![allow(unused)] fn main() { use eventcore::serialization::{Migration, MigrationError}; struct UserRegisteredV1ToV2; impl Migration<UserRegisteredV1, UserRegisteredV2> for UserRegisteredV1ToV2 { fn migrate(&self, v1: UserRegisteredV1) -> Result<UserRegisteredV2, MigrationError> { // Complex migration logic let user_id = parse_legacy_user_id(&v1.user_id)?; let (first_name, last_name) = extract_names_from_username(&v1.username)?; // Validate converted data if first_name.is_empty() { return Err(MigrationError::InvalidData("Empty first name".to_string())); } Ok(UserRegisteredV2 { user_id, email: v1.email, first_name, last_name, }) } } fn parse_legacy_user_id(legacy_id: &str) -> Result<UserId, MigrationError> { // Handle legacy ID formats if legacy_id.starts_with("user_") { let numeric_part = legacy_id.strip_prefix("user_") .ok_or_else(|| MigrationError::InvalidData("Invalid legacy ID format".to_string()))?; let uuid = Uuid::new_v5(&Uuid::NAMESPACE_OID, numeric_part.as_bytes()); Ok(UserId::from(uuid)) } else if let Ok(uuid) = Uuid::parse_str(legacy_id) { Ok(UserId::from(uuid)) } else { Err(MigrationError::InvalidData(format!("Cannot parse user ID: {}", legacy_id))) } } }
Schema Registry
EventCore provides a schema registry for managing types:
#![allow(unused)] fn main() { use eventcore::serialization::{SchemaRegistry, TypeInfo}; #[derive(Default)] struct MySchemaRegistry { registry: SchemaRegistry, } impl MySchemaRegistry { fn new() -> Self { let mut registry = SchemaRegistry::new(); // Register event types with versions registry.register::<UserRegisteredV1>("UserRegistered", 1); registry.register::<UserRegisteredV2>("UserRegistered", 2); registry.register::<UserRegisteredV3>("UserRegistered", 3); // Register migrations registry.add_migration::<UserRegisteredV1, UserRegisteredV2>( UserRegisteredV1ToV2 ); registry.add_migration::<UserRegisteredV2, UserRegisteredV3>( UserRegisteredV2ToV3 ); Self { registry } } fn deserialize_event(&self, event_type: &str, version: u32, data: &[u8]) -> Result<Box<dyn Any>, SerializationError> { self.registry.deserialize_and_migrate(event_type, version, data) } } }
Command Evolution
Commands evolve differently than events because they don’t need historical compatibility:
#![allow(unused)] fn main() { // Commands can change more freely #[derive(Command, Clone)] struct CreateUser { // V1 fields email: Email, // V2 additions - no historical constraint first_name: FirstName, last_name: LastName, // V3 additions initial_preferences: UserPreferences, referral_code: Option<ReferralCode>, } // Use builder pattern for backward compatibility impl CreateUser { pub fn builder() -> CreateUserBuilder { CreateUserBuilder::default() } // V1-style constructor pub fn from_email(email: Email) -> Self { Self { email, first_name: FirstName::default(), last_name: LastName::default(), initial_preferences: UserPreferences::default(), referral_code: None, } } // V2-style constructor pub fn with_name(email: Email, first_name: FirstName, last_name: LastName) -> Self { Self { email, first_name, last_name, initial_preferences: UserPreferences::default(), referral_code: None, } } } #[derive(Default)] pub struct CreateUserBuilder { email: Option<Email>, first_name: Option<FirstName>, last_name: Option<LastName>, initial_preferences: Option<UserPreferences>, referral_code: Option<ReferralCode>, } impl CreateUserBuilder { pub fn email(mut self, email: Email) -> Self { self.email = Some(email); self } pub fn name(mut self, first: FirstName, last: LastName) -> Self { self.first_name = Some(first); self.last_name = Some(last); self } pub fn preferences(mut self, prefs: UserPreferences) -> Self { self.initial_preferences = Some(prefs); self } pub fn referral_code(mut self, code: ReferralCode) -> Self { self.referral_code = Some(code); self } pub fn build(self) -> Result<CreateUser, ValidationError> { Ok(CreateUser { email: self.email.ok_or(ValidationError::MissingField("email"))?, first_name: self.first_name.unwrap_or_default(), last_name: self.last_name.unwrap_or_default(), initial_preferences: self.initial_preferences.unwrap_or_default(), referral_code: self.referral_code, }) } } }
State Evolution
State structures also need to evolve with events:
#![allow(unused)] fn main() { #[derive(Default)] struct UserState { exists: bool, email: String, // V2 fields with defaults first_name: Option<String>, last_name: Option<String>, // V3 fields profile: Option<UserProfile>, preferences: UserPreferences, } impl CommandLogic for CreateUser { type State = UserState; type Event = UserEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { UserEvent::RegisteredV1 { user_id, email, username } => { state.exists = true; state.email = email.clone(); // Legacy events don't have separate names state.first_name = None; state.last_name = None; } UserEvent::RegisteredV2 { user_id, email, first_name, last_name } => { state.exists = true; state.email = email.clone(); state.first_name = Some(first_name.clone()); state.last_name = Some(last_name.clone()); } UserEvent::RegisteredV3 { user_id, email, profile } => { state.exists = true; state.email = email.clone(); state.first_name = Some(profile.first_name.clone()); state.last_name = Some(profile.last_name.clone()); state.profile = Some(profile.clone()); } // Handle other events... } } } }
Projection Evolution
Projections need to handle schema changes too:
#![allow(unused)] fn main() { #[async_trait] impl Projection for UserListProjection { type Event = UserEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match &event.payload { // Handle all versions of user registration UserEvent::RegisteredV1 { user_id, email, username } => { let user = UserSummary { id: user_id.clone(), email: email.clone(), display_name: username.clone(), // Use username as display name first_name: None, last_name: None, created_at: event.occurred_at, }; self.users.insert(user_id.clone(), user); } UserEvent::RegisteredV2 { user_id, email, first_name, last_name } => { let user = UserSummary { id: user_id.clone(), email: email.clone(), display_name: format!("{} {}", first_name, last_name), first_name: Some(first_name.clone()), last_name: Some(last_name.clone()), created_at: event.occurred_at, }; self.users.insert(user_id.clone(), user); } UserEvent::RegisteredV3 { user_id, email, profile } => { let user = UserSummary { id: user_id.clone(), email: email.clone(), display_name: profile.display_name(), first_name: Some(profile.first_name.clone()), last_name: Some(profile.last_name.clone()), created_at: event.occurred_at, }; self.users.insert(user_id.clone(), user); } } Ok(()) } } }
Migration Strategies
Forward-Only Evolution
The simplest approach - only add fields, never remove:
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] struct ProductCreated { product_id: ProductId, name: String, price: Money, // V2 additions #[serde(default)] category: Option<Category>, #[serde(default)] tags: Vec<Tag>, // V3 additions #[serde(default)] metadata: ProductMetadata, #[serde(default)] variants: Vec<ProductVariant>, // V4 additions #[serde(default)] seo_info: Option<SeoInfo>, #[serde(default = "default_status")] status: ProductStatus, } fn default_status() -> ProductStatus { ProductStatus::Active } }
Event Splitting
Split large events into focused ones:
#![allow(unused)] fn main() { // V1: Monolithic event struct OrderProcessedV1 { order_id: OrderId, payment_method: PaymentMethod, payment_amount: Money, shipping_address: Address, items: Vec<OrderItem>, discount: Option<Discount>, tax_amount: Money, } // V2: Split into focused events enum OrderEventV2 { PaymentProcessed { order_id: OrderId, payment_method: PaymentMethod, amount: Money, }, ShippingAddressSet { order_id: OrderId, address: Address, }, ItemsAdded { order_id: OrderId, items: Vec<OrderItem>, }, DiscountApplied { order_id: OrderId, discount: Discount, }, TaxCalculated { order_id: OrderId, amount: Money, }, } }
Lazy Migration
Migrate events only when needed:
#![allow(unused)] fn main() { use eventcore::serialization::LazyMigration; #[derive(Clone)] struct LazyUserEvent { raw_data: Vec<u8>, version: u32, migrated: Option<UserEvent>, } impl LazyUserEvent { fn get(&mut self) -> Result<&UserEvent, MigrationError> { if self.migrated.is_none() { let migrated = match self.version { 1 => { let v1: UserRegisteredV1 = serde_json::from_slice(&self.raw_data)?; UserEvent::from_v1(v1) } 2 => { let v2: UserRegisteredV2 = serde_json::from_slice(&self.raw_data)?; UserEvent::from_v2(v2) } 3 => { serde_json::from_slice(&self.raw_data)? } _ => return Err(MigrationError::UnsupportedVersion(self.version)), }; self.migrated = Some(migrated); } Ok(self.migrated.as_ref().unwrap()) } } }
Testing Schema Evolution
Migration Tests
#![allow(unused)] fn main() { #[cfg(test)] mod migration_tests { use super::*; #[test] fn test_v1_to_v2_migration() { let v1_event = UserRegisteredV1 { user_id: "user_123".to_string(), email: "john.doe@example.com".to_string(), username: "john_doe".to_string(), }; let migration = UserRegisteredV1ToV2; let v2_event = migration.migrate(v1_event).unwrap(); assert!(v2_event.user_id.to_string().contains("123")); assert_eq!(v2_event.email, "john.doe@example.com"); assert_eq!(v2_event.first_name, "john"); assert_eq!(v2_event.last_name, "doe"); } #[test] fn test_serialization_roundtrip() { let v2_event = UserRegisteredV2 { user_id: UserId::new(), email: "test@example.com".to_string(), first_name: "Test".to_string(), last_name: "User".to_string(), }; // Serialize let json = serde_json::to_string(&v2_event).unwrap(); // Deserialize let deserialized: UserRegisteredV2 = serde_json::from_str(&json).unwrap(); assert_eq!(v2_event.user_id, deserialized.user_id); assert_eq!(v2_event.email, deserialized.email); } #[test] fn test_backward_compatibility() { // V1 JSON without new fields let v1_json = r#"{ "user_id": "550e8400-e29b-41d4-a716-446655440000", "email": "legacy@example.com" }"#; // Should deserialize into V2 with defaults let v2_event: UserRegisteredV2 = serde_json::from_str(v1_json).unwrap(); assert_eq!(v2_event.email, "legacy@example.com"); assert!(v2_event.first_name.is_empty()); // Default assert!(v2_event.last_name.is_empty()); // Default } } }
Property-Based Migration Tests
#![allow(unused)] fn main() { use proptest::prelude::*; proptest! { #[test] fn migration_preserves_core_data( user_id in any::<String>(), email in "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}", username in "[a-zA-Z0-9_]{3,20}", ) { let v1 = UserRegisteredV1 { user_id: user_id.clone(), email: email.clone(), username, }; let migration = UserRegisteredV1ToV2; let v2 = migration.migrate(v1).unwrap(); // Core data should be preserved prop_assert_eq!(v2.email, email); // User ID should be convertible prop_assert!(v2.user_id.to_string().len() > 0); } } }
Best Practices
- Plan for evolution - Design events with future changes in mind
- Use optional fields - Default to optional for new fields
- Never remove fields - Mark as deprecated instead
- Version breaking changes - Use explicit versioning for major changes
- Test migrations thoroughly - Especially edge cases
- Document schema changes - Keep a changelog
- Migrate lazily - Only when events are read
- Monitor migration performance - Large migrations can be slow
Summary
Schema evolution in EventCore:
- ✅ Backward compatible - Old events still work
- ✅ Versioned explicitly - Track breaking changes
- ✅ Migration support - Transform old formats
- ✅ Type-safe - Compile-time guarantees
- ✅ Testable - Comprehensive test support
Key patterns:
- Use serde defaults for backward compatibility
- Version events explicitly for breaking changes
- Write migration functions for complex transformations
- Test all migration paths thoroughly
- Plan for evolution from day one
Next, let’s explore Event Versioning →
Chapter 5.2: Event Versioning
Event versioning is a systematic approach to managing changes in event schemas while preserving the ability to read historical data. This chapter covers EventCore’s versioning strategies and implementation patterns.
Versioning Strategies
Semantic Versioning for Events
Apply semantic versioning principles to events:
#![allow(unused)] fn main() { use eventcore::serialization::EventVersion; #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)] struct EventSchemaVersion { major: u32, minor: u32, patch: u32, } impl EventSchemaVersion { const fn new(major: u32, minor: u32, patch: u32) -> Self { Self { major, minor, patch } } // Breaking changes const V1_0_0: Self = Self::new(1, 0, 0); const V2_0_0: Self = Self::new(2, 0, 0); // Backward compatible additions const V1_1_0: Self = Self::new(1, 1, 0); const V1_2_0: Self = Self::new(1, 2, 0); // Bug fixes/clarifications const V1_0_1: Self = Self::new(1, 0, 1); } trait VersionedEvent { const EVENT_TYPE: &'static str; const VERSION: EventSchemaVersion; fn is_compatible_with(version: &EventSchemaVersion) -> bool; } }
Linear Versioning
Simpler approach with incremental versions:
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] #[serde(tag = "version")] enum UserEvent { #[serde(rename = "1")] V1(UserEventV1), #[serde(rename = "2")] V2(UserEventV2), #[serde(rename = "3")] V3(UserEventV3), } #[derive(Debug, Serialize, Deserialize)] struct UserEventV1 { pub user_id: String, pub email: String, pub username: String, } #[derive(Debug, Serialize, Deserialize)] struct UserEventV2 { pub user_id: UserId, pub email: Email, pub first_name: String, pub last_name: String, } #[derive(Debug, Serialize, Deserialize)] struct UserEventV3 { pub user_id: UserId, pub email: Email, pub profile: UserProfile, pub preferences: UserPreferences, } }
Version-Aware Serialization
EventCore provides automatic version handling:
#![allow(unused)] fn main() { use eventcore::serialization::{VersionedSerializer, SerializationFormat}; #[derive(Clone)] struct EventSerializer { format: SerializationFormat, registry: TypeRegistry, } impl EventSerializer { fn new() -> Self { let mut registry = TypeRegistry::new(); // Register all versions registry.register_versioned::<UserEventV1>("UserEvent", 1); registry.register_versioned::<UserEventV2>("UserEvent", 2); registry.register_versioned::<UserEventV3>("UserEvent", 3); Self { format: SerializationFormat::Json, registry, } } fn serialize_event<T>(&self, event: &T) -> Result<VersionedPayload, SerializationError> where T: Serialize + VersionedEvent, { let data = self.format.serialize(event)?; Ok(VersionedPayload { event_type: T::EVENT_TYPE.to_string(), version: T::VERSION.to_string(), format: self.format, data, }) } fn deserialize_event<T>(&self, payload: &VersionedPayload) -> Result<T, SerializationError> where T: DeserializeOwned + VersionedEvent, { // Check version compatibility let payload_version = EventSchemaVersion::parse(&payload.version)?; if !T::is_compatible_with(&payload_version) { return Err(SerializationError::IncompatibleVersion { expected: T::VERSION, found: payload_version, }); } self.format.deserialize(&payload.data) } } #[derive(Debug, Clone)] struct VersionedPayload { event_type: String, version: String, format: SerializationFormat, data: Vec<u8>, } }
Migration Chains
Handle complex version transitions:
#![allow(unused)] fn main() { use eventcore::serialization::{MigrationChain, Migration}; struct UserEventMigrationChain { migrations: Vec<Box<dyn Migration<UserEvent, UserEvent>>>, } impl UserEventMigrationChain { fn new() -> Self { let migrations: Vec<Box<dyn Migration<UserEvent, UserEvent>>> = vec![ Box::new(V1ToV2Migration), Box::new(V2ToV3Migration), ]; Self { migrations } } fn migrate_to_latest(&self, event: UserEvent, from_version: u32) -> Result<UserEvent, MigrationError> { let mut current_event = event; let mut current_version = from_version; // Apply migrations in sequence while current_version < UserEvent::LATEST_VERSION { let migration = self.migrations .get((current_version - 1) as usize) .ok_or(MigrationError::NoMigrationPath { from: current_version, to: UserEvent::LATEST_VERSION })?; current_event = migration.migrate(current_event)?; current_version += 1; } Ok(current_event) } } struct V1ToV2Migration; impl Migration<UserEvent, UserEvent> for V1ToV2Migration { fn migrate(&self, event: UserEvent) -> Result<UserEvent, MigrationError> { match event { UserEvent::V1(v1) => { // Convert V1 to V2 let user_id = UserId::try_from(v1.user_id) .map_err(|e| MigrationError::ConversionFailed(e.to_string()))?; let email = Email::try_from(v1.email) .map_err(|e| MigrationError::ConversionFailed(e.to_string()))?; // Extract names from username let (first_name, last_name) = split_username(&v1.username); Ok(UserEvent::V2(UserEventV2 { user_id, email, first_name, last_name, })) } other => Ok(other), // Already V2 or later } } } fn split_username(username: &str) -> (String, String) { let parts: Vec<&str> = username.split('_').collect(); match parts.len() { 1 => (parts[0].to_string(), String::new()), 2 => (parts[0].to_string(), parts[1].to_string()), _ => (parts[0].to_string(), parts[1..].join("_")), } } }
Event Store Integration
Integrate versioning with the event store:
#![allow(unused)] fn main() { #[async_trait] impl EventStore for VersionedEventStore { type Event = VersionedEvent; type Error = EventStoreError; async fn write_events( &self, events: Vec<EventToWrite<Self::Event>>, ) -> Result<WriteResult, Self::Error> { let versioned_events: Result<Vec<_>, _> = events .into_iter() .map(|event| { let payload = self.serializer.serialize_event(&event.payload)?; Ok(EventToWrite { stream_id: event.stream_id, payload, metadata: event.metadata, expected_version: event.expected_version, }) }) .collect(); self.inner.write_events(versioned_events?).await } async fn read_stream( &self, stream_id: &StreamId, options: ReadOptions, ) -> Result<StreamEvents<Self::Event>, Self::Error> { let raw_events = self.inner.read_stream(stream_id, options).await?; let events: Result<Vec<_>, _> = raw_events .events .into_iter() .map(|event| { let payload = self.serializer.deserialize_event(&event.payload)?; Ok(StoredEvent { id: event.id, stream_id: event.stream_id, version: event.version, payload, metadata: event.metadata, occurred_at: event.occurred_at, }) }) .collect(); Ok(StreamEvents { stream_id: raw_events.stream_id, version: raw_events.version, events: events?, }) } } }
Version-Aware Projections
Projections that handle multiple event versions:
#![allow(unused)] fn main() { #[async_trait] impl Projection for UserProjection { type Event = VersionedEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match &event.payload { VersionedEvent::User(user_event) => { self.apply_user_event(user_event, event.occurred_at).await?; } _ => {} // Ignore other event types } Ok(()) } } impl UserProjection { async fn apply_user_event( &mut self, event: &UserEvent, occurred_at: DateTime<Utc> ) -> Result<(), ProjectionError> { match event { UserEvent::V1(v1) => { // Handle V1 events let user = User { id: UserId::try_from(v1.user_id.clone())?, email: v1.email.clone(), display_name: v1.username.clone(), first_name: None, last_name: None, profile: None, preferences: UserPreferences::default(), created_at: occurred_at, updated_at: occurred_at, }; self.users.insert(user.id.clone(), user); } UserEvent::V2(v2) => { // Handle V2 events let user = User { id: v2.user_id.clone(), email: v2.email.to_string(), display_name: format!("{} {}", v2.first_name, v2.last_name), first_name: Some(v2.first_name.clone()), last_name: Some(v2.last_name.clone()), profile: None, preferences: UserPreferences::default(), created_at: occurred_at, updated_at: occurred_at, }; self.users.insert(user.id.clone(), user); } UserEvent::V3(v3) => { // Handle V3 events let user = User { id: v3.user_id.clone(), email: v3.email.to_string(), display_name: v3.profile.display_name(), first_name: Some(v3.profile.first_name.clone()), last_name: Some(v3.profile.last_name.clone()), profile: Some(v3.profile.clone()), preferences: v3.preferences.clone(), created_at: occurred_at, updated_at: occurred_at, }; self.users.insert(user.id.clone(), user); } } Ok(()) } } }
Version Compatibility Rules
Define clear compatibility rules:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] enum CompatibilityLevel { FullyCompatible, // Can read/write without issues ReadOnly, // Can read but not write RequiresMigration, // Need migration to use Incompatible, // Cannot use } trait VersionCompatibility { fn check_compatibility(reader_version: &str, event_version: &str) -> CompatibilityLevel; } struct UserEventCompatibility; impl VersionCompatibility for UserEventCompatibility { fn check_compatibility(reader_version: &str, event_version: &str) -> CompatibilityLevel { use CompatibilityLevel::*; match (reader_version, event_version) { // Same version - fully compatible (r, e) if r == e => FullyCompatible, // Reader newer than event - usually compatible ("2", "1") | ("3", "1") | ("3", "2") => FullyCompatible, // Reader older than event - may need migration ("1", "2") | ("1", "3") | ("2", "3") => RequiresMigration, // Special compatibility rules ("1.1", "1.0") => FullyCompatible, // Minor versions compatible _ => Incompatible, } } } // Usage in deserialization fn deserialize_with_compatibility_check<T>( payload: &VersionedPayload, reader_version: &str, ) -> Result<T, SerializationError> where T: DeserializeOwned + VersionCompatibility, { let compatibility = T::check_compatibility(reader_version, &payload.version); match compatibility { CompatibilityLevel::FullyCompatible => { // Direct deserialization serde_json::from_slice(&payload.data) .map_err(SerializationError::Deserialization) } CompatibilityLevel::ReadOnly => { // Deserialize but mark as read-only let mut event: T = serde_json::from_slice(&payload.data)?; // Mark event as read-only somehow Ok(event) } CompatibilityLevel::RequiresMigration => { // Apply migration let migrated = migrate_to_version(&payload.data, &payload.version, reader_version)?; serde_json::from_slice(&migrated) .map_err(SerializationError::Deserialization) } CompatibilityLevel::Incompatible => { Err(SerializationError::IncompatibleVersion { reader: reader_version.to_string(), event: payload.version.clone(), }) } } } }
Event Archival and Compression
Handle old event versions efficiently:
#![allow(unused)] fn main() { use eventcore::archival::{EventArchiver, CompressionLevel}; struct VersionedEventArchiver { archiver: EventArchiver, retention_policy: RetentionPolicy, } #[derive(Debug, Clone)] struct RetentionPolicy { pub keep_latest_versions: u32, pub archive_after_days: u32, pub compress_after_days: u32, pub delete_after_years: u32, } impl VersionedEventArchiver { async fn archive_old_versions(&self, stream_id: &StreamId) -> Result<ArchiveResult, ArchiveError> { let events = self.read_all_events(stream_id).await?; let mut archive_stats = ArchiveResult::default(); for event in events { let age_days = (Utc::now() - event.occurred_at).num_days() as u32; match event.payload.version() { v if v < (CURRENT_VERSION - self.retention_policy.keep_latest_versions) => { if age_days > self.retention_policy.delete_after_years * 365 { // Delete very old events self.archiver.delete_event(&event.id).await?; archive_stats.deleted += 1; } else if age_days > self.retention_policy.compress_after_days { // Compress old events self.archiver.compress_event(&event.id, CompressionLevel::High).await?; archive_stats.compressed += 1; } else if age_days > self.retention_policy.archive_after_days { // Move to cold storage self.archiver.archive_event(&event.id).await?; archive_stats.archived += 1; } } _ => { // Keep recent versions in hot storage archive_stats.retained += 1; } } } Ok(archive_stats) } } #[derive(Debug, Default)] struct ArchiveResult { pub retained: u32, pub archived: u32, pub compressed: u32, pub deleted: u32, } }
Version Monitoring
Monitor version usage in production:
#![allow(unused)] fn main() { use prometheus::{Counter, Histogram, IntGauge}; lazy_static! { static ref EVENT_VERSION_COUNTER: Counter = register_counter!( "eventcore_event_versions_total", "Total events by version" ).unwrap(); static ref MIGRATION_DURATION: Histogram = register_histogram!( "eventcore_migration_duration_seconds", "Time spent migrating events" ).unwrap(); static ref ACTIVE_VERSIONS: IntGauge = register_int_gauge!( "eventcore_active_event_versions", "Number of active event versions" ).unwrap(); } struct VersionMetrics { version_counts: HashMap<String, u64>, migration_stats: HashMap<(String, String), MigrationStats>, } #[derive(Debug, Default)] struct MigrationStats { pub total_migrations: u64, pub successful_migrations: u64, pub failed_migrations: u64, pub average_duration: Duration, } impl VersionMetrics { fn record_event_version(&mut self, event_type: &str, version: &str) { *self.version_counts .entry(format!("{}:{}", event_type, version)) .or_insert(0) += 1; EVENT_VERSION_COUNTER .with_label_values(&[event_type, version]) .inc(); } fn record_migration(&mut self, from: &str, to: &str, duration: Duration, success: bool) { let key = (from.to_string(), to.to_string()); let stats = self.migration_stats.entry(key).or_default(); stats.total_migrations += 1; if success { stats.successful_migrations += 1; } else { stats.failed_migrations += 1; } // Update average duration let total_time = stats.average_duration * (stats.total_migrations - 1) as u32 + duration; stats.average_duration = total_time / stats.total_migrations as u32; MIGRATION_DURATION.observe(duration.as_secs_f64()); } fn update_active_versions(&self) { let active_count = self.version_counts .keys() .map(|key| key.split(':').nth(1).unwrap_or("unknown")) .collect::<HashSet<_>>() .len(); ACTIVE_VERSIONS.set(active_count as i64); } } }
Testing Event Versions
Comprehensive testing for versioned events:
#![allow(unused)] fn main() { #[cfg(test)] mod version_tests { use super::*; use proptest::prelude::*; #[test] fn test_version_serialization_roundtrip() { let v3_event = UserEventV3 { user_id: UserId::new(), email: Email::try_new("test@example.com").unwrap(), profile: UserProfile { first_name: "Test".to_string(), last_name: "User".to_string(), bio: Some("Test bio".to_string()), avatar_url: None, }, preferences: UserPreferences::default(), }; let serializer = EventSerializer::new(); // Serialize let payload = serializer.serialize_event(&v3_event).unwrap(); assert_eq!(payload.version, "3"); // Deserialize let deserialized: UserEventV3 = serializer.deserialize_event(&payload).unwrap(); assert_eq!(v3_event.user_id, deserialized.user_id); assert_eq!(v3_event.email, deserialized.email); } #[test] fn test_migration_chain() { let v1_event = UserEvent::V1(UserEventV1 { user_id: "user_123".to_string(), email: "test@example.com".to_string(), username: "test_user".to_string(), }); let migration_chain = UserEventMigrationChain::new(); let v3_event = migration_chain.migrate_to_latest(v1_event, 1).unwrap(); match v3_event { UserEvent::V3(v3) => { assert_eq!(v3.email.to_string(), "test@example.com"); assert_eq!(v3.profile.first_name, "test"); assert_eq!(v3.profile.last_name, "user"); } _ => panic!("Expected V3 event after migration"), } } proptest! { #[test] fn version_compatibility_is_transitive( v1 in 1u32..10, v2 in 1u32..10, v3 in 1u32..10, ) { let versions = [v1, v2, v3]; versions.sort(); let [min_v, mid_v, max_v] = versions; // If min compatible with mid, and mid compatible with max, // then migration chain should work if UserEventCompatibility::check_compatibility( &mid_v.to_string(), &min_v.to_string() ) != CompatibilityLevel::Incompatible && UserEventCompatibility::check_compatibility( &max_v.to_string(), &mid_v.to_string() ) != CompatibilityLevel::Incompatible { // Migration from min to max should be possible prop_assert!(can_migrate_between_versions(min_v, max_v)); } } } fn can_migrate_between_versions(from: u32, to: u32) -> bool { // Implementation depends on your migration chain to >= from && (to - from) <= MAX_MIGRATION_DISTANCE } } }
Best Practices
- Version everything explicitly - Don’t rely on implicit versioning
- Plan migration paths - Design how old versions become new ones
- Test all paths - Test reading old events with new code
- Monitor version usage - Track which versions are in production
- Clean up old versions - Archive or delete very old events
- Document changes - Keep detailed changelogs
- Gradual rollouts - Deploy new versions incrementally
- Backward compatibility - Maintain as long as practical
Summary
Event versioning in EventCore:
- ✅ Explicit versioning - Clear version tracking
- ✅ Migration support - Transform between versions
- ✅ Compatibility checking - Know what works together
- ✅ Performance monitoring - Track version usage
- ✅ Testing support - Comprehensive test patterns
Key patterns:
- Use semantic or linear versioning consistently
- Define clear compatibility rules
- Implement migration chains for complex changes
- Monitor version usage in production
- Test all migration paths thoroughly
Next, let’s explore Long-Running Processes →
Chapter 5.3: Long-Running Processes
Long-running processes, also known as sagas or process managers, coordinate complex business workflows that span multiple commands and may take significant time to complete. EventCore provides patterns for implementing these reliably.
What Are Long-Running Processes?
Long-running processes are stateful workflows that:
- React to events
- Execute commands
- Maintain state across time
- Handle failures and compensations
- May run for days, weeks, or months
Examples include:
- Order fulfillment workflows
- User onboarding sequences
- Financial transaction processing
- Document approval chains
Process Manager Pattern
EventCore implements the process manager pattern:
#![allow(unused)] fn main() { use eventcore::process::{ProcessManager, ProcessState, ProcessResult}; #[derive(Command, Clone)] struct OrderFulfillmentProcess { #[stream] process_id: StreamId, #[stream] order_id: StreamId, current_step: FulfillmentStep, timeout_at: Option<DateTime<Utc>>, } #[derive(Debug, Clone, PartialEq)] enum FulfillmentStep { PaymentPending, PaymentConfirmed, InventoryReserved, Shipped, Delivered, Completed, Failed(String), } #[derive(Default)] struct OrderFulfillmentState { order_id: Option<OrderId>, current_step: FulfillmentStep, payment_confirmed: bool, inventory_reserved: bool, shipping_info: Option<ShippingInfo>, timeout_at: Option<DateTime<Utc>>, retry_count: u32, created_at: DateTime<Utc>, } impl CommandLogic for OrderFulfillmentProcess { type State = OrderFulfillmentState; type Event = ProcessEvent; fn apply(&self, state: &mut Self::State, event: &StoredEvent<Self::Event>) { match &event.payload { ProcessEvent::Started { order_id, timeout_at } => { state.order_id = Some(*order_id); state.current_step = FulfillmentStep::PaymentPending; state.timeout_at = *timeout_at; state.created_at = event.occurred_at; } ProcessEvent::StepCompleted { step } => { state.current_step = step.clone(); } ProcessEvent::PaymentConfirmed => { state.payment_confirmed = true; } ProcessEvent::InventoryReserved => { state.inventory_reserved = true; } ProcessEvent::ShippingInfoUpdated { info } => { state.shipping_info = Some(info.clone()); } ProcessEvent::Failed { reason } => { state.current_step = FulfillmentStep::Failed(reason.clone()); } ProcessEvent::RetryAttempted => { state.retry_count += 1; } } } async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check for timeout if let Some(timeout) = state.timeout_at { if Utc::now() > timeout { return Ok(vec![ StreamWrite::new( &read_streams, self.process_id.clone(), ProcessEvent::Failed { reason: "Process timed out".to_string(), } )? ]); } } // Execute current step match state.current_step { FulfillmentStep::PaymentPending => { self.handle_payment_step(&read_streams, &state).await } FulfillmentStep::PaymentConfirmed => { self.handle_inventory_step(&read_streams, &state).await } FulfillmentStep::InventoryReserved => { self.handle_shipping_step(&read_streams, &state).await } FulfillmentStep::Shipped => { self.handle_delivery_step(&read_streams, &state).await } FulfillmentStep::Delivered => { self.handle_completion_step(&read_streams, &state).await } FulfillmentStep::Completed | FulfillmentStep::Failed(_) => { // Process finished - no more events Ok(vec![]) } } } } impl OrderFulfillmentProcess { async fn handle_payment_step( &self, read_streams: &ReadStreams<OrderFulfillmentProcessStreamSet>, state: &OrderFulfillmentState, ) -> CommandResult<Vec<StreamWrite<OrderFulfillmentProcessStreamSet, ProcessEvent>>> { if !state.payment_confirmed { // Check if payment was confirmed by external event // This would typically listen to payment events Ok(vec![]) } else { // Move to next step Ok(vec![ StreamWrite::new( read_streams, self.process_id.clone(), ProcessEvent::StepCompleted { step: FulfillmentStep::PaymentConfirmed, } )? ]) } } async fn handle_inventory_step( &self, read_streams: &ReadStreams<OrderFulfillmentProcessStreamSet>, state: &OrderFulfillmentState, ) -> CommandResult<Vec<StreamWrite<OrderFulfillmentProcessStreamSet, ProcessEvent>>> { if !state.inventory_reserved { // Reserve inventory Ok(vec![ StreamWrite::new( read_streams, self.process_id.clone(), ProcessEvent::InventoryReserved, )? ]) } else { // Move to shipping Ok(vec![ StreamWrite::new( read_streams, self.process_id.clone(), ProcessEvent::StepCompleted { step: FulfillmentStep::InventoryReserved, } )? ]) } } // Similar implementations for other steps... } }
Event-Driven Process Coordination
Processes react to events from other parts of the system:
#![allow(unused)] fn main() { #[async_trait] impl EventHandler<SystemEvent> for OrderFulfillmentProcess { async fn handle_event( &self, event: &StoredEvent<SystemEvent>, executor: &CommandExecutor, ) -> Result<(), ProcessError> { match &event.payload { SystemEvent::Payment(PaymentEvent::Confirmed { order_id, .. }) => { // Payment confirmed - advance process let process_command = AdvanceOrderProcess { process_id: derive_process_id(order_id), trigger: ProcessTrigger::PaymentConfirmed, }; executor.execute(&process_command).await?; } SystemEvent::Inventory(InventoryEvent::Reserved { order_id, .. }) => { let process_command = AdvanceOrderProcess { process_id: derive_process_id(order_id), trigger: ProcessTrigger::InventoryReserved, }; executor.execute(&process_command).await?; } SystemEvent::Shipping(ShippingEvent::Dispatched { order_id, tracking, .. }) => { let process_command = AdvanceOrderProcess { process_id: derive_process_id(order_id), trigger: ProcessTrigger::Shipped { tracking_number: tracking.clone() }, }; executor.execute(&process_command).await?; } _ => {} // Ignore other events } Ok(()) } } #[derive(Command, Clone)] struct AdvanceOrderProcess { #[stream] process_id: StreamId, trigger: ProcessTrigger, } #[derive(Debug, Clone)] enum ProcessTrigger { PaymentConfirmed, InventoryReserved, Shipped { tracking_number: String }, Delivered, Failed { reason: String }, } }
Saga Pattern Implementation
For distributed transactions, implement the saga pattern:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct Bookingsaga { #[stream] saga_id: StreamId, #[stream] reservation_id: StreamId, steps: Vec<SagaStep>, current_step: usize, compensation_mode: bool, } #[derive(Debug, Clone)] struct SagaStep { name: String, command: Box<dyn SerializableCommand>, compensation: Box<dyn SerializableCommand>, status: StepStatus, } #[derive(Debug, Clone, PartialEq)] enum StepStatus { Pending, Completed, Failed, Compensated, } impl CommandLogic for BookingSaga { type State = SagaState; type Event = SagaEvent; async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { if state.compensation_mode { self.handle_compensation(&read_streams, &state).await } else { self.handle_forward_execution(&read_streams, &state).await } } } impl BookingSaga { async fn handle_forward_execution( &self, read_streams: &ReadStreams<BookingSagaStreamSet>, state: &SagaState, ) -> CommandResult<Vec<StreamWrite<BookingSagaStreamSet, SagaEvent>>> { if state.current_step >= state.steps.len() { // All steps completed return Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::Completed, )? ]); } let current_step = &state.steps[state.current_step]; match current_step.status { StepStatus::Pending => { // Execute current step Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::StepStarted { step_index: state.current_step, step_name: current_step.name.clone(), } )? ]) } StepStatus::Completed => { // Move to next step Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::StepAdvanced { next_step: state.current_step + 1, } )? ]) } StepStatus::Failed => { // Start compensation Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::CompensationStarted { failed_step: state.current_step, } )? ]) } StepStatus::Compensated => unreachable!("Cannot be compensated in forward mode"), } } async fn handle_compensation( &self, read_streams: &ReadStreams<BookingSagaStreamSet>, state: &SagaState, ) -> CommandResult<Vec<StreamWrite<BookingSagaStreamSet, SagaEvent>>> { // Compensate completed steps in reverse order let compensation_step = state.steps .iter() .rposition(|step| step.status == StepStatus::Completed); match compensation_step { Some(index) => { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::CompensationStepStarted { step_index: index, step_name: state.steps[index].name.clone(), } )? ]) } None => { // All compensations completed Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::CompensationCompleted, )? ]) } } } } // Example saga for hotel + flight + car booking fn create_travel_booking_saga( hotel_booking: BookHotelCommand, flight_booking: BookFlightCommand, car_booking: BookCarCommand, ) -> BookingSaga { let steps = vec![ SagaStep { name: "book_hotel".to_string(), command: Box::new(hotel_booking.clone()), compensation: Box::new(CancelHotelCommand { booking_id: hotel_booking.booking_id, }), status: StepStatus::Pending, }, SagaStep { name: "book_flight".to_string(), command: Box::new(flight_booking.clone()), compensation: Box::new(CancelFlightCommand { booking_id: flight_booking.booking_id, }), status: StepStatus::Pending, }, SagaStep { name: "book_car".to_string(), command: Box::new(car_booking.clone()), compensation: Box::new(CancelCarCommand { booking_id: car_booking.booking_id, }), status: StepStatus::Pending, }, ]; BookingSaga { saga_id: StreamId::from(format!("booking-saga-{}", SagaId::new())), reservation_id: StreamId::from(format!("reservation-{}", ReservationId::new())), steps, current_step: 0, compensation_mode: false, } } }
Timeout and Retry Handling
Long-running processes need robust timeout and retry logic:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] struct ProcessTimeout { timeout_at: DateTime<Utc>, retry_policy: RetryPolicy, max_retries: u32, current_retries: u32, } #[derive(Debug, Clone)] enum RetryPolicy { FixedDelay { delay: Duration }, ExponentialBackoff { base_delay: Duration, max_delay: Duration }, LinearBackoff { initial_delay: Duration, increment: Duration }, } impl ProcessTimeout { fn should_retry(&self) -> bool { self.current_retries < self.max_retries } fn next_retry_delay(&self) -> Duration { match &self.retry_policy { RetryPolicy::FixedDelay { delay } => *delay, RetryPolicy::ExponentialBackoff { base_delay, max_delay } => { let delay = *base_delay * 2_u32.pow(self.current_retries); std::cmp::min(delay, *max_delay) } RetryPolicy::LinearBackoff { initial_delay, increment } => { *initial_delay + (*increment * self.current_retries) } } } fn next_timeout(&self) -> DateTime<Utc> { Utc::now() + self.next_retry_delay() } } // Timeout scheduler for processes #[async_trait] trait ProcessTimeoutScheduler { async fn schedule_timeout( &self, process_id: StreamId, timeout_at: DateTime<Utc>, ) -> Result<(), TimeoutError>; async fn cancel_timeout( &self, process_id: StreamId, ) -> Result<(), TimeoutError>; } struct InMemoryTimeoutScheduler { timeouts: Arc<RwLock<BTreeMap<DateTime<Utc>, Vec<StreamId>>>>, executor: CommandExecutor, } impl InMemoryTimeoutScheduler { async fn run_timeout_checker(&self) { let mut interval = tokio::time::interval(Duration::from_secs(10)); loop { interval.tick().await; self.check_timeouts().await; } } async fn check_timeouts(&self) { let now = Utc::now(); let mut timeouts = self.timeouts.write().await; // Find expired timeouts let expired: Vec<_> = timeouts .range(..=now) .flat_map(|(_, process_ids)| process_ids.clone()) .collect(); // Remove expired timeouts timeouts.retain(|&timeout_time, _| timeout_time > now); // Trigger timeout commands for process_id in expired { let timeout_command = ProcessTimeoutCommand { process_id, timed_out_at: now, }; if let Err(e) = self.executor.execute(&timeout_command).await { tracing::error!("Failed to execute timeout command: {}", e); } } } } #[derive(Command, Clone)] struct ProcessTimeoutCommand { #[stream] process_id: StreamId, timed_out_at: DateTime<Utc>, } impl CommandLogic for ProcessTimeoutCommand { type State = ProcessState; type Event = ProcessEvent; async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { // Check if process should retry or fail let should_retry = state.timeout.as_ref() .map(|t| t.should_retry()) .unwrap_or(false); if should_retry { let next_timeout = state.timeout.as_ref().unwrap().next_timeout(); Ok(vec![ StreamWrite::new( &read_streams, self.process_id.clone(), ProcessEvent::RetryScheduled { retry_at: next_timeout, attempt: state.timeout.as_ref().unwrap().current_retries + 1, } )? ]) } else { Ok(vec![ StreamWrite::new( &read_streams, self.process_id.clone(), ProcessEvent::Failed { reason: "Process timed out after maximum retries".to_string(), } )? ]) } } } }
Process Monitoring and Observability
Monitor long-running processes in production:
#![allow(unused)] fn main() { use prometheus::{Counter, Histogram, Gauge}; lazy_static! { static ref PROCESS_STARTED: Counter = register_counter!( "eventcore_processes_started_total", "Total number of processes started" ).unwrap(); static ref PROCESS_COMPLETED: Counter = register_counter!( "eventcore_processes_completed_total", "Total number of processes completed" ).unwrap(); static ref PROCESS_FAILED: Counter = register_counter!( "eventcore_processes_failed_total", "Total number of processes failed" ).unwrap(); static ref PROCESS_DURATION: Histogram = register_histogram!( "eventcore_process_duration_seconds", "Process execution duration" ).unwrap(); static ref ACTIVE_PROCESSES: Gauge = register_gauge!( "eventcore_active_processes", "Number of currently active processes" ).unwrap(); } #[derive(Clone)] struct ProcessMetrics { process_counts: HashMap<String, ProcessCounts>, active_processes: HashSet<StreamId>, } #[derive(Debug, Default)] struct ProcessCounts { started: u64, completed: u64, failed: u64, average_duration: Duration, } impl ProcessMetrics { fn record_process_started(&mut self, process_type: &str, process_id: StreamId) { PROCESS_STARTED.with_label_values(&[process_type]).inc(); self.process_counts .entry(process_type.to_string()) .or_default() .started += 1; self.active_processes.insert(process_id); ACTIVE_PROCESSES.set(self.active_processes.len() as f64); } fn record_process_completed( &mut self, process_type: &str, process_id: StreamId, duration: Duration ) { PROCESS_COMPLETED.with_label_values(&[process_type]).inc(); PROCESS_DURATION.observe(duration.as_secs_f64()); let counts = self.process_counts .entry(process_type.to_string()) .or_default(); counts.completed += 1; // Update average duration let total_completed = counts.completed; counts.average_duration = (counts.average_duration * (total_completed - 1) as u32 + duration) / total_completed as u32; self.active_processes.remove(&process_id); ACTIVE_PROCESSES.set(self.active_processes.len() as f64); } fn record_process_failed(&mut self, process_type: &str, process_id: StreamId) { PROCESS_FAILED.with_label_values(&[process_type]).inc(); self.process_counts .entry(process_type.to_string()) .or_default() .failed += 1; self.active_processes.remove(&process_id); ACTIVE_PROCESSES.set(self.active_processes.len() as f64); } } // Process health monitoring #[derive(Debug)] struct ProcessHealthCheck { max_process_age: Duration, max_retry_count: u32, warning_thresholds: HealthThresholds, } #[derive(Debug)] struct HealthThresholds { failure_rate: f64, // 0.0-1.0 average_duration: Duration, stuck_process_age: Duration, } impl ProcessHealthCheck { async fn check_process_health(&self, metrics: &ProcessMetrics) -> HealthStatus { let mut issues = Vec::new(); for (process_type, counts) in &metrics.process_counts { // Check failure rate let total = counts.started; if total > 0 { let failure_rate = counts.failed as f64 / total as f64; if failure_rate > self.warning_thresholds.failure_rate { issues.push(format!( "High failure rate for {}: {:.1}%", process_type, failure_rate * 100.0 )); } } // Check average duration if counts.average_duration > self.warning_thresholds.average_duration { issues.push(format!( "Slow processes for {}: {:?}", process_type, counts.average_duration )); } } // Check for stuck processes let stuck_count = self.count_stuck_processes(&metrics.active_processes).await; if stuck_count > 0 { issues.push(format!("{} processes appear stuck", stuck_count)); } if issues.is_empty() { HealthStatus::Healthy } else { HealthStatus::Warning { issues } } } async fn count_stuck_processes(&self, active_processes: &HashSet<StreamId>) -> usize { // This would query the event store to check process ages // Implementation depends on your monitoring setup 0 } } #[derive(Debug)] enum HealthStatus { Healthy, Warning { issues: Vec<String> }, Critical { issues: Vec<String> }, } }
Testing Long-Running Processes
Test processes thoroughly:
#![allow(unused)] fn main() { #[cfg(test)] mod process_tests { use super::*; use eventcore::testing::prelude::*; #[tokio::test] async fn test_order_fulfillment_happy_path() { let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); let order_id = OrderId::new(); let process = OrderFulfillmentProcess::start(order_id).unwrap(); // Start process executor.execute(&process).await.unwrap(); // Simulate payment confirmation let payment_event = PaymentConfirmed { order_id, amount: Money::from_cents(1000), }; // Process should advance let advance_command = AdvanceOrderProcess { process_id: process.process_id, trigger: ProcessTrigger::PaymentConfirmed, }; executor.execute(&advance_command).await.unwrap(); // Continue with inventory, shipping, etc. // Verify process reaches completion } #[tokio::test] async fn test_process_timeout_and_retry() { let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); let scheduler = InMemoryTimeoutScheduler::new(executor.clone()); let order_id = OrderId::new(); let mut process = OrderFulfillmentProcess::start(order_id).unwrap(); process.timeout_at = Some(Utc::now() + Duration::from_secs(1)); // Start process executor.execute(&process).await.unwrap(); // Wait for timeout tokio::time::sleep(Duration::from_secs(2)).await; // Verify timeout was triggered // Check retry logic works } #[tokio::test] async fn test_saga_compensation() { let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); // Create booking saga let saga = create_travel_booking_saga( create_hotel_booking(), create_flight_booking(), create_car_booking(), ); // Start saga executor.execute(&saga).await.unwrap(); // Simulate hotel booking success simulate_step_success(&executor, &saga.saga_id, 0).await; // Simulate flight booking failure simulate_step_failure(&executor, &saga.saga_id, 1, "No availability").await; // Verify compensation started // Check hotel booking was cancelled } } }
Best Practices
- Design for failure - Always plan compensation strategies
- Use timeouts - Prevent processes from hanging forever
- Implement retries - Handle transient failures gracefully
- Monitor actively - Track process health in production
- Keep state minimal - Only store what’s needed for decisions
- Test thoroughly - Include failure scenarios and edge cases
- Document workflows - Make process logic clear
- Version processes - Handle schema evolution like events
Summary
Long-running processes in EventCore:
- ✅ Stateful workflows - Coordinate complex business processes
- ✅ Event-driven - React to events from other parts of the system
- ✅ Fault tolerant - Handle failures and compensations
- ✅ Monitorable - Track health and performance
- ✅ Testable - Comprehensive testing support
Key patterns:
- Use process managers for complex workflows
- Implement saga pattern for distributed transactions
- Handle timeouts and retries robustly
- Monitor process health actively
- Test all failure scenarios
Next, let’s explore Distributed Systems →
Chapter 5.4: Distributed Systems
EventCore excels in distributed systems where multiple services need to coordinate while maintaining consistency. This chapter covers patterns for building resilient, scalable distributed event-sourced architectures.
Distributed EventCore Architecture
Service Boundaries
Each service owns its event streams and commands:
#![allow(unused)] fn main() { // User Service #[derive(Command, Clone)] struct CreateUser { #[stream] user_id: StreamId, email: Email, profile: UserProfile, } // Order Service #[derive(Command, Clone)] struct CreateOrder { #[stream] order_id: StreamId, #[stream] customer_id: StreamId, // References user from User Service items: Vec<OrderItem>, } // Payment Service #[derive(Command, Clone)] struct ProcessPayment { #[stream] payment_id: StreamId, #[stream] order_id: StreamId, // References order from Order Service amount: Money, method: PaymentMethod, } }
Event Publishing
Services publish events for other services to consume:
#![allow(unused)] fn main() { use eventcore::distributed::{EventPublisher, EventSubscriber}; #[async_trait] trait EventPublisher { async fn publish(&self, event: &StoredEvent) -> Result<(), PublishError>; } struct MessageBusPublisher { bus: MessageBus, topic_mapping: HashMap<String, String>, } impl MessageBusPublisher { async fn publish_event<E>(&self, event: &StoredEvent<E>) -> Result<(), PublishError> where E: Serialize, { let topic = self.topic_mapping .get(&E::event_type()) .ok_or(PublishError::UnknownEventType)?; let message = DistributedEvent { event_id: event.id, event_type: E::event_type(), stream_id: event.stream_id.clone(), version: event.version, payload: serde_json::to_value(&event.payload)?, metadata: event.metadata.clone(), occurred_at: event.occurred_at, published_at: Utc::now(), service_id: self.service_id(), }; self.bus.publish(topic, &message).await?; Ok(()) } fn service_id(&self) -> String { std::env::var("SERVICE_ID").unwrap_or_else(|_| "unknown".to_string()) } } #[derive(Debug, Serialize, Deserialize)] struct DistributedEvent { event_id: EventId, event_type: String, stream_id: StreamId, version: EventVersion, payload: serde_json::Value, metadata: EventMetadata, occurred_at: DateTime<Utc>, published_at: DateTime<Utc>, service_id: String, } }
Event Subscription
Services subscribe to events from other services:
#![allow(unused)] fn main() { #[async_trait] trait EventSubscriber { async fn subscribe<F>(&self, topic: &str, handler: F) -> Result<(), SubscribeError> where F: Fn(DistributedEvent) -> BoxFuture<'_, Result<(), HandleError>> + Send + Sync + 'static; } struct OrderEventHandler { executor: CommandExecutor, } impl OrderEventHandler { async fn handle_user_events(&self, event: DistributedEvent) -> Result<(), HandleError> { match event.event_type.as_str() { "UserRegistered" => { let user_registered: UserRegisteredEvent = serde_json::from_value(event.payload)?; // Create customer profile in order service let command = CreateCustomerProfile { customer_id: StreamId::from(format!("customer-{}", user_registered.user_id)), user_id: user_registered.user_id, email: user_registered.email, preferences: CustomerPreferences::default(), }; self.executor.execute(&command).await?; } "UserUpdated" => { // Handle user updates let user_updated: UserUpdatedEvent = serde_json::from_value(event.payload)?; let command = UpdateCustomerProfile { customer_id: StreamId::from(format!("customer-{}", user_updated.user_id)), email: user_updated.email, profile_updates: user_updated.profile_changes, }; self.executor.execute(&command).await?; } _ => { // Unknown event type - log and ignore tracing::debug!("Ignoring unknown event type: {}", event.event_type); } } Ok(()) } } // Setup subscription async fn setup_event_subscriptions( subscriber: &impl EventSubscriber, handler: OrderEventHandler, ) -> Result<(), SubscribeError> { // Subscribe to user events subscriber.subscribe("user-events", move |event| { let handler = handler.clone(); Box::pin(async move { handler.handle_user_events(event).await }) }).await?; // Subscribe to payment events subscriber.subscribe("payment-events", move |event| { let handler = handler.clone(); Box::pin(async move { handler.handle_payment_events(event).await }) }).await?; Ok(()) } }
Distributed Transactions
Handle distributed transactions with the saga pattern:
#![allow(unused)] fn main() { #[derive(Command, Clone)] struct DistributedOrderSaga { #[stream] saga_id: StreamId, order_details: OrderDetails, customer_id: UserId, } #[derive(Default)] struct DistributedSagaState { order_created: bool, payment_reserved: bool, inventory_reserved: bool, shipping_scheduled: bool, completed: bool, compensation_needed: bool, failed_step: Option<String>, } impl CommandLogic for DistributedOrderSaga { type State = DistributedSagaState; type Event = SagaEvent; async fn handle( &self, read_streams: ReadStreams<Self::StreamSet>, state: Self::State, _stream_resolver: &mut StreamResolver, ) -> CommandResult<Vec<StreamWrite<Self::StreamSet, Self::Event>>> { if state.compensation_needed { self.handle_compensation(&read_streams, &state).await } else { self.handle_forward_flow(&read_streams, &state).await } } } impl DistributedOrderSaga { async fn handle_forward_flow( &self, read_streams: &ReadStreams<DistributedOrderSagaStreamSet>, state: &DistributedSagaState, ) -> CommandResult<Vec<StreamWrite<DistributedOrderSagaStreamSet, SagaEvent>>> { match (state.order_created, state.payment_reserved, state.inventory_reserved, state.shipping_scheduled) { (false, _, _, _) => { // Step 1: Create order Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::OrderCreationRequested { order_details: self.order_details.clone(), } )? ]) } (true, false, _, _) => { // Step 2: Reserve payment Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::PaymentReservationRequested { customer_id: self.customer_id, amount: self.order_details.total_amount(), } )? ]) } (true, true, false, _) => { // Step 3: Reserve inventory Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::InventoryReservationRequested { items: self.order_details.items.clone(), } )? ]) } (true, true, true, false) => { // Step 4: Schedule shipping Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::ShippingScheduleRequested { order_id: self.order_details.order_id, shipping_address: self.order_details.shipping_address.clone(), } )? ]) } (true, true, true, true) => { // All steps completed Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::SagaCompleted, )? ]) } } } async fn handle_compensation( &self, read_streams: &ReadStreams<DistributedOrderSagaStreamSet>, state: &DistributedSagaState, ) -> CommandResult<Vec<StreamWrite<DistributedOrderSagaStreamSet, SagaEvent>>> { // Compensate in reverse order if state.shipping_scheduled { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::ShippingCancellationRequested, )? ]) } else if state.inventory_reserved { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::InventoryReleaseRequested, )? ]) } else if state.payment_reserved { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::PaymentReleaseRequested, )? ]) } else if state.order_created { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::OrderCancellationRequested, )? ]) } else { Ok(vec![ StreamWrite::new( read_streams, self.saga_id.clone(), SagaEvent::CompensationCompleted, )? ]) } } } // External service integration struct ExternalServiceClient { http_client: reqwest::Client, service_url: String, timeout: Duration, } impl ExternalServiceClient { async fn create_order(&self, order: &OrderDetails) -> Result<OrderId, ServiceError> { let response = self.http_client .post(&format!("{}/orders", self.service_url)) .json(order) .timeout(self.timeout) .send() .await?; if response.status().is_success() { let result: CreateOrderResponse = response.json().await?; Ok(result.order_id) } else { Err(ServiceError::RequestFailed { status: response.status(), body: response.text().await.unwrap_or_default(), }) } } async fn cancel_order(&self, order_id: OrderId) -> Result<(), ServiceError> { let response = self.http_client .delete(&format!("{}/orders/{}", self.service_url, order_id)) .timeout(self.timeout) .send() .await?; if !response.status().is_success() { return Err(ServiceError::RequestFailed { status: response.status(), body: response.text().await.unwrap_or_default(), }); } Ok(()) } } }
Event Sourcing Across Services
Cross-Service Projections
Build projections that consume events from multiple services:
#![allow(unused)] fn main() { struct CrossServiceOrderProjection { orders: HashMap<OrderId, OrderView>, event_store: Arc<dyn EventStore>, user_service_client: UserServiceClient, payment_service_client: PaymentServiceClient, } #[derive(Debug, Clone)] struct OrderView { order_id: OrderId, customer_info: CustomerInfo, items: Vec<OrderItem>, payment_status: PaymentStatus, shipping_status: ShippingStatus, total_amount: Money, created_at: DateTime<Utc>, updated_at: DateTime<Utc>, } #[async_trait] impl Projection for CrossServiceOrderProjection { type Event = DistributedEvent; type Error = ProjectionError; async fn apply(&mut self, event: &StoredEvent<Self::Event>) -> Result<(), Self::Error> { match event.payload.event_type.as_str() { "OrderCreated" => { let order_created: OrderCreatedEvent = serde_json::from_value(event.payload.payload.clone())?; // Get customer info from user service let customer_info = self.user_service_client .get_customer_info(order_created.customer_id) .await?; let order_view = OrderView { order_id: order_created.order_id, customer_info, items: order_created.items, payment_status: PaymentStatus::Pending, shipping_status: ShippingStatus::NotStarted, total_amount: order_created.total_amount, created_at: event.occurred_at, updated_at: event.occurred_at, }; self.orders.insert(order_created.order_id, order_view); } "PaymentProcessed" => { let payment_processed: PaymentProcessedEvent = serde_json::from_value(event.payload.payload.clone())?; if let Some(order) = self.orders.get_mut(&payment_processed.order_id) { order.payment_status = PaymentStatus::Completed; order.updated_at = event.occurred_at; } } "ShipmentDispatched" => { let shipment_dispatched: ShipmentDispatchedEvent = serde_json::from_value(event.payload.payload.clone())?; if let Some(order) = self.orders.get_mut(&shipment_dispatched.order_id) { order.shipping_status = ShippingStatus::Dispatched; order.updated_at = event.occurred_at; } } _ => {} // Ignore other events } Ok(()) } } }
Event Federation
Federate events across service boundaries:
#![allow(unused)] fn main() { struct EventFederationHub { publishers: HashMap<String, Box<dyn EventPublisher>>, subscribers: HashMap<String, Vec<Box<dyn EventSubscriber>>>, routing_rules: RoutingRules, } #[derive(Debug, Clone)] struct RoutingRules { routes: Vec<RoutingRule>, } #[derive(Debug, Clone)] struct RoutingRule { source_service: String, event_pattern: String, target_services: Vec<String>, transformation: Option<String>, } impl EventFederationHub { async fn route_event(&self, event: &DistributedEvent) -> Result<(), FederationError> { let applicable_rules = self.routing_rules .routes .iter() .filter(|rule| { rule.source_service == event.service_id && self.matches_pattern(&event.event_type, &rule.event_pattern) }); for rule in applicable_rules { let transformed_event = if let Some(ref transformation) = rule.transformation { self.transform_event(event, transformation)? } else { event.clone() }; for target_service in &rule.target_services { if let Some(publisher) = self.publishers.get(target_service) { publisher.publish_federated_event(&transformed_event).await?; } } } Ok(()) } fn matches_pattern(&self, event_type: &str, pattern: &str) -> bool { // Simple pattern matching - could be more sophisticated pattern == "*" || pattern == event_type || (pattern.ends_with("*") && event_type.starts_with(&pattern[..pattern.len()-1])) } fn transform_event(&self, event: &DistributedEvent, transformation: &str) -> Result<DistributedEvent, FederationError> { // Apply transformation rules match transformation { "user_to_customer" => { let mut transformed = event.clone(); transformed.event_type = transformed.event_type.replace("User", "Customer"); Ok(transformed) } "anonymize_pii" => { let mut transformed = event.clone(); // Remove PII from payload if let Some(email) = transformed.payload.get_mut("email") { *email = serde_json::Value::String("***@***.***".to_string()); } Ok(transformed) } _ => Err(FederationError::UnknownTransformation(transformation.to_string())), } } } }
Service Discovery and Health
Service Registry
#![allow(unused)] fn main() { #[async_trait] trait ServiceRegistry { async fn register_service(&self, service: ServiceInfo) -> Result<(), RegistryError>; async fn discover_services(&self, service_type: &str) -> Result<Vec<ServiceInfo>, RegistryError>; async fn health_check(&self, service_id: &str) -> Result<HealthStatus, RegistryError>; } #[derive(Debug, Clone)] struct ServiceInfo { id: String, name: String, service_type: String, version: String, endpoints: HashMap<String, String>, health_check_url: String, capabilities: Vec<String>, metadata: HashMap<String, String>, registered_at: DateTime<Utc>, } struct ConsulServiceRegistry { consul_client: ConsulClient, } impl ConsulServiceRegistry { async fn register_eventcore_service(&self) -> Result<(), RegistryError> { let service = ServiceInfo { id: format!("eventcore-{}", uuid::Uuid::new_v4()), name: "order-service".to_string(), service_type: "eventcore".to_string(), version: env!("CARGO_PKG_VERSION").to_string(), endpoints: hashmap! { "http".to_string() => "http://localhost:8080".to_string(), "grpc".to_string() => "grpc://localhost:8081".to_string(), "events".to_string() => "kafka://localhost:9092/order-events".to_string(), }, health_check_url: "http://localhost:8080/health".to_string(), capabilities: vec![ "event-sourcing".to_string(), "order-management".to_string(), "payment-processing".to_string(), ], metadata: hashmap! { "environment".to_string() => "production".to_string(), "region".to_string() => "us-east-1".to_string(), }, registered_at: Utc::now(), }; self.register_service(service).await } } }
Circuit Breaker for Service Calls
#![allow(unused)] fn main() { struct ServiceCircuitBreaker { state: Arc<RwLock<CircuitBreakerState>>, config: CircuitBreakerConfig, } #[derive(Debug)] struct CircuitBreakerConfig { failure_threshold: u32, timeout: Duration, retry_timeout: Duration, } #[derive(Debug)] enum CircuitBreakerState { Closed { failure_count: u32 }, Open { failed_at: DateTime<Utc> }, HalfOpen, } impl ServiceCircuitBreaker { async fn call<F, T, E>(&self, operation: F) -> Result<T, CircuitBreakerError<E>> where F: Future<Output = Result<T, E>>, { // Check circuit state { let state = self.state.read().await; match *state { CircuitBreakerState::Open { failed_at } => { if Utc::now() - failed_at < self.config.retry_timeout { return Err(CircuitBreakerError::CircuitOpen); } // Transition to half-open } _ => {} } } // Update to half-open if we were open { let mut state = self.state.write().await; if matches!(*state, CircuitBreakerState::Open { .. }) { *state = CircuitBreakerState::HalfOpen; } } // Execute operation with timeout match tokio::time::timeout(self.config.timeout, operation).await { Ok(Ok(result)) => { // Success - reset circuit let mut state = self.state.write().await; *state = CircuitBreakerState::Closed { failure_count: 0 }; Ok(result) } Ok(Err(e)) => { // Operation failed self.record_failure().await; Err(CircuitBreakerError::OperationFailed(e)) } Err(_) => { // Timeout self.record_failure().await; Err(CircuitBreakerError::Timeout) } } } async fn record_failure(&self) { let mut state = self.state.write().await; match *state { CircuitBreakerState::Closed { failure_count } => { let new_count = failure_count + 1; if new_count >= self.config.failure_threshold { *state = CircuitBreakerState::Open { failed_at: Utc::now() }; } else { *state = CircuitBreakerState::Closed { failure_count: new_count }; } } CircuitBreakerState::HalfOpen => { *state = CircuitBreakerState::Open { failed_at: Utc::now() }; } _ => {} } } } #[derive(Debug, thiserror::Error)] enum CircuitBreakerError<E> { #[error("Circuit breaker is open")] CircuitOpen, #[error("Operation timed out")] Timeout, #[error("Operation failed: {0}")] OperationFailed(E), } }
Distributed Monitoring
Distributed Tracing
#![allow(unused)] fn main() { use opentelemetry::{global, trace::{TraceContextExt, Tracer}}; use tracing_opentelemetry::OpenTelemetrySpanExt; #[derive(Clone)] struct DistributedCommandExecutor { inner: CommandExecutor, tracer: Box<dyn Tracer + Send + Sync>, } impl DistributedCommandExecutor { async fn execute_with_tracing<C: Command>( &self, command: &C, parent_context: Option<SpanContext>, ) -> CommandResult<ExecutionResult> { let span = self.tracer .span_builder(format!("execute_command_{}", std::any::type_name::<C>())) .with_kind(SpanKind::Internal) .start(&self.tracer); if let Some(parent) = parent_context { span.set_parent(parent); } let _guard = span.enter(); span.set_attribute("command.type", std::any::type_name::<C>()); span.set_attribute("service.name", self.service_name()); match self.inner.execute(command).await { Ok(result) => { span.set_attribute("command.success", true); span.set_attribute("events.written", result.events_written.len() as i64); Ok(result) } Err(e) => { span.set_attribute("command.success", false); span.set_attribute("error.message", e.to_string()); Err(e) } } } } // Distributed event with trace context #[derive(Debug, Serialize, Deserialize)] struct TracedDistributedEvent { #[serde(flatten)] event: DistributedEvent, trace_id: String, span_id: String, } impl From<(&StoredEvent, &SpanContext)> for TracedDistributedEvent { fn from((event, context): (&StoredEvent, &SpanContext)) -> Self { Self { event: event.into(), trace_id: context.trace_id().to_string(), span_id: context.span_id().to_string(), } } } }
Metrics Collection
#![allow(unused)] fn main() { use prometheus::{Counter, Histogram, Gauge, Registry}; #[derive(Clone)] struct DistributedMetrics { registry: Registry, // Command metrics commands_total: Counter, command_duration: Histogram, command_errors: Counter, // Event metrics events_published: Counter, events_consumed: Counter, event_lag: Gauge, // Service metrics service_health: Gauge, active_connections: Gauge, } impl DistributedMetrics { fn new(service_name: &str) -> Self { let registry = Registry::new(); let commands_total = Counter::new( "eventcore_commands_total", "Total commands executed" ).unwrap(); let command_duration = Histogram::new( "eventcore_command_duration_seconds", "Command execution duration" ).unwrap(); let command_errors = Counter::new( "eventcore_command_errors_total", "Total command errors" ).unwrap(); let events_published = Counter::new( "eventcore_events_published_total", "Total events published" ).unwrap(); let events_consumed = Counter::new( "eventcore_events_consumed_total", "Total events consumed" ).unwrap(); let event_lag = Gauge::new( "eventcore_event_lag_seconds", "Event processing lag" ).unwrap(); let service_health = Gauge::new( "eventcore_service_health", "Service health status (0=down, 1=up)" ).unwrap(); let active_connections = Gauge::new( "eventcore_active_connections", "Number of active connections" ).unwrap(); // Register all metrics registry.register(Box::new(commands_total.clone())).unwrap(); registry.register(Box::new(command_duration.clone())).unwrap(); registry.register(Box::new(command_errors.clone())).unwrap(); registry.register(Box::new(events_published.clone())).unwrap(); registry.register(Box::new(events_consumed.clone())).unwrap(); registry.register(Box::new(event_lag.clone())).unwrap(); registry.register(Box::new(service_health.clone())).unwrap(); registry.register(Box::new(active_connections.clone())).unwrap(); Self { registry, commands_total, command_duration, command_errors, events_published, events_consumed, event_lag, service_health, active_connections, } } fn record_command_executed(&self, command_type: &str, duration: Duration, success: bool) { self.commands_total .with_label_values(&[command_type]) .inc(); self.command_duration .with_label_values(&[command_type]) .observe(duration.as_secs_f64()); if !success { self.command_errors .with_label_values(&[command_type]) .inc(); } } fn record_event_published(&self, event_type: &str) { self.events_published .with_label_values(&[event_type]) .inc(); } fn record_event_consumed(&self, event_type: &str, lag: Duration) { self.events_consumed .with_label_values(&[event_type]) .inc(); self.event_lag .with_label_values(&[event_type]) .set(lag.as_secs_f64()); } async fn export_metrics(&self) -> String { use prometheus::Encoder; let encoder = prometheus::TextEncoder::new(); let metric_families = self.registry.gather(); encoder.encode_to_string(&metric_families).unwrap() } } }
Testing Distributed Systems
#![allow(unused)] fn main() { #[cfg(test)] mod distributed_tests { use super::*; use testcontainers::*; #[tokio::test] async fn test_distributed_saga() { // Setup test environment with multiple services let docker = clients::Cli::default(); let kafka_container = docker.run(images::kafka::Kafka::default()); let postgres_container = docker.run(images::postgres::Postgres::default()); // Start services let user_service = start_user_service(&postgres_container).await; let order_service = start_order_service(&postgres_container).await; let payment_service = start_payment_service(&postgres_container).await; // Setup event routing let event_hub = EventFederationHub::new(&kafka_container); // Execute distributed saga let saga = DistributedOrderSaga { saga_id: StreamId::new(), order_details: create_test_order(), customer_id: create_test_customer(&user_service).await, }; let result = order_service.execute_saga(&saga).await; // Verify all services were coordinated correctly assert!(result.is_ok()); // Verify final state across services let order = order_service.get_order(saga.order_details.order_id).await?; assert_eq!(order.status, OrderStatus::Completed); let payment = payment_service.get_payment(saga.order_details.order_id).await?; assert_eq!(payment.status, PaymentStatus::Completed); } #[tokio::test] async fn test_service_failure_compensation() { // Similar setup but simulate payment service failure // Verify compensation is triggered // Verify order is cancelled // Verify inventory is released } } }
Best Practices
- Design for independence - Services should be loosely coupled
- Use event-driven communication - Prefer async events over sync calls
- Implement circuit breakers - Protect against cascading failures
- Monitor everything - Comprehensive observability is critical
- Plan for failure - Design compensation strategies upfront
- Version everything - Events, services, and APIs
- Test across services - Include distributed testing
- Document service contracts - Clear event schemas and APIs
Summary
Distributed EventCore systems:
- ✅ Service boundaries - Clear ownership of streams and commands
- ✅ Event-driven - Async communication between services
- ✅ Fault tolerant - Circuit breakers and compensation
- ✅ Observable - Distributed tracing and metrics
- ✅ Scalable - Independent scaling of services
Key patterns:
- Own your streams - each service owns its event streams
- Publish events - share state changes via events
- Use sagas - coordinate distributed transactions
- Monitor health - track service health and performance
- Plan for failure - implement circuit breakers and compensation
Next, let’s explore Performance Optimization →
Chapter 5.5: Performance Optimization
EventCore is designed for performance, but complex event-sourced systems need careful optimization. This chapter covers patterns and techniques for maximizing performance in production.
Performance Fundamentals
Key Metrics
Monitor these critical metrics:
#![allow(unused)] fn main() { use prometheus::{Counter, Histogram, Gauge, register_counter, register_histogram, register_gauge}; lazy_static! { // Throughput metrics static ref COMMANDS_PER_SECOND: Counter = register_counter!( "eventcore_commands_per_second", "Commands executed per second" ).unwrap(); static ref EVENTS_PER_SECOND: Counter = register_counter!( "eventcore_events_per_second", "Events written per second" ).unwrap(); // Latency metrics static ref COMMAND_LATENCY: Histogram = register_histogram!( "eventcore_command_latency_seconds", "Command execution latency" ).unwrap(); static ref EVENT_STORE_LATENCY: Histogram = register_histogram!( "eventcore_event_store_latency_seconds", "Event store operation latency" ).unwrap(); // Resource usage static ref ACTIVE_STREAMS: Gauge = register_gauge!( "eventcore_active_streams", "Number of active event streams" ).unwrap(); static ref MEMORY_USAGE: Gauge = register_gauge!( "eventcore_memory_usage_bytes", "Memory usage in bytes" ).unwrap(); } #[derive(Debug, Clone)] struct PerformanceMetrics { pub commands_per_second: f64, pub events_per_second: f64, pub avg_command_latency: Duration, pub p95_command_latency: Duration, pub p99_command_latency: Duration, pub memory_usage_mb: f64, pub active_streams: u64, } impl PerformanceMetrics { fn record_command_executed(&self, duration: Duration) { COMMANDS_PER_SECOND.inc(); COMMAND_LATENCY.observe(duration.as_secs_f64()); } fn record_events_written(&self, count: usize) { EVENTS_PER_SECOND.inc_by(count as f64); } } }
Performance Targets
Typical performance targets for EventCore applications:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] struct PerformanceTargets { // Throughput targets pub min_commands_per_second: f64, // 100+ commands/sec pub min_events_per_second: f64, // 1000+ events/sec // Latency targets pub max_p50_latency: Duration, // <10ms pub max_p95_latency: Duration, // <50ms pub max_p99_latency: Duration, // <100ms // Resource targets pub max_memory_usage_mb: f64, // <1GB per service pub max_cpu_usage_percent: f64, // <70% } impl PerformanceTargets { fn production() -> Self { Self { min_commands_per_second: 100.0, min_events_per_second: 1000.0, max_p50_latency: Duration::from_millis(10), max_p95_latency: Duration::from_millis(50), max_p99_latency: Duration::from_millis(100), max_memory_usage_mb: 1024.0, max_cpu_usage_percent: 70.0, } } fn development() -> Self { Self { min_commands_per_second: 10.0, min_events_per_second: 100.0, max_p50_latency: Duration::from_millis(50), max_p95_latency: Duration::from_millis(200), max_p99_latency: Duration::from_millis(500), max_memory_usage_mb: 512.0, max_cpu_usage_percent: 50.0, } } } }
Event Store Optimization
Connection Pooling
Optimize database connections for high throughput:
#![allow(unused)] fn main() { use sqlx::{Pool, Postgres, ConnectOptions}; use std::time::Duration; #[derive(Debug, Clone)] struct OptimizedPostgresConfig { pub database_url: String, pub max_connections: u32, pub min_connections: u32, pub acquire_timeout: Duration, pub idle_timeout: Duration, pub max_lifetime: Duration, pub connect_timeout: Duration, pub command_timeout: Duration, } impl OptimizedPostgresConfig { fn production() -> Self { Self { database_url: "postgresql://user:pass@host/db".to_string(), max_connections: 20, // Higher for production min_connections: 5, // Always keep minimum ready acquire_timeout: Duration::from_secs(30), idle_timeout: Duration::from_secs(600), // 10 minutes max_lifetime: Duration::from_secs(1800), // 30 minutes connect_timeout: Duration::from_secs(5), command_timeout: Duration::from_secs(30), } } async fn create_pool(&self) -> Result<Pool<Postgres>, sqlx::Error> { let options = sqlx::postgres::PgConnectOptions::from_url(&url::Url::parse(&self.database_url)?)? .application_name("eventcore-optimized"); sqlx::postgres::PgPoolOptions::new() .max_connections(self.max_connections) .min_connections(self.min_connections) .acquire_timeout(self.acquire_timeout) .idle_timeout(self.idle_timeout) .max_lifetime(self.max_lifetime) .connect_with(options) .await } } struct OptimizedPostgresEventStore { pool: Pool<Postgres>, config: OptimizedPostgresConfig, batch_size: usize, } impl OptimizedPostgresEventStore { async fn new(config: OptimizedPostgresConfig) -> Result<Self, sqlx::Error> { let pool = config.create_pool().await?; Ok(Self { pool, config, batch_size: 1000, // Optimal batch size for PostgreSQL }) } } }
Batch Operations
Batch database operations for better throughput:
#![allow(unused)] fn main() { #[async_trait] impl EventStore for OptimizedPostgresEventStore { type Event = serde_json::Value; type Error = EventStoreError; async fn write_events_batch( &self, events: Vec<EventToWrite<Self::Event>>, ) -> Result<WriteResult, Self::Error> { if events.is_empty() { return Ok(WriteResult { events_written: 0 }); } // Batch events by stream for version checking let mut stream_batches: HashMap<StreamId, Vec<_>> = HashMap::new(); for event in events { stream_batches.entry(event.stream_id.clone()).or_default().push(event); } let mut transaction = self.pool.begin().await?; let mut total_written = 0; for (stream_id, batch) in stream_batches { let written = self.write_stream_batch(&mut transaction, stream_id, batch).await?; total_written += written; } transaction.commit().await?; Ok(WriteResult { events_written: total_written }) } async fn write_stream_batch( &self, transaction: &mut sqlx::Transaction<'_, Postgres>, stream_id: StreamId, events: Vec<EventToWrite<Self::Event>>, ) -> Result<usize, EventStoreError> { if events.is_empty() { return Ok(0); } // Check current version let current_version = self.get_stream_version(&mut *transaction, &stream_id).await?; // Validate expected versions let expected_version = events[0].expected_version; if expected_version != current_version { return Err(EventStoreError::VersionConflict { stream: stream_id, expected: expected_version, actual: current_version, }); } // Prepare batch insert let mut values = Vec::new(); let mut parameters = Vec::new(); let mut param_index = 1; for (i, event) in events.iter().enumerate() { let version = current_version.0 + i as u64 + 1; let event_id = EventId::new_v7(); values.push(format!( "(${}, ${}, ${}, ${}, ${}, ${}, ${})", param_index, param_index + 1, param_index + 2, param_index + 3, param_index + 4, param_index + 5, param_index + 6 )); parameters.extend([ event_id.as_ref(), stream_id.as_ref(), &version.to_string(), &event.event_type, &serde_json::to_string(&event.payload)?, &serde_json::to_string(&event.metadata)?, &Utc::now().to_rfc3339(), ]); param_index += 7; } let query = format!( "INSERT INTO events (id, stream_id, version, event_type, payload, metadata, occurred_at) VALUES {}", values.join(", ") ); let mut query_builder = sqlx::query(&query); for param in parameters { query_builder = query_builder.bind(param); } let rows_affected = query_builder.execute(&mut **transaction).await?.rows_affected(); Ok(rows_affected as usize) } } }
Read Optimization
Optimize reading patterns:
#![allow(unused)] fn main() { impl OptimizedPostgresEventStore { // Optimized stream reading with pagination async fn read_stream_paginated( &self, stream_id: &StreamId, from_version: EventVersion, page_size: usize, ) -> Result<StreamEvents<Self::Event>, Self::Error> { let query = " SELECT id, stream_id, version, event_type, payload, metadata, occurred_at FROM events WHERE stream_id = $1 AND version >= $2 ORDER BY version ASC LIMIT $3 "; let rows = sqlx::query(query) .bind(stream_id.as_ref()) .bind(from_version.as_ref()) .bind(page_size as i64) .fetch_all(&self.pool) .await?; let events = rows.into_iter() .map(|row| self.row_to_event(row)) .collect::<Result<Vec<_>, _>>()?; let version = events.last() .map(|e| e.version) .unwrap_or(from_version); Ok(StreamEvents { stream_id: stream_id.clone(), version, events, }) } // Multi-stream reading with parallel queries async fn read_multiple_streams( &self, stream_ids: Vec<StreamId>, options: ReadOptions, ) -> Result<Vec<StreamEvents<Self::Event>>, Self::Error> { let futures = stream_ids.into_iter().map(|stream_id| { self.read_stream(&stream_id, options.clone()) }); let results = futures::future::try_join_all(futures).await?; Ok(results) } // Optimized subscription reading async fn read_all_events_from( &self, position: EventPosition, batch_size: usize, ) -> Result<Vec<StoredEvent<Self::Event>>, Self::Error> { let query = " SELECT id, stream_id, version, event_type, payload, metadata, occurred_at FROM events WHERE occurred_at > $1 ORDER BY occurred_at ASC LIMIT $2 "; let rows = sqlx::query(query) .bind(position.timestamp) .bind(batch_size as i64) .fetch_all(&self.pool) .await?; rows.into_iter() .map(|row| self.row_to_event(row)) .collect() } } }
Memory Optimization
State Management
Optimize memory usage in command state:
#![allow(unused)] fn main() { use std::collections::LRU; #[derive(Clone)] struct OptimizedCommandExecutor { event_store: Arc<dyn EventStore<Event = serde_json::Value>>, state_cache: Arc<RwLock<LruCache<StreamId, Arc<dyn Any + Send + Sync>>>>, cache_size: usize, } impl OptimizedCommandExecutor { fn new(event_store: Arc<dyn EventStore<Event = serde_json::Value>>) -> Self { Self { event_store, state_cache: Arc::new(RwLock::new(LruCache::new(NonZeroUsize::new(1000).unwrap()))), cache_size: 1000, } } async fn execute_with_caching<C: Command>( &self, command: &C, ) -> CommandResult<ExecutionResult> { let read_streams = self.read_streams_for_command(command).await?; // Try to get cached state let cached_state = self.get_cached_state::<C>(&read_streams).await; let state = match cached_state { Some(state) => state, None => { // Reconstruct state and cache it let state = self.reconstruct_state::<C>(&read_streams).await?; self.cache_state(&read_streams, &state).await; state } }; // Execute command let mut stream_resolver = StreamResolver::new(); let events = command.handle(read_streams, state, &mut stream_resolver).await?; // Write events and invalidate cache let result = self.write_events(events).await?; self.invalidate_cache_for_streams(&result.affected_streams).await; Ok(result) } async fn get_cached_state<C: Command>(&self, read_streams: &ReadStreams<C::StreamSet>) -> Option<C::State> { let cache = self.state_cache.read().await; // Check if all streams are cached and up-to-date for stream_data in read_streams.iter() { if let Some(cached) = cache.get(&stream_data.stream_id) { // Verify cache is current if !self.is_cache_current(&stream_data, cached).await { return None; } } else { return None; } } // All streams cached - reconstruct state from cache self.reconstruct_from_cache(read_streams).await } async fn cache_state<C: Command>(&self, read_streams: &ReadStreams<C::StreamSet>, state: &C::State) { let mut cache = self.state_cache.write().await; for stream_data in read_streams.iter() { let cached_data = CachedStreamData { stream_id: stream_data.stream_id.clone(), version: stream_data.version, events: stream_data.events.clone(), cached_at: Utc::now(), }; cache.put(stream_data.stream_id.clone(), Arc::new(cached_data)); } } } #[derive(Debug, Clone)] struct CachedStreamData { stream_id: StreamId, version: EventVersion, events: Vec<StoredEvent<serde_json::Value>>, cached_at: DateTime<Utc>, } }
Event Streaming
Stream events instead of loading everything into memory:
#![allow(unused)] fn main() { use tokio_stream::{Stream, StreamExt}; use futures::stream::TryStreamExt; trait StreamingEventStore { fn stream_events( &self, stream_id: &StreamId, from_version: EventVersion, ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>>; fn stream_all_events( &self, from_position: EventPosition, ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>>; } impl StreamingEventStore for OptimizedPostgresEventStore { fn stream_events( &self, stream_id: &StreamId, from_version: EventVersion, ) -> impl Stream<Item = Result<StoredEvent<serde_json::Value>, EventStoreError>> { let pool = self.pool.clone(); let stream_id = stream_id.clone(); let page_size = 100; async_stream::try_stream! { let mut current_version = from_version; loop { let query = " SELECT id, stream_id, version, event_type, payload, metadata, occurred_at FROM events WHERE stream_id = $1 AND version >= $2 ORDER BY version ASC LIMIT $3 "; let rows = sqlx::query(query) .bind(stream_id.as_ref()) .bind(current_version.as_ref()) .bind(page_size as i64) .fetch_all(&pool) .await?; if rows.is_empty() { break; } for row in rows { let event = self.row_to_event(row)?; current_version = EventVersion::from(event.version.as_u64() + 1); yield event; } if rows.len() < page_size { break; } } } } } // Usage in projections #[async_trait] impl Projection for StreamingProjection { type Event = serde_json::Value; type Error = ProjectionError; async fn rebuild_from_stream( &mut self, event_stream: impl Stream<Item = Result<StoredEvent<Self::Event>, EventStoreError>>, ) -> Result<(), Self::Error> { let mut stream = std::pin::pin!(event_stream); while let Some(event_result) = stream.next().await { let event = event_result?; self.apply(&event).await?; // Checkpoint every 1000 events if event.version.as_u64() % 1000 == 0 { self.save_checkpoint(event.version).await?; } } Ok(()) } } }
Concurrency Optimization
Parallel Command Execution
Execute independent commands in parallel:
#![allow(unused)] fn main() { use tokio::sync::Semaphore; use std::sync::Arc; #[derive(Clone)] struct ParallelCommandExecutor { inner: CommandExecutor, concurrency_limit: Arc<Semaphore>, stream_locks: Arc<RwLock<HashMap<StreamId, Arc<Mutex<()>>>>>, } impl ParallelCommandExecutor { fn new(inner: CommandExecutor, max_concurrency: usize) -> Self { Self { inner, concurrency_limit: Arc::new(Semaphore::new(max_concurrency)), stream_locks: Arc::new(RwLock::new(HashMap::new())), } } async fn execute_batch<C: Command>( &self, commands: Vec<C>, ) -> Vec<CommandResult<ExecutionResult>> { // Group commands by affected streams let stream_groups = self.group_by_streams(&commands).await; let futures = stream_groups.into_iter().map(|(streams, commands)| { self.execute_stream_group(streams, commands) }); let results = futures::future::join_all(futures).await; // Flatten results results.into_iter().flatten().collect() } async fn execute_stream_group<C: Command>( &self, affected_streams: HashSet<StreamId>, commands: Vec<C>, ) -> Vec<CommandResult<ExecutionResult>> { // Acquire locks for all streams in this group let _locks = self.acquire_stream_locks(&affected_streams).await; // Execute commands sequentially within the group let mut results = Vec::new(); for command in commands { let _permit = self.concurrency_limit.acquire().await.unwrap(); let result = self.inner.execute(&command).await; results.push(result); } results } async fn group_by_streams<C: Command>( &self, commands: &[C], ) -> HashMap<HashSet<StreamId>, Vec<C>> { let mut groups = HashMap::new(); for command in commands { let streams = command.read_streams(&command).into_iter().collect(); groups.entry(streams).or_insert_with(Vec::new).push(command.clone()); } groups } async fn acquire_stream_locks( &self, stream_ids: &HashSet<StreamId>, ) -> Vec<tokio::sync::MutexGuard<'_, ()>> { let mut locks = Vec::new(); // Sort stream IDs to prevent deadlocks let mut sorted_streams: Vec<_> = stream_ids.iter().collect(); sorted_streams.sort(); for stream_id in sorted_streams { let lock = { let stream_locks = self.stream_locks.read().await; stream_locks.get(stream_id).cloned() }; let lock = match lock { Some(lock) => lock, None => { let mut stream_locks = self.stream_locks.write().await; stream_locks.entry(stream_id.clone()) .or_insert_with(|| Arc::new(Mutex::new(()))) .clone() } }; locks.push(lock.lock().await); } locks } } }
Async Batching
Batch operations automatically:
#![allow(unused)] fn main() { use tokio::sync::mpsc; use tokio::time::{interval, Duration}; struct BatchProcessor<T, R> { sender: mpsc::UnboundedSender<BatchItem<T, R>>, batch_size: usize, batch_timeout: Duration, } struct BatchItem<T, R> { item: T, response_sender: oneshot::Sender<R>, } impl<T, R> BatchProcessor<T, R> where T: Send + 'static, R: Send + 'static, { fn new<F, Fut>( batch_size: usize, batch_timeout: Duration, processor: F, ) -> Self where F: Fn(Vec<T>) -> Fut + Send + 'static, Fut: Future<Output = Vec<R>> + Send, { let (sender, receiver) = mpsc::unbounded_channel(); tokio::spawn(Self::batch_worker(receiver, batch_size, batch_timeout, processor)); Self { sender, batch_size, batch_timeout, } } async fn process(&self, item: T) -> Result<R, BatchError> { let (response_sender, response_receiver) = oneshot::channel(); self.sender.send(BatchItem { item, response_sender, })?; response_receiver.await.map_err(BatchError::Cancelled) } async fn batch_worker<F, Fut>( mut receiver: mpsc::UnboundedReceiver<BatchItem<T, R>>, batch_size: usize, batch_timeout: Duration, processor: F, ) where F: Fn(Vec<T>) -> Fut + Send + 'static, Fut: Future<Output = Vec<R>> + Send, { let mut batch = Vec::new(); let mut senders = Vec::new(); let mut timer = interval(batch_timeout); loop { select! { item = receiver.recv() => { match item { Some(BatchItem { item, response_sender }) => { batch.push(item); senders.push(response_sender); if batch.len() >= batch_size { Self::process_batch(&processor, &mut batch, &mut senders).await; } } None => break, // Channel closed } } _ = timer.tick() => { if !batch.is_empty() { Self::process_batch(&processor, &mut batch, &mut senders).await; } } } } } async fn process_batch<F, Fut>( processor: &F, batch: &mut Vec<T>, senders: &mut Vec<oneshot::Sender<R>>, ) where F: Fn(Vec<T>) -> Fut, Fut: Future<Output = Vec<R>>, { if batch.is_empty() { return; } let items = std::mem::take(batch); let response_senders = std::mem::take(senders); let results = processor(items).await; for (sender, result) in response_senders.into_iter().zip(results) { let _ = sender.send(result); // Ignore send errors } } } // Usage for batched event writing type EventBatch = BatchProcessor<EventToWrite<serde_json::Value>, Result<(), EventStoreError>>; impl OptimizedPostgresEventStore { fn new_with_batching(pool: Pool<Postgres>) -> (Self, EventBatch) { let store = Self::new(pool); let store_clone = store.clone(); let batch_processor = BatchProcessor::new( 100, // Batch size Duration::from_millis(10), // Batch timeout move |events| { let store = store_clone.clone(); async move { match store.write_events_batch(events).await { Ok(_) => vec![Ok(()); events.len()], Err(e) => vec![Err(e); events.len()], } } } ); (store, batch_processor) } } }
Projection Optimization
Incremental Updates
Update projections incrementally:
#![allow(unused)] fn main() { #[async_trait] trait IncrementalProjection { type Event; type State; type Error; async fn apply_incremental( &mut self, event: &StoredEvent<Self::Event>, previous_state: Option<&Self::State>, ) -> Result<Self::State, Self::Error>; async fn get_checkpoint(&self) -> Result<EventVersion, Self::Error>; async fn save_checkpoint(&self, version: EventVersion) -> Result<(), Self::Error>; } struct OptimizedUserProjection { users: HashMap<UserId, UserSummary>, last_processed_version: EventVersion, checkpoint_interval: u64, } #[async_trait] impl IncrementalProjection for OptimizedUserProjection { type Event = UserEvent; type State = HashMap<UserId, UserSummary>; type Error = ProjectionError; async fn apply_incremental( &mut self, event: &StoredEvent<Self::Event>, previous_state: Option<&Self::State>, ) -> Result<Self::State, Self::Error> { // Clone state if provided, otherwise start fresh let mut state = previous_state.cloned().unwrap_or_default(); // Apply only this event match &event.payload { UserEvent::Registered { user_id, email, profile } => { state.insert(*user_id, UserSummary { id: *user_id, email: email.clone(), display_name: profile.display_name(), created_at: event.occurred_at, updated_at: event.occurred_at, }); } UserEvent::ProfileUpdated { user_id, profile } => { if let Some(user) = state.get_mut(user_id) { user.display_name = profile.display_name(); user.updated_at = event.occurred_at; } } } // Update checkpoint self.last_processed_version = event.version; // Save checkpoint periodically if event.version.as_u64() % self.checkpoint_interval == 0 { self.save_checkpoint(event.version).await?; } Ok(state) } async fn get_checkpoint(&self) -> Result<EventVersion, Self::Error> { Ok(self.last_processed_version) } async fn save_checkpoint(&self, version: EventVersion) -> Result<(), Self::Error> { // Save to persistent storage // Implementation depends on your checkpoint store Ok(()) } } }
Materialized Views
Use database materialized views for query optimization:
-- Create materialized view for user summaries
CREATE MATERIALIZED VIEW user_summaries AS
SELECT
(payload->>'user_id')::uuid as user_id,
payload->>'email' as email,
payload->'profile'->>'display_name' as display_name,
occurred_at as created_at,
occurred_at as updated_at
FROM events
WHERE event_type = 'UserRegistered'
UNION ALL
SELECT
(payload->>'user_id')::uuid as user_id,
NULL as email,
payload->'profile'->>'display_name' as display_name,
NULL as created_at,
occurred_at as updated_at
FROM events
WHERE event_type = 'UserProfileUpdated';
-- Create indexes for fast queries
CREATE INDEX idx_user_summaries_user_id ON user_summaries(user_id);
CREATE INDEX idx_user_summaries_email ON user_summaries(email);
-- Refresh materialized view (can be automated)
REFRESH MATERIALIZED VIEW user_summaries;
Benchmarking and Profiling
Performance Testing
Create comprehensive benchmarks:
#![allow(unused)] fn main() { use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId}; use tokio::runtime::Runtime; fn benchmark_command_execution(c: &mut Criterion) { let rt = Runtime::new().unwrap(); let store = rt.block_on(async { InMemoryEventStore::new() }); let executor = CommandExecutor::new(store); let mut group = c.benchmark_group("command_execution"); for concurrency in [1, 10, 50, 100].iter() { group.bench_with_input( BenchmarkId::new("create_user", concurrency), concurrency, |b, &concurrency| { b.to_async(&rt).iter(|| async { let commands: Vec<_> = (0..concurrency) .map(|i| CreateUser { email: Email::try_new(format!("user{}@example.com", i)).unwrap(), first_name: FirstName::try_new(format!("User{}", i)).unwrap(), last_name: LastName::try_new("Test".to_string()).unwrap(), }) .collect(); let futures = commands.into_iter().map(|cmd| executor.execute(&cmd)); let results = futures::future::join_all(futures).await; black_box(results); }); } ); } group.finish(); } fn benchmark_event_store_operations(c: &mut Criterion) { let rt = Runtime::new().unwrap(); let store = rt.block_on(async { PostgresEventStore::new("postgresql://localhost/eventcore_bench").await.unwrap() }); let mut group = c.benchmark_group("event_store"); for batch_size in [1, 10, 100, 1000].iter() { group.bench_with_input( BenchmarkId::new("write_events", batch_size), batch_size, |b, &batch_size| { b.to_async(&rt).iter(|| async { let events: Vec<_> = (0..batch_size) .map(|i| EventToWrite { stream_id: StreamId::try_new(format!("test-{}", i)).unwrap(), payload: json!({ "test": i }), metadata: EventMetadata::default(), expected_version: EventVersion::from(0), }) .collect(); let result = store.write_events(events).await; black_box(result); }); } ); } group.finish(); } criterion_group!(benches, benchmark_command_execution, benchmark_event_store_operations); criterion_main!(benches); }
Memory Profiling
Profile memory usage patterns:
#![allow(unused)] fn main() { use memory_profiler::{Allocator, ProfiledAllocator}; #[global_allocator] static PROFILED_ALLOCATOR: ProfiledAllocator<std::alloc::System> = ProfiledAllocator::new(std::alloc::System); #[derive(Debug)] struct MemoryUsageReport { pub allocated_bytes: usize, pub deallocated_bytes: usize, pub peak_memory: usize, pub current_memory: usize, } impl MemoryUsageReport { fn capture() -> Self { let stats = PROFILED_ALLOCATOR.stats(); Self { allocated_bytes: stats.allocated, deallocated_bytes: stats.deallocated, peak_memory: stats.peak, current_memory: stats.current, } } } #[cfg(test)] mod memory_tests { use super::*; #[tokio::test] async fn test_memory_usage_during_batch_execution() { let initial_memory = MemoryUsageReport::capture(); // Execute large batch of commands let store = InMemoryEventStore::new(); let executor = CommandExecutor::new(store); let commands: Vec<_> = (0..10000) .map(|i| CreateUser { email: Email::try_new(format!("user{}@example.com", i)).unwrap(), first_name: FirstName::try_new(format!("User{}", i)).unwrap(), last_name: LastName::try_new("Test".to_string()).unwrap(), }) .collect(); let peak_memory = MemoryUsageReport::capture(); for command in commands { executor.execute(&command).await.unwrap(); } let final_memory = MemoryUsageReport::capture(); println!("Initial memory: {:?}", initial_memory); println!("Peak memory: {:?}", peak_memory); println!("Final memory: {:?}", final_memory); // Assert memory doesn't grow unbounded let memory_growth = final_memory.current_memory.saturating_sub(initial_memory.current_memory); assert!(memory_growth < 100 * 1024 * 1024, "Memory growth too large: {} bytes", memory_growth); } } }
Production Monitoring
Performance Dashboards
Create monitoring dashboards:
#![allow(unused)] fn main() { use prometheus::{Opts, Registry, TextEncoder, Encoder}; use axum::{response::Html, routing::get, Router}; #[derive(Clone)] struct PerformanceMonitor { registry: Registry, metrics: PerformanceMetrics, } impl PerformanceMonitor { fn new() -> Self { let registry = Registry::new(); let metrics = PerformanceMetrics::new(®istry); Self { registry, metrics } } async fn metrics_handler(&self) -> String { let encoder = TextEncoder::new(); let metric_families = self.registry.gather(); encoder.encode_to_string(&metric_families).unwrap() } fn dashboard_routes(&self) -> Router { let monitor = self.clone(); Router::new() .route("/metrics", get(move || monitor.metrics_handler())) .route("/health", get(|| async { "OK" })) .route("/dashboard", get(|| async { Html(include_str!("performance_dashboard.html")) })) } } // HTML dashboard template const DASHBOARD_HTML: &str = r#" <!DOCTYPE html> <html> <head> <title>EventCore Performance Dashboard</title> <script src="https://cdn.jsdelivr.net/npm/chart.js"></script> </head> <body> <h1>EventCore Performance Metrics</h1> <div style="display: flex; flex-wrap: wrap;"> <div style="width: 50%; padding: 10px;"> <canvas id="throughputChart"></canvas> </div> <div style="width: 50%; padding: 10px;"> <canvas id="latencyChart"></canvas> </div> <div style="width: 50%; padding: 10px;"> <canvas id="memoryChart"></canvas> </div> <div style="width: 50%; padding: 10px;"> <canvas id="streamsChart"></canvas> </div> </div> <script> // Real-time dashboard implementation async function updateMetrics() { const response = await fetch('/metrics'); const text = await response.text(); // Parse Prometheus metrics and update charts parseAndUpdateCharts(text); } setInterval(updateMetrics, 5000); // Update every 5 seconds updateMetrics(); // Initial load </script> </body> </html> "#; }
Alerting
Set up performance alerts:
#![allow(unused)] fn main() { use std::sync::atomic::{AtomicBool, Ordering}; #[derive(Clone)] struct PerformanceAlerting { thresholds: PerformanceTargets, alert_cooldown: Duration, last_alert: Arc<Mutex<HashMap<String, DateTime<Utc>>>>, alert_enabled: Arc<AtomicBool>, } impl PerformanceAlerting { fn new(thresholds: PerformanceTargets) -> Self { Self { thresholds, alert_cooldown: Duration::from_minutes(5), last_alert: Arc::new(Mutex::new(HashMap::new())), alert_enabled: Arc::new(AtomicBool::new(true)), } } async fn check_metrics(&self, metrics: &PerformanceMetrics) { if !self.alert_enabled.load(Ordering::Relaxed) { return; } // Check command latency if metrics.p95_command_latency > self.thresholds.max_p95_latency { self.send_alert( "high_latency", &format!( "P95 latency is {}ms, threshold is {}ms", metrics.p95_command_latency.as_millis(), self.thresholds.max_p95_latency.as_millis() ) ).await; } // Check throughput if metrics.commands_per_second < self.thresholds.min_commands_per_second { self.send_alert( "low_throughput", &format!( "Throughput is {:.1} commands/sec, threshold is {:.1}", metrics.commands_per_second, self.thresholds.min_commands_per_second ) ).await; } // Check memory usage if metrics.memory_usage_mb > self.thresholds.max_memory_usage_mb { self.send_alert( "high_memory", &format!( "Memory usage is {:.1}MB, threshold is {:.1}MB", metrics.memory_usage_mb, self.thresholds.max_memory_usage_mb ) ).await; } } async fn send_alert(&self, alert_type: &str, message: &str) { let mut last_alerts = self.last_alert.lock().await; let now = Utc::now(); // Check cooldown if let Some(last_time) = last_alerts.get(alert_type) { if now.signed_duration_since(*last_time) < self.alert_cooldown { return; // Still in cooldown } } // Send alert (implement your alerting system) self.dispatch_alert(alert_type, message).await; // Update last alert time last_alerts.insert(alert_type.to_string(), now); } async fn dispatch_alert(&self, alert_type: &str, message: &str) { // Implementation depends on your alerting system // Examples: Slack, PagerDuty, email, etc. tracing::error!("PERFORMANCE ALERT [{}]: {}", alert_type, message); // Example: Send to Slack if let Ok(webhook_url) = std::env::var("SLACK_WEBHOOK_URL") { let payload = json!({ "text": format!("🚨 EventCore Performance Alert: {}", message), "channel": "#alerts", "username": "EventCore Monitor" }); let client = reqwest::Client::new(); let _ = client.post(&webhook_url) .json(&payload) .send() .await; } } } }
Best Practices
- Measure first - Always profile before optimizing
- Optimize bottlenecks - Focus on the slowest operations
- Batch operations - Reduce round trips to storage
- Cache wisely - Cache expensive computations, not everything
- Stream large datasets - Don’t load everything into memory
- Monitor continuously - Track performance in production
- Set alerts - Get notified when performance degrades
- Test under load - Use realistic workloads in testing
Summary
Performance optimization in EventCore:
- ✅ Comprehensive monitoring - Track all key metrics
- ✅ Database optimization - Connection pooling and batching
- ✅ Memory efficiency - Streaming and caching strategies
- ✅ Concurrency optimization - Parallel execution patterns
- ✅ Production monitoring - Dashboards and alerting
Key strategies:
- Optimize the event store with connection pooling and batching
- Use streaming for large datasets to minimize memory usage
- Implement parallel execution for independent commands
- Monitor performance continuously with metrics and alerts
- Profile and benchmark to identify bottlenecks
Performance is a journey, not a destination. Measure, optimize, and monitor continuously to ensure your EventCore applications scale effectively in production.
Next, let’s explore the Operations Guide →
Security
This section covers security best practices for building secure applications with EventCore.
Topics
- Overview - Security responsibilities and architecture
- Authentication & Authorization - Implementing access control
- Data Encryption - Protecting sensitive data
- Input Validation - Preventing injection attacks
- Compliance - Meeting regulatory requirements
Key Principles
- Defense in Depth - Multiple layers of security
- Least Privilege - Grant minimal necessary access
- Fail Secure - Default to denying access
- Audit Everything - Log security events
- Encrypt Sensitive Data - Protect data at rest and in transit
Security Guide
This guide covers security best practices when building applications with EventCore.
Overview
EventCore provides a solid foundation for secure applications through:
- Strong type safety that prevents many common vulnerabilities
- Immutable event storage providing natural audit trails
- Built-in concurrency control preventing data races
- Configurable resource limits preventing DoS attacks
However, EventCore is a library, not a complete application framework. Security responsibilities are shared between EventCore and your application code.
What EventCore Provides
Type Safety
- Validated domain types using
nutype
prevent injection attacks - Exhaustive pattern matching eliminates undefined behavior
- Memory safety guaranteed by Rust
Concurrency Control
- Optimistic locking prevents lost updates
- Version checking ensures consistency
- Atomic multi-stream operations maintain integrity
Resource Protection
- Configurable timeouts prevent runaway operations
- Batch size limits prevent memory exhaustion
- Retry limits prevent infinite loops
What You Must Implement
Authentication & Authorization
EventCore does not provide:
- User authentication
- Stream-level access control
- Command authorization
- Read model security
You must implement these at the application layer.
Data Protection
EventCore stores events as-is. You must:
- Encrypt sensitive data before storing
- Implement key management
- Handle data retention/deletion
- Ensure compliance with regulations
Input Validation
While EventCore validates its own types, you must:
- Validate all user input
- Sanitize data before processing
- Implement rate limiting
- Prevent abuse patterns
Security Layers
┌─────────────────────────────────────┐
│ Application Layer │
│ • Authentication │
│ • Authorization │
│ • Input Validation │
│ • Rate Limiting │
├─────────────────────────────────────┤
│ EventCore Layer │
│ • Type Safety │
│ • Concurrency Control │
│ • Resource Limits │
│ • Audit Trail │
├─────────────────────────────────────┤
│ Storage Layer │
│ • Encryption at Rest │
│ • Access Control │
│ • Backup Security │
│ • Network Security │
└─────────────────────────────────────┘
Next Steps
Authentication & Authorization
EventCore is authentication-agnostic but provides hooks for integrating your auth system.
Authentication Integration
Capturing User Identity
EventCore’s metadata system captures user identity for audit trails:
#![allow(unused)] fn main() { use eventcore::{CommandExecutor, UserId}; // Execute command with authenticated user let user_id = UserId::try_new("user@example.com")?; let result = executor .execute_as_user(command, user_id) .await?; }
Middleware Pattern
Implement authentication as middleware:
#![allow(unused)] fn main() { use axum::{ extract::State, http::StatusCode, middleware::Next, response::Response, }; async fn auth_middleware( State(auth): State<AuthService>, headers: HeaderMap, mut req: Request, next: Next, ) -> Result<Response, StatusCode> { // Extract and verify token let token = headers .get("Authorization") .and_then(|h| h.to_str().ok()) .ok_or(StatusCode::UNAUTHORIZED)?; let user = auth .verify_token(token) .await .map_err(|_| StatusCode::UNAUTHORIZED)?; // Add user to request extensions req.extensions_mut().insert(user); Ok(next.run(req).await) } }
Authorization Patterns
Stream-Level Authorization
Implement fine-grained access control:
#![allow(unused)] fn main() { #[async_trait] trait StreamAuthorization { async fn can_read(&self, user: &User, stream_id: &StreamId) -> bool; async fn can_write(&self, user: &User, stream_id: &StreamId) -> bool; } struct CommandAuthorizationLayer<A: StreamAuthorization> { auth: A, } impl<A: StreamAuthorization> CommandAuthorizationLayer<A> { async fn authorize_command( &self, command: &impl Command, user: &User, ) -> Result<(), AuthError> { // Check read permissions for stream_id in command.read_streams() { if !self.auth.can_read(user, &stream_id).await { return Err(AuthError::Forbidden(stream_id)); } } // Check write permissions for stream_id in command.write_streams() { if !self.auth.can_write(user, &stream_id).await { return Err(AuthError::Forbidden(stream_id)); } } Ok(()) } } }
Role-Based Access Control (RBAC)
#![allow(unused)] fn main() { #[derive(Debug, Clone)] enum Role { Admin, User, ReadOnly, } #[derive(Debug, Clone)] struct User { id: UserId, roles: Vec<Role>, } impl User { fn has_role(&self, role: &Role) -> bool { self.roles.contains(role) } fn can_execute_command(&self, command_type: &str) -> bool { match command_type { "CreateAccount" => self.has_role(&Role::Admin), "UpdateAccount" => { self.has_role(&Role::Admin) || self.has_role(&Role::User) } "ViewAccount" => true, // All authenticated users _ => false, } } } }
Attribute-Based Access Control (ABAC)
#![allow(unused)] fn main() { #[derive(Debug)] struct AccessContext { user: User, resource: Resource, action: Action, environment: Environment, } #[async_trait] trait AccessPolicy { async fn evaluate(&self, context: &AccessContext) -> Decision; } struct AbacAuthorizer { policies: Vec<Box<dyn AccessPolicy>>, } impl AbacAuthorizer { async fn authorize(&self, context: AccessContext) -> Result<(), AuthError> { for policy in &self.policies { match policy.evaluate(&context).await { Decision::Deny(reason) => { return Err(AuthError::PolicyDenied(reason)); } Decision::Allow => continue, } } Ok(()) } } }
Projection Security
Row-Level Security
Filter projections based on user permissions:
#![allow(unused)] fn main() { #[async_trait] impl ReadModelStore for SecureAccountStore { async fn get_account( &self, account_id: &AccountId, user: &User, ) -> Result<Option<AccountReadModel>> { let account = self.inner.get_account(account_id).await?; // Apply row-level security match account { Some(acc) if self.user_can_view(&acc, user) => Ok(Some(acc)), _ => Ok(None), } } async fn list_accounts( &self, user: &User, filter: AccountFilter, ) -> Result<Vec<AccountReadModel>> { let accounts = self.inner.list_accounts(filter).await?; // Filter based on permissions Ok(accounts .into_iter() .filter(|acc| self.user_can_view(acc, user)) .collect()) } } }
Field-Level Security
Redact sensitive fields:
#![allow(unused)] fn main() { impl AccountReadModel { fn redact_for_user(&self, user: &User) -> Self { let mut redacted = self.clone(); if !user.has_role(&Role::Admin) { redacted.ssn = None; redacted.tax_id = None; } if !user.has_role(&Role::Financial) { redacted.balance = None; redacted.credit_limit = None; } redacted } } }
Best Practices
- Fail Secure: Default to denying access
- Audit Everything: Log all authorization decisions
- Minimize Privileges: Grant only necessary permissions
- Separate Concerns: Keep auth logic separate from business logic
- Token Expiry: Implement short-lived tokens with refresh
- Rate Limiting: Prevent brute force attacks
Common Pitfalls
- Not checking permissions on read models
- Forgetting to validate token expiry
- Exposing internal IDs that enable enumeration
- Not rate limiting authentication attempts
- Storing permissions in events (they change over time)
Data Encryption
Events are immutable and permanent. Encrypt sensitive data before storing it.
Encryption Strategies
Field-Level Encryption
Encrypt individual fields containing sensitive data:
#![allow(unused)] fn main() { use aes_gcm::{ aead::{Aead, KeyInit}, Aes256Gcm, Nonce, }; use serde::{Deserialize, Serialize}; #[derive(Debug, Serialize, Deserialize)] struct EncryptedField { ciphertext: Vec<u8>, nonce: Vec<u8>, key_id: String, // For key rotation } impl EncryptedField { fn encrypt( plaintext: &str, key: &[u8; 32], key_id: String, ) -> Result<Self, EncryptionError> { let cipher = Aes256Gcm::new(key.into()); let nonce = Nonce::from_slice(b"unique nonce"); // Use random nonce let ciphertext = cipher .encrypt(nonce, plaintext.as_bytes()) .map_err(|_| EncryptionError::EncryptionFailed)?; Ok(Self { ciphertext, nonce: nonce.to_vec(), key_id, }) } fn decrypt(&self, key: &[u8; 32]) -> Result<String, EncryptionError> { let cipher = Aes256Gcm::new(key.into()); let nonce = Nonce::from_slice(&self.nonce); let plaintext = cipher .decrypt(nonce, self.ciphertext.as_ref()) .map_err(|_| EncryptionError::DecryptionFailed)?; String::from_utf8(plaintext) .map_err(|_| EncryptionError::InvalidUtf8) } } }
Event Payload Encryption
Encrypt entire event payloads:
#![allow(unused)] fn main() { #[derive(Debug, Serialize, Deserialize)] #[serde(tag = "type")] enum SecureEvent { #[serde(rename = "encrypted")] Encrypted { payload: EncryptedField, event_type: String, }, // Non-sensitive events can remain unencrypted SystemEvent(SystemEvent), } impl SecureEvent { fn encrypt_event<E: Serialize>( event: E, event_type: String, key: &[u8; 32], key_id: String, ) -> Result<Self, EncryptionError> { let json = serde_json::to_string(&event)?; let encrypted = EncryptedField::encrypt(&json, key, key_id)?; Ok(Self::Encrypted { payload: encrypted, event_type, }) } } }
Key Management
Key Storage
Never store encryption keys in:
- Source code
- Configuration files
- Environment variables (in production)
- Event payloads
Use proper key management:
- AWS KMS
- Azure Key Vault
- HashiCorp Vault
- Hardware Security Modules (HSM)
Key Rotation
Support key rotation without re-encrypting historical data:
#![allow(unused)] fn main() { struct KeyManager { current_key_id: String, keys: HashMap<String, Key>, } impl KeyManager { fn encrypt(&self, data: &str) -> Result<EncryptedField, Error> { let key = self.keys .get(&self.current_key_id) .ok_or(Error::KeyNotFound)?; EncryptedField::encrypt(data, &key.material, self.current_key_id.clone()) } fn decrypt(&self, field: &EncryptedField) -> Result<String, Error> { // Use the key ID stored with the encrypted data let key = self.keys .get(&field.key_id) .ok_or(Error::KeyNotFound)?; field.decrypt(&key.material) } } }
Encryption Patterns
Deterministic Encryption
For fields that need to be searchable:
#![allow(unused)] fn main() { use sha2::{Sha256, Digest}; fn deterministic_encrypt( plaintext: &str, key: &[u8; 32], ) -> String { let mut hasher = Sha256::new(); hasher.update(key); hasher.update(plaintext.as_bytes()); base64::encode(hasher.finalize()) } // Usage in events #[derive(Serialize, Deserialize)] struct UserRegistered { user_id: UserId, email_hash: String, // For lookups encrypted_email: EncryptedField, // Actual email } }
Tokenization
Replace sensitive data with tokens:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] struct Token(String); trait TokenVault { async fn tokenize(&self, value: &str) -> Result<Token, Error>; async fn detokenize(&self, token: &Token) -> Result<String, Error>; } // Store tokens in events instead of sensitive data #[derive(Serialize, Deserialize)] struct PaymentProcessed { payment_id: PaymentId, card_token: Token, // Not the actual card number amount: Money, } }
Compliance Considerations
GDPR - Right to Erasure
Since events are immutable, implement crypto-shredding:
#![allow(unused)] fn main() { impl KeyManager { async fn shred_user_data(&mut self, user_id: &UserId) -> Result<(), Error> { // Delete user-specific encryption keys self.user_keys.remove(user_id); // Events remain but are now unreadable Ok(()) } } }
PCI DSS
Never store in events:
- Full credit card numbers
- CVV/CVC codes
- PIN numbers
- Magnetic stripe data
HIPAA
Encrypt all Protected Health Information (PHI):
- Patient names
- Medical record numbers
- Health conditions
- Treatment information
Performance Considerations
- Batch Operations: Encrypt/decrypt in batches when possible
- Caching: Cache decrypted data with appropriate TTLs
- Async Operations: Use async encryption for better throughput
- Hardware Acceleration: Use AES-NI when available
Example: Secure User Events
#![allow(unused)] fn main() { use eventcore::Event; #[derive(Debug, Serialize, Deserialize)] struct SecureUserEvent { #[serde(flatten)] base: Event, #[serde(flatten)] payload: SecureUserPayload, } #[derive(Debug, Serialize, Deserialize)] #[serde(tag = "type")] enum SecureUserPayload { UserRegistered { user_id: UserId, username: String, // Public email_hash: String, // For lookups encrypted_pii: EncryptedField, // Name, email, phone }, ProfileUpdated { user_id: UserId, changes: Vec<ProfileChange>, encrypted_changes: Option<EncryptedField>, }, } // Helper for building secure events struct SecureEventBuilder<'a> { crypto: &'a CryptoService, } impl<'a> SecureEventBuilder<'a> { async fn user_registered( &self, user_id: UserId, username: String, email: String, pii: PersonalInfo, ) -> Result<SecureUserEvent, Error> { let email_hash = self.crypto.hash_email(&email); let encrypted_pii = self.crypto.encrypt_pii(&pii).await?; Ok(SecureUserEvent { base: Event::new(), payload: SecureUserPayload::UserRegistered { user_id, username, email_hash, encrypted_pii, }, }) } } }
Input Validation
Proper input validation prevents injection attacks and data corruption.
Validation Layers
1. API Layer Validation
Validate at the edge before data enters your system:
#![allow(unused)] fn main() { use axum::{ extract::Json, http::StatusCode, response::IntoResponse, }; use validator::{Validate, ValidationError}; #[derive(Debug, Deserialize, Validate)] struct CreateUserRequest { #[validate(length(min = 3, max = 50))] username: String, #[validate(email)] email: String, #[validate(length(min = 8), custom = "validate_password_strength")] password: String, #[validate(range(min = 13, max = 120))] age: u8, } fn validate_password_strength(password: &str) -> Result<(), ValidationError> { let has_uppercase = password.chars().any(|c| c.is_uppercase()); let has_lowercase = password.chars().any(|c| c.is_lowercase()); let has_digit = password.chars().any(|c| c.is_digit(10)); let has_special = password.chars().any(|c| !c.is_alphanumeric()); if !(has_uppercase && has_lowercase && has_digit && has_special) { return Err(ValidationError::new("weak_password")); } Ok(()) } async fn create_user( Json(request): Json<CreateUserRequest>, ) -> Result<impl IntoResponse, StatusCode> { // Validation happens automatically during deserialization request.validate() .map_err(|_| StatusCode::BAD_REQUEST)?; // Continue with validated data... Ok(StatusCode::CREATED) } }
2. Domain Type Validation
Use nutype
for domain-level validation:
#![allow(unused)] fn main() { use nutype::nutype; #[nutype( sanitize(trim, lowercase), validate( len_char_min = 3, len_char_max = 50, regex = r"^[a-z0-9_]+$" ), derive(Debug, Clone, Serialize, Deserialize) )] pub struct Username(String); #[nutype( sanitize(trim), validate(regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"), derive(Debug, Clone, Serialize, Deserialize) )] pub struct Email(String); #[nutype( validate(greater_or_equal = 0, less_or_equal = 1_000_000), derive(Debug, Clone, Copy, Serialize, Deserialize) )] pub struct Money(u64); // In cents // Usage let username = Username::try_new("JohnDoe123") .map_err(|_| "Invalid username")?; let email = Email::try_new("john@example.com") .map_err(|_| "Invalid email")?; }
3. Command Validation
Validate business rules in commands:
#![allow(unused)] fn main() { use eventcore::{Command, CommandError, require}; #[derive(Debug, Clone)] struct TransferMoney { from_account: AccountId, to_account: AccountId, amount: Money, } impl TransferMoney { fn new( from: AccountId, to: AccountId, amount: Money, ) -> Result<Self, ValidationError> { // Validate at construction if from == to { return Err(ValidationError::SameAccount); } if amount.is_zero() { return Err(ValidationError::ZeroAmount); } Ok(Self { from_account: from, to_account: to, amount, }) } } #[async_trait] impl CommandLogic for TransferMoney { async fn handle(&self, state: State) -> CommandResult<Vec<Event>> { // Business rule validation require!( state.from_balance >= self.amount, CommandError::InsufficientFunds ); require!( state.to_account.is_active(), CommandError::AccountInactive ); require!( self.amount <= state.daily_limit_remaining, CommandError::DailyLimitExceeded ); // Proceed with valid transfer... Ok(vec![/* events */]) } } }
Sanitization Patterns
HTML/Script Injection Prevention
#![allow(unused)] fn main() { use ammonia::clean; #[nutype( sanitize(trim, with = sanitize_html), validate(len_char_max = 1000), derive(Debug, Clone, Serialize, Deserialize) )] pub struct SafeHtml(String); fn sanitize_html(input: &str) -> String { // Remove dangerous HTML/JS clean(input) } // For plain text fields #[nutype( sanitize(trim, with = escape_html), validate(len_char_max = 500), derive(Debug, Clone, Serialize, Deserialize) )] pub struct DisplayName(String); fn escape_html(input: &str) -> String { input .replace('&', "&") .replace('<', "<") .replace('>', ">") .replace('"', """) .replace('\'', "'") } }
SQL Injection Prevention
EventCore uses parameterized queries via sqlx
, but validate data types:
#![allow(unused)] fn main() { #[nutype( sanitize(trim), validate(regex = r"^[a-zA-Z0-9_]+$"), // Alphanumeric + underscore only derive(Debug, Clone, Serialize, Deserialize) )] pub struct TableName(String); #[nutype( sanitize(trim), validate(regex = r"^[a-zA-Z_][a-zA-Z0-9_]*$"), // Valid identifier derive(Debug, Clone, Serialize, Deserialize) )] pub struct ColumnName(String); }
Rate Limiting
Protect against abuse:
#![allow(unused)] fn main() { use std::sync::Arc; use tokio::sync::Mutex; use std::collections::HashMap; use std::time::{Duration, Instant}; struct RateLimiter { limits: Arc<Mutex<HashMap<String, Vec<Instant>>>>, max_requests: usize, window: Duration, } impl RateLimiter { async fn check_rate_limit(&self, key: &str) -> Result<(), RateLimitError> { let mut limits = self.limits.lock().await; let now = Instant::now(); let requests = limits.entry(key.to_string()).or_default(); // Remove old requests outside window requests.retain(|&time| now.duration_since(time) < self.window); if requests.len() >= self.max_requests { return Err(RateLimitError::TooManyRequests); } requests.push(now); Ok(()) } } // Apply to commands async fn execute_command( command: Command, user_id: UserId, rate_limiter: &RateLimiter, ) -> Result<(), Error> { // Rate limit by user rate_limiter.check_rate_limit(&user_id.to_string()).await?; // Rate limit by IP for anonymous operations // rate_limiter.check_rate_limit(&ip_address).await?; executor.execute(command).await } }
File Upload Validation
#![allow(unused)] fn main() { use tokio::io::AsyncReadExt; #[derive(Debug)] struct FileValidator { max_size: usize, allowed_types: Vec<String>, } impl FileValidator { async fn validate_upload( &self, mut file: impl AsyncRead + Unpin, content_type: &str, ) -> Result<Vec<u8>, ValidationError> { // Check content type if !self.allowed_types.contains(&content_type.to_string()) { return Err(ValidationError::InvalidFileType); } // Read and check size let mut buffer = Vec::new(); let bytes_read = file .take(self.max_size as u64 + 1) .read_to_end(&mut buffer) .await?; if bytes_read > self.max_size { return Err(ValidationError::FileTooLarge); } // Verify file magic numbers if !self.verify_file_signature(&buffer, content_type) { return Err(ValidationError::InvalidFileContent); } Ok(buffer) } fn verify_file_signature(&self, data: &[u8], content_type: &str) -> bool { match content_type { "image/jpeg" => data.starts_with(&[0xFF, 0xD8, 0xFF]), "image/png" => data.starts_with(&[0x89, 0x50, 0x4E, 0x47]), "application/pdf" => data.starts_with(b"%PDF"), _ => true, // Add more as needed } } } }
Validation Best Practices
- Validate Early: At system boundaries
- Fail Fast: Return errors immediately
- Be Specific: Provide clear error messages
- Whitelist, Don’t Blacklist: Define what’s allowed
- Layer Defense: Validate at multiple levels
- Log Violations: Track validation failures
Common Mistakes
- Trusting client-side validation
- Not validating after deserialization
- Weak regex patterns
- Not checking array/collection sizes
- Forgetting to validate optional fields
- Not escaping output data
Compliance
EventCore’s immutable audit trail helps with compliance, but you must implement specific controls.
📋 Comprehensive Compliance Checklist
For a detailed compliance checklist covering OWASP, NIST, SOC2, PCI DSS, GDPR, and HIPAA requirements, see our COMPLIANCE_CHECKLIST.md.
This checklist provides actionable items for achieving compliance with major security frameworks and regulations.
GDPR Compliance
Data Protection Principles
- Lawfulness: Store only data with legal basis
- Purpose Limitation: Use data only for stated purposes
- Data Minimization: Store only necessary data
- Accuracy: Provide mechanisms to correct data
- Storage Limitation: Implement retention policies
- Security: Encrypt and protect personal data
Right to Erasure (Right to be Forgotten)
Since events are immutable, use crypto-shredding:
#![allow(unused)] fn main() { use std::collections::HashMap; use uuid::Uuid; struct GdprCompliantEventStore { event_store: Box<dyn EventStore>, key_vault: Box<dyn KeyVault>, user_keys: HashMap<UserId, KeyId>, } impl GdprCompliantEventStore { async fn forget_user(&mut self, user_id: UserId) -> Result<(), Error> { // 1. Delete user's encryption key if let Some(key_id) = self.user_keys.remove(&user_id) { self.key_vault.delete_key(key_id).await?; } // 2. Store erasure event for audit let erasure_event = UserDataErased { user_id: user_id.clone(), erased_at: Timestamp::now(), reason: "GDPR Article 17 Request".to_string(), }; self.event_store .append_events( &StreamId::from_user(&user_id), vec![Event::from(erasure_event)], ) .await?; // 3. Events remain but PII is now unreadable Ok(()) } } }
Data Portability
Export user data in machine-readable format:
#![allow(unused)] fn main() { #[async_trait] trait GdprExport { async fn export_user_data( &self, user_id: UserId, ) -> Result<UserDataExport, Error>; } #[derive(Serialize)] struct UserDataExport { user_id: UserId, export_date: Timestamp, profile: UserProfile, events: Vec<UserEvent>, projections: HashMap<String, Value>, } impl EventStore { async fn export_user_events( &self, user_id: &UserId, ) -> Result<Vec<UserEvent>, Error> { // Collect all events related to user let streams = self.find_user_streams(user_id).await?; let mut events = Vec::new(); for stream_id in streams { let stream_events = self.read_stream(&stream_id).await?; events.extend( stream_events .into_iter() .filter(|e| e.involves_user(user_id)) .map(|e| e.decrypt_for_export()) ); } Ok(events) } } }
PCI DSS Compliance
Never Store in Events
#![allow(unused)] fn main() { // BAD - Never do this #[derive(Serialize, Deserialize)] struct PaymentProcessed { card_number: String, // NEVER! cvv: String, // NEVER! pin: String, // NEVER! } // GOOD - Store only tokens #[derive(Serialize, Deserialize)] struct PaymentProcessed { payment_id: PaymentId, card_token: CardToken, // From PCI-compliant tokenizer last_four: String, // "****1234" amount: Money, merchant_ref: String, } }
Audit Requirements
#![allow(unused)] fn main() { struct PciAuditLogger { logger: Box<dyn AuditLogger>, } impl PciAuditLogger { async fn log_payment_access( &self, user: &User, action: PaymentAction, resource: &str, ) -> Result<(), Error> { let entry = AuditEntry { timestamp: Timestamp::now(), user_id: user.id.clone(), action: action.to_string(), resource: resource.to_string(), ip_address: user.ip_address.clone(), success: true, }; self.logger.log(entry).await } } }
HIPAA Compliance
Protected Health Information (PHI)
Always encrypt PHI:
#![allow(unused)] fn main() { #[derive(Serialize, Deserialize)] struct PatientRecord { patient_id: PatientId, // All PHI must be encrypted encrypted_name: EncryptedField, encrypted_ssn: EncryptedField, encrypted_diagnosis: EncryptedField, encrypted_medications: EncryptedField, // Non-PHI can be unencrypted admission_date: Date, room_number: String, } struct HipaaCompliantStore { encryption: EncryptionService, audit: AuditService, } impl HipaaCompliantStore { async fn store_patient_event( &self, event: PatientEvent, accessed_by: UserId, ) -> Result<(), Error> { // Audit the access self.audit.log_phi_access( &accessed_by, &event.patient_id(), "WRITE", ).await?; // Encrypt and store let encrypted = self.encryption.encrypt_event(event)?; self.event_store.append(encrypted).await?; Ok(()) } } }
Access Controls
#![allow(unused)] fn main() { #[derive(Debug, Clone)] enum HipaaRole { Doctor, Nurse, Admin, Billing, } impl HipaaRole { fn can_access_phi(&self) -> bool { matches!(self, HipaaRole::Doctor | HipaaRole::Nurse) } fn can_access_billing(&self) -> bool { matches!(self, HipaaRole::Admin | HipaaRole::Billing) } } }
SOX Compliance
Financial Controls
#![allow(unused)] fn main() { struct SoxCompliantExecutor { executor: CommandExecutor, approvals: ApprovalService, } impl SoxCompliantExecutor { async fn execute_financial_command( &self, command: FinancialCommand, requester: User, ) -> Result<(), Error> { // Segregation of duties if command.amount() > Money::from_dollars(10_000) { let approver = self.approvals .get_approver(&requester) .await?; self.approvals .request_approval(&command, &approver) .await?; } // Execute with full audit trail let result = self.executor .execute_with_metadata( command, metadata! { "sox_requester" => requester.id, "sox_timestamp" => Timestamp::now(), "sox_ip" => requester.ip_address, }, ) .await?; Ok(result) } } }
General Compliance Features
Audit Trail
#![allow(unused)] fn main() { #[derive(Debug, Serialize)] struct ComplianceAuditEntry { timestamp: Timestamp, event_id: EventId, stream_id: StreamId, user_id: UserId, action: String, regulation: String, // "GDPR", "PCI", "HIPAA" details: HashMap<String, String>, } trait ComplianceAuditor { async fn log_access(&self, entry: ComplianceAuditEntry) -> Result<(), Error>; async fn generate_report( &self, regulation: &str, from: Date, to: Date, ) -> Result<ComplianceReport, Error>; } }
Data Retention
#![allow(unused)] fn main() { struct RetentionPolicy { regulation: String, data_type: String, retention_days: u32, action: RetentionAction, } enum RetentionAction { Delete, Archive, Anonymize, } struct RetentionManager { policies: Vec<RetentionPolicy>, } impl RetentionManager { async fn apply_retention(&self, event_store: &EventStore) -> Result<(), Error> { for policy in &self.policies { let cutoff = Timestamp::now() - Duration::days(policy.retention_days); match policy.action { RetentionAction::Delete => { // For GDPR compliance self.crypto_shred_old_data(cutoff).await?; } RetentionAction::Archive => { // Move to cold storage self.archive_old_events(cutoff).await?; } RetentionAction::Anonymize => { // Remove PII but keep analytics data self.anonymize_old_events(cutoff).await?; } } } Ok(()) } } }
Compliance Checklist
- Implement encryption for all PII/PHI
- Set up audit logging for all access
- Configure data retention policies
- Implement right to erasure (GDPR)
- Set up data export capabilities
- Configure access controls (RBAC/ABAC)
- Implement approval workflows (SOX)
- Set up monitoring and alerting
- Document all compliance measures
- Regular compliance audits
Part 6: Operations
This part covers the operational aspects of running EventCore applications in production. From deployment strategies to monitoring, backup, and troubleshooting, you’ll learn how to operate EventCore systems reliably at scale.
Chapters in This Part
- Deployment Strategies - Production deployment patterns
- Monitoring and Metrics - Observability and performance tracking
- Backup and Recovery - Data protection and disaster recovery
- Troubleshooting - Debugging and problem resolution
- Production Checklist - Go-live validation and best practices
What You’ll Learn
- Deploy EventCore applications safely
- Monitor system health and performance
- Implement backup and recovery procedures
- Troubleshoot common production issues
- Validate production readiness
Prerequisites
- Completed Parts 1-5
- Basic understanding of production deployments
- Familiarity with containerization and orchestration
- Knowledge of monitoring and logging concepts
Target Audience
- DevOps engineers
- Site reliability engineers
- Platform engineers
- Senior developers responsible for production systems
Time to Complete
- Reading: ~45 minutes
- With implementation: ~6 hours
Ready to learn production operations? Let’s start with Deployment Strategies →
Chapter 6.1: Deployment Strategies
EventCore applications require careful deployment planning to ensure high availability, data consistency, and smooth rollouts. This chapter covers production-ready deployment patterns and strategies.
Container-Based Deployment
Docker Configuration
EventCore applications containerize well with proper configuration:
# Multi-stage build for optimized production image
FROM rust:1.87-slim as builder
WORKDIR /usr/src/app
COPY Cargo.toml Cargo.lock ./
COPY src ./src
# Build with release optimizations
RUN cargo build --release --locked
# Runtime image
FROM debian:bookworm-slim
# Install runtime dependencies
RUN apt-get update && apt-get install -y \
ca-certificates \
libssl3 \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN useradd -r -s /bin/false eventcore
# Copy application
COPY --from=builder /usr/src/app/target/release/eventcore-app /usr/local/bin/
RUN chmod +x /usr/local/bin/eventcore-app
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
USER eventcore
EXPOSE 8080
CMD ["eventcore-app"]
Environment Configuration
Use environment variables for configuration:
# Database configuration
DATABASE_URL=postgresql://user:pass@db:5432/eventcore
DATABASE_MAX_CONNECTIONS=20
DATABASE_ACQUIRE_TIMEOUT=30s
# Application configuration
HTTP_PORT=8080
LOG_LEVEL=info
LOG_FORMAT=json
# Performance tuning
COMMAND_TIMEOUT=30s
EVENT_BATCH_SIZE=100
PROJECTION_WORKERS=4
# Security
JWT_SECRET_KEY=/run/secrets/jwt_key
CORS_ALLOWED_ORIGINS=https://myapp.com
# Monitoring
METRICS_PORT=9090
TRACING_ENDPOINT=http://jaeger:14268/api/traces
HEALTH_CHECK_INTERVAL=30s
Docker Compose for Development
version: '3.8'
services:
eventcore-app:
build: .
ports:
- "8080:8080"
- "9090:9090"
environment:
DATABASE_URL: postgresql://postgres:password@postgres:5432/eventcore
LOG_LEVEL: debug
METRICS_PORT: 9090
depends_on:
postgres:
condition: service_healthy
networks:
- eventcore
restart: unless-stopped
postgres:
image: postgres:17-alpine
environment:
POSTGRES_DB: eventcore
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
- ./migrations:/docker-entrypoint-initdb.d
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- eventcore
prometheus:
image: prom/prometheus:latest
ports:
- "9091:9090"
volumes:
- ./config/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
networks:
- eventcore
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
volumes:
- grafana_data:/var/lib/grafana
- ./config/grafana:/etc/grafana/provisioning
networks:
- eventcore
volumes:
postgres_data:
prometheus_data:
grafana_data:
networks:
eventcore:
driver: bridge
Kubernetes Deployment
Application Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: eventcore-app
namespace: eventcore
labels:
app: eventcore
component: application
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: eventcore
component: application
template:
metadata:
labels:
app: eventcore
component: application
spec:
serviceAccountName: eventcore
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: eventcore-app
image: eventcore:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
- containerPort: 9090
name: metrics
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: eventcore-secrets
key: database-url
- name: JWT_SECRET_KEY
valueFrom:
secretKeyRef:
name: eventcore-secrets
key: jwt-secret
envFrom:
- configMapRef:
name: eventcore-config
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumeMounts:
- name: config
mountPath: /etc/eventcore
readOnly: true
volumes:
- name: config
configMap:
name: eventcore-config
---
apiVersion: v1
kind: Service
metadata:
name: eventcore-service
namespace: eventcore
labels:
app: eventcore
component: application
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
- port: 9090
targetPort: 9090
protocol: TCP
name: metrics
selector:
app: eventcore
component: application
---
apiVersion: v1
kind: ConfigMap
metadata:
name: eventcore-config
namespace: eventcore
data:
HTTP_PORT: "8080"
METRICS_PORT: "9090"
LOG_LEVEL: "info"
LOG_FORMAT: "json"
COMMAND_TIMEOUT: "30s"
EVENT_BATCH_SIZE: "100"
PROJECTION_WORKERS: "4"
HEALTH_CHECK_INTERVAL: "30s"
---
apiVersion: v1
kind: Secret
metadata:
name: eventcore-secrets
namespace: eventcore
type: Opaque
data:
database-url: <base64-encoded-database-url>
jwt-secret: <base64-encoded-jwt-secret>
Database Configuration
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster
namespace: eventcore
spec:
instances: 3
primaryUpdateStrategy: unsupervised
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
effective_cache_size: "1GB"
maintenance_work_mem: "64MB"
checkpoint_completion_target: "0.9"
wal_buffers: "16MB"
default_statistics_target: "100"
random_page_cost: "1.1"
effective_io_concurrency: "200"
bootstrap:
initdb:
database: eventcore
owner: eventcore
secret:
name: postgres-credentials
storage:
size: 100Gi
storageClass: fast-ssd
monitoring:
enabled: true
backup:
target: prefer-standby
retentionPolicy: "30d"
data:
compression: gzip
encryption: AES256
jobs: 2
wal:
compression: gzip
encryption: AES256
---
apiVersion: v1
kind: Secret
metadata:
name: postgres-credentials
namespace: eventcore
type: kubernetes.io/basic-auth
data:
username: <base64-encoded-username>
password: <base64-encoded-password>
Ingress Configuration
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: eventcore-ingress
namespace: eventcore
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
tls:
- hosts:
- api.eventcore.example.com
secretName: eventcore-tls
rules:
- host: api.eventcore.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: eventcore-service
port:
number: 80
Blue-Green Deployment
Deployment Strategy
Blue-green deployment ensures zero-downtime updates:
# Blue environment (current production)
apiVersion: apps/v1
kind: Deployment
metadata:
name: eventcore-blue
namespace: eventcore
labels:
app: eventcore
environment: blue
spec:
replicas: 3
selector:
matchLabels:
app: eventcore
environment: blue
template:
metadata:
labels:
app: eventcore
environment: blue
spec:
containers:
- name: eventcore-app
image: eventcore:v1.0.0
# ... container spec
---
# Green environment (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
name: eventcore-green
namespace: eventcore
labels:
app: eventcore
environment: green
spec:
replicas: 3
selector:
matchLabels:
app: eventcore
environment: green
template:
metadata:
labels:
app: eventcore
environment: green
spec:
containers:
- name: eventcore-app
image: eventcore:v1.1.0
# ... container spec
---
# Service that can switch between environments
apiVersion: v1
kind: Service
metadata:
name: eventcore-service
namespace: eventcore
spec:
selector:
app: eventcore
environment: blue # Switch to 'green' when deploying
ports:
- port: 80
targetPort: 8080
Deployment Script
#!/bin/bash
set -e
NAMESPACE="eventcore"
NEW_VERSION="$1"
CURRENT_ENV="blue"
TARGET_ENV="green"
if [[ -z "$NEW_VERSION" ]]; then
echo "Usage: $0 <new-version>"
exit 1
fi
echo "Starting blue-green deployment to version $NEW_VERSION"
# Get current environment
CURRENT_SELECTOR=$(kubectl get service eventcore-service -n $NAMESPACE -o jsonpath='{.spec.selector.environment}')
if [[ "$CURRENT_SELECTOR" == "blue" ]]; then
TARGET_ENV="green"
CURRENT_ENV="blue"
else
TARGET_ENV="blue"
CURRENT_ENV="green"
fi
echo "Current environment: $CURRENT_ENV"
echo "Target environment: $TARGET_ENV"
# Update target environment with new version
kubectl set image deployment/eventcore-$TARGET_ENV -n $NAMESPACE \
eventcore-app=eventcore:$NEW_VERSION
# Wait for rollout to complete
kubectl rollout status deployment/eventcore-$TARGET_ENV -n $NAMESPACE
# Health check on target environment
echo "Performing health checks..."
TARGET_POD=$(kubectl get pods -n $NAMESPACE -l environment=$TARGET_ENV -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n $NAMESPACE $TARGET_POD -- curl -f http://localhost:8080/health
# Run smoke tests
echo "Running smoke tests..."
kubectl port-forward -n $NAMESPACE service/eventcore-$TARGET_ENV 8081:80 &
PORT_FORWARD_PID=$!
sleep 5
# Basic functionality test
curl -f http://localhost:8081/health
curl -f http://localhost:8081/metrics
kill $PORT_FORWARD_PID
# Switch traffic to target environment
echo "Switching traffic to $TARGET_ENV environment"
kubectl patch service eventcore-service -n $NAMESPACE \
-p '{"spec":{"selector":{"environment":"'$TARGET_ENV'"}}}'
echo "Deployment complete. Traffic switched to $TARGET_ENV"
echo "Old environment ($CURRENT_ENV) is still running for rollback if needed"
echo "To rollback: kubectl patch service eventcore-service -n $NAMESPACE -p '{\"spec\":{\"selector\":{\"environment\":\"$CURRENT_ENV\"}}}'"
Canary Deployment
Traffic Splitting with Istio
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: eventcore-canary
namespace: eventcore
spec:
hosts:
- api.eventcore.example.com
http:
- match:
- headers:
canary:
exact: "true"
route:
- destination:
host: eventcore-service
subset: canary
- route:
- destination:
host: eventcore-service
subset: stable
weight: 95
- destination:
host: eventcore-service
subset: canary
weight: 5
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: eventcore-destination
namespace: eventcore
spec:
host: eventcore-service
subsets:
- name: stable
labels:
version: stable
- name: canary
labels:
version: canary
Automated Canary with Flagger
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: eventcore
namespace: eventcore
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: eventcore
progressDeadlineSeconds: 60
service:
port: 80
targetPort: 8080
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 30s
webhooks:
- name: smoke-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 15s
metadata:
type: bash
cmd: "curl -sd 'test' http://eventcore-canary/health"
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://eventcore-canary/"
Database Migrations
Schema Migration Strategy
#![allow(unused)] fn main() { use sqlx::{PgPool, migrate::MigrateDatabase, Postgres}; pub struct MigrationManager { pool: PgPool, migration_path: String, } impl MigrationManager { pub async fn new(database_url: &str, migration_path: String) -> Result<Self, sqlx::Error> { // Ensure database exists if !Postgres::database_exists(database_url).await? { Postgres::create_database(database_url).await?; } let pool = PgPool::connect(database_url).await?; Ok(Self { pool, migration_path, }) } pub async fn run_migrations(&self) -> Result<(), sqlx::Error> { sqlx::migrate::Migrator::new(std::path::Path::new(&self.migration_path)) .await? .run(&self.pool) .await?; Ok(()) } pub async fn check_migration_status(&self) -> Result<MigrationStatus, sqlx::Error> { let migrator = sqlx::migrate::Migrator::new(std::path::Path::new(&self.migration_path)) .await?; let applied = migrator.get_applied_migrations(&self.pool).await?; let available = migrator.iter().count(); Ok(MigrationStatus { applied: applied.len(), available, pending: available - applied.len(), }) } } #[derive(Debug)] pub struct MigrationStatus { pub applied: usize, pub available: usize, pub pending: usize, } }
Migration Files Structure
migrations/
├── 001_initial_schema.sql
├── 002_add_user_preferences.sql
├── 003_optimize_event_indexes.sql
└── 004_add_projection_checkpoints.sql
Example migration:
-- migrations/001_initial_schema.sql
-- Create events table with optimized indexes
CREATE TABLE events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
stream_id VARCHAR(255) NOT NULL,
version BIGINT NOT NULL,
event_type VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}',
occurred_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
CONSTRAINT events_stream_version_unique UNIQUE (stream_id, version)
);
-- Optimized indexes for common query patterns
CREATE INDEX idx_events_stream_id ON events (stream_id);
CREATE INDEX idx_events_stream_id_version ON events (stream_id, version);
CREATE INDEX idx_events_occurred_at ON events (occurred_at);
CREATE INDEX idx_events_event_type ON events (event_type);
CREATE INDEX idx_events_payload_gin ON events USING GIN (payload);
-- Create projection checkpoints table
CREATE TABLE projection_checkpoints (
projection_name VARCHAR(255) PRIMARY KEY,
last_event_id UUID,
last_event_version BIGINT,
stream_positions JSONB NOT NULL DEFAULT '{}',
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_projection_checkpoints_updated_at ON projection_checkpoints (updated_at);
Zero-Downtime Migration Pattern
#!/bin/bash
# Zero-downtime migration script
set -e
DATABASE_URL="$1"
MIGRATION_PATH="./migrations"
echo "Starting zero-downtime migration process..."
# Step 1: Run additive migrations (safe)
echo "Running additive migrations..."
sqlx migrate run --source $MIGRATION_PATH/additive
# Step 2: Deploy new application version (backward compatible)
echo "Deploying new application version..."
kubectl set image deployment/eventcore-app eventcore-app=eventcore:$NEW_VERSION
kubectl rollout status deployment/eventcore-app
# Step 3: Verify application health
echo "Verifying application health..."
kubectl get pods -l app=eventcore
curl -f http://api.eventcore.example.com/health
# Step 4: Run data migrations (if needed)
echo "Running data migrations..."
sqlx migrate run --source $MIGRATION_PATH/data
# Step 5: Run cleanup migrations (remove old columns/tables)
echo "Running cleanup migrations..."
sqlx migrate run --source $MIGRATION_PATH/cleanup
echo "Zero-downtime migration completed successfully!"
Configuration Management
Environment-Specific Configuration
#![allow(unused)] fn main() { use config::{Config, ConfigError, Environment, File}; use serde::Deserialize; #[derive(Debug, Deserialize, Clone)] pub struct AppConfig { pub database: DatabaseConfig, pub server: ServerConfig, pub monitoring: MonitoringConfig, pub features: FeatureFlags, } #[derive(Debug, Deserialize, Clone)] pub struct DatabaseConfig { pub url: String, pub max_connections: u32, pub acquire_timeout_seconds: u64, pub command_timeout_seconds: u64, } #[derive(Debug, Deserialize, Clone)] pub struct ServerConfig { pub host: String, pub port: u16, pub cors_origins: Vec<String>, pub request_timeout_seconds: u64, } #[derive(Debug, Deserialize, Clone)] pub struct MonitoringConfig { pub metrics_port: u16, pub tracing_endpoint: Option<String>, pub log_level: String, pub health_check_interval_seconds: u64, } #[derive(Debug, Deserialize, Clone)] pub struct FeatureFlags { pub enable_metrics: bool, pub enable_tracing: bool, pub enable_auth: bool, pub enable_rate_limiting: bool, } impl AppConfig { pub fn from_env() -> Result<Self, ConfigError> { let environment = std::env::var("ENVIRONMENT").unwrap_or_else(|_| "development".to_string()); let config = Config::builder() // Start with default configuration .add_source(File::with_name("config/default")) // Add environment-specific configuration .add_source(File::with_name(&format!("config/{}", environment)).required(false)) // Add local configuration (for development) .add_source(File::with_name("config/local").required(false)) // Override with environment variables .add_source(Environment::with_prefix("EVENTCORE").separator("_")) .build()?; config.try_deserialize() } } }
Configuration Files
# config/default.yaml
database:
max_connections: 10
acquire_timeout_seconds: 30
command_timeout_seconds: 60
server:
host: "0.0.0.0"
port: 8080
cors_origins: ["http://localhost:3000"]
request_timeout_seconds: 30
monitoring:
metrics_port: 9090
log_level: "info"
health_check_interval_seconds: 30
features:
enable_metrics: true
enable_tracing: false
enable_auth: false
enable_rate_limiting: false
# config/production.yaml
database:
max_connections: 20
acquire_timeout_seconds: 10
command_timeout_seconds: 30
server:
cors_origins: ["https://myapp.com"]
request_timeout_seconds: 15
monitoring:
log_level: "warn"
health_check_interval_seconds: 10
features:
enable_tracing: true
enable_auth: true
enable_rate_limiting: true
Health Checks and Readiness
Application Health Endpoints
#![allow(unused)] fn main() { use axum::{Json, response::Json as JsonResponse, extract::State}; use serde_json::{json, Value}; use std::sync::Arc; #[derive(Clone)] pub struct HealthService { event_store: Arc<dyn EventStore>, dependencies: Vec<Arc<dyn HealthCheck>>, } #[async_trait] pub trait HealthCheck: Send + Sync { async fn name(&self) -> &'static str; async fn check(&self) -> HealthStatus; } #[derive(Debug, Clone)] pub enum HealthStatus { Healthy, Unhealthy(String), Unknown, } impl HealthService { pub async fn health_check(&self) -> JsonResponse<Value> { let mut overall_healthy = true; let mut checks = Vec::new(); // Check event store let event_store_status = self.check_event_store().await; let event_store_healthy = matches!(event_store_status, HealthStatus::Healthy); overall_healthy &= event_store_healthy; checks.push(json!({ "name": "event_store", "status": if event_store_healthy { "healthy" } else { "unhealthy" }, "details": match event_store_status { HealthStatus::Unhealthy(msg) => Some(msg), _ => None, } })); // Check dependencies for dependency in &self.dependencies { let name = dependency.name().await; let status = dependency.check().await; let healthy = matches!(status, HealthStatus::Healthy); overall_healthy &= healthy; checks.push(json!({ "name": name, "status": if healthy { "healthy" } else { "unhealthy" }, "details": match status { HealthStatus::Unhealthy(msg) => Some(msg), _ => None, } })); } let response = json!({ "status": if overall_healthy { "healthy" } else { "unhealthy" }, "checks": checks, "timestamp": chrono::Utc::now().to_rfc3339(), "version": env!("CARGO_PKG_VERSION") }); JsonResponse(response) } pub async fn readiness_check(&self) -> JsonResponse<Value> { // Readiness is stricter - all components must be ready let event_store_ready = self.check_event_store_ready().await; let migrations_ready = self.check_migrations_ready().await; let ready = event_store_ready && migrations_ready; let response = json!({ "status": if ready { "ready" } else { "not_ready" }, "checks": { "event_store": event_store_ready, "migrations": migrations_ready, }, "timestamp": chrono::Utc::now().to_rfc3339() }); JsonResponse(response) } async fn check_event_store(&self) -> HealthStatus { match self.event_store.health_check().await { Ok(_) => HealthStatus::Healthy, Err(e) => HealthStatus::Unhealthy(format!("Event store error: {}", e)), } } async fn check_event_store_ready(&self) -> bool { // More stringent check for readiness self.event_store.ping().await.is_ok() } async fn check_migrations_ready(&self) -> bool { // Check if all migrations are applied match self.event_store.migration_status().await { Ok(status) => status.pending == 0, Err(_) => false, } } } // Route handlers pub async fn health_handler(State(health_service): State<HealthService>) -> JsonResponse<Value> { health_service.health_check().await } pub async fn readiness_handler(State(health_service): State<HealthService>) -> JsonResponse<Value> { health_service.readiness_check().await } pub async fn liveness_handler() -> JsonResponse<Value> { // Simple liveness check - just return OK if the process is running JsonResponse(json!({ "status": "alive", "timestamp": chrono::Utc::now().to_rfc3339() })) } }
Kubernetes Health Check Configuration
# Detailed health check configuration
spec:
containers:
- name: eventcore-app
# Liveness probe - restart container if this fails
livenessProbe:
httpGet:
path: /liveness
port: 8080
httpHeaders:
- name: Accept
value: application/json
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
successThreshold: 1
# Readiness probe - remove from service if this fails
readinessProbe:
httpGet:
path: /readiness
port: 8080
httpHeaders:
- name: Accept
value: application/json
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
successThreshold: 1
# Startup probe - give extra time during startup
startupProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 30
successThreshold: 1
Best Practices
- Containerize everything - Use containers for consistent deployments
- Infrastructure as Code - Version control all configuration
- Zero-downtime deployments - Use blue-green or canary strategies
- Database migrations - Plan for backward compatibility
- Health monitoring - Implement comprehensive health checks
- Configuration management - Separate config from code
- Security - Use secrets management and RBAC
- Rollback plans - Always have a rollback strategy
Summary
EventCore deployment strategies:
- ✅ Containerized - Docker and Kubernetes ready
- ✅ Zero-downtime - Blue-green and canary deployments
- ✅ Database migrations - Safe schema evolution
- ✅ Health monitoring - Comprehensive health checks
- ✅ Configuration management - Environment-specific config
Key patterns:
- Use containers for consistent, portable deployments
- Implement blue-green or canary deployments for zero downtime
- Plan database migrations for backward compatibility
- Configure comprehensive health checks for reliability
- Manage configuration separately from application code
Next, let’s explore Monitoring and Metrics →
Chapter 6.2: Monitoring and Metrics
Effective monitoring is crucial for operating EventCore applications in production. This chapter covers comprehensive observability strategies including metrics, logging, tracing, and alerting.
Metrics Collection
Prometheus Integration
EventCore provides built-in Prometheus metrics:
#![allow(unused)] fn main() { use prometheus::{ Counter, Histogram, Gauge, IntGauge, register_counter, register_histogram, register_gauge, register_int_gauge, Encoder, TextEncoder }; use axum::{response::Response, http::StatusCode}; lazy_static! { // Command execution metrics static ref COMMANDS_TOTAL: Counter = register_counter!( "eventcore_commands_total", "Total number of commands executed" ).unwrap(); static ref COMMAND_DURATION: Histogram = register_histogram!( "eventcore_command_duration_seconds", "Command execution duration in seconds" ).unwrap(); static ref COMMAND_ERRORS: Counter = register_counter!( "eventcore_command_errors_total", "Total number of command execution errors" ).unwrap(); // Event store metrics static ref EVENTS_WRITTEN: Counter = register_counter!( "eventcore_events_written_total", "Total number of events written to the store" ).unwrap(); static ref EVENT_STORE_LATENCY: Histogram = register_histogram!( "eventcore_event_store_latency_seconds", "Event store operation latency in seconds" ).unwrap(); // Stream metrics static ref ACTIVE_STREAMS: IntGauge = register_int_gauge!( "eventcore_active_streams", "Number of active event streams" ).unwrap(); static ref STREAM_VERSIONS: Gauge = register_gauge!( "eventcore_stream_versions", "Current version of event streams" ).unwrap(); // Projection metrics static ref PROJECTION_EVENTS_PROCESSED: Counter = register_counter!( "eventcore_projection_events_processed_total", "Total events processed by projections" ).unwrap(); static ref PROJECTION_LAG: Gauge = register_gauge!( "eventcore_projection_lag_seconds", "Projection lag behind latest events in seconds" ).unwrap(); // System metrics static ref MEMORY_USAGE: Gauge = register_gauge!( "eventcore_memory_usage_bytes", "Memory usage in bytes" ).unwrap(); static ref CONNECTION_POOL_SIZE: IntGauge = register_int_gauge!( "eventcore_connection_pool_size", "Database connection pool size" ).unwrap(); } #[derive(Clone)] pub struct MetricsService { start_time: std::time::Instant, } impl MetricsService { pub fn new() -> Self { Self { start_time: std::time::Instant::now(), } } pub fn record_command_executed(&self, command_type: &str, duration: std::time::Duration, success: bool) { COMMANDS_TOTAL.with_label_values(&[command_type]).inc(); COMMAND_DURATION.with_label_values(&[command_type]).observe(duration.as_secs_f64()); if !success { COMMAND_ERRORS.with_label_values(&[command_type]).inc(); } } pub fn record_events_written(&self, stream_id: &str, count: usize) { EVENTS_WRITTEN.with_label_values(&[stream_id]).inc_by(count as f64); } pub fn record_event_store_operation(&self, operation: &str, duration: std::time::Duration) { EVENT_STORE_LATENCY.with_label_values(&[operation]).observe(duration.as_secs_f64()); } pub fn update_active_streams(&self, count: i64) { ACTIVE_STREAMS.set(count); } pub fn update_stream_version(&self, stream_id: &str, version: f64) { STREAM_VERSIONS.with_label_values(&[stream_id]).set(version); } pub fn record_projection_event(&self, projection_name: &str, lag_seconds: f64) { PROJECTION_EVENTS_PROCESSED.with_label_values(&[projection_name]).inc(); PROJECTION_LAG.with_label_values(&[projection_name]).set(lag_seconds); } pub fn update_memory_usage(&self, bytes: f64) { MEMORY_USAGE.set(bytes); } pub fn update_connection_pool_size(&self, size: i64) { CONNECTION_POOL_SIZE.set(size); } pub async fn export_metrics(&self) -> Result<Response<String>, StatusCode> { let encoder = TextEncoder::new(); let metric_families = prometheus::gather(); match encoder.encode_to_string(&metric_families) { Ok(output) => { let response = Response::builder() .status(StatusCode::OK) .header("Content-Type", encoder.format_type()) .body(output) .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?; Ok(response) } Err(_) => Err(StatusCode::INTERNAL_SERVER_ERROR), } } } // Metrics endpoint handler pub async fn metrics_handler( State(metrics_service): State<MetricsService> ) -> Result<Response<String>, StatusCode> { metrics_service.export_metrics().await } }
Custom Metrics
Define application-specific metrics:
#![allow(unused)] fn main() { use prometheus::{register_counter_vec, register_histogram_vec, CounterVec, HistogramVec}; lazy_static! { // Business metrics static ref USER_REGISTRATIONS: Counter = register_counter!( "eventcore_user_registrations_total", "Total number of user registrations" ).unwrap(); static ref ORDER_VALUE: Histogram = register_histogram!( "eventcore_order_value_dollars", "Order value in dollars", vec![10.0, 50.0, 100.0, 500.0, 1000.0, 5000.0] ).unwrap(); static ref API_REQUESTS: CounterVec = register_counter_vec!( "eventcore_api_requests_total", "Total API requests", &["method", "endpoint", "status"] ).unwrap(); static ref REQUEST_DURATION: HistogramVec = register_histogram_vec!( "eventcore_request_duration_seconds", "Request duration in seconds", &["method", "endpoint"] ).unwrap(); } pub struct BusinessMetrics; impl BusinessMetrics { pub fn record_user_registration() { USER_REGISTRATIONS.inc(); } pub fn record_order_placed(value_dollars: f64) { ORDER_VALUE.observe(value_dollars); } pub fn record_api_request(method: &str, endpoint: &str, status: u16, duration: std::time::Duration) { API_REQUESTS .with_label_values(&[method, endpoint, &status.to_string()]) .inc(); REQUEST_DURATION .with_label_values(&[method, endpoint]) .observe(duration.as_secs_f64()); } } }
Automatic Instrumentation
Instrument EventCore operations automatically:
#![allow(unused)] fn main() { use std::time::Instant; use async_trait::async_trait; pub struct InstrumentedCommandExecutor { inner: CommandExecutor, metrics: MetricsService, } impl InstrumentedCommandExecutor { pub fn new(inner: CommandExecutor, metrics: MetricsService) -> Self { Self { inner, metrics } } } #[async_trait] impl CommandExecutor for InstrumentedCommandExecutor { async fn execute<C: Command>(&self, command: &C) -> CommandResult<ExecutionResult> { let start = Instant::now(); let command_type = std::any::type_name::<C>(); let result = self.inner.execute(command).await; let duration = start.elapsed(); let success = result.is_ok(); self.metrics.record_command_executed(command_type, duration, success); if let Ok(ref execution_result) = result { self.metrics.record_events_written( &execution_result.affected_streams[0].to_string(), execution_result.events_written.len() ); } result } } // Instrumented event store pub struct InstrumentedEventStore { inner: Arc<dyn EventStore>, metrics: MetricsService, } #[async_trait] impl EventStore for InstrumentedEventStore { async fn write_events(&self, events: Vec<EventToWrite>) -> EventStoreResult<WriteResult> { let start = Instant::now(); let result = self.inner.write_events(events).await; let duration = start.elapsed(); self.metrics.record_event_store_operation("write", duration); result } async fn read_stream(&self, stream_id: &StreamId, options: ReadOptions) -> EventStoreResult<StreamEvents> { let start = Instant::now(); let result = self.inner.read_stream(stream_id, options).await; let duration = start.elapsed(); self.metrics.record_event_store_operation("read", duration); result } } }
Structured Logging
Logging Configuration
#![allow(unused)] fn main() { use tracing::{info, warn, error, debug, trace, instrument}; use tracing_subscriber::{ layer::SubscriberExt, util::SubscriberInitExt, fmt, EnvFilter, }; use serde_json::json; pub fn init_logging(log_level: &str, log_format: &str) -> Result<(), Box<dyn std::error::Error>> { let env_filter = EnvFilter::try_from_default_env() .unwrap_or_else(|_| EnvFilter::new(log_level)); let fmt_layer = match log_format { "json" => { fmt::layer() .json() .with_current_span(true) .with_span_list(true) .with_target(true) .with_file(true) .with_line_number(true) .boxed() } _ => { fmt::layer() .with_target(true) .with_file(true) .with_line_number(true) .boxed() } }; tracing_subscriber::registry() .with(env_filter) .with(fmt_layer) .init(); Ok(()) } // Structured logging for command execution #[instrument(skip(command), fields(command_type = %std::any::type_name::<C>()))] pub async fn execute_command_with_logging<C: Command>( command: &C, executor: &CommandExecutor, ) -> CommandResult<ExecutionResult> { debug!("Starting command execution"); let result = executor.execute(command).await; match &result { Ok(execution_result) => { info!( events_written = execution_result.events_written.len(), affected_streams = execution_result.affected_streams.len(), "Command executed successfully" ); } Err(error) => { error!( error = %error, "Command execution failed" ); } } result } // Event store logging #[instrument(skip(events), fields(event_count = events.len()))] pub async fn write_events_with_logging( events: Vec<EventToWrite>, event_store: &dyn EventStore, ) -> EventStoreResult<WriteResult> { debug!("Writing events to store"); let stream_ids: Vec<_> = events.iter() .map(|e| e.stream_id.to_string()) .collect(); let result = event_store.write_events(events).await; match &result { Ok(write_result) => { info!( events_written = write_result.events_written, streams = ?stream_ids, "Events written successfully" ); } Err(error) => { error!( error = %error, streams = ?stream_ids, "Failed to write events" ); } } result } }
Log Aggregation
Configure log shipping to centralized systems:
# Fluentd configuration for Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: eventcore
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/eventcore-*.log
pos_file /var/log/fluentd-eventcore.log.pos
tag eventcore.*
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</source>
<filter eventcore.**>
@type parser
key_name log
format json
reserve_data true
</filter>
<match eventcore.**>
@type elasticsearch
host elasticsearch.logging.svc.cluster.local
port 9200
index_name eventcore-logs
type_name _doc
include_timestamp true
logstash_format true
logstash_prefix eventcore
<buffer>
@type file
path /var/log/fluentd-buffers/eventcore
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>
Distributed Tracing
OpenTelemetry Integration
#![allow(unused)] fn main() { use opentelemetry::{ global, trace::{TraceError, Tracer, TracerProvider}, KeyValue, }; use opentelemetry_otlp::WithExportConfig; use opentelemetry_sdk::{ trace::{self, Sampler}, Resource, }; use tracing_opentelemetry::OpenTelemetryLayer; pub fn init_tracing(service_name: &str, otlp_endpoint: &str) -> Result<(), TraceError> { let tracer = opentelemetry_otlp::new_pipeline() .tracing() .with_exporter( opentelemetry_otlp::new_exporter() .tonic() .with_endpoint(otlp_endpoint) ) .with_trace_config( trace::config() .with_sampler(Sampler::TraceIdRatioBased(1.0)) .with_resource(Resource::new(vec![ KeyValue::new("service.name", service_name.to_string()), KeyValue::new("service.version", env!("CARGO_PKG_VERSION")), KeyValue::new("deployment.environment", std::env::var("ENVIRONMENT").unwrap_or_else(|_| "unknown".to_string()) ), ])) ) .install_batch(opentelemetry_sdk::runtime::Tokio)?; let telemetry_layer = tracing_opentelemetry::layer().with_tracer(tracer); tracing_subscriber::registry() .with(telemetry_layer) .init(); Ok(()) } // Traced command execution #[tracing::instrument(skip(command, executor), fields(command_id = %uuid::Uuid::new_v4()))] pub async fn execute_command_traced<C: Command>( command: &C, executor: &CommandExecutor, ) -> CommandResult<ExecutionResult> { let span = tracing::Span::current(); span.record("command.type", std::any::type_name::<C>()); let result = executor.execute(command).await; match &result { Ok(execution_result) => { span.record("command.success", true); span.record("events.count", execution_result.events_written.len()); span.record("streams.count", execution_result.affected_streams.len()); } Err(error) => { span.record("command.success", false); span.record("error.message", format!("{}", error)); span.record("error.type", std::any::type_name_of_val(error)); } } result } // Cross-service trace propagation use axum::{ extract::Request, http::{HeaderMap, HeaderName, HeaderValue}, middleware::Next, response::Response, }; pub async fn trace_propagation_middleware( request: Request, next: Next, ) -> Response { // Extract trace context from headers let headers = request.headers(); let parent_context = global::get_text_map_propagator(|propagator| { propagator.extract(&HeaderMapCarrier::new(headers)) }); // Create new span with parent context let span = tracing::info_span!( "http_request", method = %request.method(), uri = %request.uri(), version = ?request.version(), ); // Set parent context span.set_parent(parent_context); // Execute request within span let response = span.in_scope(|| next.run(request)).await; response } struct HeaderMapCarrier<'a> { headers: &'a HeaderMap, } impl<'a> HeaderMapCarrier<'a> { fn new(headers: &'a HeaderMap) -> Self { Self { headers } } } impl<'a> opentelemetry::propagation::Extractor for HeaderMapCarrier<'a> { fn get(&self, key: &str) -> Option<&str> { self.headers.get(key)?.to_str().ok() } fn keys(&self) -> Vec<&str> { self.headers.keys().map(|k| k.as_str()).collect() } } }
Alerting
Prometheus Alerting Rules
# prometheus-alerts.yaml
groups:
- name: eventcore.rules
rules:
# High error rate
- alert: HighCommandErrorRate
expr: |
(
rate(eventcore_command_errors_total[5m]) /
rate(eventcore_commands_total[5m])
) > 0.05
for: 2m
labels:
severity: warning
service: eventcore
annotations:
summary: "High command error rate detected"
description: "Command error rate is {{ $value | humanizePercentage }} over the last 5 minutes"
# High latency
- alert: HighCommandLatency
expr: |
histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m])) > 1.0
for: 3m
labels:
severity: warning
service: eventcore
annotations:
summary: "High command latency detected"
description: "95th percentile command latency is {{ $value }}s"
# Event store issues
- alert: EventStoreDown
expr: up{job="eventcore"} == 0
for: 1m
labels:
severity: critical
service: eventcore
annotations:
summary: "EventCore service is down"
description: "EventCore service has been down for more than 1 minute"
# Projection lag
- alert: ProjectionLag
expr: eventcore_projection_lag_seconds > 300
for: 5m
labels:
severity: warning
service: eventcore
annotations:
summary: "Projection lag is high"
description: "Projection {{ $labels.projection_name }} is {{ $value }}s behind"
# Memory usage
- alert: HighMemoryUsage
expr: |
(eventcore_memory_usage_bytes / (1024 * 1024 * 1024)) > 1.0
for: 5m
labels:
severity: warning
service: eventcore
annotations:
summary: "High memory usage"
description: "Memory usage is {{ $value | humanize }}GB"
# Database connection pool
- alert: DatabaseConnectionPoolExhausted
expr: eventcore_connection_pool_size / eventcore_connection_pool_max_size > 0.9
for: 2m
labels:
severity: critical
service: eventcore
annotations:
summary: "Database connection pool nearly exhausted"
description: "Connection pool utilization is {{ $value | humanizePercentage }}"
Alert Manager Configuration
# alertmanager.yaml
global:
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alerts@eventcore.com'
route:
group_by: ['alertname', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
routes:
- match:
severity: critical
receiver: 'critical-alerts'
- match:
severity: warning
receiver: 'warning-alerts'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://slack-webhook/webhook'
- name: 'critical-alerts'
email_configs:
- to: 'oncall@eventcore.com'
subject: 'CRITICAL: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
body: |
{{ range .Alerts }}
Alert: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
Labels: {{ range .Labels.SortedPairs }}{{ .Name }}={{ .Value }} {{ end }}
{{ end }}
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#critical-alerts'
title: 'Critical Alert: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
- name: 'warning-alerts'
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#warnings'
title: 'Warning: {{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
Grafana Dashboards
EventCore Operations Dashboard
{
"dashboard": {
"title": "EventCore Operations",
"panels": [
{
"title": "Command Execution Rate",
"type": "graph",
"targets": [
{
"expr": "rate(eventcore_commands_total[5m])",
"legendFormat": "Commands/sec"
}
]
},
{
"title": "Command Latency",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.50, rate(eventcore_command_duration_seconds_bucket[5m]))",
"legendFormat": "p50"
},
{
"expr": "histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m]))",
"legendFormat": "p95"
},
{
"expr": "histogram_quantile(0.99, rate(eventcore_command_duration_seconds_bucket[5m]))",
"legendFormat": "p99"
}
]
},
{
"title": "Error Rate",
"type": "singlestat",
"targets": [
{
"expr": "rate(eventcore_command_errors_total[5m]) / rate(eventcore_commands_total[5m])",
"legendFormat": "Error Rate"
}
],
"thresholds": [
{
"value": 0.01,
"colorMode": "critical"
}
]
},
{
"title": "Active Streams",
"type": "singlestat",
"targets": [
{
"expr": "eventcore_active_streams",
"legendFormat": "Streams"
}
]
},
{
"title": "Projection Lag",
"type": "graph",
"targets": [
{
"expr": "eventcore_projection_lag_seconds",
"legendFormat": "{{ projection_name }}"
}
]
},
{
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "eventcore_memory_usage_bytes / (1024 * 1024 * 1024)",
"legendFormat": "Memory (GB)"
}
]
}
]
}
}
Performance Monitoring
Real-Time Performance Metrics
#![allow(unused)] fn main() { use std::sync::Arc; use tokio::sync::RwLock; use std::collections::HashMap; #[derive(Debug, Clone)] pub struct PerformanceSnapshot { pub timestamp: chrono::DateTime<chrono::Utc>, pub commands_per_second: f64, pub events_per_second: f64, pub avg_latency_ms: f64, pub p95_latency_ms: f64, pub p99_latency_ms: f64, pub error_rate: f64, pub active_streams: i64, pub memory_usage_mb: f64, } pub struct PerformanceMonitor { snapshots: Arc<RwLock<Vec<PerformanceSnapshot>>>, max_snapshots: usize, } impl PerformanceMonitor { pub fn new(max_snapshots: usize) -> Self { Self { snapshots: Arc::new(RwLock::new(Vec::new())), max_snapshots, } } pub async fn capture_snapshot(&self) -> PerformanceSnapshot { let snapshot = PerformanceSnapshot { timestamp: chrono::Utc::now(), commands_per_second: self.calculate_command_rate().await, events_per_second: self.calculate_event_rate().await, avg_latency_ms: self.calculate_avg_latency().await, p95_latency_ms: self.calculate_p95_latency().await, p99_latency_ms: self.calculate_p99_latency().await, error_rate: self.calculate_error_rate().await, active_streams: self.get_active_stream_count().await, memory_usage_mb: self.get_memory_usage_mb().await, }; let mut snapshots = self.snapshots.write().await; snapshots.push(snapshot.clone()); // Keep only the most recent snapshots if snapshots.len() > self.max_snapshots { snapshots.remove(0); } snapshot } pub async fn get_trend_analysis(&self, minutes: u64) -> TrendAnalysis { let snapshots = self.snapshots.read().await; let cutoff = chrono::Utc::now() - chrono::Duration::minutes(minutes as i64); let recent_snapshots: Vec<_> = snapshots .iter() .filter(|s| s.timestamp > cutoff) .collect(); if recent_snapshots.is_empty() { return TrendAnalysis::default(); } TrendAnalysis { throughput_trend: self.calculate_trend(&recent_snapshots, |s| s.commands_per_second), latency_trend: self.calculate_trend(&recent_snapshots, |s| s.avg_latency_ms), error_rate_trend: self.calculate_trend(&recent_snapshots, |s| s.error_rate), memory_trend: self.calculate_trend(&recent_snapshots, |s| s.memory_usage_mb), } } async fn calculate_command_rate(&self) -> f64 { // Get rate from Prometheus metrics // Implementation depends on your metrics backend 0.0 } async fn calculate_event_rate(&self) -> f64 { // Get rate from Prometheus metrics 0.0 } async fn calculate_avg_latency(&self) -> f64 { // Get average latency from metrics 0.0 } async fn calculate_p95_latency(&self) -> f64 { // Get p95 latency from metrics 0.0 } async fn calculate_p99_latency(&self) -> f64 { // Get p99 latency from metrics 0.0 } async fn calculate_error_rate(&self) -> f64 { // Calculate error rate from metrics 0.0 } async fn get_active_stream_count(&self) -> i64 { // Get active stream count from metrics 0 } async fn get_memory_usage_mb(&self) -> f64 { // Get memory usage from system metrics 0.0 } fn calculate_trend<F>(&self, snapshots: &[&PerformanceSnapshot], extractor: F) -> Trend where F: Fn(&PerformanceSnapshot) -> f64, { if snapshots.len() < 2 { return Trend::Stable; } let values: Vec<f64> = snapshots.iter().map(|s| extractor(s)).collect(); let first_half = &values[0..values.len()/2]; let second_half = &values[values.len()/2..]; let first_avg = first_half.iter().sum::<f64>() / first_half.len() as f64; let second_avg = second_half.iter().sum::<f64>() / second_half.len() as f64; let change_percent = (second_avg - first_avg) / first_avg * 100.0; match change_percent { x if x > 10.0 => Trend::Increasing, x if x < -10.0 => Trend::Decreasing, _ => Trend::Stable, } } } #[derive(Debug, Clone)] pub struct TrendAnalysis { pub throughput_trend: Trend, pub latency_trend: Trend, pub error_rate_trend: Trend, pub memory_trend: Trend, } #[derive(Debug, Clone)] pub enum Trend { Increasing, Decreasing, Stable, } impl Default for TrendAnalysis { fn default() -> Self { Self { throughput_trend: Trend::Stable, latency_trend: Trend::Stable, error_rate_trend: Trend::Stable, memory_trend: Trend::Stable, } } } }
Best Practices
- Comprehensive metrics - Monitor all key system components
- Structured logging - Use consistent, searchable log formats
- Distributed tracing - Track requests across service boundaries
- Proactive alerting - Alert on trends, not just thresholds
- Performance baselines - Establish and monitor performance baselines
- Dashboard organization - Create role-specific dashboards
- Alert fatigue - Tune alerts to reduce noise
- Runbook automation - Automate common response procedures
Summary
EventCore monitoring and metrics:
- ✅ Prometheus metrics - Comprehensive system monitoring
- ✅ Structured logging - Searchable, contextual logs
- ✅ Distributed tracing - Request flow visibility
- ✅ Intelligent alerting - Proactive issue detection
- ✅ Performance monitoring - Real-time performance tracking
Key components:
- Export detailed Prometheus metrics for all operations
- Implement structured logging with correlation IDs
- Use distributed tracing for multi-service visibility
- Configure intelligent alerting with appropriate thresholds
- Build comprehensive dashboards for different audiences
Next, let’s explore Backup and Recovery →
Chapter 6.3: Backup and Recovery
Data protection is critical for EventCore applications since event stores contain the complete history of your system. This chapter covers comprehensive backup strategies, disaster recovery procedures, and data integrity verification.
Backup Strategies
PostgreSQL Backup Configuration
EventCore’s PostgreSQL event store requires specific backup considerations:
# PostgreSQL backup configuration using CloudNativePG
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: eventcore-postgres
namespace: eventcore
spec:
instances: 3
backup:
target: prefer-standby
retentionPolicy: "30d"
# Base backup configuration
data:
compression: gzip
encryption: AES256
jobs: 2
immediateCheckpoint: true
# WAL archiving
wal:
compression: gzip
encryption: AES256
maxParallel: 2
# Backup schedule
barmanObjectStore:
destinationPath: "s3://eventcore-backups/postgres"
s3Credentials:
accessKeyId:
name: backup-credentials
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-credentials
key: SECRET_ACCESS_KEY
wal:
retention: "7d"
data:
retention: "30d"
jobs: 2
---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: eventcore-backup-schedule
namespace: eventcore
spec:
schedule: "0 2 * * *" # Daily at 2 AM
backupOwnerReference: self
cluster:
name: eventcore-postgres
target: prefer-standby
method: barmanObjectStore
Event Store Backup Implementation
#![allow(unused)] fn main() { use tokio::fs::File; use tokio::io::{AsyncWriteExt, BufWriter}; use chrono::{DateTime, Utc}; use serde::{Serialize, Deserialize}; use uuid::Uuid; #[derive(Debug, Clone)] pub struct BackupManager { event_store: Arc<dyn EventStore>, storage: Arc<dyn BackupStorage>, config: BackupConfig, } #[derive(Debug, Clone)] pub struct BackupConfig { pub backup_format: BackupFormat, pub compression: CompressionType, pub encryption_enabled: bool, pub chunk_size: usize, pub retention_days: u32, pub verify_after_backup: bool, } #[derive(Debug, Clone)] pub enum BackupFormat { JsonLines, MessagePack, Custom, } #[derive(Debug, Clone)] pub enum CompressionType { None, Gzip, Zstd, } #[derive(Debug, Serialize, Deserialize)] pub struct BackupMetadata { pub backup_id: Uuid, pub created_at: DateTime<Utc>, pub format: BackupFormat, pub compression: CompressionType, pub total_events: u64, pub total_streams: u64, pub size_bytes: u64, pub checksum: String, pub event_range: EventRange, } #[derive(Debug, Serialize, Deserialize)] pub struct EventRange { pub earliest_event: DateTime<Utc>, pub latest_event: DateTime<Utc>, pub earliest_version: EventVersion, pub latest_version: EventVersion, } impl BackupManager { pub async fn create_full_backup(&self) -> Result<BackupMetadata, BackupError> { let backup_id = Uuid::new_v4(); let start_time = Utc::now(); tracing::info!(backup_id = %backup_id, "Starting full backup"); // Create backup metadata let mut metadata = BackupMetadata { backup_id, created_at: start_time, format: self.config.backup_format.clone(), compression: self.config.compression.clone(), total_events: 0, total_streams: 0, size_bytes: 0, checksum: String::new(), event_range: EventRange { earliest_event: start_time, latest_event: start_time, earliest_version: EventVersion::initial(), latest_version: EventVersion::initial(), }, }; // Get all streams let streams = self.event_store.list_all_streams().await?; metadata.total_streams = streams.len() as u64; // Create backup writer let backup_path = format!("full-backup-{}.eventcore", backup_id); let mut writer = BackupWriter::new( &backup_path, self.config.compression.clone(), self.config.encryption_enabled, ).await?; // Write backup header writer.write_header(&metadata).await?; // Backup each stream for stream_id in streams { let events = self.backup_stream(&stream_id, &mut writer).await?; metadata.total_events += events; if metadata.total_events % 10000 == 0 { tracing::info!( backup_id = %backup_id, events_backed_up = metadata.total_events, "Backup progress" ); } } // Calculate checksums and finalize metadata.size_bytes = writer.finalize().await?; metadata.checksum = writer.calculate_checksum().await?; // Store backup metadata self.storage.store_backup(&backup_path, &metadata).await?; // Verify backup if configured if self.config.verify_after_backup { self.verify_backup(&backup_id).await?; } let duration = Utc::now().signed_duration_since(start_time); tracing::info!( backup_id = %backup_id, duration_seconds = duration.num_seconds(), total_events = metadata.total_events, size_mb = metadata.size_bytes / (1024 * 1024), "Backup completed successfully" ); Ok(metadata) } pub async fn create_incremental_backup( &self, since: DateTime<Utc>, ) -> Result<BackupMetadata, BackupError> { let backup_id = Uuid::new_v4(); let start_time = Utc::now(); tracing::info!( backup_id = %backup_id, since = %since, "Starting incremental backup" ); // Query events since timestamp let events = self.event_store.read_events_since(since).await?; let mut metadata = BackupMetadata { backup_id, created_at: start_time, format: self.config.backup_format.clone(), compression: self.config.compression.clone(), total_events: events.len() as u64, total_streams: 0, // Will be calculated size_bytes: 0, checksum: String::new(), event_range: self.calculate_event_range(&events), }; // Create backup writer let backup_path = format!("incremental-backup-{}.eventcore", backup_id); let mut writer = BackupWriter::new( &backup_path, self.config.compression.clone(), self.config.encryption_enabled, ).await?; // Write incremental backup writer.write_header(&metadata).await?; let mut unique_streams = std::collections::HashSet::new(); for event in events { writer.write_event(&event).await?; unique_streams.insert(event.stream_id.clone()); } metadata.total_streams = unique_streams.len() as u64; metadata.size_bytes = writer.finalize().await?; metadata.checksum = writer.calculate_checksum().await?; self.storage.store_backup(&backup_path, &metadata).await?; tracing::info!( backup_id = %backup_id, total_events = metadata.total_events, total_streams = metadata.total_streams, "Incremental backup completed" ); Ok(metadata) } async fn backup_stream( &self, stream_id: &StreamId, writer: &mut BackupWriter, ) -> Result<u64, BackupError> { let mut event_count = 0; let mut from_version = EventVersion::initial(); let batch_size = self.config.chunk_size; loop { let options = ReadOptions::default() .from_version(from_version) .limit(batch_size); let stream_events = self.event_store.read_stream(stream_id, options).await?; if stream_events.events.is_empty() { break; } for event in &stream_events.events { writer.write_event(event).await?; event_count += 1; } from_version = EventVersion::from( stream_events.events.last().unwrap().version.as_u64() + 1 ); } Ok(event_count) } fn calculate_event_range(&self, events: &[StoredEvent]) -> EventRange { if events.is_empty() { let now = Utc::now(); return EventRange { earliest_event: now, latest_event: now, earliest_version: EventVersion::initial(), latest_version: EventVersion::initial(), }; } let earliest = events.iter().min_by_key(|e| e.occurred_at).unwrap(); let latest = events.iter().max_by_key(|e| e.occurred_at).unwrap(); EventRange { earliest_event: earliest.occurred_at, latest_event: latest.occurred_at, earliest_version: earliest.version, latest_version: latest.version, } } } struct BackupWriter { file: BufWriter<File>, path: String, compression: CompressionType, encrypted: bool, bytes_written: u64, } impl BackupWriter { async fn new( path: &str, compression: CompressionType, encrypted: bool, ) -> Result<Self, BackupError> { let file = File::create(path).await?; let file = BufWriter::new(file); Ok(Self { file, path: path.to_string(), compression, encrypted, bytes_written: 0, }) } async fn write_header(&mut self, metadata: &BackupMetadata) -> Result<(), BackupError> { let header = serde_json::to_string(metadata)?; let header_line = format!("EVENTCORE_BACKUP_HEADER:{}\n", header); self.file.write_all(header_line.as_bytes()).await?; self.bytes_written += header_line.len() as u64; Ok(()) } async fn write_event(&mut self, event: &StoredEvent) -> Result<(), BackupError> { let event_line = match self.compression { CompressionType::None => { let json = serde_json::to_string(event)?; format!("{}\n", json) } CompressionType::Gzip => { // Implement gzip compression let json = serde_json::to_string(event)?; format!("{}\n", json) // Simplified for example } CompressionType::Zstd => { // Implement zstd compression let json = serde_json::to_string(event)?; format!("{}\n", json) // Simplified for example } }; self.file.write_all(event_line.as_bytes()).await?; self.bytes_written += event_line.len() as u64; Ok(()) } async fn finalize(&mut self) -> Result<u64, BackupError> { self.file.flush().await?; Ok(self.bytes_written) } async fn calculate_checksum(&self) -> Result<String, BackupError> { // Calculate SHA-256 checksum of the backup file use sha2::{Sha256, Digest}; use tokio::fs::File; use tokio::io::AsyncReadExt; let mut file = File::open(&self.path).await?; let mut hasher = Sha256::new(); let mut buffer = [0; 8192]; loop { let bytes_read = file.read(&mut buffer).await?; if bytes_read == 0 { break; } hasher.update(&buffer[..bytes_read]); } Ok(format!("{:x}", hasher.finalize())) } } }
Point-in-Time Recovery
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct PointInTimeRecovery { backup_manager: BackupManager, event_store: Arc<dyn EventStore>, } impl PointInTimeRecovery { pub async fn restore_to_point_in_time( &self, target_time: DateTime<Utc>, ) -> Result<RecoveryResult, RecoveryError> { tracing::info!(target_time = %target_time, "Starting point-in-time recovery"); // Find the best backup to start from let base_backup = self.find_best_base_backup(target_time).await?; // Restore from base backup self.restore_from_backup(&base_backup.backup_id).await?; // Apply incremental backups up to the target time let incremental_backups = self.find_incremental_backups_until( base_backup.created_at, target_time, ).await?; for backup in incremental_backups { self.apply_incremental_backup(&backup.backup_id, Some(target_time)).await?; } // Apply WAL entries up to the exact target time self.apply_wal_entries_until(target_time).await?; // Verify recovery let recovery_result = self.verify_recovery(target_time).await?; tracing::info!( target_time = %target_time, events_restored = recovery_result.events_restored, streams_restored = recovery_result.streams_restored, "Point-in-time recovery completed" ); Ok(recovery_result) } async fn find_best_base_backup( &self, target_time: DateTime<Utc>, ) -> Result<BackupMetadata, RecoveryError> { let backups = self.backup_manager.list_backups().await?; // Find the latest full backup before the target time let base_backup = backups .iter() .filter(|b| b.created_at <= target_time) .filter(|b| matches!(b.format, BackupFormat::JsonLines)) // Full backup indicator .max_by_key(|b| b.created_at) .ok_or(RecoveryError::NoSuitableBackup)?; Ok(base_backup.clone()) } async fn restore_from_backup(&self, backup_id: &Uuid) -> Result<(), RecoveryError> { tracing::info!(backup_id = %backup_id, "Restoring from base backup"); // Clear the event store self.event_store.clear_all().await?; // Read backup file let backup_reader = BackupReader::new(backup_id).await?; let metadata = backup_reader.read_metadata().await?; tracing::info!( backup_id = %backup_id, total_events = metadata.total_events, "Reading backup events" ); // Restore events in batches let batch_size = 1000; let mut events_restored = 0; while let Some(batch) = backup_reader.read_events_batch(batch_size).await? { self.event_store.write_events(batch).await?; events_restored += batch_size; if events_restored % 10000 == 0 { tracing::info!( events_restored = events_restored, "Restore progress" ); } } Ok(()) } async fn apply_wal_entries_until( &self, target_time: DateTime<Utc>, ) -> Result<(), RecoveryError> { // Apply WAL (Write-Ahead Log) entries from PostgreSQL // This provides exact point-in-time recovery let wal_entries = self.read_wal_entries_until(target_time).await?; for entry in wal_entries { if entry.timestamp <= target_time { self.apply_wal_entry(entry).await?; } } Ok(()) } } #[derive(Debug, Clone)] pub struct RecoveryResult { pub events_restored: u64, pub streams_restored: u64, pub recovery_time: DateTime<Utc>, pub data_integrity_verified: bool, } }
Disaster Recovery
Multi-Region Backup Strategy
# Multi-region backup configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: backup-config
namespace: eventcore
data:
backup-policy.yaml: |
# Primary backup configuration
primary:
region: us-east-1
storage: s3://eventcore-backups-primary
schedule: "0 */6 * * *" # Every 6 hours
retention: "30d"
# Cross-region replication
replicas:
- region: us-west-2
storage: s3://eventcore-backups-west
sync_schedule: "0 1 * * *" # Daily sync
retention: "90d"
- region: eu-west-1
storage: s3://eventcore-backups-eu
sync_schedule: "0 2 * * *" # Daily sync
retention: "90d"
# Archive configuration
archive:
storage: glacier://eventcore-archive
after_days: 90
retention: "7y"
Automated Disaster Recovery
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct DisasterRecoveryOrchestrator { primary_region: String, failover_regions: Vec<String>, backup_manager: BackupManager, health_checker: HealthChecker, } impl DisasterRecoveryOrchestrator { pub async fn execute_disaster_recovery( &self, trigger: DisasterTrigger, ) -> Result<RecoveryOutcome, DisasterRecoveryError> { tracing::error!( trigger = ?trigger, "Disaster recovery triggered" ); // Assess the situation let assessment = self.assess_disaster_scope().await?; // Choose recovery strategy let strategy = self.choose_recovery_strategy(&assessment).await?; // Execute recovery match strategy { RecoveryStrategy::LocalRestore => { self.execute_local_restore().await } RecoveryStrategy::RegionalFailover { target_region } => { self.execute_regional_failover(&target_region).await } RecoveryStrategy::FullRebuild => { self.execute_full_rebuild().await } } } async fn assess_disaster_scope(&self) -> Result<DisasterAssessment, DisasterRecoveryError> { let mut assessment = DisasterAssessment::default(); // Check primary database assessment.primary_db_accessible = self.health_checker .check_database_connectivity(&self.primary_region) .await .is_ok(); // Check backup availability assessment.backup_accessible = self.backup_manager .verify_backup_accessibility() .await .is_ok(); // Check replica regions for region in &self.failover_regions { let accessible = self.health_checker .check_database_connectivity(region) .await .is_ok(); assessment.replica_regions.insert(region.clone(), accessible); } // Estimate data loss assessment.estimated_data_loss = self.calculate_potential_data_loss().await?; Ok(assessment) } async fn execute_regional_failover( &self, target_region: &str, ) -> Result<RecoveryOutcome, DisasterRecoveryError> { tracing::info!( target_region = target_region, "Executing regional failover" ); // 1. Promote replica in target region self.promote_replica(target_region).await?; // 2. Update DNS to point to new region self.update_dns_routing(target_region).await?; // 3. Scale up resources in target region self.scale_up_target_region(target_region).await?; // 4. Verify system health let health_check = self.verify_system_health(target_region).await?; // 5. Notify stakeholders self.notify_failover_completion(target_region, &health_check).await?; Ok(RecoveryOutcome { strategy_used: RecoveryStrategy::RegionalFailover { target_region: target_region.to_string(), }, recovery_time: Utc::now(), data_loss_minutes: 0, // Assuming near-real-time replication systems_recovered: health_check.systems_operational, }) } } #[derive(Debug)] pub struct DisasterAssessment { pub primary_db_accessible: bool, pub backup_accessible: bool, pub replica_regions: HashMap<String, bool>, pub estimated_data_loss: Duration, } #[derive(Debug, Clone)] pub enum RecoveryStrategy { LocalRestore, RegionalFailover { target_region: String }, FullRebuild, } #[derive(Debug)] pub enum DisasterTrigger { DatabaseFailure, RegionOutage, DataCorruption, SecurityBreach, ManualTrigger, } }
Data Integrity Verification
Backup Verification
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct BackupVerifier { event_store: Arc<dyn EventStore>, backup_storage: Arc<dyn BackupStorage>, } impl BackupVerifier { pub async fn verify_backup_integrity( &self, backup_id: &Uuid, ) -> Result<VerificationResult, VerificationError> { tracing::info!(backup_id = %backup_id, "Starting backup verification"); let mut result = VerificationResult::default(); // Verify checksum result.checksum_valid = self.verify_checksum(backup_id).await?; // Verify metadata consistency result.metadata_consistent = self.verify_metadata(backup_id).await?; // Verify event integrity result.events_valid = self.verify_events(backup_id).await?; // Verify completeness (if verifying against live system) if let Ok(completeness) = self.verify_completeness(backup_id).await { result.completeness_verified = true; result.missing_events = completeness.missing_events; } result.verification_time = Utc::now(); result.overall_valid = result.checksum_valid && result.metadata_consistent && result.events_valid && result.missing_events == 0; if result.overall_valid { tracing::info!(backup_id = %backup_id, "Backup verification passed"); } else { tracing::error!( backup_id = %backup_id, result = ?result, "Backup verification failed" ); } Ok(result) } async fn verify_checksum(&self, backup_id: &Uuid) -> Result<bool, VerificationError> { let backup_metadata = self.backup_storage.get_metadata(backup_id).await?; let calculated_checksum = self.calculate_backup_checksum(backup_id).await?; Ok(backup_metadata.checksum == calculated_checksum) } async fn verify_events(&self, backup_id: &Uuid) -> Result<bool, VerificationError> { let backup_reader = BackupReader::new(backup_id).await?; let mut events_valid = true; let mut event_count = 0; while let Some(event) = backup_reader.read_next_event().await? { // Verify event structure if !self.is_event_structurally_valid(&event) { tracing::error!( backup_id = %backup_id, event_id = %event.id, "Invalid event structure found" ); events_valid = false; break; } // Verify event ordering (within stream) if !self.is_event_ordering_valid(&event) { tracing::error!( backup_id = %backup_id, event_id = %event.id, "Invalid event ordering found" ); events_valid = false; break; } event_count += 1; if event_count % 10000 == 0 { tracing::info!( backup_id = %backup_id, events_verified = event_count, "Verification progress" ); } } Ok(events_valid) } fn is_event_structurally_valid(&self, event: &StoredEvent) -> bool { // Verify required fields if event.id.is_nil() || event.stream_id.as_ref().is_empty() { return false; } // Verify event ordering within stream if event.version.as_u64() == 0 { return false; } // Verify timestamp is reasonable let now = Utc::now(); if event.occurred_at > now || event.occurred_at < (now - chrono::Duration::days(3650)) { return false; } true } fn is_event_ordering_valid(&self, event: &StoredEvent) -> bool { // This would need to track ordering within streams // Simplified implementation for example true } } #[derive(Debug, Default)] pub struct VerificationResult { pub checksum_valid: bool, pub metadata_consistent: bool, pub events_valid: bool, pub completeness_verified: bool, pub missing_events: u64, pub verification_time: DateTime<Utc>, pub overall_valid: bool, } }
Continuous Integrity Monitoring
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct IntegrityMonitor { event_store: Arc<dyn EventStore>, monitoring_config: IntegrityMonitoringConfig, } #[derive(Debug, Clone)] pub struct IntegrityMonitoringConfig { pub check_interval: Duration, pub sample_percentage: f64, pub alert_on_corruption: bool, pub auto_repair: bool, } impl IntegrityMonitor { pub async fn start_monitoring(&self) -> Result<(), MonitoringError> { tracing::info!("Starting continuous integrity monitoring"); let mut interval = tokio::time::interval(self.monitoring_config.check_interval); loop { interval.tick().await; match self.perform_integrity_check().await { Ok(report) => { if !report.integrity_ok { tracing::error!( corruption_count = report.corrupted_events, "Data integrity issues detected" ); if self.monitoring_config.alert_on_corruption { self.send_corruption_alert(&report).await; } if self.monitoring_config.auto_repair { self.attempt_auto_repair(&report).await; } } else { tracing::debug!("Integrity check passed"); } } Err(e) => { tracing::error!(error = %e, "Integrity check failed"); } } } } async fn perform_integrity_check(&self) -> Result<IntegrityReport, MonitoringError> { let start_time = Utc::now(); let mut report = IntegrityReport::default(); // Sample events for checking let sample_events = self.sample_events().await?; report.events_checked = sample_events.len() as u64; for event in sample_events { // Check event integrity let integrity_check = self.check_event_integrity(&event).await?; if !integrity_check.valid { report.corrupted_events += 1; report.corruption_details.push(integrity_check); } } report.check_time = Utc::now(); report.check_duration = report.check_time.signed_duration_since(start_time); report.integrity_ok = report.corrupted_events == 0; Ok(report) } async fn sample_events(&self) -> Result<Vec<StoredEvent>, MonitoringError> { // Sample a percentage of events for integrity checking let sample_size = ((self.get_total_event_count().await? as f64) * self.monitoring_config.sample_percentage / 100.0) as usize; // Use reservoir sampling or similar technique self.event_store.sample_events(sample_size).await .map_err(MonitoringError::EventStoreError) } async fn check_event_integrity(&self, event: &StoredEvent) -> Result<EventIntegrityCheck, MonitoringError> { let mut check = EventIntegrityCheck { event_id: event.id, stream_id: event.stream_id.clone(), valid: true, issues: Vec::new(), }; // Check payload can be deserialized if let Err(_) = serde_json::from_value::<serde_json::Value>(event.payload.clone()) { check.valid = false; check.issues.push("Payload deserialization failed".to_string()); } // Check metadata is valid if event.metadata.is_empty() { check.issues.push("Missing metadata".to_string()); } // Check event ordering within stream if let Err(_) = self.verify_event_ordering(event).await { check.valid = false; check.issues.push("Event ordering violation".to_string()); } Ok(check) } } #[derive(Debug, Default)] pub struct IntegrityReport { pub check_time: DateTime<Utc>, pub check_duration: chrono::Duration, pub events_checked: u64, pub corrupted_events: u64, pub integrity_ok: bool, pub corruption_details: Vec<EventIntegrityCheck>, } #[derive(Debug)] pub struct EventIntegrityCheck { pub event_id: EventId, pub stream_id: StreamId, pub valid: bool, pub issues: Vec<String>, } }
Backup Testing and Validation
Automated Backup Testing
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct BackupTestSuite { backup_manager: BackupManager, test_event_store: Arc<dyn EventStore>, test_config: BackupTestConfig, } #[derive(Debug, Clone)] pub struct BackupTestConfig { pub test_frequency: Duration, pub full_restore_test_frequency: Duration, pub sample_restore_percentage: f64, pub cleanup_test_data: bool, } impl BackupTestSuite { pub async fn run_comprehensive_backup_tests(&self) -> Result<TestResults, TestError> { tracing::info!("Starting comprehensive backup tests"); let mut results = TestResults::default(); // Test 1: Backup creation results.backup_creation = self.test_backup_creation().await?; // Test 2: Backup verification results.backup_verification = self.test_backup_verification().await?; // Test 3: Partial restore results.partial_restore = self.test_partial_restore().await?; // Test 4: Full restore (if scheduled) if self.should_run_full_restore_test().await? { results.full_restore = Some(self.test_full_restore().await?); } // Test 5: Point-in-time recovery results.point_in_time_recovery = self.test_point_in_time_recovery().await?; // Test 6: Cross-region restore results.cross_region_restore = self.test_cross_region_restore().await?; results.overall_success = results.all_tests_passed(); results.test_time = Utc::now(); if results.overall_success { tracing::info!("All backup tests passed"); } else { tracing::error!(results = ?results, "Some backup tests failed"); } Ok(results) } async fn test_backup_creation(&self) -> Result<TestResult, TestError> { let start_time = Utc::now(); // Create test data let test_events = self.create_test_events(1000).await?; self.write_test_events(&test_events).await?; // Create backup let backup_result = self.backup_manager.create_full_backup().await; let duration = Utc::now().signed_duration_since(start_time); match backup_result { Ok(metadata) => { Ok(TestResult { test_name: "backup_creation".to_string(), success: true, duration, details: format!("Backup created: {}", metadata.backup_id), error: None, }) } Err(e) => { Ok(TestResult { test_name: "backup_creation".to_string(), success: false, duration, details: "Backup creation failed".to_string(), error: Some(e.to_string()), }) } } } async fn test_full_restore(&self) -> Result<TestResult, TestError> { let start_time = Utc::now(); // Get latest backup let latest_backup = self.backup_manager.get_latest_backup().await?; // Create clean test environment let test_store = self.create_clean_test_store().await?; // Perform restore let restore_result = self.restore_backup_to_store( &latest_backup.backup_id, &test_store, ).await; let duration = Utc::now().signed_duration_since(start_time); match restore_result { Ok(_) => { // Verify restore completeness let verification = self.verify_restore_completeness(&test_store).await?; Ok(TestResult { test_name: "full_restore".to_string(), success: verification.complete, duration, details: format!( "Events restored: {}, Streams restored: {}", verification.events_count, verification.streams_count ), error: None, }) } Err(e) => { Ok(TestResult { test_name: "full_restore".to_string(), success: false, duration, details: "Full restore failed".to_string(), error: Some(e.to_string()), }) } } } } #[derive(Debug, Default)] pub struct TestResults { pub backup_creation: TestResult, pub backup_verification: TestResult, pub partial_restore: TestResult, pub full_restore: Option<TestResult>, pub point_in_time_recovery: TestResult, pub cross_region_restore: TestResult, pub overall_success: bool, pub test_time: DateTime<Utc>, } impl TestResults { fn all_tests_passed(&self) -> bool { self.backup_creation.success && self.backup_verification.success && self.partial_restore.success && self.full_restore.as_ref().map_or(true, |t| t.success) && self.point_in_time_recovery.success && self.cross_region_restore.success } } #[derive(Debug, Default)] pub struct TestResult { pub test_name: String, pub success: bool, pub duration: chrono::Duration, pub details: String, pub error: Option<String>, } }
Best Practices
- Regular backups - Automated, frequent backup schedules
- Multiple strategies - Full, incremental, and WAL-based backups
- Geographic distribution - Multi-region backup storage
- Regular testing - Automated backup and restore testing
- Integrity verification - Continuous data integrity monitoring
- Recovery planning - Documented disaster recovery procedures
- Retention policies - Appropriate data retention and archival
- Security - Encrypted backups and secure storage
Summary
EventCore backup and recovery:
- ✅ Comprehensive backups - Full, incremental, and point-in-time
- ✅ Disaster recovery - Multi-region failover capabilities
- ✅ Data integrity - Continuous verification and monitoring
- ✅ Automated testing - Regular backup and restore validation
- ✅ Recovery orchestration - Automated disaster recovery procedures
Key components:
- Implement automated backup strategies with multiple approaches
- Design disaster recovery procedures for various failure scenarios
- Continuously monitor data integrity with automated verification
- Test backup and recovery procedures regularly
- Maintain geographic distribution of backups for resilience
Next, let’s explore Troubleshooting →
Chapter 6.4: Troubleshooting
This chapter provides comprehensive troubleshooting guidance for EventCore applications in production. From common issues to advanced debugging techniques, you’ll learn to diagnose and resolve problems quickly.
Common Issues and Solutions
Command Execution Failures
Issue: Commands timing out
Symptoms:
- Commands taking longer than expected
- Timeout errors in logs
- Degraded system performance
Debugging steps:
#![allow(unused)] fn main() { // Enable detailed command tracing #[tracing::instrument(skip(command, executor), level = "debug")] async fn debug_command_execution<C: Command>( command: &C, executor: &CommandExecutor, ) -> CommandResult<ExecutionResult> { let start = std::time::Instant::now(); tracing::debug!( command_type = std::any::type_name::<C>(), "Starting command execution" ); // Check stream access patterns let read_streams = command.read_streams(&command); tracing::debug!( stream_count = read_streams.len(), streams = ?read_streams, "Command will read from streams" ); // Time each phase let read_start = std::time::Instant::now(); let result = executor.execute(command).await; let total_duration = start.elapsed(); match &result { Ok(execution_result) => { tracing::info!( total_duration_ms = total_duration.as_millis(), events_written = execution_result.events_written.len(), "Command completed successfully" ); } Err(error) => { tracing::error!( total_duration_ms = total_duration.as_millis(), error = %error, "Command failed" ); } } result } }
Common causes and solutions:
-
Database connection pool exhaustion
#![allow(unused)] fn main() { // Check connection pool metrics async fn diagnose_connection_pool(pool: &sqlx::PgPool) { let pool_options = pool.options(); let pool_size = pool.size(); let idle_connections = pool.num_idle(); tracing::info!( max_connections = pool_options.get_max_connections(), current_size = pool_size, idle_connections = idle_connections, active_connections = pool_size - idle_connections, "Connection pool status" ); // Alert if pool utilization is high let utilization = (pool_size as f64) / (pool_options.get_max_connections() as f64); if utilization > 0.8 { tracing::warn!( utilization_percent = utilization * 100.0, "High connection pool utilization" ); } } }
-
Long-running database queries
-- PostgreSQL: Check for long-running queries SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state FROM pg_stat_activity WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes' AND state = 'active';
-
Lock contention on streams
#![allow(unused)] fn main() { // Implement lock timeout and retry async fn execute_with_lock_retry<C: Command>( command: &C, executor: &CommandExecutor, max_retries: u32, ) -> CommandResult<ExecutionResult> { let mut retry_count = 0; loop { match executor.execute(command).await { Ok(result) => return Ok(result), Err(CommandError::ConcurrencyConflict(streams)) => { retry_count += 1; if retry_count >= max_retries { return Err(CommandError::ConcurrencyConflict(streams)); } // Exponential backoff let delay = Duration::from_millis(100 * 2_u64.pow(retry_count - 1)); tokio::time::sleep(delay).await; tracing::warn!( retry_attempt = retry_count, delay_ms = delay.as_millis(), conflicting_streams = ?streams, "Retrying command due to concurrency conflict" ); } Err(other_error) => return Err(other_error), } } } }
Issue: Command validation failures
Symptoms:
- Validation errors in command processing
- Business rule violations
- Data consistency issues
Debugging approach:
#![allow(unused)] fn main() { // Enhanced validation with detailed error reporting #[derive(Debug, thiserror::Error)] pub enum DetailedValidationError { #[error("Field validation failed: {field} - {reason}")] FieldValidation { field: String, reason: String }, #[error("Business rule violation: {rule} - {context}")] BusinessRule { rule: String, context: String }, #[error("State precondition failed: expected {expected}, found {actual}")] StatePrecondition { expected: String, actual: String }, #[error("Reference validation failed: {reference_type} {reference_id} not found")] ReferenceNotFound { reference_type: String, reference_id: String }, } // Validation with detailed context pub fn validate_transfer_command( command: &TransferMoney, state: &AccountState, ) -> Result<(), DetailedValidationError> { // Check amount if command.amount <= Money::zero() { return Err(DetailedValidationError::FieldValidation { field: "amount".to_string(), reason: format!("Amount must be positive, got {}", command.amount), }); } // Check account state if !state.is_active { return Err(DetailedValidationError::StatePrecondition { expected: "active account".to_string(), actual: "inactive account".to_string(), }); } // Check sufficient balance if state.balance < command.amount { return Err(DetailedValidationError::BusinessRule { rule: "sufficient_balance".to_string(), context: format!( "Balance {} insufficient for transfer {}", state.balance, command.amount ), }); } Ok(()) } }
Event Store Issues
Issue: High event store latency
Diagnosis tools:
#![allow(unused)] fn main() { // Event store performance monitor #[derive(Debug, Clone)] pub struct EventStoreMonitor { latency_tracker: Arc<Mutex<LatencyTracker>>, } impl EventStoreMonitor { pub async fn monitor_operation<F, T>(&self, operation_name: &str, operation: F) -> Result<T, EventStoreError> where F: Future<Output = Result<T, EventStoreError>>, { let start = std::time::Instant::now(); let result = operation.await; let duration = start.elapsed(); // Record latency { let mut tracker = self.latency_tracker.lock().await; tracker.record_operation(operation_name, duration, result.is_ok()); } // Alert on high latency if duration > Duration::from_millis(1000) { tracing::warn!( operation = operation_name, duration_ms = duration.as_millis(), success = result.is_ok(), "High latency event store operation" ); } result } pub async fn get_performance_report(&self) -> PerformanceReport { let tracker = self.latency_tracker.lock().await; tracker.generate_report() } } #[derive(Debug)] pub struct LatencyTracker { operations: HashMap<String, Vec<OperationMetric>>, } #[derive(Debug, Clone)] struct OperationMetric { duration: Duration, success: bool, timestamp: DateTime<Utc>, } impl LatencyTracker { pub fn record_operation(&mut self, operation: &str, duration: Duration, success: bool) { let metric = OperationMetric { duration, success, timestamp: Utc::now(), }; self.operations .entry(operation.to_string()) .or_insert_with(Vec::new) .push(metric); // Keep only recent metrics (last hour) let cutoff = Utc::now() - chrono::Duration::hours(1); for metrics in self.operations.values_mut() { metrics.retain(|m| m.timestamp > cutoff); } } pub fn generate_report(&self) -> PerformanceReport { let mut report = PerformanceReport::default(); for (operation, metrics) in &self.operations { if metrics.is_empty() { continue; } let durations: Vec<_> = metrics.iter().map(|m| m.duration).collect(); let success_rate = metrics.iter().filter(|m| m.success).count() as f64 / metrics.len() as f64; let operation_stats = OperationStats { operation_name: operation.clone(), total_operations: metrics.len(), success_rate, avg_duration: durations.iter().sum::<Duration>() / durations.len() as u32, p95_duration: calculate_percentile(&durations, 0.95), p99_duration: calculate_percentile(&durations, 0.99), }; report.operations.push(operation_stats); } report } } fn calculate_percentile(durations: &[Duration], percentile: f64) -> Duration { let mut sorted = durations.to_vec(); sorted.sort(); let index = ((sorted.len() as f64 - 1.0) * percentile) as usize; sorted[index] } }
PostgreSQL-specific debugging:
-- Check for blocking queries
SELECT
blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_statement,
blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity
ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity
ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.GRANTED;
-- Check index usage
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan < 100
ORDER BY idx_scan;
-- Check table and index sizes
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Issue: Event store corruption
Detection and recovery:
#![allow(unused)] fn main() { // Corruption detection pub struct CorruptionDetector { event_store: Arc<dyn EventStore>, } impl CorruptionDetector { pub async fn scan_for_corruption(&self) -> Result<CorruptionReport, ScanError> { let mut report = CorruptionReport::default(); // Scan all streams let all_streams = self.event_store.list_all_streams().await?; for stream_id in all_streams { match self.scan_stream(&stream_id).await { Ok(stream_report) => { if stream_report.has_issues() { report.corrupted_streams.push(stream_report); } } Err(e) => { tracing::error!( stream_id = %stream_id, error = %e, "Failed to scan stream for corruption" ); report.scan_errors.push(ScanError::StreamScanFailed { stream_id: stream_id.clone(), error: e.to_string(), }); } } } report.scan_completed_at = Utc::now(); Ok(report) } async fn scan_stream(&self, stream_id: &StreamId) -> Result<StreamCorruptionReport, ScanError> { let mut report = StreamCorruptionReport { stream_id: stream_id.clone(), issues: Vec::new(), }; let events = self.event_store.read_stream(stream_id, ReadOptions::default()).await?; // Check version sequence for (i, event) in events.events.iter().enumerate() { let expected_version = EventVersion::from(i as u64 + 1); if event.version != expected_version { report.issues.push(CorruptionIssue::VersionGap { event_id: event.id, expected_version, actual_version: event.version, }); } // Check event structure if let Err(e) = self.validate_event_structure(event) { report.issues.push(CorruptionIssue::StructuralError { event_id: event.id, error: e, }); } } Ok(report) } fn validate_event_structure(&self, event: &StoredEvent) -> Result<(), String> { // Check UUID format if event.id.is_nil() { return Err("Nil event ID".to_string()); } // Check payload can be deserialized match serde_json::from_value::<serde_json::Value>(event.payload.clone()) { Ok(_) => {} Err(e) => return Err(format!("Invalid payload JSON: {}", e)), } // Check timestamp is reasonable let now = Utc::now(); if event.occurred_at > now + chrono::Duration::minutes(5) { return Err("Event timestamp is in the future".to_string()); } if event.occurred_at < (now - chrono::Duration::days(10 * 365)) { return Err("Event timestamp is too old".to_string()); } Ok(()) } } #[derive(Debug, Default)] pub struct CorruptionReport { pub corrupted_streams: Vec<StreamCorruptionReport>, pub scan_errors: Vec<ScanError>, pub scan_completed_at: DateTime<Utc>, } #[derive(Debug)] pub struct StreamCorruptionReport { pub stream_id: StreamId, pub issues: Vec<CorruptionIssue>, } impl StreamCorruptionReport { pub fn has_issues(&self) -> bool { !self.issues.is_empty() } } #[derive(Debug)] pub enum CorruptionIssue { VersionGap { event_id: EventId, expected_version: EventVersion, actual_version: EventVersion, }, StructuralError { event_id: EventId, error: String, }, DuplicateEvent { event_id: EventId, duplicate_id: EventId, }, } }
Projection Issues
Issue: Projection lag
Monitoring and diagnosis:
#![allow(unused)] fn main() { // Projection lag monitor #[derive(Debug, Clone)] pub struct ProjectionLagMonitor { event_store: Arc<dyn EventStore>, projection_manager: Arc<ProjectionManager>, } impl ProjectionLagMonitor { pub async fn check_all_projections(&self) -> Result<Vec<ProjectionLagReport>, MonitorError> { let mut reports = Vec::new(); let projections = self.projection_manager.list_projections().await?; let latest_event_time = self.get_latest_event_time().await?; for projection_name in projections { let report = self.check_projection_lag(&projection_name, latest_event_time).await?; reports.push(report); } Ok(reports) } async fn check_projection_lag( &self, projection_name: &str, latest_event_time: DateTime<Utc>, ) -> Result<ProjectionLagReport, MonitorError> { let checkpoint = self.projection_manager .get_checkpoint(projection_name) .await?; let lag = match checkpoint.last_processed_at { Some(last_processed) => latest_event_time.signed_duration_since(last_processed), None => chrono::Duration::max_value(), // Never processed }; let status = if lag > chrono::Duration::minutes(30) { ProjectionStatus::Critical } else if lag > chrono::Duration::minutes(5) { ProjectionStatus::Warning } else { ProjectionStatus::Healthy }; Ok(ProjectionLagReport { projection_name: projection_name.to_string(), lag_duration: lag, status, last_processed_event: checkpoint.last_event_id, last_processed_at: checkpoint.last_processed_at, events_processed: checkpoint.events_processed, }) } async fn get_latest_event_time(&self) -> Result<DateTime<Utc>, MonitorError> { // Get the timestamp of the most recent event across all streams self.event_store.get_latest_event_time().await .map_err(MonitorError::EventStoreError) } } #[derive(Debug)] pub struct ProjectionLagReport { pub projection_name: String, pub lag_duration: chrono::Duration, pub status: ProjectionStatus, pub last_processed_event: Option<EventId>, pub last_processed_at: Option<DateTime<Utc>>, pub events_processed: u64, } #[derive(Debug, Clone)] pub enum ProjectionStatus { Healthy, Warning, Critical, } }
Projection rebuild when corrupted:
#![allow(unused)] fn main() { // Safe projection rebuild pub struct ProjectionRebuilder { event_store: Arc<dyn EventStore>, projection_manager: Arc<ProjectionManager>, } impl ProjectionRebuilder { pub async fn rebuild_projection( &self, projection_name: &str, strategy: RebuildStrategy, ) -> Result<RebuildResult, RebuildError> { tracing::info!( projection_name = projection_name, strategy = ?strategy, "Starting projection rebuild" ); let start_time = Utc::now(); // Create backup of current projection state let backup_id = self.backup_projection_state(projection_name).await?; // Reset projection state self.projection_manager.reset_projection(projection_name).await?; // Rebuild based on strategy let rebuild_result = match strategy { RebuildStrategy::Full => { self.rebuild_from_beginning(projection_name).await } RebuildStrategy::FromCheckpoint { checkpoint_time } => { self.rebuild_from_checkpoint(projection_name, checkpoint_time).await } RebuildStrategy::FromEvent { event_id } => { self.rebuild_from_event(projection_name, event_id).await } }; match rebuild_result { Ok(stats) => { // Rebuild successful - clean up backup self.cleanup_projection_backup(backup_id).await?; let duration = Utc::now().signed_duration_since(start_time); tracing::info!( projection_name = projection_name, events_processed = stats.events_processed, duration_seconds = duration.num_seconds(), "Projection rebuild completed successfully" ); Ok(RebuildResult { success: true, events_processed: stats.events_processed, duration, backup_id: Some(backup_id), }) } Err(e) => { // Rebuild failed - restore from backup tracing::error!( projection_name = projection_name, error = %e, "Projection rebuild failed, restoring from backup" ); self.restore_projection_from_backup(projection_name, backup_id).await?; Err(RebuildError::RebuildFailed { original_error: Box::new(e), backup_restored: true, }) } } } async fn rebuild_from_beginning(&self, projection_name: &str) -> Result<RebuildStats, RebuildError> { let mut stats = RebuildStats::default(); // Get all events in chronological order let events = self.event_store.read_all_events_ordered().await?; // Process events in batches let batch_size = 1000; for chunk in events.chunks(batch_size) { self.projection_manager .process_events_batch(projection_name, chunk) .await?; stats.events_processed += chunk.len() as u64; // Checkpoint every batch self.projection_manager .save_checkpoint(projection_name) .await?; // Progress reporting if stats.events_processed % 10000 == 0 { tracing::info!( projection_name = projection_name, events_processed = stats.events_processed, "Rebuild progress" ); } } Ok(stats) } } #[derive(Debug)] pub enum RebuildStrategy { Full, FromCheckpoint { checkpoint_time: DateTime<Utc> }, FromEvent { event_id: EventId }, } #[derive(Debug, Default)] pub struct RebuildStats { pub events_processed: u64, } #[derive(Debug)] pub struct RebuildResult { pub success: bool, pub events_processed: u64, pub duration: chrono::Duration, pub backup_id: Option<Uuid>, } }
Debugging Tools
Command Execution Tracer
#![allow(unused)] fn main() { // Detailed command execution tracer #[derive(Debug, Clone)] pub struct CommandTracer { traces: Arc<Mutex<HashMap<Uuid, CommandTrace>>>, } #[derive(Debug, Clone)] pub struct CommandTrace { pub trace_id: Uuid, pub command_type: String, pub start_time: DateTime<Utc>, pub phases: Vec<TracePhase>, pub completed: bool, pub result: Option<Result<String, String>>, } #[derive(Debug, Clone)] pub struct TracePhase { pub phase_name: String, pub start_time: DateTime<Utc>, pub duration: Option<Duration>, pub details: HashMap<String, String>, } impl CommandTracer { pub fn start_trace<C: Command>(&self, command: &C) -> Uuid { let trace_id = Uuid::new_v4(); let trace = CommandTrace { trace_id, command_type: std::any::type_name::<C>().to_string(), start_time: Utc::now(), phases: Vec::new(), completed: false, result: None, }; let mut traces = self.traces.lock().unwrap(); traces.insert(trace_id, trace); tracing::info!( trace_id = %trace_id, command_type = std::any::type_name::<C>(), "Started command trace" ); trace_id } pub fn add_phase(&self, trace_id: Uuid, phase_name: &str, details: HashMap<String, String>) { let mut traces = self.traces.lock().unwrap(); if let Some(trace) = traces.get_mut(&trace_id) { trace.phases.push(TracePhase { phase_name: phase_name.to_string(), start_time: Utc::now(), duration: None, details, }); } } pub fn complete_phase(&self, trace_id: Uuid) { let mut traces = self.traces.lock().unwrap(); if let Some(trace) = traces.get_mut(&trace_id) { if let Some(last_phase) = trace.phases.last_mut() { last_phase.duration = Some( Utc::now().signed_duration_since(last_phase.start_time).to_std().unwrap_or_default() ); } } } pub fn complete_trace(&self, trace_id: Uuid, result: Result<String, String>) { let mut traces = self.traces.lock().unwrap(); if let Some(trace) = traces.get_mut(&trace_id) { trace.completed = true; trace.result = Some(result); let total_duration = Utc::now().signed_duration_since(trace.start_time); tracing::info!( trace_id = %trace_id, duration_ms = total_duration.num_milliseconds(), phases = trace.phases.len(), success = trace.result.as_ref().unwrap().is_ok(), "Completed command trace" ); } } pub fn get_trace(&self, trace_id: Uuid) -> Option<CommandTrace> { let traces = self.traces.lock().unwrap(); traces.get(&trace_id).cloned() } pub fn get_recent_traces(&self, limit: usize) -> Vec<CommandTrace> { let traces = self.traces.lock().unwrap(); let mut trace_list: Vec<_> = traces.values().cloned().collect(); trace_list.sort_by(|a, b| b.start_time.cmp(&a.start_time)); trace_list.into_iter().take(limit).collect() } } // Usage in command executor pub async fn execute_with_tracing<C: Command>( command: &C, executor: &CommandExecutor, tracer: &CommandTracer, ) -> CommandResult<ExecutionResult> { let trace_id = tracer.start_trace(command); // Phase 1: Stream Reading tracer.add_phase(trace_id, "stream_reading", hashmap! { "streams_to_read".to_string() => command.read_streams(command).len().to_string(), }); let result = executor.execute(command).await; tracer.complete_phase(trace_id); // Complete trace let trace_result = match &result { Ok(execution_result) => Ok(format!( "Events written: {}, Streams affected: {}", execution_result.events_written.len(), execution_result.affected_streams.len() )), Err(e) => Err(e.to_string()), }; tracer.complete_trace(trace_id, trace_result); result } }
Performance Profiler
#![allow(unused)] fn main() { // Built-in performance profiler #[derive(Debug, Clone)] pub struct PerformanceProfiler { profiles: Arc<Mutex<HashMap<String, PerformanceProfile>>>, enabled: bool, } #[derive(Debug, Clone)] pub struct PerformanceProfile { pub operation_name: String, pub samples: Vec<PerformanceSample>, pub statistics: ProfileStatistics, } #[derive(Debug, Clone)] pub struct PerformanceSample { pub timestamp: DateTime<Utc>, pub duration: Duration, pub memory_before: usize, pub memory_after: usize, pub success: bool, pub metadata: HashMap<String, String>, } #[derive(Debug, Clone, Default)] pub struct ProfileStatistics { pub total_samples: usize, pub success_rate: f64, pub avg_duration: Duration, pub min_duration: Duration, pub max_duration: Duration, pub p95_duration: Duration, pub avg_memory_delta: i64, } impl PerformanceProfiler { pub fn new(enabled: bool) -> Self { Self { profiles: Arc::new(Mutex::new(HashMap::new())), enabled, } } pub async fn profile_operation<F, T>(&self, operation_name: &str, operation: F) -> T where F: Future<Output = T>, { if !self.enabled { return operation.await; } let memory_before = self.get_current_memory_usage(); let start_time = Utc::now(); let start_instant = std::time::Instant::now(); let result = operation.await; let duration = start_instant.elapsed(); let memory_after = self.get_current_memory_usage(); let sample = PerformanceSample { timestamp: start_time, duration, memory_before, memory_after, success: true, // Would need to be determined by operation type metadata: HashMap::new(), }; // Record sample let mut profiles = self.profiles.lock().await; let profile = profiles.entry(operation_name.to_string()).or_insert_with(|| { PerformanceProfile { operation_name: operation_name.to_string(), samples: Vec::new(), statistics: ProfileStatistics::default(), } }); profile.samples.push(sample); // Update statistics self.update_statistics(profile); // Keep only recent samples (last hour) let cutoff = Utc::now() - chrono::Duration::hours(1); profile.samples.retain(|s| s.timestamp > cutoff); result } fn update_statistics(&self, profile: &mut PerformanceProfile) { if profile.samples.is_empty() { return; } let mut durations: Vec<_> = profile.samples.iter().map(|s| s.duration).collect(); durations.sort(); let success_count = profile.samples.iter().filter(|s| s.success).count(); profile.statistics = ProfileStatistics { total_samples: profile.samples.len(), success_rate: success_count as f64 / profile.samples.len() as f64, avg_duration: durations.iter().sum::<Duration>() / durations.len() as u32, min_duration: durations[0], max_duration: durations[durations.len() - 1], p95_duration: durations[(durations.len() as f64 * 0.95) as usize], avg_memory_delta: profile.samples.iter() .map(|s| s.memory_after as i64 - s.memory_before as i64) .sum::<i64>() / profile.samples.len() as i64, }; } fn get_current_memory_usage(&self) -> usize { // Platform-specific memory usage detection // This is a simplified implementation 0 } pub async fn get_profile_report(&self) -> HashMap<String, ProfileStatistics> { let profiles = self.profiles.lock().await; profiles.iter() .map(|(name, profile)| (name.clone(), profile.statistics.clone())) .collect() } } }
Log Analysis Tools
#![allow(unused)] fn main() { // Automated log analysis for common issues #[derive(Debug, Clone)] pub struct LogAnalyzer { log_patterns: Vec<LogPattern>, } #[derive(Debug, Clone)] pub struct LogPattern { pub name: String, pub pattern: String, pub severity: LogSeverity, pub action: String, } #[derive(Debug, Clone)] pub enum LogSeverity { Info, Warning, Error, Critical, } impl LogAnalyzer { pub fn new() -> Self { Self { log_patterns: Self::default_patterns(), } } fn default_patterns() -> Vec<LogPattern> { vec![ LogPattern { name: "connection_pool_exhaustion".to_string(), pattern: r"(?i)connection.*pool.*exhausted|too many connections".to_string(), severity: LogSeverity::Critical, action: "Scale up connection pool or check for connection leaks".to_string(), }, LogPattern { name: "command_timeout".to_string(), pattern: r"(?i)command.*timeout|execution.*timeout".to_string(), severity: LogSeverity::Error, action: "Check database performance and query optimization".to_string(), }, LogPattern { name: "concurrency_conflict".to_string(), pattern: r"(?i)concurrency.*conflict|version.*conflict".to_string(), severity: LogSeverity::Warning, action: "Consider optimizing command patterns or retry strategies".to_string(), }, LogPattern { name: "memory_pressure".to_string(), pattern: r"(?i)out of memory|memory.*limit|allocation.*failed".to_string(), severity: LogSeverity::Critical, action: "Scale up memory or check for memory leaks".to_string(), }, LogPattern { name: "projection_lag".to_string(), pattern: r"(?i)projection.*lag|projection.*behind".to_string(), severity: LogSeverity::Warning, action: "Check projection performance and consider scaling".to_string(), }, ] } pub async fn analyze_logs(&self, log_entries: &[LogEntry]) -> LogAnalysisReport { let mut report = LogAnalysisReport::default(); for entry in log_entries { for pattern in &self.log_patterns { if self.matches_pattern(&entry.message, &pattern.pattern) { let issue = LogIssue { pattern_name: pattern.name.clone(), severity: pattern.severity.clone(), message: entry.message.clone(), timestamp: entry.timestamp, action: pattern.action.clone(), occurrences: 1, }; // Aggregate similar issues if let Some(existing) = report.issues.iter_mut() .find(|i| i.pattern_name == issue.pattern_name) { existing.occurrences += 1; if entry.timestamp > existing.timestamp { existing.timestamp = entry.timestamp; existing.message = entry.message.clone(); } } else { report.issues.push(issue); } } } } // Sort by severity and occurrence count report.issues.sort_by(|a, b| { match (&a.severity, &b.severity) { (LogSeverity::Critical, LogSeverity::Critical) => b.occurrences.cmp(&a.occurrences), (LogSeverity::Critical, _) => std::cmp::Ordering::Less, (_, LogSeverity::Critical) => std::cmp::Ordering::Greater, (LogSeverity::Error, LogSeverity::Error) => b.occurrences.cmp(&a.occurrences), (LogSeverity::Error, _) => std::cmp::Ordering::Less, (_, LogSeverity::Error) => std::cmp::Ordering::Greater, _ => b.occurrences.cmp(&a.occurrences), } }); report } fn matches_pattern(&self, message: &str, pattern: &str) -> bool { use regex::Regex; if let Ok(regex) = Regex::new(pattern) { regex.is_match(message) } else { false } } } #[derive(Debug, Default)] pub struct LogAnalysisReport { pub issues: Vec<LogIssue>, } #[derive(Debug)] pub struct LogIssue { pub pattern_name: String, pub severity: LogSeverity, pub message: String, pub timestamp: DateTime<Utc>, pub action: String, pub occurrences: u32, } #[derive(Debug)] pub struct LogEntry { pub timestamp: DateTime<Utc>, pub level: String, pub message: String, pub metadata: HashMap<String, String>, } }
Troubleshooting Runbooks
Common Runbooks
Runbook 1: High Command Latency
-
Check connection pool status
curl http://localhost:9090/metrics | grep eventcore_connection_pool
-
Analyze slow queries
SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;
-
Check for lock contention
SELECT * FROM pg_locks WHERE NOT granted;
-
Scale resources if needed
kubectl scale deployment eventcore-app --replicas=6
Runbook 2: Projection Lag
-
Check projection status
curl http://localhost:8080/health/projections
-
Identify lagging projections
curl http://localhost:9090/metrics | grep projection_lag
-
Restart projection processing
kubectl delete pod -l app=eventcore-projections
-
Consider projection rebuild if corruption detected
kubectl exec -it eventcore-app -- eventcore-cli projection rebuild user-summary
Runbook 3: Memory Issues
-
Check memory usage
kubectl top pods -l app=eventcore
-
Analyze memory patterns
curl http://localhost:9090/metrics | grep memory_usage
-
Generate heap dump if needed
kubectl exec -it eventcore-app -- kill -USR1 1
-
Scale up memory limits
resources: limits: memory: "1Gi"
Best Practices
- Comprehensive monitoring - Monitor all system components
- Automated diagnostics - Use tools to detect issues early
- Detailed logging - Include context and correlation IDs
- Performance profiling - Regular performance analysis
- Runbook maintenance - Keep troubleshooting guides updated
- Incident response - Defined escalation procedures
- Root cause analysis - Learn from every incident
- Preventive measures - Address issues before they become problems
Summary
EventCore troubleshooting:
- ✅ Systematic diagnosis - Structured approach to problem identification
- ✅ Comprehensive tools - Built-in debugging and monitoring tools
- ✅ Automated analysis - Log analysis and pattern detection
- ✅ Performance profiling - Detailed performance insights
- ✅ Runbook automation - Standardized troubleshooting procedures
Key components:
- Use comprehensive monitoring to detect issues early
- Implement systematic debugging approaches for complex problems
- Maintain detailed logs with proper correlation and context
- Use automated tools for log analysis and pattern detection
- Document and automate common troubleshooting procedures
Next, let’s explore Production Checklist →
Chapter 6.5: Production Checklist
This chapter provides a comprehensive checklist for deploying EventCore applications to production. Use this as a final validation before going live and as a periodic review for existing production systems.
Pre-Deployment Checklist
Security
Authentication and Authorization
- JWT secret key configured and secured
- Token expiration properly configured
- Role-based access control implemented and tested
- API rate limiting configured
- CORS origins restricted to known domains
- HTTPS enforced for all endpoints
- Security headers configured (HSTS, CSP, etc.)
#![allow(unused)] fn main() { // Security configuration validation #[derive(Debug)] pub struct SecurityAudit { pub findings: Vec<SecurityFinding>, } #[derive(Debug)] pub struct SecurityFinding { pub category: SecurityCategory, pub severity: SecuritySeverity, pub description: String, pub recommendation: String, } #[derive(Debug)] pub enum SecurityCategory { Authentication, Authorization, Encryption, NetworkSecurity, DataProtection, } #[derive(Debug)] pub enum SecuritySeverity { Critical, High, Medium, Low, } pub struct SecurityAuditor; impl SecurityAuditor { pub fn audit_configuration(config: &AppConfig) -> SecurityAudit { let mut findings = Vec::new(); // Check JWT configuration if config.jwt.secret_key.len() < 32 { findings.push(SecurityFinding { category: SecurityCategory::Authentication, severity: SecuritySeverity::Critical, description: "JWT secret key is too short".to_string(), recommendation: "Use a secret key of at least 256 bits (32 bytes)".to_string(), }); } // Check CORS configuration if config.cors.allowed_origins.contains(&"*".to_string()) { findings.push(SecurityFinding { category: SecurityCategory::NetworkSecurity, severity: SecuritySeverity::High, description: "CORS allows all origins".to_string(), recommendation: "Restrict CORS to specific trusted domains".to_string(), }); } // Check HTTPS enforcement if !config.server.force_https { findings.push(SecurityFinding { category: SecurityCategory::NetworkSecurity, severity: SecuritySeverity::High, description: "HTTPS not enforced".to_string(), recommendation: "Enable HTTPS enforcement for all endpoints".to_string(), }); } // Check rate limiting if config.rate_limiting.requests_per_minute == 0 { findings.push(SecurityFinding { category: SecurityCategory::NetworkSecurity, severity: SecuritySeverity::Medium, description: "Rate limiting not configured".to_string(), recommendation: "Configure appropriate rate limits for API endpoints".to_string(), }); } SecurityAudit { findings } } } }
Database Security
- Database credentials stored in secrets management
- Connection encryption (SSL/TLS) enabled
- Database user permissions follow principle of least privilege
- Database firewall rules restrict access
- Connection pooling properly configured
- Query parameterization used (prevent SQL injection)
-- PostgreSQL security checklist queries
-- Check SSL is enforced
SHOW ssl;
-- Check user permissions
\du
-- Check database-level permissions
SELECT datname, datacl FROM pg_database;
-- Check table-level permissions
SELECT schemaname, tablename, tableowner, tablespace, hasindexes, hasrules, hastriggers
FROM pg_tables
WHERE schemaname = 'public';
-- Verify no wildcard permissions
SELECT * FROM information_schema.table_privileges
WHERE grantee = 'PUBLIC';
Performance
Resource Limits
- CPU limits set appropriately
- Memory limits configured with buffer
- Database connection pool sized correctly
- Request timeouts configured
- Circuit breakers implemented
- Resource quotas set at namespace level
# Kubernetes resource configuration checklist
apiVersion: v1
kind: LimitRange
metadata:
name: eventcore-limits
namespace: eventcore
spec:
limits:
- type: Container
default:
memory: "512Mi"
cpu: "500m"
defaultRequest:
memory: "256Mi"
cpu: "250m"
max:
memory: "2Gi"
cpu: "2000m"
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: eventcore-quota
namespace: eventcore
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
persistentvolumeclaims: "4"
Performance Benchmarks
- Load testing completed with realistic scenarios
- Performance baselines established
- Scalability limits identified
- Database query performance optimized
- Index usage analyzed and optimized
#![allow(unused)] fn main() { // Performance validation pub struct PerformanceValidator { target_metrics: PerformanceTargets, } #[derive(Debug, Clone)] pub struct PerformanceTargets { pub max_p95_latency_ms: u64, pub min_throughput_rps: f64, pub max_error_rate: f64, pub max_memory_usage_mb: f64, } impl PerformanceValidator { pub async fn validate_performance(&self) -> Result<PerformanceValidationResult, ValidationError> { let mut results = PerformanceValidationResult::default(); // Test command latency let latency_test = self.test_command_latency().await?; results.latency_passed = latency_test.p95_latency_ms <= self.target_metrics.max_p95_latency_ms; // Test throughput let throughput_test = self.test_throughput().await?; results.throughput_passed = throughput_test.requests_per_second >= self.target_metrics.min_throughput_rps; // Test error rate let error_test = self.test_error_rate().await?; results.error_rate_passed = error_test.error_rate <= self.target_metrics.max_error_rate; // Test memory usage let memory_test = self.test_memory_usage().await?; results.memory_passed = memory_test.peak_memory_mb <= self.target_metrics.max_memory_usage_mb; results.overall_passed = results.latency_passed && results.throughput_passed && results.error_rate_passed && results.memory_passed; Ok(results) } async fn test_command_latency(&self) -> Result<LatencyTestResult, ValidationError> { // Implement latency testing // Execute sample commands and measure response times Ok(LatencyTestResult { p95_latency_ms: 50, // Example result avg_latency_ms: 25, }) } async fn test_throughput(&self) -> Result<ThroughputTestResult, ValidationError> { // Implement throughput testing // Execute concurrent commands and measure RPS Ok(ThroughputTestResult { requests_per_second: 150.0, // Example result peak_concurrent_requests: 50, }) } } #[derive(Debug, Default)] pub struct PerformanceValidationResult { pub latency_passed: bool, pub throughput_passed: bool, pub error_rate_passed: bool, pub memory_passed: bool, pub overall_passed: bool, } }
Reliability
High Availability
- Multiple replicas deployed
- Pod disruption budgets configured
- Health checks implemented and tested
- Readiness probes properly configured
- Liveness probes tuned appropriately
- Rolling update strategy configured
# High availability configuration
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: eventcore-pdb
namespace: eventcore
spec:
minAvailable: 2
selector:
matchLabels:
app: eventcore
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: eventcore-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
spec:
containers:
- name: eventcore-app
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
Backup and Recovery
- Automated backups configured and tested
- Backup verification automated
- Recovery procedures documented and tested
- Point-in-time recovery capability verified
- Cross-region backup replication configured
- Backup retention policies implemented
#![allow(unused)] fn main() { // Backup validation pub struct BackupValidator; impl BackupValidator { pub async fn validate_backup_system(&self) -> Result<BackupValidationResult, ValidationError> { let mut result = BackupValidationResult::default(); // Test backup creation result.backup_creation = self.test_backup_creation().await?; // Test backup verification result.backup_verification = self.test_backup_verification().await?; // Test restore functionality result.restore_capability = self.test_restore_capability().await?; // Test backup schedule result.backup_schedule = self.verify_backup_schedule().await?; // Test retention policy result.retention_policy = self.verify_retention_policy().await?; result.overall_passed = result.backup_creation && result.backup_verification && result.restore_capability && result.backup_schedule && result.retention_policy; Ok(result) } } #[derive(Debug, Default)] pub struct BackupValidationResult { pub backup_creation: bool, pub backup_verification: bool, pub restore_capability: bool, pub backup_schedule: bool, pub retention_policy: bool, pub overall_passed: bool, } }
Monitoring and Observability
Metrics Collection
- Application metrics exported to Prometheus
- Business metrics tracked
- Infrastructure metrics monitored
- Custom dashboards created for key metrics
- SLI/SLO defined and monitored
#![allow(unused)] fn main() { // Metrics validation pub struct MetricsValidator { prometheus_client: PrometheusClient, } impl MetricsValidator { pub async fn validate_metrics(&self) -> Result<MetricsValidationResult, ValidationError> { let mut result = MetricsValidationResult::default(); // Check core application metrics result.core_metrics = self.check_core_metrics().await?; // Check business metrics result.business_metrics = self.check_business_metrics().await?; // Check infrastructure metrics result.infrastructure_metrics = self.check_infrastructure_metrics().await?; // Verify metric freshness result.metrics_current = self.check_metrics_freshness().await?; result.overall_passed = result.core_metrics && result.business_metrics && result.infrastructure_metrics && result.metrics_current; Ok(result) } async fn check_core_metrics(&self) -> Result<bool, ValidationError> { let required_metrics = vec![ "eventcore_commands_total", "eventcore_command_duration_seconds", "eventcore_events_written_total", "eventcore_active_streams", "eventcore_projection_lag_seconds", ]; for metric in required_metrics { if !self.prometheus_client.metric_exists(metric).await? { return Ok(false); } } Ok(true) } } }
Logging
- Structured logging implemented
- Log aggregation configured
- Log retention policies set
- Correlation IDs used throughout
- Log levels appropriately configured
- Sensitive data excluded from logs
Alerting
- Critical alerts configured
- Warning alerts tuned to reduce noise
- Alert routing configured for different severities
- Escalation policies defined
- Alert fatigue minimized through proper thresholds
# Alerting validation checklist
groups:
- name: eventcore-critical
rules:
- alert: EventCoreDown
expr: up{job="eventcore"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "EventCore service is down"
- alert: HighErrorRate
expr: rate(eventcore_command_errors_total[5m]) / rate(eventcore_commands_total[5m]) > 0.05
for: 3m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: DatabaseConnectionFailure
expr: eventcore_connection_pool_errors_total > 0
for: 1m
labels:
severity: critical
annotations:
summary: "Database connection issues"
Deployment Checklist
Environment Configuration
- Environment variables properly set
- Secrets configured and mounted
- Config maps updated
- Feature flags configured appropriately
- Resource limits applied
- Network policies configured
Database Setup
- Database migrations applied and verified
- Database indexes created and optimized
- Database monitoring configured
- Connection pooling tuned
- Backup strategy implemented
- Read replicas configured if needed
Infrastructure
- DNS records configured
- Load balancer configured
- SSL certificates installed and valid
- CDN configured if applicable
- Firewall rules applied
- Network segmentation implemented
Post-Deployment Verification
Functional Testing
- Smoke tests pass
- Critical user journeys work
- API endpoints respond correctly
- Authentication works
- Authorization enforced
- Error handling works properly
#![allow(unused)] fn main() { // Post-deployment validation suite pub struct PostDeploymentValidator { base_url: String, auth_token: String, } impl PostDeploymentValidator { pub async fn run_validation_suite(&self) -> Result<ValidationSuite, ValidationError> { let mut suite = ValidationSuite::default(); // Test 1: Health check suite.health_check = self.test_health_endpoint().await?; // Test 2: Authentication suite.authentication = self.test_authentication().await?; // Test 3: Core functionality suite.core_functionality = self.test_core_functionality().await?; // Test 4: Performance suite.performance = self.test_basic_performance().await?; // Test 5: Error handling suite.error_handling = self.test_error_handling().await?; suite.overall_passed = suite.health_check && suite.authentication && suite.core_functionality && suite.performance && suite.error_handling; Ok(suite) } async fn test_health_endpoint(&self) -> Result<bool, ValidationError> { let response = reqwest::get(&format!("{}/health", self.base_url)).await?; Ok(response.status().is_success()) } async fn test_authentication(&self) -> Result<bool, ValidationError> { // Test with valid token let client = reqwest::Client::new(); let response = client .get(&format!("{}/api/v1/test", self.base_url)) .header("Authorization", format!("Bearer {}", self.auth_token)) .send() .await?; if !response.status().is_success() { return Ok(false); } // Test without token (should fail) let response = client .get(&format!("{}/api/v1/test", self.base_url)) .send() .await?; Ok(response.status() == 401) } async fn test_core_functionality(&self) -> Result<bool, ValidationError> { // Test a simple command execution let client = reqwest::Client::new(); let create_user_payload = serde_json::json!({ "email": "test@example.com", "first_name": "Test", "last_name": "User" }); let response = client .post(&format!("{}/api/v1/users", self.base_url)) .header("Authorization", format!("Bearer {}", self.auth_token)) .json(&create_user_payload) .send() .await?; Ok(response.status().is_success()) } } #[derive(Debug, Default)] pub struct ValidationSuite { pub health_check: bool, pub authentication: bool, pub core_functionality: bool, pub performance: bool, pub error_handling: bool, pub overall_passed: bool, } }
Performance Validation
- Response times within acceptable limits
- Throughput meets requirements
- Resource usage within limits
- Memory leaks not detected
- CPU usage stable
- Database performance optimal
Monitoring Validation
- Metrics flowing to monitoring system
- Logs being collected and indexed
- Traces visible in tracing system
- Alerts triggering appropriately
- Dashboards showing correct data
- SLI/SLO monitoring active
Ongoing Operations Checklist
Daily Checks
- System health green across all services
- Error rates within acceptable thresholds
- Performance metrics meeting SLOs
- Resource utilization not approaching limits
- Log analysis for new error patterns
- Security alerts reviewed
Weekly Checks
- Backup verification completed successfully
- Performance trends analyzed
- Capacity planning reviewed
- Security patches evaluated and applied
- Dependency updates reviewed
- Documentation updated as needed
Monthly Checks
- Disaster recovery procedures tested
- Security audit completed
- Performance benchmarks updated
- Cost optimization opportunities identified
- Capacity forecasting updated
- Runbook accuracy verified
Automation Scripts
Deployment Validation Script
#!/bin/bash
# deployment-validation.sh
set -e
NAMESPACE="eventcore"
APP_NAME="eventcore-app"
BASE_URL="https://api.eventcore.example.com"
echo "🚀 Starting deployment validation..."
# Check deployment status
echo "📋 Checking deployment status..."
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s
# Check pod health
echo "🏥 Checking pod health..."
READY_PODS=$(kubectl get pods -l app=$APP_NAME -n $NAMESPACE -o jsonpath='{.items[?(@.status.phase=="Running")].metadata.name}' | wc -w)
DESIRED_PODS=$(kubectl get deployment $APP_NAME -n $NAMESPACE -o jsonpath='{.spec.replicas}')
if [ "$READY_PODS" -ne "$DESIRED_PODS" ]; then
echo "❌ Not all pods are ready: $READY_PODS/$DESIRED_PODS"
exit 1
fi
echo "✅ All pods are ready: $READY_PODS/$DESIRED_PODS"
# Check health endpoint
echo "🔍 Testing health endpoint..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $BASE_URL/health)
if [ "$HTTP_STATUS" -ne 200 ]; then
echo "❌ Health check failed with status: $HTTP_STATUS"
exit 1
fi
echo "✅ Health check passed"
# Check metrics endpoint
echo "📊 Testing metrics endpoint..."
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $BASE_URL/metrics)
if [ "$HTTP_STATUS" -ne 200 ]; then
echo "❌ Metrics endpoint failed with status: $HTTP_STATUS"
exit 1
fi
echo "✅ Metrics endpoint responding"
# Check database connectivity
echo "🗄️ Testing database connectivity..."
kubectl exec -n $NAMESPACE deployment/$APP_NAME -- eventcore-cli health-check database
if [ $? -ne 0 ]; then
echo "❌ Database connectivity check failed"
exit 1
fi
echo "✅ Database connectivity verified"
# Run smoke tests
echo "💨 Running smoke tests..."
kubectl exec -n $NAMESPACE deployment/$APP_NAME -- eventcore-cli test smoke
if [ $? -ne 0 ]; then
echo "❌ Smoke tests failed"
exit 1
fi
echo "✅ Smoke tests passed"
echo "🎉 Deployment validation completed successfully!"
Health Check Script
#!/bin/bash
# health-check.sh
set -e
NAMESPACE="eventcore"
PROMETHEUS_URL="http://prometheus.monitoring.svc.cluster.local:9090"
echo "🔍 Running comprehensive health check..."
# Check application health
echo "📱 Checking application health..."
APP_UP=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=up{job=\"eventcore\"}" | jq '.data.result[0].value[1]' -r)
if [ "$APP_UP" != "1" ]; then
echo "❌ Application is down"
exit 1
fi
# Check error rate
echo "🚨 Checking error rate..."
ERROR_RATE=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=rate(eventcore_command_errors_total[5m])/rate(eventcore_commands_total[5m])" | jq '.data.result[0].value[1]' -r)
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
echo "❌ High error rate detected: $ERROR_RATE"
exit 1
fi
# Check response time
echo "⏱️ Checking response time..."
P95_LATENCY=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=histogram_quantile(0.95, rate(eventcore_command_duration_seconds_bucket[5m]))" | jq '.data.result[0].value[1]' -r)
if (( $(echo "$P95_LATENCY > 1.0" | bc -l) )); then
echo "❌ High latency detected: ${P95_LATENCY}s"
exit 1
fi
# Check database connectivity
echo "🗄️ Checking database health..."
DB_CONNECTIONS=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=eventcore_connection_pool_size" | jq '.data.result[0].value[1]' -r)
MAX_CONNECTIONS=$(curl -s "$PROMETHEUS_URL/api/v1/query?query=eventcore_connection_pool_max_size" | jq '.data.result[0].value[1]' -r)
UTILIZATION=$(echo "scale=2; $DB_CONNECTIONS / $MAX_CONNECTIONS" | bc)
if (( $(echo "$UTILIZATION > 0.8" | bc -l) )); then
echo "⚠️ High database connection utilization: $UTILIZATION"
fi
echo "✅ All health checks passed!"
Emergency Procedures
Incident Response
- Assess severity using incident severity matrix
- Activate incident response team if critical
- Create incident tracking (ticket/channel)
- Implement immediate mitigation if possible
- Communicate status to stakeholders
- Investigate root cause after mitigation
- Document lessons learned and improvements
Rollback Procedures
#!/bin/bash
# emergency-rollback.sh
NAMESPACE="eventcore"
APP_NAME="eventcore-app"
echo "🚨 Emergency rollback initiated..."
# Get previous revision
CURRENT_REVISION=$(kubectl rollout history deployment/$APP_NAME -n $NAMESPACE --output=json | jq '.items[-1].revision')
PREVIOUS_REVISION=$((CURRENT_REVISION - 1))
echo "Rolling back from revision $CURRENT_REVISION to $PREVIOUS_REVISION"
# Perform rollback
kubectl rollout undo deployment/$APP_NAME -n $NAMESPACE --to-revision=$PREVIOUS_REVISION
# Wait for rollback to complete
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s
# Verify health
sleep 30
./health-check.sh
echo "✅ Emergency rollback completed"
Summary
Production readiness checklist for EventCore:
- ✅ Security - Authentication, authorization, encryption
- ✅ Performance - Resource limits, optimization, benchmarks
- ✅ Reliability - High availability, backup and recovery
- ✅ Monitoring - Metrics, logging, alerting, dashboards
- ✅ Operations - Deployment validation, health checks, incident response
Key principles:
- Validate everything - Don’t assume anything works in production
- Automate checks - Use scripts and tools for consistent validation
- Monitor continuously - Track all critical metrics and logs
- Plan for failure - Have rollback and recovery procedures ready
- Document procedures - Maintain up-to-date runbooks and checklists
This completes the EventCore Operations guide. You now have comprehensive documentation for deploying, monitoring, and maintaining EventCore applications in production environments.
Next, proceed to Part 7: Reference →
Part 7: Reference
This part provides comprehensive reference documentation for EventCore. Use this section to look up specific APIs, configuration options, error codes, and terminology.
Chapters in This Part
- API Documentation - Complete API reference
- Configuration Reference - All configuration options
- Error Reference - Error codes and troubleshooting
- Glossary - Definitions and terminology
What You’ll Find
- Complete API documentation with examples
- Exhaustive configuration option reference
- Comprehensive error code catalog
- Definitions of all EventCore terminology
Usage
This reference documentation is designed for:
- Quick lookups during development
- Understanding specific configuration options
- Troubleshooting error conditions
- Learning EventCore terminology
Organization
Each reference chapter is organized alphabetically or logically for easy navigation. Use the search functionality in your documentation viewer to quickly find specific information.
Ready to explore the reference? Start with API Documentation →
API Documentation
The complete EventCore API documentation is generated from the source code using rustdoc.
The API documentation includes:
- Complete type and trait references - All public types, traits, and functions
- Usage examples - Code examples demonstrating common patterns
- Module documentation - Overview and guidance for each module
- Cross-references - Links between related types and concepts
Key Modules
Core Library
eventcore
- Core library with command execution, event stores, and projectionseventcore::prelude
- Common imports for EventCore applications
Event Store Adapters
eventcore_postgres
- PostgreSQL event store adaptereventcore_memory
- In-memory event store for testing
Derive Macros
eventcore_macros
- Derive macros for commands
Quick Reference
For quick access to commonly used items:
Command
- Core command traitCommandExecutor
- Executes commandsEventStore
- Event persistence traitProjection
- Read model projectionsStreamId
- Stream identifierEventId
- Event identifier
Chapter 7.2: Configuration Reference
This chapter provides a complete reference for all EventCore configuration options. Use this as a lookup guide when setting up and tuning your EventCore applications.
Core Configuration
EventStore Configuration
Configuration for event store implementations.
PostgresConfig
Configuration for PostgreSQL event store.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct PostgresConfig { pub database_url: String, pub pool_config: PoolConfig, pub migration_config: MigrationConfig, pub performance_config: PerformanceConfig, pub security_config: SecurityConfig, } impl PostgresConfig { pub fn new(database_url: String) -> Self pub fn from_env() -> Result<Self, ConfigError> pub fn with_pool_config(mut self, config: PoolConfig) -> Self pub fn with_migration_config(mut self, config: MigrationConfig) -> Self } }
Example:
#![allow(unused)] fn main() { let config = PostgresConfig::new("postgresql://localhost/eventcore".to_string()) .with_pool_config(PoolConfig { max_connections: 20, min_connections: 5, connect_timeout: Duration::from_secs(10), idle_timeout: Some(Duration::from_secs(300)), max_lifetime: Some(Duration::from_secs(1800)), }) .with_migration_config(MigrationConfig { auto_migrate: true, migration_timeout: Duration::from_secs(60), }); }
PoolConfig
Database connection pool configuration.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct PoolConfig { /// Maximum number of connections in the pool pub max_connections: u32, /// Minimum number of connections to maintain pub min_connections: u32, /// Timeout for establishing new connections pub connect_timeout: Duration, /// Maximum time a connection can be idle before being closed pub idle_timeout: Option<Duration>, /// Maximum lifetime of a connection pub max_lifetime: Option<Duration>, /// Test connections before use pub test_before_acquire: bool, } impl Default for PoolConfig { fn default() -> Self { Self { max_connections: 10, min_connections: 2, connect_timeout: Duration::from_secs(5), idle_timeout: Some(Duration::from_secs(600)), max_lifetime: Some(Duration::from_secs(3600)), test_before_acquire: true, } } } }
Tuning Guidelines:
- max_connections: 2-4x CPU cores for CPU-bound workloads, higher for I/O-bound
- min_connections: 10-20% of max_connections
- connect_timeout: 5-10 seconds for local databases, 15-30 seconds for remote
- idle_timeout: 5-10 minutes to balance connection reuse and resource usage
- max_lifetime: 30-60 minutes to prevent connection staleness
MigrationConfig
Database migration configuration.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct MigrationConfig { /// Automatically run migrations on startup pub auto_migrate: bool, /// Timeout for migration operations pub migration_timeout: Duration, /// Lock timeout for migration coordination pub lock_timeout: Duration, /// Migration table name pub migration_table: String, } impl Default for MigrationConfig { fn default() -> Self { Self { auto_migrate: false, migration_timeout: Duration::from_secs(300), lock_timeout: Duration::from_secs(60), migration_table: "_sqlx_migrations".to_string(), } } } }
Command Execution Configuration
CommandExecutorConfig
Configuration for command execution behavior.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct CommandExecutorConfig { pub retry_config: RetryConfig, pub timeout_config: TimeoutConfig, pub concurrency_config: ConcurrencyConfig, pub metrics_config: MetricsConfig, } impl Default for CommandExecutorConfig { fn default() -> Self { Self { retry_config: RetryConfig::default(), timeout_config: TimeoutConfig::default(), concurrency_config: ConcurrencyConfig::default(), metrics_config: MetricsConfig::default(), } } } }
RetryConfig
Configuration for command retry behavior.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct RetryConfig { /// Maximum number of retry attempts pub max_attempts: u32, /// Initial delay before first retry pub initial_delay: Duration, /// Maximum delay between retries pub max_delay: Duration, /// Multiplier for exponential backoff pub backoff_multiplier: f64, /// Which types of errors to retry pub retry_policy: RetryPolicy, /// Add jitter to prevent thundering herd pub jitter: bool, } impl RetryConfig { pub fn none() -> Self { Self { max_attempts: 0, ..Default::default() } } pub fn aggressive() -> Self { Self { max_attempts: 10, initial_delay: Duration::from_millis(10), max_delay: Duration::from_secs(5), backoff_multiplier: 1.5, retry_policy: RetryPolicy::All, jitter: true, } } pub fn conservative() -> Self { Self { max_attempts: 3, initial_delay: Duration::from_millis(100), max_delay: Duration::from_secs(2), backoff_multiplier: 2.0, retry_policy: RetryPolicy::ConcurrencyConflictsOnly, jitter: true, } } } impl Default for RetryConfig { fn default() -> Self { Self { max_attempts: 5, initial_delay: Duration::from_millis(50), max_delay: Duration::from_secs(1), backoff_multiplier: 2.0, retry_policy: RetryPolicy::TransientErrorsOnly, jitter: true, } } } #[derive(Debug, Clone)] pub enum RetryPolicy { /// Never retry None, /// Only retry concurrency conflicts ConcurrencyConflictsOnly, /// Only retry transient errors (connection issues, timeouts) TransientErrorsOnly, /// Retry all retryable errors All, } }
Retry Policy Guidelines:
- ConcurrencyConflictsOnly: Use for high-conflict scenarios where immediate retry is beneficial
- TransientErrorsOnly: Use for stable systems where business logic errors shouldn’t be retried
- All: Use for development or systems where any failure might be recoverable
TimeoutConfig
Configuration for command timeouts.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct TimeoutConfig { /// Default timeout for command execution pub default_timeout: Duration, /// Timeout for reading streams pub read_timeout: Duration, /// Timeout for writing events pub write_timeout: Duration, /// Timeout for stream discovery pub discovery_timeout: Duration, } impl Default for TimeoutConfig { fn default() -> Self { Self { default_timeout: Duration::from_secs(30), read_timeout: Duration::from_secs(10), write_timeout: Duration::from_secs(15), discovery_timeout: Duration::from_secs(5), } } } }
ConcurrencyConfig
Configuration for concurrent command execution.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct ConcurrencyConfig { /// Maximum number of concurrent commands pub max_concurrent_commands: usize, /// Maximum iterations for stream discovery pub max_discovery_iterations: usize, /// Enable command batching pub enable_batching: bool, /// Maximum batch size for event writes pub max_batch_size: usize, /// Batch timeout pub batch_timeout: Duration, } impl Default for ConcurrencyConfig { fn default() -> Self { Self { max_concurrent_commands: 100, max_discovery_iterations: 10, enable_batching: true, max_batch_size: 1000, batch_timeout: Duration::from_millis(100), } } } }
Concurrency Tuning:
- max_concurrent_commands: Balance between throughput and resource usage
- max_discovery_iterations: Higher values allow more complex stream patterns but increase latency
- max_batch_size: Larger batches improve throughput but increase memory usage and latency
Projection Configuration
ProjectionConfig
Configuration for projection management.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct ProjectionConfig { pub checkpoint_config: CheckpointConfig, pub processing_config: ProcessingConfig, pub recovery_config: RecoveryConfig, } ### CheckpointConfig Configuration for projection checkpointing. ```rust #[derive(Debug, Clone)] pub struct CheckpointConfig { /// How often to save checkpoints pub checkpoint_interval: Duration, /// Number of events to process before checkpointing pub events_per_checkpoint: usize, /// Store for checkpoint persistence pub checkpoint_store: CheckpointStoreConfig, /// Enable checkpoint compression pub compress_checkpoints: bool, } impl Default for CheckpointConfig { fn default() -> Self { Self { checkpoint_interval: Duration::from_secs(30), events_per_checkpoint: 1000, checkpoint_store: CheckpointStoreConfig::Database, compress_checkpoints: true, } } } #[derive(Debug, Clone)] pub enum CheckpointStoreConfig { /// Store checkpoints in the main database Database, /// Store checkpoints in Redis Redis { connection_string: String }, /// Store checkpoints in memory (testing only) InMemory, /// Custom checkpoint store Custom { store_type: String, config: HashMap<String, String> }, } }
ProcessingConfig
Configuration for event processing.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct ProcessingConfig { /// Number of events to process in each batch pub batch_size: usize, /// Timeout for processing a single event pub event_timeout: Duration, /// Timeout for processing a batch pub batch_timeout: Duration, /// Number of parallel processors pub parallelism: usize, /// Buffer size for event queues pub buffer_size: usize, /// Error handling strategy pub error_handling: ErrorHandlingStrategy, } impl Default for ProcessingConfig { fn default() -> Self { Self { batch_size: 100, event_timeout: Duration::from_secs(5), batch_timeout: Duration::from_secs(30), parallelism: 1, buffer_size: 10000, error_handling: ErrorHandlingStrategy::SkipAndLog, } } } #[derive(Debug, Clone)] pub enum ErrorHandlingStrategy { /// Skip failed events and log errors SkipAndLog, /// Stop processing on first error FailFast, /// Retry failed events with backoff Retry { max_attempts: u32, backoff: Duration }, /// Send failed events to dead letter queue DeadLetter { queue_config: DeadLetterConfig }, } }
Monitoring Configuration
MetricsConfig
Configuration for metrics collection.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct MetricsConfig { /// Enable metrics collection pub enabled: bool, /// Metrics export format pub export_format: MetricsFormat, /// Export interval pub export_interval: Duration, /// Histogram buckets for latency metrics pub latency_buckets: Vec<f64>, /// Labels to add to all metrics pub default_labels: HashMap<String, String>, /// Metrics to collect pub collectors: Vec<MetricsCollector>, } impl Default for MetricsConfig { fn default() -> Self { Self { enabled: true, export_format: MetricsFormat::Prometheus, export_interval: Duration::from_secs(15), latency_buckets: vec![ 0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0 ], default_labels: HashMap::new(), collectors: vec![ MetricsCollector::Commands, MetricsCollector::Events, MetricsCollector::Projections, MetricsCollector::System, ], } } } #[derive(Debug, Clone)] pub enum MetricsFormat { Prometheus, OpenTelemetry, StatsD, Custom { format: String }, } #[derive(Debug, Clone)] pub enum MetricsCollector { Commands, Events, Projections, System, Custom { name: String }, } }
TracingConfig
Configuration for distributed tracing.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct TracingConfig { /// Enable tracing pub enabled: bool, /// Tracing exporter configuration pub exporter: TracingExporter, /// Sampling configuration pub sampling: SamplingConfig, /// Resource attributes pub resource_attributes: HashMap<String, String>, /// Span attributes to add to all spans pub default_span_attributes: HashMap<String, String>, } impl Default for TracingConfig { fn default() -> Self { Self { enabled: true, exporter: TracingExporter::Jaeger { endpoint: "http://localhost:14268/api/traces".to_string(), }, sampling: SamplingConfig::default(), resource_attributes: HashMap::new(), default_span_attributes: HashMap::new(), } } } #[derive(Debug, Clone)] pub enum TracingExporter { Jaeger { endpoint: String }, Zipkin { endpoint: String }, OpenTelemetry { endpoint: String }, Console, None, } #[derive(Debug, Clone)] pub struct SamplingConfig { /// Sampling rate (0.0 to 1.0) pub sample_rate: f64, /// Always sample errors pub always_sample_errors: bool, /// Sampling strategy pub strategy: SamplingStrategy, } impl Default for SamplingConfig { fn default() -> Self { Self { sample_rate: 0.1, always_sample_errors: true, strategy: SamplingStrategy::Probabilistic, } } } #[derive(Debug, Clone)] pub enum SamplingStrategy { /// Always sample Always, /// Never sample Never, /// Probabilistic sampling Probabilistic, /// Rate limiting sampling RateLimit { max_per_second: u32 }, } }
LoggingConfig
Configuration for structured logging.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct LoggingConfig { /// Log level pub level: LogLevel, /// Log format pub format: LogFormat, /// Output destination pub output: LogOutput, /// Include timestamps pub include_timestamps: bool, /// Include source code locations pub include_locations: bool, /// Correlation ID header name pub correlation_id_header: String, /// Fields to include in all log entries pub default_fields: HashMap<String, String>, } impl Default for LoggingConfig { fn default() -> Self { Self { level: LogLevel::Info, format: LogFormat::Json, output: LogOutput::Stdout, include_timestamps: true, include_locations: false, correlation_id_header: "x-correlation-id".to_string(), default_fields: HashMap::new(), } } } #[derive(Debug, Clone)] pub enum LogLevel { Trace, Debug, Info, Warn, Error, } #[derive(Debug, Clone)] pub enum LogFormat { Json, Logfmt, Pretty, Compact, } #[derive(Debug, Clone)] pub enum LogOutput { Stdout, Stderr, File { path: String, rotation: RotationConfig }, Syslog { facility: String }, Network { endpoint: String }, } #[derive(Debug, Clone)] pub struct RotationConfig { pub max_size_mb: u64, pub max_files: u32, pub compress: bool, } }
Security Configuration
SecurityConfig
Configuration for security features.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct SecurityConfig { pub tls_config: Option<TlsConfig>, pub auth_config: AuthConfig, pub encryption_config: EncryptionConfig, } ### TlsConfig Configuration for TLS encryption. ```rust #[derive(Debug, Clone)] pub struct TlsConfig { /// Path to certificate file pub cert_file: String, /// Path to private key file pub key_file: String, /// Path to CA certificate file (for client verification) pub ca_file: Option<String>, /// Require client certificates pub require_client_cert: bool, /// Minimum TLS version pub min_version: TlsVersion, /// Allowed cipher suites pub cipher_suites: Vec<String>, } #[derive(Debug, Clone)] pub enum TlsVersion { V1_2, V1_3, } }
AuthConfig
Configuration for authentication.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct AuthConfig { /// Authentication provider pub provider: AuthProvider, /// Token validation settings pub token_validation: TokenValidationConfig, /// Session configuration pub session_config: SessionConfig, } #[derive(Debug, Clone)] pub enum AuthProvider { /// JWT-based authentication Jwt { secret_key: String, algorithm: JwtAlgorithm, issuer: Option<String>, audience: Option<String>, }, /// OAuth2 authentication OAuth2 { client_id: String, client_secret: String, auth_url: String, token_url: String, scopes: Vec<String>, }, /// API key authentication ApiKey { header_name: String, query_param: Option<String>, }, /// Custom authentication Custom { provider_type: String, config: HashMap<String, String> }, } #[derive(Debug, Clone)] pub enum JwtAlgorithm { HS256, HS384, HS512, RS256, RS384, RS512, ES256, ES384, ES512, } }
EncryptionConfig
Configuration for data encryption.
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub struct EncryptionConfig { /// Enable encryption at rest pub encrypt_at_rest: bool, /// Encryption algorithm pub algorithm: EncryptionAlgorithm, /// Key management configuration pub key_management: KeyManagementConfig, /// Fields to encrypt pub encrypted_fields: Vec<String>, } #[derive(Debug, Clone)] pub enum EncryptionAlgorithm { AES256GCM, ChaCha20Poly1305, XChaCha20Poly1305, } #[derive(Debug, Clone)] pub enum KeyManagementConfig { /// Environment variable Environment { key_var: String }, /// AWS KMS AwsKms { key_id: String, region: String }, /// HashiCorp Vault Vault { endpoint: String, token: String, key_path: String }, /// File-based key storage File { key_file: String }, } }
Environment Variables
EventCore supports configuration via environment variables with the EVENTCORE_
prefix:
Core Settings
# Database configuration
EVENTCORE_DATABASE_URL=postgresql://localhost/eventcore
EVENTCORE_DATABASE_MAX_CONNECTIONS=20
EVENTCORE_DATABASE_MIN_CONNECTIONS=5
EVENTCORE_DATABASE_CONNECT_TIMEOUT=10
EVENTCORE_DATABASE_IDLE_TIMEOUT=300
EVENTCORE_DATABASE_MAX_LIFETIME=1800
# Command execution
EVENTCORE_COMMAND_DEFAULT_TIMEOUT=30
EVENTCORE_COMMAND_MAX_RETRIES=5
EVENTCORE_COMMAND_RETRY_DELAY_MS=50
EVENTCORE_COMMAND_MAX_CONCURRENT=100
# Projections
EVENTCORE_PROJECTION_BATCH_SIZE=100
EVENTCORE_PROJECTION_CHECKPOINT_INTERVAL=30
EVENTCORE_PROJECTION_EVENTS_PER_CHECKPOINT=1000
# Metrics and monitoring
EVENTCORE_METRICS_ENABLED=true
EVENTCORE_METRICS_EXPORT_INTERVAL=15
EVENTCORE_TRACING_ENABLED=true
EVENTCORE_TRACING_SAMPLE_RATE=0.1
# Security
EVENTCORE_JWT_SECRET=your-secret-key
EVENTCORE_TLS_CERT_FILE=/path/to/cert.pem
EVENTCORE_TLS_KEY_FILE=/path/to/key.pem
EVENTCORE_ENCRYPT_AT_REST=true
Logging Configuration
EVENTCORE_LOG_LEVEL=info
EVENTCORE_LOG_FORMAT=json
EVENTCORE_LOG_OUTPUT=stdout
EVENTCORE_LOG_INCLUDE_TIMESTAMPS=true
EVENTCORE_LOG_INCLUDE_LOCATIONS=false
Development Settings
# Development mode settings
EVENTCORE_DEV_MODE=true
EVENTCORE_DEV_AUTO_MIGRATE=true
EVENTCORE_DEV_RESET_DB=false
EVENTCORE_DEV_SEED_DATA=true
# Testing settings
EVENTCORE_TEST_DATABASE_URL=postgresql://localhost/eventcore_test
EVENTCORE_TEST_PARALLEL=true
EVENTCORE_TEST_RESET_BETWEEN_TESTS=true
Configuration Files
TOML Configuration Example
# eventcore.toml
[database]
url = "postgresql://localhost/eventcore"
max_connections = 20
min_connections = 5
connect_timeout = "10s"
idle_timeout = "5m"
max_lifetime = "30m"
[commands]
default_timeout = "30s"
max_retries = 5
retry_delay = "50ms"
max_concurrent = 100
max_discovery_iterations = 10
[projections]
batch_size = 100
checkpoint_interval = "30s"
events_per_checkpoint = 1000
parallelism = 1
[metrics]
enabled = true
export_format = "prometheus"
export_interval = "15s"
latency_buckets = [0.001, 0.005, 0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]
[tracing]
enabled = true
exporter = "jaeger"
jaeger_endpoint = "http://localhost:14268/api/traces"
sample_rate = 0.1
always_sample_errors = true
[logging]
level = "info"
format = "json"
output = "stdout"
include_timestamps = true
include_locations = false
[security]
encrypt_at_rest = true
jwt_secret = "${JWT_SECRET}"
[security.tls]
cert_file = "/etc/ssl/certs/eventcore.pem"
key_file = "/etc/ssl/private/eventcore.key"
require_client_cert = false
min_version = "1.3"
YAML Configuration Example
# eventcore.yaml
database:
url: postgresql://localhost/eventcore
pool:
max_connections: 20
min_connections: 5
connect_timeout: 10s
idle_timeout: 5m
max_lifetime: 30m
migration:
auto_migrate: false
migration_timeout: 5m
commands:
timeout:
default_timeout: 30s
read_timeout: 10s
write_timeout: 15s
retry:
max_attempts: 5
initial_delay: 50ms
max_delay: 1s
backoff_multiplier: 2.0
policy: transient_errors_only
jitter: true
concurrency:
max_concurrent_commands: 100
max_discovery_iterations: 10
enable_batching: true
max_batch_size: 1000
projections:
checkpoint:
interval: 30s
events_per_checkpoint: 1000
store: database
compress: true
processing:
batch_size: 100
event_timeout: 5s
batch_timeout: 30s
parallelism: 1
error_handling: skip_and_log
monitoring:
metrics:
enabled: true
export_format: prometheus
export_interval: 15s
collectors:
- commands
- events
- projections
- system
tracing:
enabled: true
exporter:
type: jaeger
endpoint: http://localhost:14268/api/traces
sampling:
sample_rate: 0.1
always_sample_errors: true
logging:
level: info
format: json
output: stdout
correlation_id_header: x-correlation-id
security:
auth:
provider:
type: jwt
secret_key: ${JWT_SECRET}
algorithm: HS256
encryption:
encrypt_at_rest: true
algorithm: AES256GCM
key_management:
type: environment
key_var: ENCRYPTION_KEY
Configuration Loading
EventCore supports multiple configuration sources with the following precedence order:
- Command line arguments (highest priority)
- Environment variables
- Configuration files (TOML, YAML, JSON)
- Default values (lowest priority)
Loading Configuration in Code
#![allow(unused)] fn main() { use eventcore::config::{EventCoreConfig, ConfigBuilder}; // Load from environment and files let config = EventCoreConfig::from_env() .expect("Failed to load configuration"); // Custom configuration loading let config = ConfigBuilder::new() .load_from_file("config/eventcore.toml")? .load_from_env()? .override_with_args(std::env::args())? .build()?; // Validate configuration config.validate()?; }
This completes the configuration reference. All EventCore configuration options are documented with examples, default values, and tuning guidelines.
Next, explore Error Reference →
Chapter 7.3: Error Reference
This chapter provides a comprehensive reference for all EventCore error types, error codes, and troubleshooting guidance. Use this reference to understand and resolve errors in your EventCore applications.
Error Categories
EventCore errors are organized into several categories based on their origin and nature:
- Command Errors - Errors during command execution
- Event Store Errors - Errors from event store operations
- Projection Errors - Errors in projection processing
- Validation Errors - Input validation failures
- Configuration Errors - Configuration and setup issues
- Network Errors - Network and connectivity issues
- Serialization Errors - Data serialization/deserialization issues
Command Errors
CommandError
The primary error type for command execution failures.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum CommandError { #[error("Validation failed: {message}")] ValidationFailed { message: String }, #[error("Business rule violation: {rule} - {message}")] BusinessRuleViolation { rule: String, message: String }, #[error("Concurrency conflict on streams: {streams:?}")] ConcurrencyConflict { streams: Vec<StreamId> }, #[error("Stream not found: {stream_id}")] StreamNotFound { stream_id: StreamId }, #[error("Unauthorized: {permission}")] Unauthorized { permission: String }, #[error("Timeout after {duration:?}")] Timeout { duration: Duration }, #[error("Stream access denied: cannot write to {stream_id}")] StreamAccessDenied { stream_id: StreamId }, #[error("Maximum discovery iterations exceeded: {iterations}")] MaxIterationsExceeded { iterations: usize }, #[error("Event store error: {source}")] EventStoreError { #[from] source: EventStoreError }, #[error("Serialization error: {message}")] SerializationError { message: String }, #[error("Internal error: {message}")] InternalError { message: String }, } }
Error Codes and Solutions
CE001: ValidationFailed
Error: Validation failed: StreamId cannot be empty
Code: CE001
Cause: Input validation failed during command construction or execution. Solution:
- Check input parameters for correct format and constraints
- Ensure all required fields are provided
- Verify string lengths and format requirements
CE002: BusinessRuleViolation
Error: Business rule violation: insufficient_balance - Account balance $100.00 is less than transfer amount $150.00
Code: CE002
Cause: Business logic constraints were violated. Solution:
- Review business rules and ensure command logic respects them
- Check application state before executing commands
- Implement proper validation in command handlers
CE003: ConcurrencyConflict
Error: Concurrency conflict on streams: ["account-123", "account-456"]
Code: CE003
Cause: Multiple commands attempted to modify the same streams simultaneously. Solution:
- Implement retry logic with exponential backoff
- Consider command design to reduce conflicts
- Use optimistic concurrency control patterns
CE004: StreamNotFound
Error: Stream not found: account-nonexistent
Code: CE004
Cause: Command attempted to read from a stream that doesn’t exist. Solution:
- Verify stream IDs are correct
- Check if the resource exists before referencing it
- Implement proper error handling for missing resources
CE005: Unauthorized
Error: Unauthorized: write_account_events
Code: CE005
Cause: Insufficient permissions to execute the command. Solution:
- Verify user authentication and authorization
- Check role-based access control configuration
- Ensure proper security context is set
CE006: Timeout
Error: Timeout after 30s
Code: CE006
Cause: Command execution exceeded configured timeout. Solution:
- Check system performance and database connectivity
- Increase timeout configuration if appropriate
- Optimize command logic and database queries
CE007: StreamAccessDenied
Error: Stream access denied: cannot write to protected-stream-123
Code: CE007
Cause: Command attempted to write to a stream it didn’t declare. Solution:
- Add the stream to the command’s
read_streams()
method - Verify command design follows EventCore stream access patterns
- Check for typos in stream ID generation
CE008: MaxIterationsExceeded
Error: Maximum discovery iterations exceeded: 10
Code: CE008
Cause: Stream discovery loop exceeded configured maximum iterations. Solution:
- Review command logic for potential infinite discovery loops
- Increase
max_discovery_iterations
if legitimate - Optimize stream discovery patterns
Command Execution Flow Errors
These errors occur during specific phases of command execution:
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum ExecutionPhaseError { #[error("Stream reading failed: {message}")] StreamReadError { message: String }, #[error("State reconstruction failed: {message}")] StateReconstructionError { message: String }, #[error("Command handling failed: {message}")] CommandHandlingError { message: String }, #[error("Event writing failed: {message}")] EventWritingError { message: String }, #[error("Stream discovery failed: {message}")] StreamDiscoveryError { message: String }, } }
Event Store Errors
EventStoreError
Errors from event store operations.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum EventStoreError { #[error("Version conflict: expected {expected:?}, got {actual}")] VersionConflict { expected: ExpectedVersion, actual: EventVersion, }, #[error("Stream not found: {stream_id}")] StreamNotFound { stream_id: StreamId }, #[error("Connection failed: {message}")] ConnectionFailed { message: String }, #[error("Database error: {source}")] DatabaseError { #[from] source: sqlx::Error, }, #[error("Serialization error: {message}")] SerializationError { message: String }, #[error("Transaction failed: {message}")] TransactionError { message: String }, #[error("Migration error: {message}")] MigrationError { message: String }, #[error("Configuration error: {message}")] ConfigurationError { message: String }, #[error("Timeout error: operation timed out after {duration:?}")] TimeoutError { duration: Duration }, #[error("Connection pool exhausted")] ConnectionPoolExhausted, #[error("Invalid event data: {message}")] InvalidEventData { message: String }, } }
Error Codes and Solutions
ES001: VersionConflict
Error: Version conflict: expected Exact(5), got 7
Code: ES001
Cause: Optimistic concurrency control detected concurrent modification. Solution:
- Implement retry logic in command execution
- Consider command design to reduce conflicts
- Use appropriate
ExpectedVersion
strategy
ES002: ConnectionFailed
Error: Connection failed: Failed to connect to database at postgresql://localhost/eventcore
Code: ES002
Cause: Unable to establish database connection. Solution:
- Verify database is running and accessible
- Check connection string configuration
- Verify network connectivity and firewall rules
ES003: ConnectionPoolExhausted
Error: Connection pool exhausted
Code: ES003
Cause: All database connections in the pool are in use. Solution:
- Increase
max_connections
in pool configuration - Check for connection leaks in application code
- Monitor connection usage patterns
ES004: TransactionError
Error: Transaction failed: serialization failure
Code: ES004
Cause: Database transaction could not be completed due to conflicts. Solution:
- Implement transaction retry logic
- Review transaction isolation levels
- Consider reducing transaction scope
ES005: MigrationError
Error: Migration error: Migration 20231201_001_create_events failed
Code: ES005
Cause: Database migration failed during startup. Solution:
- Check database permissions
- Verify migration scripts are valid
- Review database schema state
PostgreSQL-Specific Errors
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum PostgresError { #[error("Unique constraint violation: {constraint}")] UniqueConstraintViolation { constraint: String }, #[error("Foreign key constraint violation: {constraint}")] ForeignKeyViolation { constraint: String }, #[error("Check constraint violation: {constraint}")] CheckConstraintViolation { constraint: String }, #[error("Deadlock detected: {message}")] DeadlockDetected { message: String }, #[error("Query timeout: query exceeded {timeout:?}")] QueryTimeout { timeout: Duration }, #[error("Connection limit exceeded")] ConnectionLimitExceeded, } }
Projection Errors
ProjectionError
Errors from projection operations.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum ProjectionError { #[error("Projection not found: {name}")] NotFound { name: String }, #[error("Projection already exists: {name}")] AlreadyExists { name: String }, #[error("Event processing failed: {message}")] ProcessingFailed { message: String }, #[error("Checkpoint save failed: {message}")] CheckpointFailed { message: String }, #[error("Rebuild failed: {message}")] RebuildFailed { message: String }, #[error("Subscription error: {message}")] SubscriptionError { message: String }, #[error("State corruption detected: {message}")] StateCorruption { message: String }, #[error("Projection timeout: {projection} timed out after {duration:?}")] Timeout { projection: String, duration: Duration }, #[error("Configuration error: {message}")] ConfigurationError { message: String }, } }
Error Codes and Solutions
PR001: ProcessingFailed
Error: Event processing failed: Failed to apply UserCreated event
Code: PR001
Cause: Projection failed to process an event. Solution:
- Check projection logic for errors
- Verify event format matches expectations
- Implement proper error handling in projections
PR002: CheckpointFailed
Error: Checkpoint save failed: Database connection lost
Code: PR002
Cause: Unable to save projection checkpoint. Solution:
- Check database connectivity
- Verify checkpoint storage configuration
- Implement checkpoint retry logic
PR003: RebuildFailed
Error: Rebuild failed: Out of memory during rebuild
Code: PR003
Cause: Projection rebuild encountered an error. Solution:
- Increase memory allocation for rebuild operations
- Implement incremental rebuild strategies
- Check for memory leaks in projection code
PR004: StateCorruption
Error: State corruption detected: Checksum mismatch
Code: PR004
Cause: Projection state integrity check failed. Solution:
- Rebuild projection from beginning
- Investigate potential data corruption causes
- Verify checkpoint storage integrity
Validation Errors
ValidationError
Input validation errors from the nutype
validation system.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum ValidationError { #[error("Required field is empty: {field}")] Empty { field: String }, #[error("Value too long: {field} length {length} exceeds maximum {max}")] TooLong { field: String, length: usize, max: usize }, #[error("Value too short: {field} length {length} below minimum {min}")] TooShort { field: String, length: usize, min: usize }, #[error("Invalid format: {field} does not match expected format")] InvalidFormat { field: String }, #[error("Invalid range: {field} value {value} outside range [{min}, {max}]")] OutOfRange { field: String, value: String, min: String, max: String }, #[error("Predicate failed: {field} failed validation rule")] PredicateFailed { field: String }, #[error("Parse error: {field} could not be parsed - {message}")] ParseError { field: String, message: String }, } }
Error Codes and Solutions
VE001: Empty
Error: Required field is empty: stream_id
Code: VE001
Cause: Required field was empty or contained only whitespace. Solution:
- Ensure all required fields have values
- Check for null or empty string inputs
- Verify string trimming behavior
VE002: TooLong
Error: Value too long: stream_id length 300 exceeds maximum 255
Code: VE002
Cause: Input value exceeded maximum length constraint. Solution:
- Reduce input length to meet constraints
- Consider using shorter identifiers
- Review length requirements
VE003: InvalidFormat
Error: Invalid format: email does not match expected format
Code: VE003
Cause: Input value didn’t match expected format pattern. Solution:
- Verify input format matches requirements
- Check regular expression patterns
- Validate input on client side before submission
Configuration Errors
ConfigError
Configuration and setup errors.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum ConfigError { #[error("Missing required configuration: {key}")] MissingRequired { key: String }, #[error("Invalid configuration value: {key} = {value}")] InvalidValue { key: String, value: String }, #[error("Configuration file not found: {path}")] FileNotFound { path: String }, #[error("Configuration parse error: {message}")] ParseError { message: String }, #[error("Environment variable error: {message}")] EnvironmentError { message: String }, #[error("Validation error: {message}")] ValidationError { message: String }, } }
Error Codes and Solutions
CF001: MissingRequired
Error: Missing required configuration: DATABASE_URL
Code: CF001
Cause: Required configuration parameter not provided. Solution:
- Set missing environment variable or configuration value
- Check configuration file completeness
- Verify environment setup
CF002: InvalidValue
Error: Invalid configuration value: MAX_CONNECTIONS = -5
Code: CF002
Cause: Configuration value is invalid for the parameter type. Solution:
- Check value format and type requirements
- Verify numeric ranges and constraints
- Review configuration documentation
Network Errors
NetworkError
Network and connectivity related errors.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum NetworkError { #[error("Connection timeout: {endpoint}")] ConnectionTimeout { endpoint: String }, #[error("DNS resolution failed: {hostname}")] DnsResolutionFailed { hostname: String }, #[error("TLS error: {message}")] TlsError { message: String }, #[error("HTTP error: {status} - {message}")] HttpError { status: u16, message: String }, #[error("Network unreachable: {endpoint}")] NetworkUnreachable { endpoint: String }, #[error("Connection refused: {endpoint}")] ConnectionRefused { endpoint: String }, } }
Serialization Errors
SerializationError
Data serialization and deserialization errors.
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum SerializationError { #[error("JSON serialization failed: {message}")] JsonSerializationFailed { message: String }, #[error("JSON deserialization failed: {message}")] JsonDeserializationFailed { message: String }, #[error("Invalid JSON format: {message}")] InvalidJsonFormat { message: String }, #[error("Missing required field: {field}")] MissingField { field: String }, #[error("Unknown field: {field}")] UnknownField { field: String }, #[error("Type mismatch: expected {expected}, found {found}")] TypeMismatch { expected: String, found: String }, #[error("Schema version mismatch: expected {expected}, found {found}")] SchemaVersionMismatch { expected: String, found: String }, } }
Error Handling Patterns
Retry Strategies
EventCore provides different retry strategies for different error types:
#![allow(unused)] fn main() { // Automatic retry for transient errors match command_executor.execute(&command).await { Ok(result) => result, Err(CommandError::ConcurrencyConflict { .. }) => { // Retry with exponential backoff retry_with_backoff(|| command_executor.execute(&command)).await? }, Err(CommandError::Timeout { .. }) => { // Retry with different timeout command_executor.execute_with_timeout(&command, increased_timeout).await? }, Err(other) => return Err(other), // Don't retry business logic errors } }
Error Conversion
Common error conversion patterns:
#![allow(unused)] fn main() { // Convert EventStore errors to Command errors impl From<EventStoreError> for CommandError { fn from(err: EventStoreError) -> Self { match err { EventStoreError::VersionConflict { .. } => { CommandError::ConcurrencyConflict { streams: vec![] } }, EventStoreError::StreamNotFound { stream_id } => { CommandError::StreamNotFound { stream_id } }, other => CommandError::EventStoreError { source: other }, } } } }
Error Context
Adding context to errors for better debugging:
#![allow(unused)] fn main() { use anyhow::{Context, Result}; async fn execute_command<C: Command>(command: &C) -> Result<ExecutionResult> { command_executor .execute(command) .await .with_context(|| format!("Failed to execute command: {}", std::any::type_name::<C>())) .with_context(|| "Command execution failed in main handler") } }
Troubleshooting Guide
Quick Reference
Performance Issues:
- Check
CE006: Timeout
errors → Review system performance - Check
ES003: ConnectionPoolExhausted
→ Increase pool size or fix leaks - Check
PR003: RebuildFailed
→ Optimize memory usage
Data Issues:
- Check
CE003: ConcurrencyConflict
→ Implement retry logic - Check
ES001: VersionConflict
→ Review optimistic concurrency - Check
PR004: StateCorruption
→ Rebuild projections
Configuration Issues:
- Check
CF001: MissingRequired
→ Set required configuration - Check
ES002: ConnectionFailed
→ Verify database connectivity - Check
CF002: InvalidValue
→ Review configuration values
Security Issues:
- Check
CE005: Unauthorized
→ Verify permissions - Check
CE007: StreamAccessDenied
→ Fix stream access patterns
Diagnostic Commands
# Check EventCore health
eventcore-cli health-check
# Validate configuration
eventcore-cli config validate
# Test database connectivity
eventcore-cli database ping
# Check projection status
eventcore-cli projections status
# Verify stream access
eventcore-cli commands validate <command-type>
Log Analysis
Common log patterns to look for:
# High error rates
grep "ERROR" logs/eventcore.log | grep -c "CommandError"
# Concurrency conflicts
grep "ConcurrencyConflict" logs/eventcore.log | tail -10
# Performance issues
grep "Timeout\|slow query" logs/eventcore.log
# Connection issues
grep "ConnectionFailed\|ConnectionPoolExhausted" logs/eventcore.log
Error Prevention
Best Practices
- Input Validation: Use type-safe domain types with validation
- Error Handling: Implement comprehensive error handling strategies
- Monitoring: Set up alerts for error rate thresholds
- Testing: Include error scenarios in integration tests
- Documentation: Document expected error conditions
Type Safety
EventCore’s type system prevents many error categories:
#![allow(unused)] fn main() { // Good: Type-safe stream access #[derive(Command)] struct TransferMoney { #[stream] source_account: StreamId, // Guaranteed valid #[stream] target_account: StreamId, // Guaranteed valid amount: Money, // Guaranteed valid currency/amount } // Prevents: CE007 StreamAccessDenied, VE001-VE003 validation errors }
This completes the error reference documentation. Use this guide to understand, diagnose, and resolve EventCore errors effectively.
Next, explore the Glossary →
Chapter 7.4: Glossary
This glossary defines all terms and concepts used throughout EventCore documentation. Use this as a reference to understand EventCore terminology and concepts.
Core Concepts
Aggregate
In traditional event sourcing, an aggregate is a cluster of domain objects that can be treated as a single unit. EventCore eliminates traditional aggregates in favor of dynamic consistency boundaries defined by commands.
Command
A request to change the state of the system by writing events to one or more streams. Commands in EventCore can read from and write to multiple streams atomically, defining their own consistency boundaries.
Command Executor
The component responsible for executing commands against the event store. It handles stream reading, state reconstruction, command logic execution, and event writing.
Consistency Boundary
The scope within which ACID properties are maintained. In EventCore, each command defines its own consistency boundary by specifying which streams it needs to read from and write to.
CQRS (Command Query Responsibility Segregation)
An architectural pattern that separates read and write operations. EventCore naturally supports CQRS through its command system (writes) and projection system (reads).
Dynamic Consistency Boundaries
EventCore’s approach where consistency boundaries are determined at runtime by individual commands, rather than being fixed by aggregate design.
Event
An immutable fact that represents something that happened in the system. Events are stored in streams and contain a payload, metadata, and system-generated identifiers.
Event Sourcing
A data storage pattern where the state of entities is derived from a sequence of events, rather than storing current state directly.
Event Store
The database or storage system that persists events in streams. EventCore provides abstractions for different event store implementations.
Event Stream
See Stream.
Multi-Stream Event Sourcing
EventCore’s approach where a single command can atomically read from and write to multiple event streams, enabling complex business operations across multiple entities.
Projection
A read model built by processing events from one or more streams. Projections transform event data into formats optimized for querying.
Stream
A sequence of events identified by a unique StreamId. Streams represent the event history for a particular entity or concept.
EventCore Specific Terms
CommandStreams Trait
A trait that defines which streams a command needs to read from. Typically implemented automatically by the #[derive(Command)]
macro.
CommandLogic Trait
A trait containing the domain logic for command execution. Separates business logic from infrastructure concerns.
EventId
A UUIDv7 identifier for events that provides both uniqueness and chronological ordering across the entire event store.
EventVersion
A monotonically increasing number representing the position of an event within its stream, starting from 0.
ExecutionResult
The result of executing a command, containing information about events written, affected streams, and execution metadata.
ReadStreams
A type-safe container providing access to stream data during command execution. Prevents commands from accessing streams they didn’t declare.
StreamData
The collection of events from a single stream, along with metadata like the current version.
StreamId
A validated identifier for event streams. Must be non-empty and under 255 characters.
StreamResolver
A component that allows commands to dynamically request additional streams during execution.
StreamWrite
A type-safe wrapper for writing events to streams that enforces stream access control.
TypeState Pattern
A compile-time safety pattern used in EventCore’s execution engine to prevent race conditions and ensure proper execution flow.
Architecture Terms
Functional Core, Imperative Shell
An architectural pattern where pure business logic (functional core) is separated from side effects and I/O operations (imperative shell).
Phantom Types
Types that exist only at compile time to provide additional type safety. EventCore uses phantom types to track stream access permissions.
Smart Constructor
A constructor function that validates input and returns a Result type, ensuring that successfully constructed values are always valid.
Type-Driven Development
A development approach where types are designed first to make illegal states unrepresentable, followed by implementation guided by the type system.
Event Store Terms
Checkpoint
A saved position in an event stream indicating how far a projection has processed events.
Expected Version
A constraint used for optimistic concurrency control when writing events to a stream.
Optimistic Concurrency Control
A concurrency control method that assumes conflicts are rare and checks for conflicts only when committing changes.
Position
The global ordering position of an event across all streams in the event store.
Snapshot
A saved state of an entity at a particular point in time, used to optimize event replay performance.
Subscription
A mechanism for receiving real-time notifications of new events as they’re written to streams.
WAL (Write-Ahead Log)
A logging mechanism where changes are written to a log before being applied to the main database.
Patterns and Techniques
Circuit Breaker
A pattern that prevents cascading failures by temporarily disabling operations that are likely to fail.
Dead Letter Queue
A queue that stores messages that could not be processed successfully, allowing for later analysis and reprocessing.
Event Envelope
A wrapper around event data that includes metadata like event type, version, and timestamps.
Event Upcasting
The process of transforming old event formats to new formats when event schemas evolve.
Idempotency
The property where performing an operation multiple times has the same effect as performing it once.
Process Manager
A component that coordinates long-running business processes by reacting to events and issuing commands.
Railway-Oriented Programming
A functional programming pattern for chaining operations that might fail, using Result types to handle errors gracefully.
Saga
A pattern for managing complex business transactions that span multiple services or aggregates.
Temporal Coupling
A coupling between components based on timing, which EventCore helps avoid through its event-driven architecture.
Database and Storage Terms
ACID Properties
Atomicity (all or nothing), Consistency (valid state), Isolation (concurrent safety), Durability (persistent storage).
Connection Pool
A cache of database connections that can be reused across multiple requests to improve performance.
Connection Pooling
The practice of maintaining a pool of reusable database connections.
Index
A database structure that improves query performance by creating ordered access paths to data.
Materialized View
A database object that contains the results of a query, physically stored and periodically refreshed.
PostgreSQL
The primary database system supported by EventCore for production event storage.
Transaction
A unit of work that is either completed entirely or not at all, maintaining database consistency.
UUIDv7
A UUID variant that includes a timestamp component, providing both uniqueness and chronological ordering.
Monitoring and Operations Terms
Alert
A notification triggered when monitored metrics exceed predefined thresholds.
Dashboard
A visual display of key metrics and system status information.
Health Check
An endpoint or service that reports the operational status of a system component.
Metrics
Quantitative measurements of system behavior and performance.
Observability
The ability to understand the internal state of a system based on its external outputs.
SLI (Service Level Indicator)
A metric that measures the performance of a service.
SLO (Service Level Objective)
A target value or range for an SLI.
Telemetry
The automated collection and transmission of data from remote sources.
Tracing
The practice of tracking requests through distributed systems to understand performance and behavior.
Security Terms
Authentication
The process of verifying the identity of a user or system.
Authorization
The process of determining what actions an authenticated user is allowed to perform.
JWT (JSON Web Token)
A standard for securely transmitting information between parties as a JSON object.
RBAC (Role-Based Access Control)
An access control method where permissions are associated with roles, and users are assigned roles.
TLS (Transport Layer Security)
A cryptographic protocol for securing communications over a network.
Development Terms
Cargo
Rust’s package manager and build system.
CI/CD (Continuous Integration/Continuous Deployment)
Practices for automating the integration, testing, and deployment of code changes.
Integration Test
A test that verifies the interaction between multiple components or systems.
Mock
A test double that simulates the behavior of real objects in controlled ways.
Property-Based Testing
A testing approach that verifies system properties hold for a wide range of generated inputs.
Regression Test
A test that ensures previously working functionality continues to work after changes.
Unit Test
A test that verifies the behavior of individual components in isolation.
Error Handling Terms
Backoff
A delay mechanism that increases wait time between retry attempts.
Circuit Breaker
See Patterns and Techniques section.
Error Boundary
A component that catches and handles errors from child components.
Exponential Backoff
A backoff strategy where delays increase exponentially with each retry attempt.
Failure Mode
A specific way in which a system can fail.
Graceful Degradation
The ability of a system to continue operating with reduced functionality when components fail.
Retry Logic
Code that automatically retries failed operations with appropriate delays and limits.
Timeout
A limit on how long an operation is allowed to run before being considered failed.
Configuration Terms
Environment Variable
A value set in the operating system environment that can be read by applications.
Configuration File
A file containing settings and parameters for application behavior.
Secret
Sensitive configuration data like passwords or API keys that must be protected.
TOML
A configuration file format that is easy to read and write.
YAML
A human-readable data serialization standard often used for configuration files.
Performance Terms
Benchmark
A test that measures system performance under specific conditions.
Bottleneck
The component or operation that limits overall system performance.
Latency
The time it takes for a single operation to complete.
Load Test
A test that simulates expected system load to verify performance characteristics.
Profiling
The process of analyzing system performance to identify optimization opportunities.
Scalability
The ability of a system to handle increased load by adding resources.
Throughput
The number of operations a system can handle per unit of time.
Data Terms
Immutable
Data that cannot be changed after creation.
Normalization
The process of organizing data to reduce redundancy and improve integrity.
Payload
The actual data content of an event, excluding metadata.
Schema
The structure and constraints that define how data is organized.
Serialization
The process of converting data structures into a format that can be stored or transmitted.
Validation
The process of checking that data meets specified requirements and constraints.
Rust-Specific Terms
Async/Await
Rust’s asynchronous programming model for non-blocking operations.
Borrow Checker
Rust’s compile-time mechanism that ensures memory safety without garbage collection.
Cargo.toml
The manifest file for Rust projects that specifies dependencies and metadata.
Crate
A compilation unit in Rust; equivalent to a library or package in other languages.
Derive Macro
A Rust macro that automatically generates implementations of traits for types.
Lifetime
A construct in Rust that tracks how long references are valid.
Option
Rust’s type for representing optional values, similar to nullable types in other languages.
Result
Rust’s type for representing operations that might fail, containing either a success value or an error.
Trait
Rust’s mechanism for defining shared behavior that types can implement.
Ownership
Rust’s system for managing memory through compile-time tracking of resource ownership.
Acronyms and Abbreviations
API - Application Programming Interface
CI - Continuous Integration
CLI - Command Line Interface
CQRS - Command Query Responsibility Segregation
DDD - Domain-Driven Design
DNS - Domain Name System
HTTP - Hypertext Transfer Protocol
HTTPS - HTTP Secure
I/O - Input/Output
JSON - JavaScript Object Notation
JWT - JSON Web Token
CRUD - Create, Read, Update, Delete
ORM - Object-Relational Mapping
REST - Representational State Transfer
SQL - Structured Query Language
SSL - Secure Sockets Layer
TDD - Test-Driven Development
TLS - Transport Layer Security
UUID - Universally Unique Identifier
XML - eXtensible Markup Language
EventCore Command Reference
Common EventCore CLI commands and their purposes:
eventcore-cli
The main command-line interface for EventCore operations.
health-check
Verify system health and connectivity.
migrate
Run database migrations.
config validate
Validate configuration files and settings.
projections status
Check the status of all projections.
projections rebuild
Rebuild projections from event history.
streams list
List available event streams.
events export
Export events for backup or analysis.
Common Patterns
Builder Pattern
A creational pattern for constructing complex objects step by step.
Factory Pattern
A creational pattern for creating objects without specifying their exact classes.
Observer Pattern
A behavioral pattern where objects notify observers of state changes.
Repository Pattern
A design pattern that encapsulates data access logic.
Strategy Pattern
A behavioral pattern that enables selecting algorithms at runtime.
Best Practices
Fail Fast
The practice of detecting and reporting errors as early as possible.
Immutable Infrastructure
Infrastructure that is never modified after deployment, only replaced.
Least Privilege
The security principle of granting minimum necessary permissions.
Separation of Concerns
The principle of dividing software into distinct sections with specific responsibilities.
Single Responsibility Principle
Each class or module should have only one reason to change.
This glossary provides comprehensive coverage of EventCore terminology and related concepts. Use it as a reference when working with EventCore or reading the documentation.
That completes Part 7: Reference documentation for EventCore!
Banking Example
The banking example demonstrates EventCore’s multi-stream atomic operations by implementing a double-entry bookkeeping system.
Key Features
- Atomic Transfers: Move money between accounts with ACID guarantees
- Balance Validation: Prevent overdrafts with compile-time safe types
- Audit Trail: Complete history of all transactions
- Account Lifecycle: Open, close, and freeze accounts
Running the Example
cargo run --example banking
Code Structure
The example includes:
types.rs
- Domain types with validation (AccountId, Money, etc.)events.rs
- Account events (Opened, Deposited, Withdrawn, etc.)commands.rs
- Business operations (OpenAccount, Transfer, etc.)projections.rs
- Read models for account balances and history
E-Commerce Example
The e-commerce example shows how to build a complete order processing system with inventory management using EventCore.
Key Features
- Order Processing: Multi-step order workflow with validation
- Inventory Management: Real-time stock tracking across warehouses
- Dynamic Pricing: Apply discounts and calculate totals
- Multi-Stream Operations: Coordinate between orders, inventory, and customers
Running the Example
cargo run --example ecommerce
Code Walkthrough
The example demonstrates:
- Complex state machines for order lifecycle
- Compensation patterns for failed operations
- Projection-based inventory queries
- Integration with external payment systems
Sagas Example
The saga example implements distributed transaction patterns using EventCore’s multi-stream capabilities.
What are Sagas?
Sagas are a pattern for managing long-running business processes that span multiple bounded contexts or services. EventCore makes implementing sagas straightforward with its multi-stream atomicity.
Example Scenario
This example implements a travel booking saga that coordinates:
- Flight reservation
- Hotel booking
- Car rental
- Payment processing
Each step can fail, triggering compensating actions to maintain consistency.
Running the Example
cargo run --example sagas
Implementation Details
- Orchestration: Central saga coordinator manages the workflow
- Compensation: Automatic rollback on failures
- Idempotency: Safe retries with exactly-once semantics
- Monitoring: Built-in observability for saga progress
Contributing to EventCore
Thank you for your interest in contributing to EventCore! We welcome contributions from the community.
Code of Conduct
Contributor Covenant Code of Conduct
Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
Our Standards
Examples of behavior that contributes to a positive environment for our community include:
- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
- Focusing on what is best not just for us as individuals, but for the overall community
Examples of unacceptable behavior include:
- The use of sexualized language or imagery, and sexual attention or advances of any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others’ private information, such as a physical or email address, without their explicit permission
- Other conduct which could reasonably be considered inappropriate in a professional setting
Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at john@johnwilger.com. All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
1. Correction
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
2. Warning
Community Impact: A violation through a single incident or series of actions.
Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
3. Temporary Ban
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
4. Permanent Ban
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
Consequence: A permanent ban from any sort of public interaction within the community.
Attribution
This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
Changelog
All notable changes to the EventCore project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
Added
- Comprehensive interactive documentation tutorials
- Enhanced error diagnostics with miette integration
- Fluent CommandExecutorBuilder API for configuration
- Command definition macros for cleaner code
- Multi-stream event sourcing with dynamic consistency boundaries
- Type-safe command system with compile-time stream access control
- Flexible command-controlled dynamic stream discovery
- PostgreSQL adapter with full type safety
- In-memory event store adapter for testing
- Comprehensive benchmarking suite
- Complete examples for banking, e-commerce, and saga pattern domains
- Property-based testing throughout the codebase
- Extensive monitoring and observability features
- Projection system with checkpointing and recovery
- Event serialization with schema evolution support
- Command retry mechanisms with configurable policies
- Developer experience improvements with macros
- Complete CI/CD pipeline with PostgreSQL integration
Changed
- Replaced aggregate-per-command terminology with multi-stream event sourcing
- Made PostgreSQL adapter generic over event type for better type safety
- Updated Command trait to include StreamResolver for flexible stream discovery
- Enhanced concurrency control to check all read stream versions
- Improved CI configuration with PostgreSQL services and coverage optimization
Fixed
- PostgreSQL schema initialization concurrency issues in CI
- All pre-commit hook failures across the codebase
- CI workflow syntax errors and configuration issues
- Test isolation with unique stream IDs and database cleanup
- Race conditions in concurrent command execution
Security
- Forbid unsafe code throughout the workspace
- Comprehensive security audit integration in CI
- Protection against dependency vulnerabilities
[0.1.3] - 2025-07-07
Fixed
- PostgreSQL test configuration (missing TEST_DATABASE_URL) in PR #31
- Documentation sync script symlink issue in PR #31
- Cargo.toml version specifications for crates.io publishing in PR #33
- CSS directory creation for documentation builds in PR #33
- Workspace dependency syntax errors (
rand.workspace = true
torand = { workspace = true }
) - Version conflicts preventing re-release after partial crates.io publishing
- Circular dependency in eventcore-macros preventing crates.io release
- Publishing order to resolve dev-dependency circular dependencies
Changed
- Implemented workspace dependencies for internal crates to enable automatic lockstep versioning
- Updated publishing order to macros → memory → postgres → eventcore
- Added PR template compliance rules to CLAUDE.md
- Improved PR validation workflow with debouncing and comment deduplication
- Replaced PR validation workflow with Definition of Done bot
- Removed version numbers from internal workspace dependencies for cleaner dependency management
Added
- PR template compliance enforcement in development workflow
- Definition of Done bot configuration for automatic PR checklists
- Critical rule #4 to CLAUDE.md: Always stop and ask for help rather than taking shortcuts
[0.1.2] - 2025-07-05
Fixed
- Rand crate v0.9.1 deprecation errors:
- Updated
thread_rng()
torng()
across codebase - Updated
gen()
torandom()
andgen_range()
torandom_range()
- Fixed ThreadRng Send issue in stress tests
- Updated
- OpenTelemetry v0.30.0 API breaking changes:
- Updated
Resource::new()
toResource::builder()
pattern - Removed unnecessary runtime parameter from
PeriodicReader::builder()
- Added required
grpc-tonic
feature to opentelemetry-otlp dependency
- Updated
- Bincode v2.0.1 API breaking changes:
- Updated to use
bincode::serde::encode_to_vec()
andbincode::serde::decode_from_slice()
- Added “serde” feature to bincode dependency
- Replaced deprecated
bincode::serialize()
andbincode::deserialize()
functions
- Updated to use
Changed
- Updated actions/configure-pages from v4 to v5 (PR #3)
- Updated codecov/codecov-action from v3 to v5 (PR #4)
[0.1.1] - 2025-07-05
Added
- Modern documentation website with mdBook
- GitHub Pages deployment workflow
- Custom EventCore branding and responsive design
- Automated documentation synchronization from markdown sources
- Deployment on releases with version information
- Comprehensive security infrastructure:
- SECURITY.md with vulnerability reporting via GitHub Security Advisories
- Improved cargo-audit CI job using rustsec/audit-check action
- Dependabot configuration for automated dependency updates
- CONTRIBUTING.md with GPG signing documentation
- Security guide in user manual covering authentication, encryption, validation, and compliance
- COMPLIANCE_CHECKLIST.md mapping to OWASP/NIST/SOC2/PCI/GDPR/HIPAA
- Pull request template with security and performance review checklists
- GitHub Copilot instructions for automated PR reviews
- Pre-commit hook improvements:
- Added doctests to pre-commit hooks
- Auto-format and stage files instead of failing
- GitHub MCP server integration for all GitHub operations
Fixed
- Outdated Command trait references (now CommandLogic) in documentation
- Broken documentation links in README.md
- License information to reflect MIT-only licensing
- Doctest compilation error in resource.rs
Changed
- Reorganized documentation structure (renumbered operations to 07, reference to 08)
- Consolidated documentation to single source (symlinked docs/manual to website/src/manual)
- Updated PR template to remove redundant pre-merge checklist and add Review Focus section
- Enhanced CLAUDE.md with GitHub MCP integration and PR-based workflow documentation
- Simplified PR template by consolidating multiple checklists into single Submitter Checklist
[0.1.0] - Initial Release
Added
-
Core Event Sourcing Foundation
StreamId
,EventId
,EventVersion
types with validation- Command trait system with type-safe execution
- Event store abstraction with pluggable backends
- Multi-stream atomicity with optimistic concurrency control
- Event metadata tracking (causation, correlation, user)
-
Type-Driven Development
- Extensive use of
nutype
for domain type validation - Smart constructors that make illegal states unrepresentable
- Result types for all fallible operations
- Property-based testing with
proptest
- Extensive use of
-
PostgreSQL Adapter (
eventcore-postgres
)- Full PostgreSQL event store implementation
- Database schema migrations
- Transaction-based multi-stream writes
- Optimistic concurrency control with version checking
- Connection pooling and error mapping
-
In-Memory Adapter (
eventcore-memory
)- Fast in-memory event store for testing
- Thread-safe storage with Arc
- Complete EventStore trait implementation
- Version tracking per stream
-
Command System
- Type-safe command execution
- Automatic state reconstruction from events
- Multi-stream read/write operations
- Retry mechanisms with exponential backoff
- Command context and metadata support
-
Projection System
- Projection trait for building read models
- Checkpoint management for resume capability
- Projection manager with lifecycle control
- Event subscription and processing
- Error recovery and retry logic
-
Monitoring & Observability
- Metrics collection (counters, gauges, timers)
- Health checks for event store and projections
- Structured logging with tracing integration
- Performance monitoring and alerts
-
Serialization & Persistence
- JSON event serialization with schema evolution
- Type registry for dynamic deserialization
- Unknown event type handling
- Migration chain support
-
Developer Experience
- Comprehensive test utilities and fixtures
- Property test generators for all domain types
- Event and command builders
- Assertion helpers for testing
- Test harness for end-to-end scenarios
-
Macro System (
eventcore-macros
)#[derive(Command)]
procedural macro- Automatic stream field detection
- Type-safe StreamSet generation
- Declarative
command!
macro
-
Examples (
eventcore-examples
)- Banking domain with money transfers
- E-commerce domain with order management
- Order fulfillment saga with distributed transaction coordination
- Complete integration tests
- Usage patterns and best practices
-
Benchmarks (
eventcore-benchmarks
)- Command execution performance tests
- Event store read/write benchmarks
- Projection processing benchmarks
- Memory allocation profiling
-
Documentation
- Comprehensive rustdoc for all public APIs
- Interactive tutorials for common scenarios
- Usage examples in documentation
- Migration guides and best practices
Performance
- Target: 5,000-10,000 single-stream commands/sec
- Target: 2,000-5,000 multi-stream commands/sec
- Target: 20,000+ events/sec (batched writes)
- Target: P95 command latency < 10ms
Breaking Changes
- N/A (initial release)
Migration Guide
- N/A (initial release)
Dependencies
- Rust: Minimum supported version 1.70.0
- PostgreSQL: Version 15+ (for PostgreSQL adapter)
- Key Dependencies:
tokio
1.45+ for async runtimesqlx
0.8+ for PostgreSQL integrationuuid
1.17+ with v7 support for event orderingserde
1.0+ for serializationnutype
0.6+ for type safetymiette
7.6+ for enhanced error diagnosticsproptest
1.7+ for property-based testing
Architecture Highlights
- Multi-Stream Event Sourcing: Commands define their own consistency boundaries
- Type-Driven Development: Leverage Rust’s type system for domain modeling
- Functional Core, Imperative Shell: Pure business logic with side effects at boundaries
- Parse, Don’t Validate: Transform unstructured data at system boundaries only
- Railway-Oriented Programming: Chain operations using Result types
Versioning Strategy
EventCore follows Semantic Versioning with the following guidelines:
Major Version (X.0.0)
- Breaking changes to public APIs
- Changes to the Command trait signature
- Database schema changes requiring migration
- Changes to serialization format requiring migration
Minor Version (0.X.0)
- New features and capabilities
- New optional methods on traits
- New crates in the workspace
- Performance improvements
- New configuration options
Patch Version (0.0.X)
- Bug fixes
- Documentation improvements
- Dependency updates (compatible versions)
- Internal refactoring without API changes
Workspace Versioning
All crates in the EventCore workspace share the same version number to ensure compatibility:
eventcore
(core library)eventcore-postgres
(PostgreSQL adapter)eventcore-memory
(in-memory adapter)eventcore-examples
(example implementations)eventcore-benchmarks
(performance benchmarks)eventcore-macros
(procedural macros)
Pre-release Versions
- Alpha:
X.Y.Z-alpha.N
- Early development, APIs may change - Beta:
X.Y.Z-beta.N
- Feature complete, testing phase - RC:
X.Y.Z-rc.N
- Release candidate, final testing
Compatibility Promise
- Patch versions: Fully compatible, safe to upgrade
- Minor versions: Backward compatible, safe to upgrade
- Major versions: May contain breaking changes, migration guide provided
Contributing
When contributing to EventCore:
- Follow the conventional commits format
- Update this CHANGELOG.md with your changes
- Ensure all tests pass and coverage remains high
- Update documentation for any API changes
- Add property-based tests for new functionality
Commit Message Format
type(scope): description
[optional body]
[optional footer]
Types: feat
, fix
, docs
, style
, refactor
, test
, chore
Scopes: core
, postgres
, memory
, examples
, macros
, benchmarks
MIT License
Copyright (c) 2025 John Wilger
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.