Skip to content

A high-performance distributed task scheduler built with Go, capable of handling millions of scheduled tasks with fault tolerance and horizontal scalability.

Overview

Built to solve the problem of reliably scheduling and executing tasks across a distributed system. Provides exactly-once execution guarantees and automatic failover.

Features

  • High Throughput: Process 100k+ tasks per second
  • Fault Tolerant: Automatic failover and task reassignment
  • Flexible Scheduling: Cron expressions, one-time, and recurring tasks
  • Priority Queues: Execute critical tasks first
  • Monitoring: Real-time metrics and alerting

Technical Highlights

Architecture

  • Leader election using Raft consensus
  • Sharding for horizontal scalability
  • Message queue for task distribution
  • State management with PostgreSQL

Performance Optimizations

  • Connection pooling and reuse
  • Batch processing for database operations
  • In-memory caching with Redis
  • Worker pool with dynamic sizing

Metrics

  • Latency: P99 < 50ms for task submission
  • Availability: 99.99% uptime
  • Scale: Tested with 10M+ concurrent scheduled tasks
  • Recovery: < 30 seconds failover time

Use Cases

  1. ETL Pipelines: Schedule data extraction and transformation jobs
  2. Notifications: Send time-based alerts and reminders
  3. Maintenance: Automated cleanup and backup tasks
  4. Reporting: Generate scheduled reports

Lessons Learned

  • Importance of observability in distributed systems
  • Trade-offs between consistency and availability
  • Effective testing strategies for concurrent systems