Skip to main content
NativeLink is a high-performance, distributed build cache and remote execution system designed to accelerate software compilation and testing. The system follows the Remote Execution API v2 protocol and provides a modular architecture that scales from single-machine setups to large distributed deployments.

Architecture Overview

NativeLink consists of three primary components that work together to provide caching and remote execution capabilities:

Core Components

Build Clients

Build tools like Bazel, Buck2, Goma, and Reclient interact with NativeLink through the Remote Execution API:
  • Submit build actions to the scheduler
  • Upload input files to the Content Addressable Storage (CAS)
  • Query the Action Cache (AC) for previously computed results
  • Download output artifacts from CAS

Schedulers

The scheduler is responsible for managing the execution lifecycle of build actions:
The primary scheduler implementation that handles action queuing, worker matching, and task distribution.Key Features:
  • Platform property-based worker matching
  • Configurable allocation strategies (LRU/MRU)
  • Action timeout and retry logic
  • Worker health monitoring
Configuration: See schedulers.rs:88-169

Workers

Worker nodes execute build actions in isolated environments:
  • Connect to the scheduler and advertise their capabilities via platform properties
  • Download action inputs from CAS
  • Execute commands in controlled environments
  • Upload outputs back to CAS
  • Report execution results to the scheduler
Worker Capabilities:
  • Multi-action concurrency (configurable max_inflight_tasks)
  • Resource management (CPU, memory, disk)
  • Precondition scripts for dynamic resource checks
  • Graceful draining and shutdown

Storage Backends

NativeLink provides a flexible storage abstraction supporting multiple backends and composition strategies. See Stores for details.

Data Flow

Remote Execution Flow

Build Cache Flow

When a build tool performs a build:
  1. Hash Computation: Compute action digest from inputs, command, and platform properties
  2. Cache Check: Query Action Cache with digest
  3. Cache Hit: Download outputs from CAS and skip execution
  4. Cache Miss: Execute locally or remotely, then populate cache

Deployment Patterns

Single-Node Setup

All components run on a single machine. Ideal for local development and CI runners.
  • Scheduler + Worker + Storage on one node
  • In-memory or filesystem storage
  • Minimal configuration

Distributed Cluster

Components distributed across multiple machines for scalability.
  • Dedicated scheduler nodes
  • Worker pool (10s-1000s of nodes)
  • Shared cloud storage (S3, GCS)
  • Redis for metadata

Hybrid Cloud

Local caching with remote execution.
  • Local CAS/AC stores
  • GRPC Scheduler forwarding to cloud
  • FastSlow store for cache tiers

Multi-Region

Geographically distributed deployment.
  • Regional schedulers and workers
  • Shared global CAS (S3/GCS)
  • Compression for network efficiency

Communication Protocols

NativeLink implements the following Remote Execution API v2 services:

Execution Service

  • Execute - Submit actions for execution
  • WaitExecution - Monitor execution progress

Content Addressable Storage Service

  • FindMissingBlobs - Check blob existence
  • BatchUpdateBlobs - Upload small blobs
  • BatchReadBlobs - Download small blobs
  • GetTree - Retrieve directory trees

ByteStream Service

  • Read - Stream blob downloads
  • Write - Stream blob uploads

Action Cache Service

  • GetActionResult - Retrieve cached results
  • UpdateActionResult - Store action results

Capabilities Service

  • GetCapabilities - Query server capabilities

Platform Properties

Platform properties enable fine-grained worker matching:
Example Configuration:
{
  "supported_platform_properties": {
    "cpu_count": "minimum",
    "cpu_arch": "exact",
    "OSFamily": "exact"
  }
}

Configuration Files

NativeLink uses JSON5 configuration files that define:
  • Stores: CAS and AC backend configurations
  • Schedulers: Task scheduling and worker management
  • Workers: Execution capabilities and resources
  • Servers: gRPC service endpoints
See the configuration examples for reference deployments.

Performance Characteristics

NativeLink is trusted in production to handle over 1 billion requests per month for customers including Samsung.
Key Performance Features:
  • Content-addressed deduplication eliminates redundant storage
  • Incremental builds reuse cached artifacts
  • Parallel remote execution distributes workload
  • Store composition (compression, dedup, fast/slow tiers)
  • Efficient binary protocols (gRPC + protobuf)

Metrics and Observability

NativeLink provides extensive metrics and tracing:
  • Prometheus Metrics: Component-level performance data
  • OpenTelemetry Tracing: Distributed request tracing
  • Origin Events: Action lifecycle tracking
  • Health Endpoints: Service status monitoring

Next Steps

Build Cache

Learn how build caching accelerates builds

Remote Execution

Understand distributed task execution

Storage Backends

Explore storage options and composition