Waterfall Engine

waterfall_v1 is a rule-based engine for multi-manager deployments. You define a CSV-like policy_content where each line is one manager rule.

Quick Start (copy/paste)

This example uses two native worker managers to simulate two tiers.

[scheduler]
scheduler_address = "tcp://127.0.0.1:8516"
# for following object_storage_address
# - if omitted, object storage is auto-started at scheduler port + 1
# - if specified, scheduler will connect to specified address without start one
# object_storage_address = "tcp://127.0.0.1:8517"
policy_engine_type = "waterfall_v1"
policy_content = """
#  priority, worker_manager_id, max_task_concurrency
         1,        NAT|local1,                   8
         2,        NAT|burst1,                  50
"""

[[worker_manager]]
type = "baremetal_native"
scheduler_address = "tcp://127.0.0.1:8516"
object_storage_address = "tcp://127.0.0.1:8517"
worker_manager_id = "NAT|local1"
max_task_concurrency = 8

[[worker_manager]]
type = "baremetal_native"
scheduler_address = "tcp://127.0.0.1:8516"
object_storage_address = "tcp://127.0.0.1:8517"
worker_manager_id = "NAT|burst1"
max_task_concurrency = 50

Run command:

$ scaler config.toml

Policy Content (CSV-like)

policy_content is a newline-separated CSV-like list. Each non-empty line is one rule:

priority,worker_manager_id,max_task_concurrency

Example:

1,NAT|local1,8
2,NAT|burst1,50

Rules are evaluated as a priority ladder.

  • Scale-up goes from smaller priority values to larger values.

  • Scale-down is the reverse, so higher-priority-number tiers drain first.

  • Empty or malformed rule lines fail fast.

priority

priority defines tier order for each manager rule.

  • Type: integer.

  • Smaller number = higher preference.

  • 1 is preferred before 2 for scale-up decisions.

  • For scale-down, higher values are reduced before lower values.

Use priority to encode cost or latency tiers, for example local capacity first and burst capacity second.

worker_manager_id

worker_manager_id binds a rule to a specific connected worker manager.

  • Must match the manager heartbeat ID exactly.

  • Must be non-empty.

  • Must be unique per scheduler.

  • Each ID can appear only once in policy_content.

If a manager connects without a matching rule, it is ignored by waterfall scaling.

max_task_concurrency

max_task_concurrency defines the rule-level capacity cap for a manager.

  • Type: integer.

  • The effective cap is min(rule.max_task_concurrency, heartbeat.max_task_concurrency).

  • A lower-priority manager only scales up after all higher-priority managers reach their effective caps.

  • A higher-priority manager only scales down after lower-priority managers are drained.

Engine behavior and limits

  • Allocation policy is fixed to even_load.

  • Ratio thresholds are fixed at 10 for scale-up and 1 for scale-down.

  • Threshold values are not runtime-configurable.