Skip to main content

// Chapter 04 · Protocol

Request Routing

The four signals, in priority order.

5 min4 sectionsProtocol

A deterministic function of four signals

// 4.0 · same inputs, same outcome, every time

// Key claim

The order is the policy.

Scheduling in ParalleliX is not a heuristic. Four signals are applied in fixed priority order; each stage narrows the candidate pool before the next stage runs. Capability runs first because it eliminates the largest cohort with the cheapest test. Reputation runs last because it is a tie-breaker, not a gate.

// Request-level fan-outlive

// ParalleliX AI · inbound requests

  • req-4a1
  • req-4a2
  • req-4a3
  • req-4a4
  • req-4a5
  • req-4a6
scheduler · capability + capacity + best-fit + reputation

// node pool · each node serves one whole request

  • node-04req-4a1
  • node-09req-4a2
  • node-15req-4a3
  • node-22req-4a4
  • node-31req-4a5
  • node-07req-4a6

A single inference is served whole by one node, never split across machines. Throughput scales with the number of online nodes.

// Priority order · top → bottom
  1. // 01Binary

    Capability match

    Available now
  2. // 02Priority

    Capacity-aware queue

    Available now
  3. // 03Fit

    Best-fit allocation

    Available now
  4. // 04Weighted

    Reputation weighting

    Planned

The four signals

// 4.1 · 4.2 · 4.3 · 4.4 · in priority order

// Signal register · coordinator schedulerlive now

  1. // 01

    Capability match

    · Binary· Available now

    Reads each node's declared workload classes and eliminates every node that did not register the unit of work's class. At launch every capable node is eligible; class-specific optimisation is planned.

  2. // 02

    Capacity-aware queue

    · Priority· Available now

    Reads each node's active work counter against its declared limit. Higher-complexity work jumps lower-complexity work; no node is over-subscribed beyond its declared capacity.

  3. // 03

    Best-fit allocation

    · Fit· Available now

    Among capable, available nodes, picks the smallest one that fits the unit of work. Larger nodes stay free for larger work, and capacity fragmentation drops.

  4. // 04

    Reputation weighting

    · Weighted· Planned

    Reads rolling uptime, validated-vs-dispatched completion rate, and 90-day validation success. At launch the multiplier is constant 1.0; once the reputation layer ships, it breaks ties at equal capability and capacity.

Worked example

// 4.5 · ai.inference · tier 2 · expedited · n=247

A single inference request, dispatched whole to one node. Watch the candidate pool shrink at each signal.

// Request specification

workload_class
ai.inference
hardware_tier
2 (mid-GPU)
priority
expedited
resource_req
8 GB VRAM · 2 CPU cores
node_pool_n
247

// Candidate funnel · 247 → 1expedited · tier 2

  1. // 01Capability247 nodes declared ai.inference247 nodes
  2. // 02Capacity89 with available capacity for an expedited slot89 nodes158
  3. // 03Best-fit73 with smallest-adequate resource_req73 nodes16
  4. // 04Reputation10 top by uptime + validation success10 nodes63
  5. // 05AssignedTop-ranked candidate wins the request.nd_8e91

Note·Routing in practice at launch

At launch, capability declarations are live, capacity and best-fit are fully enforced, and reputation is set to constant 1.0. Class-specific scheduler optimisation is planned, not yet built.

Signal inputs and behaviour over time

// 4.6 · per-signal schema

What each signal reads, what it outputs, and what changes as planned work lands.

// Signal arity · 4 axes × 4 columns

  • 4.1 Capabilitydeclared workload classesall capable = eligibleunchangedclass-specific optimisation
  • 4.2 Capacityactive counter per nodepriority queue by complexityunchangedunchanged
  • 4.3 Best-fitresource_req vs availablesmallest adequateunchangedunchanged
  • 4.4 Reputationuptime / completion / validationconstant 1.0rolling multiplierunchanged

// Where to go next · reading path