// Section 4.2 · Protocol

Capacity-Aware Queueing

1 min4.2Protocol

// 2 of 4 · request routing

Priority by complexity, not arrival order.

// 4.2 · capacity-aware queueing · signal 2 of 4

Higher-complexity requests are not blocked behind lower-complexity ones.

After capability filtering, the surviving node pool has a queue. Requests enter that queue ordered by complexity, not by arrival order. On the live AI path each entry is a whole inference request, dispatched intact to one node; on the general parallelizable-workload path it is a sub-task. Each node holds its own active capacity counter. The scheduler does not over-subscribe a node beyond its declared limit.

Per-node capacity counter

// each node carries its own limit · scheduler respects declared maximum

nd_4f7c

3 / 4ACCEPTING

nd_8e91

4 / 4SATURATED

nd_c027

2 / 8ACCEPTING

nd_d104

1 / 4ACCEPTING

// capacity counter is per-node · the scheduler skips nd_8e91 entirely while it sits at the max

Priority queue order

// complexity-ranked · not FIFO

// ORDERCOMPLEXITYARRIVALDISPATCH ETA

// 1

req_a3f1large-context inference request

14:03:12next

// 2

sub_b2c0mid render frame

14:03:05next-of-tier

// 3

sub_c1d8small ETL partition

14:03:08queued

// 4

req_d4e2short inference request

14:03:11queued

// 5

sub_e5f9small data window

14:03:14queued

// the heaviest request arrived LAST but dispatches FIRST · capacity-aware queueing prevents head-of-line blocking on lower-complexity work

// Reading path

Sub-Task Routing hub Best-Fit Allocation · signal 3 of 4 Per-Step Detail · scheduling step in the task lifecycle