Skip to main content

// Section 4.3 · Protocol

Best-Fit Allocation

1 min4.3Protocol
// 3 of 4 · request routing

Smallest adequate node wins.

// 4.3 · best-fit allocation · signal 3 of 4

Larger nodes stay free for larger work. Capacity fragmentation drops.

Among capable, available nodes the scheduler picks the smallest adequate one for the dispatched item. The intent is structural: leave larger nodes free for larger work, prevent capacity fragmentation, keep the heaviest work routable when it arrives. The example below routes a whole inference request, the live AI path; the same allocation applies to a sub-task on the general parallelizable-workload path.

Worked example

// medium inference request · 4 candidate nodes · best fit wins

REQUEST TO ROUTE
req_a3f1
required: tier 3, ~mid-sized
estimated capacity: 0.4 nodes
CANDIDATE NODES
nd_4f7cTier 3 · 4×H100 · 80% free
6.5FIT SCORE
Adequate but over-spec · saves for larger work
nd_8e91Tier 3 · 4×A100 · 75% free
3.2FIT SCORE
BEST FITadequate, smallest adequate
nd_c027Tier 3 · 2×H100 · 90% free
1.0FIT SCORE
Insufficient · rejected
nd_d104Tier 4 · TPU v5 · 60% free
9.8FIT SCORE
Over-spec · would fragment Tier 4 capacity
DISPATCH DECISIONnd_8e91

The fragmentation cost

// what naive scheduling looks like

// WITHOUT BEST-FITCapacity fragments.

If the scheduler greedily picks the first capable node, large nodes fill with small work. When a tier-4 heavy request arrives, no adequate node is free. Throughput drops; the request waits.

// WITH BEST-FITCapacity stays accessible.

Best-fit routes small work to small adequate nodes. Large nodes stay available for large work. When a heavy request arrives, an adequate node is free immediately. Throughput holds; the heaviest work is never starved.