Skip to main content

// Section 1.1.1 · Concepts

The Parallel-First Principle

2 min1.1.1Concepts
// 3 of 4 · first principles

The inversion

// 1.1.1 · the inversion

// Load-bearing sentence

The unit of work is the request, not the machine.

Parallelism is request-level. Many requests run at once across the pool. Genuinely parallelizable workloads can be segmented into sub-tasks, but a single inference request is dispatched whole. State that once. Hold it for the rest of the document.

// Three-way contrast · execution model

// Centralised cloud

Instance

Throughput capped at instance size. Cost scales with rental time, not work done. Geography is the provider's choice.

// rented machine · one job · one ceiling

// Token-wrapped DePIN

Instance

Decentralisation lives in the billing column. The job still lands on a single box. Execution did not move.

// rented machine · billing changed

// ParalleliX

Request

Each request routes to a capable node and runs whole. Many requests run concurrently across machines the user never sees. Throughput scales with the count of online nodes, not with any one instance.

// dispatched whole · mesh · ceiling moves

Six downstream effects

// 1.1.1 · six downstream effects

// Downstream effects · 6 entries

  • // 01Scheduling

    Routes per request. Capability match and best-fit allocation operate on whole inference requests; many route across the pool at once.

  • // 02Validation

    Verifies per request. Proof-of-Execution binds a result to a specific request and the node that served it.

  • // 03Payment

    Settles off-chain. ParalleliX AI credits are metered per request with no on-chain gas; operators earn reward weight from uptime.

  • // 04Pricing

    Scales with the request served, weighted by the node's hardware tier and uptime, not the wall-clock of one rented machine.

  • // 05Telemetry

    Surfaces per-node, per-request state. The node and the requests it serves, not a task ID, are the primary observability unit.

  • // 06Failure recovery

    Redispatches per request. A failing node loses that request and its uptime credit; the coordinator re-routes to a healthy node.

Note

Read every section of this document with that lens. The live demand is ParalleliX AI: each inference request is the unit of work, served whole by one node, with many requests running in parallel across the pool. Segmentation into sub-tasks is the general path for genuinely parallelizable workloads (rendering, scientific sweeps, map-reduce) and the future third-party submitter marketplace, not the inference path.