Cost-efficient private AI inference

Darkbloom routes encrypted requests to hardware-verified Apple Silicon providers, delivering comparable model performance at about 50% lower cost than typical API providers. Prompts stay hidden from operators, and Mac owners earn from compute they already own.

Start building ↗ Estimate earnings ↗ Read the Paper ↗

01 - What You Get

For developers

Private inference without a new SDK

Change the base URL and keep your existing OpenAI client. Requests are encrypted before they leave your app and routed to verified Apple Silicon providers.

Open Console ↗

For Mac owners

Turn idle Apple Silicon into earnings

Run a provider on hardware you already own. Darkbloom matches your Mac with inference demand, and operators keep 100% of inference revenue during the public alpha.

Start Earning ↗

02 - Why It Costs Less

Most inference pricing includes several layers between silicon and the developer.

Capacity is bought, rented, repackaged, and metered before it reaches an API call. Each layer adds margin. Darkbloom routes demand to idle Apple Silicon instead, where the hardware is already paid for and the marginal cost is mostly electricity.

Typical API supply chain

→

→ API providers → End users

Apple has shipped over 100 million machines with serious ML hardware: unified memory, high bandwidth, Neural Engines, and enough RAM in high-end systems to serve large MoE models. Most of that capacity sits idle for long stretches every day.

Darkbloom turns that idle capacity into a private inference market.

Developers get lower prices without changing SDKs. Mac owners earn from machines they already own. The coordinator matches demand to providers, but prompts stay encrypted and hidden from the operator.

100M+

Apple Silicon machines shipped since 2020

50%

lower cost at comparable model performance

18hrs

average daily idle time per machine

100%

of inference revenue goes to the hardware owner

03 - The Privacy Problem

Routing to idle machines is only useful if the operator cannot read the request.

Prompts can contain customer conversations, internal plans, source code, and other sensitive context. A marketplace promise is not enough when inference runs on hardware you do not own.

Darkbloom is designed around a stricter guarantee: the coordinator can route requests, the provider can serve them, but neither should get a usable view of the prompt.

Private inference requires privacy that can be verified, not just promised.

04 - Privacy Architecture

Operator-blind by design

Darkbloom removes the practical software paths an operator could use to observe inference data. Four layers work together, each independently verifiable.

Encryption

Encrypted end-to-end

Requests are encrypted before transmission. The coordinator routes ciphertext, and only the matched provider's hardware-bound key can decrypt the request.

Hardware

Hardware-verified

Each provider uses a key generated inside Apple's tamper-resistant secure hardware. The attestation chain traces back to Apple's root certificate authority.

Runtime

Hardened runtime

The inference process is locked down at the OS level. Debugger attachment and memory inspection are blocked so the operator cannot inspect a running request.

Output

Traceable to hardware

Responses are signed by the specific machine that produced them. The attestation chain is public, so users can verify the hardware behind the result.

The operator contributes compute, not visibility.

Your prompt is encrypted before it leaves your app. The coordinator routes traffic it cannot read. The provider serves the request inside a hardened process the operator cannot inspect.

Read the paper ↗

05 - Developer Experience

OpenAI-compatible API

Keep your SDK, request shape, and streaming code. Point the client at Darkbloom and start routing private inference.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.darkbloom.dev/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="gemma-4-26b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Streaming - SSE in the OpenAI format

Large MoE - selected models up to 239B params

06 - Pricing

50% lower cost, comparable performance

Idle Apple Silicon keeps the cost structure simple. Pay per token with no subscription or minimum, with selected model prices set around 50% below typical API-provider rates for comparable models.

Model	Input	Output	Typical API	vs typical API
Gemma 4 26BMoE · 128K context	$0.03	$0.165	$0.33	50% lower
GPT-OSS 20BMoE · 128K context	$0.015	$0.07	$0.14	50% lower

Prices per million tokens. Typical API means published list rates for comparable models from major API providers.

07 - Earn

Earn from your Mac

Install the provider, choose when your Mac is available, and earn from inference jobs matched by the network. During the public alpha, operators keep 100% of inference revenue.

100%

of inference revenue goes to you

Low

marginal cost on Apple Silicon

Install via Terminal

Downloads the provider binary and configures a background launchd service.

terminal

$ curl -fsSL https://api.darkbloom.dev/install.sh | bash

No dependenciesAuto-updatesRuns as launchd service

Earnings estimate

Usage earnings (at full utilization) plus the base-reward floor your machine earns for staying online. Select your hardware, model, online hours, and electricity cost.

1. Mac type

2. Chip

3. Memory

Model

Auto-selected: most profitable for your hardware

Electricity cost

$ /kWh

US avg: $0.15 · EU avg: $0.25 · CA avg: $0.22 · usage assumes 80% utilization, always-on

Base reward by memory tier (paid on top of usage)

24GB

$10

32GB

$12

48GB

$16

64GB

$18

96GB

$22

128GB

$26

192GB

$30

512GB

$40

Estimates only. Usage assumes 80% utilization with continuous batching (4×); the live network runs lower today. Base rewards go to attested machines online ≥90% of the month, up to a fixed monthly budget — not a guarantee, and they taper as the network grows. Actual earnings depend on demand, model popularity, reputation, uptime, and local electricity cost.

Read the technical paper

Architecture, threat model, security analysis, and economic model for private inference on distributed Apple Silicon.

Download PDF ↗