Reference

API Reference

DevToks exposes a verified OpenAI-compatible API surface for chat, Responses, Claude Messages, embeddings, rerank, media generation, asset management, and model discovery.

Base URL Auth Requests Examples Parameters Chat Responses Messages Embeddings Rerank Images Audio Video Moderation Realtime MCP Assets Models Special Streaming Errors Unsupported

Base URL

Use the SDK base URL for OpenAI-compatible SDKs. The endpoint tables below show full HTTP paths under the API origin, so do not add /v1 twice.

OpenAI-compatible SDK base URL

https://api.devtoks.com/v1

HTTP API origin

https://api.devtoks.com

Example: with the SDK base URL shown here, POST /v1/chat/completions is called as /chat/completions from OpenAI SDK clients.

Authentication

Authenticate every API request with a project API key.

Authorization: Bearer sk-your-key

Recommended header: Authorization: Bearer sk-your-key.
Anthropic-compatible clients may also send X-Api-Key: sk-your-key.
Never expose API keys in browser code, mobile apps, public repositories, or client-side logs.

Request Contract

Requests and responses use JSON unless a specific endpoint requires multipart upload, binary download, Server-Sent Events, or WebSocket upgrade semantics.

Send Content-Type: application/json for JSON requests.
Use stream: true on supported generation endpoints to receive Server-Sent Events.
Use GET /v1/models with the same API key to discover the exact model IDs available to that key.
Provider-specific features depend on the selected model and channel capability; unsupported options return a structured error.

Verified Examples

These examples use the same public routes registered by the API gateway.

Chat Completions

OpenAI SDK-compatible request for text generation.

cURL

curl https://api.devtoks.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-key" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "Say hello in one sentence."}
    ]
  }'

Responses API

OpenAI Responses request. Native-only actions require a native Responses channel.

cURL

curl https://api.devtoks.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-key" \
  -d '{
    "model": "gpt-4.1",
    "input": "Summarize the benefits of API routing."
  }'

Claude Messages

Anthropic Messages format routed through the same API key.

cURL

curl https://api.devtoks.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-key" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Write a concise integration checklist."}
    ]
  }'

Model Discovery

Use this before production rollout to confirm model IDs and permissions.

cURL

curl https://api.devtoks.com/v1/models \
  -H "Authorization: Bearer sk-your-key"

Endpoints / Parameters and Responses

The gateway accepts the public compatibility contracts below. Unknown or provider-specific fields are forwarded only when supported by the selected model, channel, and sanitizer rules.

Chat & Text Completions

POST

/v1/chat/completionsChat completions — primary endpoint

POST

/v1/completionsLegacy text completions

POST

/v1/editsLegacy edits endpoint

Chat Completions — POST /v1/chat/completions

OpenAI-compatible chat request body.

Required

model: string
messages: array of { role, content }

Common optional fields

max_tokens or max_completion_tokens
temperature, top_p, top_k, stop
stream, stream_options.include_usage
tools, tool_choice, parallel_tool_calls
response_format, reasoning_effort, verbosity
metadata, user, service_tier, seed

Successful response

Non-streaming responses use choices[].message, choices[].finish_reason, model, created, and usage.
Streaming responses use Server-Sent Events with choices[].delta and optional usage chunks.

Notes

Values are validated for known ranges such as penalties, token limits, service tier, and reasoning effort.

Responses API

Create Responses API calls. Retrieval, input item, input token, cancel, delete, and compact actions require native Responses API upstream support.

POST

/v1/responsesCreate a response

POST

/v1/responses/compactCompact a response context (native Responses only)

POST

/v1/responses/input_tokensCount input tokens (native Responses only)

GET

/v1/responses/:idRetrieve a response

GET

/v1/responses/:id/input_itemsList response input items (native Responses only)

DELETE

/v1/responses/:idDelete a response

POST

/v1/responses/:id/cancelCancel in-progress response

Responses API — POST /v1/responses

OpenAI Responses contract. Native channels pass through Responses payloads; other compatible channels may use the chat fallback.

Required

model: string
input or prompt

Common optional fields

instructions, previous_response_id, store, background
max_output_tokens, temperature, top_p
tools, tool_choice, parallel_tool_calls, max_tool_calls
reasoning, text, include, truncation
stream, metadata, user, service_tier, safety_identifier

Successful response

Native Responses channels return upstream Responses API objects or typed SSE events.
Fallback channels return a Responses-shaped object converted from Chat Completions output.

Notes

Retrieve, input_items, input_tokens, cancel, delete, and compact actions require native Responses API upstream support.

Messages (Anthropic Native)

Use Claude models with the Anthropic Messages format directly.

POST

/v1/messagesAnthropic-compatible Messages endpoint

Claude Messages — POST /v1/messages

Anthropic Messages-compatible request body.

Required

model: string
max_tokens: number
messages: array of { role, content }

Common optional fields

system, temperature, top_p, top_k
stream, stop_sequences
tools, tool_choice, thinking
reasoning_effort, effort, output_config, metadata

Successful response

JSON responses use id, type, role, model, content[], stop_reason, stop_sequence, and usage.
Streaming responses use Anthropic Messages SSE events.

Embeddings

POST

/v1/embeddingsCreate text embeddings

POST

/v1/engines/:model/embeddingsLegacy path (OpenAI v1 compat)

Embeddings — POST /v1/embeddings

OpenAI-compatible embeddings request body.

Required

model: string
input: string or array

Common optional fields

encoding_format
dimensions
user

Successful response

Responses use object, data[].embedding, data[].index, model, and usage.
Numeric embedding arrays and base64-encoded embedding payloads are normalized for compatible clients.

Rerank

POST

/v1/rerankRerank documents by relevance

POST

/v2/rerankV2 endpoint (Cohere-compatible)

Rerank — POST /v1/rerank and /v2/rerank

Canonical rerank request body with Cohere-style compatibility.

Required

model: string
query: string
documents: string[]

Common optional fields

top_n
max_tokens_per_doc
priority
input as legacy query alias

Successful response

Responses are provider-shaped rerank results containing ranked document indexes and relevance scores when supported upstream.

Images

Requires a channel with image generation capability.

POST

/v1/images/generationsGenerate images from text

POST

/v1/images/editsEdit an existing image

Images — POST /v1/images/generations and /v1/images/edits

Image generation uses JSON or form fields. Image edits require multipart form upload.

Required

generations: prompt
edits: image, mask, model, prompt

Common optional fields

model, n, size, quality, style
response_format, aspect_ratio, output_format, background
moderation, user, image_prompt, input_fidelity

Successful response

Responses use created, data[] with url or b64_json, revised_prompt when provided, and usage when upstream supplies it.

Audio

Requires a channel with audio capability.

POST

/v1/audio/speechText-to-speech (TTS)

POST

/v1/audio/transcriptionsAudio to text (Whisper)

POST

/v1/audio/translationsTranscribe and translate to English

Audio — speech, transcriptions, translations

Speech uses JSON. Transcription and translation use multipart form upload.

Required

speech: model, input, voice
transcriptions/translations: file, model

Common optional fields

speech: speed, response_format
transcriptions/translations: prompt, response_format, temperature
transcriptions: language, timestamp_granularity

Successful response

Speech returns audio bytes in the requested format when supported.
Transcription and translation return text, JSON, verbose_json, srt, or vtt according to response_format.

Video

Requires a channel with video generation capability.

POST

/v1/videosSubmit a video task

GET

/v1/videosList video tasks

GET

/v1/videos/:idGet task status

GET

/v1/videos/:id/contentDownload completed video

DELETE

/v1/videos/:idDelete a video task

POST

/v1/video/generationsLegacy video generation submit path

GET

/v1/video/generations/:idLegacy video generation status path

Video — /v1/videos and legacy /v1/video/generations

Video tasks are asynchronous on supported video channels.

Required

model and/or prompt, depending on selected channel

Common optional fields

seconds, duration, duration_seconds
size, resolution, aspect_ratio
remix_id, reference_id, reference_assets
generate_audio, seed, return_last_frame, callback_url

Successful response

Create and status responses are provider-compatible task objects.
GET /v1/videos/:id/content streams completed video content when available.

Moderation

POST

/v1/moderationsClassify text for policy violations

Moderation — POST /v1/moderations

OpenAI-compatible moderation request body.

Required

model: string
input: string or array

Common optional fields

metadata or provider-specific moderation options when supported

Successful response

Responses contain provider-compatible moderation results and category scores when the upstream supplies them.

Realtime

GET

/v1/realtimeWebSocket for real-time sessions

GET

/v1/responsesWebSocket upgrade for native Responses sessions

MCP (Model Context Protocol)

POST

/mcpMCP proxy — route tool calls

MCP — POST /mcp

Model Context Protocol proxy endpoint.

Required

Provider-specific JSON-RPC or tool payload

Common optional fields

Tool call metadata supported by the selected upstream channel

Successful response

MCP responses follow the selected upstream capability and are returned in the provider-compatible shape.

Asset Management

Manage Volcengine video reference assets (images). Requires a channel with asset management capability. All endpoints use POST with a JSON body.

POST

/volc/asset/CreateAssetGroupCreate an asset group

POST

/volc/asset/CreateAssetUpload a reference asset to a group

POST

/volc/asset/ListAssetGroupsList your asset groups

POST

/volc/asset/ListAssetsList assets in a group

POST

/volc/asset/GetAssetGroupGet asset group details

POST

/volc/asset/GetAssetGet a single asset

POST

/volc/asset/UpdateAssetGroupRename or update an asset group

POST

/volc/asset/UpdateAssetUpdate asset metadata

Assets — POST /volc/asset/*

Volcengine reference asset management for video-capable channels.

Required

Asset group or asset JSON payload required by the selected asset operation

Common optional fields

Asset metadata, group metadata, pagination, or update fields depending on operation

Successful response

Asset management responses follow the selected upstream capability.

Models

GET

/v1/modelsList all available models

GET

/v1/models/:modelRetrieve model details

Models — GET /v1/models

Model discovery for the current API key.

Required

No request body

Common optional fields

GET /v1/models/:model returns one model detail when available

Successful response

Models return the key-filtered model list or model detail.

Special Endpoints

POST

/api/paas/v4/layout_parsingZhipu OCR / document parsing

OCR — POST /api/paas/v4/layout_parsing

Zhipu OCR and document layout parsing compatibility endpoint.

Required

Provider-specific JSON payload or document reference

Common optional fields

request_id, user_id, page range, crop, and visualization flags

Successful response

OCR responses follow the selected upstream capability and document parsing format.

Endpoint availability is enforced by API key permissions, model access, channel capability, and account balance. A registered route can still return an error when the selected model or channel does not support that operation.

Streaming and Realtime

Streaming is supported on compatible text, Responses, Claude Messages, and media routes when the upstream channel supports the requested mode.

For Chat Completions and Responses, set stream: true to receive Server-Sent Events.
Claude Messages streaming returns Anthropic-shaped SSE events.
GET /v1/realtime and WebSocket upgrades on /v1/responses are WebSocket flows, not normal JSON GET requests.
If a stream has already started and an error occurs, the gateway sends a protocol-appropriate SSE error event instead of corrupting the stream with a JSON body.

Model Discovery

The model list is dynamic and filtered by the API key, user entitlement, channel status, and configured model mappings. Treat GET /v1/models as the source of truth for customer integrations.

Special Compatibility

Claude Code Path Normalization

Claude Code may send Messages API requests through different path prefixes. DevToks rewrites all of the following to POST /v1/messages:

/openai/v1/messages/v1/v1/messages/openai/v1/v1/messages/api/v1/v1/messages

No client-side configuration required.

API Format Auto-Detection

If a request body is sent to the wrong endpoint, for example a Responses API payload to /v1/chat/completions, DevToks detects the mismatch and routes it to the correct handler by default. Deployments may be configured to return a 302 redirect instead.

Error Codes

Relay endpoints return sanitized public error messages. Most JSON errors use the OpenAI-compatible { error: { message, type, param, code } } envelope; Claude Messages JSON errors use the Anthropic { type: "error", error: { type, message } } envelope.

400Invalid request, unsupported parameter, unsupported channel capability, or model mismatch

401Authentication failed. Check the API key and header format

403Permission denied or account balance is insufficient

404The requested resource or route was not found

408/504The request timed out

413The request body or model input is too large

429Rate limit, concurrency limit, or service busy condition

500/502/503Temporary service or upstream provider failure

Error Shape

{
  "error": {
    "message": "The request is too long for the selected model.",
    "type": "invalid_request_error",
    "param": "",
    "code": "context_window_exceeded"
  }
}

Retry Guidance

Retry 408, 429, 500, 502, 503, and 504 with exponential backoff and jitter.
Do not retry 400, 401, 403, 404, or 413 until the request, credentials, permissions, balance, or input size is corrected.
Include the request ID from the error message or response headers when contacting support.

Registered but Unsupported

The following OpenAI-compatible route families are intentionally registered for clear 501 responses, but should not be documented as supported product features.

/v1/files/v1/fine_tuning/jobs/v1/assistants/v1/threads/v1/images/variationsDELETE /v1/models/:model

Quickstart Guide Dashboard