Chat Completion
Send messages to a chat model and receive a response. Compatible with OpenAI Chat Completions protocol. Supports multi-turn conversations by providing message history.
Authorizations
Authenticate using Bearer token. Get your API Key from the WeryAI Console.
Example: Authorization: Bearer sk-xxxxxxxxxxxxxxxx
Body
Chat model key. Use the /v1/chat/models endpoint to get available models.
| Model Name | Model Key |
|---|---|
| GPT-5.5 | GPT_5_5 |
| GPT-5.4 | GPT_5_4 |
| Claude-Fable-5 | CLAUDE_FABLE_5 |
| Claude-4.8-Opus | CLAUDE_4_8_OPUS |
| Claude-4.6-Opus | CLAUDE_4_6_OPUS |
| Gemini-3.5-Flash | GEMINI_3_5_FLASH |
| Gemini-3.1-Pro | GEMINI_3_1_PRO |
| GPT-5.1 | GPT_5_1 |
| Claude-4.5-Opus | CLAUDE_4_5_OPUS |
| GPT-5 | GPT_5 |
| DeepSeek-R1 | DEEPSEEK_R1 |
| Kimi K2 Thinking | KIMI_K2_THINKING |
| QwQ 32B | QWEN_QWQ_32B |
| Grok-4 | GROK_4 |
| Claude-Sonnet-4.6 | CLAUDE_SONNET_4_6 |
| Gemini-3.1-Flash-Lite | GEMINI_3_1_FLASH_LITE |
| Qwen3.5 Plus | QWEN_3_5_PLUS |
| GLM 5 | GLM_5 |
| Kimi K2.5 | KIMI_K2_5 |
| Claude-Sonnet-4.5 | CLAUDE_SONNET_4_5 |
| GPT-4o | GPT_4O |
| GPT-4.1 | GPT_4_1 |
| Gemini-2.5-Pro | GEMINI_25_PRO |
| GLM 4.7 Flash | GLM_4_7_FLASH |
| Gemini-2.5-Flash | GEMINI_25_FLASH |
| Seed-2.0-Mini | SEED_2_0_MINI |
| Claude-4-Opus | CLAUDE_4_OPUS |
| Claude-4-Sonnet | CLAUDE_4_SONNET |
"GEMINI_25_FLASH"
Message list for the conversation. Supports multi-turn by including history.
1 - 50 elements[
{
"role": "user",
"content": "What is artificial intelligence?"
}
]Maximum number of tokens to generate. Default 1024. The upper limit depends on the model (use the model list endpoint to check).
x >= 11024
Controls randomness of the output. Higher values produce more diverse results.
0 <= x <= 21
Nucleus sampling parameter. Limits cumulative probability of candidate tokens.
0 <= x <= 11
Penalizes new topics to reduce repetition
-2 <= x <= 2Penalizes frequent tokens to reduce repetition
-2 <= x <= 2Random seed. Same seed with same input produces deterministic results.
Number of responses to generate
x >= 11
Whether to stream the response
Optional plugins for the chat request.
Web Search
Some models support web search. Whether web search takes effect depends on the selected model and upstream provider capabilities. Gemini models are currently integrated with Google Search; for other models, the plugins parameter is passed through and upstream support determines whether it works.
Enable web search with:
{
"plugins": [
{ "id": "web" }
]
}Notes:
plugins[].id = "web"requests web search.- When Gemini models use web search,
response_format.type = "json_schema"cannot be used at the same time.
[{ "id": "web" }]Response
Chat completed successfully
OpenAI-compatible Chat Completion response
Unique identifier for the chat completion
"chatcmpl-abc123def456"
Object type, always "chat.completion"
"chat.completion"
Unix timestamp (in seconds) of when the completion was created
1711929600
The model used for this completion
"GEMINI_25_FLASH"
List of completion choices
