Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completion
curl --request POST \
  --url https://api.weryai.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "GEMINI_25_FLASH",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is artificial intelligence?"
    }
  ],
  "max_tokens": 1024,
  "temperature": 1,
  "top_p": 1,
  "n": 1
}
'
{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1711929600,
  "model": "GEMINI_25_FLASH",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Artificial intelligence (AI) refers to the simulation of human intelligence in machines..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Authorizations

Authorization
string
header
required

Authenticate using Bearer token. Get your API Key from the WeryAI Console.

Example: Authorization: Bearer sk-xxxxxxxxxxxxxxxx

Body

application/json
model
string
required

Chat model key. Use the /v1/chat/models endpoint to get available models.

Model NameModel Key
GPT-5.5GPT_5_5
GPT-5.4GPT_5_4
Claude-Fable-5CLAUDE_FABLE_5
Claude-4.8-OpusCLAUDE_4_8_OPUS
Claude-4.6-OpusCLAUDE_4_6_OPUS
Gemini-3.5-FlashGEMINI_3_5_FLASH
Gemini-3.1-ProGEMINI_3_1_PRO
GPT-5.1GPT_5_1
Claude-4.5-OpusCLAUDE_4_5_OPUS
GPT-5GPT_5
DeepSeek-R1DEEPSEEK_R1
Kimi K2 ThinkingKIMI_K2_THINKING
QwQ 32BQWEN_QWQ_32B
Grok-4GROK_4
Claude-Sonnet-4.6CLAUDE_SONNET_4_6
Gemini-3.1-Flash-LiteGEMINI_3_1_FLASH_LITE
Qwen3.5 PlusQWEN_3_5_PLUS
GLM 5GLM_5
Kimi K2.5KIMI_K2_5
Claude-Sonnet-4.5CLAUDE_SONNET_4_5
GPT-4oGPT_4O
GPT-4.1GPT_4_1
Gemini-2.5-ProGEMINI_25_PRO
GLM 4.7 FlashGLM_4_7_FLASH
Gemini-2.5-FlashGEMINI_25_FLASH
Seed-2.0-MiniSEED_2_0_MINI
Claude-4-OpusCLAUDE_4_OPUS
Claude-4-SonnetCLAUDE_4_SONNET
Example:

"GEMINI_25_FLASH"

messages
object[]
required

Message list for the conversation. Supports multi-turn by including history.

Required array length: 1 - 50 elements
Example:
[
{
"role": "user",
"content": "What is artificial intelligence?"
}
]
max_tokens
integer
default:1024

Maximum number of tokens to generate. Default 1024. The upper limit depends on the model (use the model list endpoint to check).

Required range: x >= 1
Example:

1024

temperature
number
default:1

Controls randomness of the output. Higher values produce more diverse results.

Required range: 0 <= x <= 2
Example:

1

top_p
number
default:1

Nucleus sampling parameter. Limits cumulative probability of candidate tokens.

Required range: 0 <= x <= 1
Example:

1

presence_penalty
number

Penalizes new topics to reduce repetition

Required range: -2 <= x <= 2
frequency_penalty
number

Penalizes frequent tokens to reduce repetition

Required range: -2 <= x <= 2
seed
integer

Random seed. Same seed with same input produces deterministic results.

n
integer
default:1

Number of responses to generate

Required range: x >= 1
Example:

1

stream
boolean
default:false

Whether to stream the response

plugins
object[]

Optional plugins for the chat request.

Web Search

Some models support web search. Whether web search takes effect depends on the selected model and upstream provider capabilities. Gemini models are currently integrated with Google Search; for other models, the plugins parameter is passed through and upstream support determines whether it works.

Enable web search with:

{
"plugins": [
{ "id": "web" }
]
}

Notes:

  • plugins[].id = "web" requests web search.
  • When Gemini models use web search, response_format.type = "json_schema" cannot be used at the same time.
Example:
[{ "id": "web" }]

Response

Chat completed successfully

OpenAI-compatible Chat Completion response

id
string

Unique identifier for the chat completion

Example:

"chatcmpl-abc123def456"

object
string

Object type, always "chat.completion"

Example:

"chat.completion"

created
integer<int64>

Unix timestamp (in seconds) of when the completion was created

Example:

1711929600

model
string

The model used for this completion

Example:

"GEMINI_25_FLASH"

choices
object[]

List of completion choices

usage
object