Skip to content

Latest commit

 

History

History
246 lines (175 loc) · 6.07 KB

File metadata and controls

246 lines (175 loc) · 6.07 KB

ZAI Backend

The ZAI backend provides access to Zhipu AI (Z.ai) models through an OpenAI-compatible API. ZAI offers powerful Chinese language models and coding-specific models.

Overview

ZAI (Zhipu AI) is a Chinese AI company that provides access to various language models, including GLM models and specialized coding models. The proxy supports two ZAI backend configurations: standard zai and zai-coding-plan for coding-specific workflows.

Key Features

  • OpenAI-compatible API
  • Strong Chinese language support
  • Specialized coding models
  • Competitive pricing
  • Streaming and non-streaming responses

Configuration

Environment Variables

export ZAI_API_KEY="..."

CLI Arguments

# Start proxy with ZAI as default backend
python -m src.core.cli --default-backend zai

# Or use the coding plan backend
python -m src.core.cli --default-backend zai-coding-plan

YAML Configuration

# config.yaml
backends:
  zai:
    type: zai
  zai-coding-plan:
    type: zai-coding-plan

default_backend: zai

Backend Variants

Standard ZAI Backend

The standard zai backend provides access to all ZAI models:

python -m src.core.cli --default-backend zai

ZAI Coding Plan Backend

The zai-coding-plan backend is optimized for coding workflows and works with any supported front-end and coding agent:

python -m src.core.cli --default-backend zai-coding-plan

This backend is specifically designed for:

  • Code generation and completion
  • Code review and refactoring
  • Debugging assistance
  • Technical documentation

Usage Examples

Basic Chat Completion

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -d '{
    "model": "glm-4",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Coding Task

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -d '{
    "model": "glm-4",
    "messages": [
      {"role": "user", "content": "Write a Python function to sort a list"}
    ]
  }'

Streaming Response

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -d '{
    "model": "glm-4",
    "messages": [
      {"role": "user", "content": "Explain async programming"}
    ],
    "stream": true
  }'

Chinese Language Task

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -d '{
    "model": "glm-4",
    "messages": [
      {"role": "user", "content": "请解释一下机器学习的基本概念"}
    ]
  }'

Use Cases

Chinese Language Applications

ZAI models excel at:

  • Chinese language understanding and generation
  • Chinese-English translation
  • Chinese text analysis
  • Chinese content creation

Coding Workflows

The zai-coding-plan backend is ideal for:

  • Code generation in multiple languages
  • Code review and suggestions
  • Debugging assistance
  • Technical documentation generation
  • Integration with coding agents

Cost-Effective Alternative

Use ZAI as:

  • A cost-effective alternative to Western providers
  • A specialized option for Chinese language tasks
  • A coding-focused backend for development workflows

Known Limitations

Reasoning Payloads

Reasoning payloads are currently stripped due to provider limitations. This means:

  • Reasoning models may not work as expected
  • Thinking/reasoning output is not preserved
  • Use standard models for best results

If you need reasoning capabilities, consider using other backends like OpenAI (o1 models) or the Hybrid Backend.

Model Parameters

You can specify model parameters using URI syntax:

# With temperature
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zai:glm-4?temperature=0.7",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

See URI Model Parameters for more details.

Troubleshooting

401 Unauthorized

  • Verify your ZAI_API_KEY is set correctly
  • Check that the API key is valid and has credits
  • Ensure you're using the correct authentication header

429 Rate Limit Exceeded

  • ZAI has rate limits based on your account tier
  • Consider upgrading your account for higher limits
  • Use failover to switch to alternative models

Model Not Found

  • Verify the model name is correct (e.g., glm-4)
  • Check that your API key has access to the requested model
  • Some models may require special access

Chinese Character Encoding Issues

  • Ensure your client is using UTF-8 encoding
  • Check that the proxy is configured to handle UTF-8
  • Verify that your terminal/client supports Chinese characters

Reasoning Not Working

  • Remember that reasoning payloads are stripped for ZAI
  • Use standard models instead of reasoning models
  • Consider using other backends for reasoning tasks

Integration with Coding Agents

The zai-coding-plan backend works seamlessly with coding agents:

# Point your coding agent to the proxy
export OPENAI_API_BASE=http://localhost:8000/v1
export OPENAI_API_KEY=YOUR_PROXY_KEY

# Start the proxy with ZAI coding plan
python -m src.core.cli --default-backend zai-coding-plan

# Your coding agent will now use ZAI models

Related Features

Related Documentation