Back to providers page
Try this model
openrouter

glm-z1-rumination-32b

completions

THUDM: GLM Z1 Rumination 32B is a 32B-parameter deep reasoning model from the GLM-4-Z1 series, optimized for complex, open-ended tasks requiring prolonged deliberation. It builds upon glm-4-32b-0414 with additional reinforcement learning phases and multi-stage alignment strategies, introducing “rumination” capabilities designed to emulate extended cognitive processing. This includes iterative reasoning, multi-hop analysis, and tool-augmented workflows such as search, retrieval, and citation-aware synthesis. The model excels in research-style writing, comparative analysis, and intricate question answering. It supports function calling for search and navigation primitives (`search`, `click`, `open`, `finish`), enabling use in agent-style pipelines. Rumination behavior is governed by multi-turn loops with rule-based reward shaping and delayed decision mechanisms, benchmarked against Deep Research frameworks such as OpenAI’s internal alignment stacks. This variant is suitable for scenarios requiring depth over speed.

Input:$0.24 / 1M tokens
Output:$0.24 / 1M tokens
Context:32K tokens
text
text

Access glm-z1-rumination-32b through LangDB AI Gateway

Recommended

Integrate with thudm's glm-z1-rumination-32b and 250+ other models through a unified API. Monitor usage, control costs, and enhance security.

Unified API
Cost Optimization
Enterprise Security
Get Started Now

Free tier available • No credit card required

Instant Setup
99.9% Uptime
10,000+Monthly Requests
Code Example
Configuration
Base URL
API Keys
Headers
Project ID in header
X-Run-Id
X-Thread-Id
Model Parameters
12 available
frequency_penalty
-202
include_reasoning
logit_bias
max_tokens
min_p
001
presence_penalty
-201.999
repetition_penalty
012
seed
stop
temperature
012
top_k
top_p
011
Additional Configuration
Tools
Guards
User:
Id:
Name:
Tags:
Publicly Shared Threads0

Discover shared experiences

Shared threads will appear here, showcasing real-world applications and insights from the community. Check back soon for updates!

Share your threads to help others
Popular Models10
  • openai
    gpt-4o-mini-search-preview
    openai
    GPT-4o mini Search Preview is a specialized model trained to understand and execute web search queries with the Chat Completions API.
    Input:$0.15 / 1M tokens
    Output:$0.6 / 1M tokens
    Context:128K tokens
    text
    text
  • openai
    gpt-4o-search-preview
    openai
    GPT-4o Search Preview is a specialized model trained to understand and execute web search queries with the Chat Completions API
    Input:$2.5 / 1M tokens
    Output:$10 / 1M tokens
    Context:128K tokens
    text
    text
  • anthropic
    claude-sonnet-4
    anthropic
    Our high-performance model with exceptional reasoning and efficiency
    Input:$3 / 1M tokens
    Output:$15 / 1M tokens
    Context:200K tokens
    tools
    text
    image
    text
  • anthropic
    claude-opus-4
    anthropic
    Our most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding
    Input:$15 / 1M tokens
    Output:$75 / 1M tokens
    Context:200K tokens
    tools
    text
    image
    text
  • openai
    gpt-4.1
    openai
    GPT-4.1 is OpenAI's flagship model for complex tasks. It is well suited for problem solving across domains.
    Input:$2 / 1M tokens
    Output:$8 / 1M tokens
    Context:1047576 tokens
    tools
    text
    image
    text
  • gemini
    gemini-2.5-pro-preview
    gemini
    Gemini 2.5 Pro Experimental is Google's state-of-the-art thinking model, capable of reasoning over complex problems in code, math, and STEM, as well as analyzing large datasets, codebases, and documents using long context.
    Input:$1.25 / 1M tokens
    Output:$10 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • gemini
    gemini-2.5-flash-preview
    gemini
    Google's best model in terms of price-performance, offering well-rounded capabilities. Gemini 2.5 Flash rate limits are more restricted since it is an experimental / preview model.
    Input:$0.15 / 1M tokens
    Output:$0.6 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • gemini
    gemini-2.0-flash
    gemini
    Google's most capable multi-modal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents.
    Input:$0.1 / 1M tokens
    Output:$0.4 / 1M tokens
    Context:1M tokens
    tools
    text
    image
    audio
    video
    text
  • anthropic
    claude-3.7-sonnet
    anthropic
    Intelligent model, with visible step‑by‑step reasoning
    Input:$3 / 1M tokens
    Output:$15 / 1M tokens
    Context:200K tokens
    tools
    text
    text
    image
  • gemini
    gemini-2.0-flash-lite
    gemini
    Google's smallest and most cost effective model, built for at scale usage.
    Input:$0.07 / 1M tokens
    Output:$0.3 / 1M tokens
    Context:1M tokens
    text
    image
    audio
    video
    text

Related AI Model Resources

Explore more AI models, providers, and integration options:

  • Browse All AI Models
  • AI Providers Directory
  • More from openrouter
  • MCP Servers
  • Integration Documentation
  • Pricing & Plans
  • AI Industry Blog