GoRules AI requires an LLM provider to be configured on your BRMS instance. Your administrator sets this up via environment variables on the server.Documentation Index
Fetch the complete documentation index at: https://docs.gorules.io/llms.txt
Use this file to discover all available pages before exploring further.
Supported LLM providers
| Provider | LLM_PROVIDER value | Supported models |
|---|---|---|
| OpenAI | openai | OpenAI models |
| Anthropic (Claude) | anthropic | Anthropic models |
| Google (Gemini) | google | Gemini models |
| Amazon Bedrock | amazon-bedrock | Anthropic models |
| Google Vertex AI | google-vertex | Gemini models |
| Azure OpenAI | azure-openai | OpenAI models |
Vertex AI and Azure currently support only their native model families. If you need cross-provider model support (e.g., Anthropic models on Vertex AI or Azure), please contact us — we are happy to add it based on customer demand.
Environment variables
| Variable | Description | Default |
|---|---|---|
LLM_PROVIDER | LLM provider to use | Required |
LLM_MODEL | Model name (e.g., gpt-5.4, claude-sonnet-4-6, claude-opus-4-6, gemini-3.1-pro-preview, eu.anthropic.claude-opus-4-6-v1) | Required |
LLM_API_KEY | API key for the provider (not required for Amazon Bedrock and Vertex providers) | Required |
LLM_BASE_URL | Custom base URL for OpenAI-compatible endpoints | — |
LLM_TEMPERATURE | Sampling temperature (applies to Gemini/Google providers only) | 0.4 |
LLM_CONTEXT_WINDOW | Context window size in tokens | Provider default |
LLM_MAX_OUTPUT_TOKENS | Maximum tokens per response | 32000 |
LLM_THINKING_LEVEL | Extended thinking level: high or medium | medium |
LLM_AZURE_RESOURCE_NAME | Azure OpenAI resource name (required for azure-openai) | — |
LLM_GCP_PROJECT | GCP project ID (required for google-vertex) | — |
LLM_GCP_LOCATION | GCP region for Vertex AI (required for google-vertex, e.g. global) | — |
Prompt caching
GoRules AI uses prompt caching to reduce token usage and improve response times. Caching behavior depends on the provider:| Provider | Caching |
|---|---|
| Anthropic (direct) | cacheControl: ephemeral |
| Amazon Bedrock (Anthropic models) | cachePoint on messages |
| OpenAI | Automatic (prefix caching) |
| Azure OpenAI | Automatic (prefix caching) |
| Gemini/Google (direct & Vertex AI) | Automatic (implicit caching) |
For self-hosted deployments, ensure your load balancer has response buffering disabled or streaming enabled for optimal AI assistant experience.