openai_chat_completion
beta
Generates responses to messages in a chat conversation, using the OpenAI API and external tools.
-
Common
-
Advanced
# Common configuration fields, showing default values
label: ""
openai_chat_completion:
server_address: https://api.openai.com/v1
api_key: "" # No default (required)
model: gpt-4o # No default (required)
prompt: "" # No default (optional)
system_prompt: "" # No default (optional)
history: "" # No default (optional)
image: 'root = this.image.decode("base64") # decode base64 encoded image' # No default (optional)
max_tokens: 0 # No default (optional)
temperature: 0 # No default (optional)
user: "" # No default (optional)
response_format: text
json_schema:
name: "" # No default (required)
schema: "" # No default (required)
tools: [] # No default (required)
# All configuration fields, showing default values
label: ""
openai_chat_completion:
server_address: https://api.openai.com/v1
api_key: "" # No default (required)
model: gpt-4o # No default (required)
prompt: "" # No default (optional)
system_prompt: "" # No default (optional)
history: "" # No default (optional)
image: 'root = this.image.decode("base64") # decode base64 encoded image' # No default (optional)
max_tokens: 0 # No default (optional)
temperature: 0 # No default (optional)
user: "" # No default (optional)
response_format: text
json_schema:
name: "" # No default (required)
description: "" # No default (optional)
schema: "" # No default (required)
schema_registry:
url: "" # No default (required)
name_prefix: schema_registry_id_
subject: "" # No default (required)
refresh_interval: "" # No default (optional)
tls:
skip_cert_verify: false
enable_renegotiation: false
root_cas: "" # No default (optional)
root_cas_file: "" # No default (optional)
client_certs: [] # Optional
oauth:
enabled: false
consumer_key: "" # No default (optional)
consumer_secret: "" # No default (optional)
access_token: "" # No default (optional)
access_token_secret: "" # No default (optional)
basic_auth:
enabled: false
username: "" # No default (optional)
password: "" # No default (optional)
jwt:
enabled: false
private_key_file: "" # No default (optional)
signing_method: "" # No default (optional)
claims: {} # Optional
headers: {} # Optional
top_p: 0 # No default (optional)
frequency_penalty: 0 # No default (optional)
presence_penalty: 0 # No default (optional)
seed: 0 # No default (optional)
stop: [] # No default (optional)
tools: [] # No default (required)
This processor sends user prompts to the OpenAI API, and the specified large language model (LLM) generates responses using all available context, including supplementary data provided by external tools. By default, the processor submits the entire payload of each message as a string, unless you use the prompt
configuration field to customize it.
To learn more about chat completion, see the OpenAI API documentation, and Examples.
Fields
server_address
The OpenAI API endpoint to which the processor sends requests. Update the default value to use a different OpenAI-compatible service.
Type: string
Default: "https://api.openai.com/v1"
api_key
The API secret key for OpenAI API.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
model
The name of the OpenAI model to use.
Type: string
# Examples
model: gpt-4o
model: gpt-4o-mini
model: gpt-4
model: gpt4-turbo
prompt
The user prompt for which a response is generated. By default, the processor sends the entire payload as a string unless customized using this field.
Type: string
system_prompt
The system prompt to submit along with the user prompt. This field supports interpolation functions.
Type: string
history
Include messages from a prior conversation. You must use a Bloblang query to create an array of objects in the form of [{"role": "user", "content": "<text>"}, {"role":"assistant", "content":"<text>"}]
where:
-
role
is the sender of the original messages, eithersystem
,user
, orassistant
. -
content
is the text of the original messages.
For more information, see Examples.
Type: string
Default: ""
# Examples
history: [{"role": "user", "content": "My favorite color is blue"}, {"role":"assistant", "content":"Nice"}]
If the prompt
is set to "What is my favorite color?"
, the specified model
responds with blue
.
image
An optional image to submit along with the prompt. The result of the Bloblang mapping must be a byte array.
Type: string
# Examples
image: 'root = this.image.decode("base64")' # Decode base64 encoded image
temperature
Choose a sampling temperature between 0
and 2
:
-
Higher values, such as
0.8
make the output more random. -
Lower values, such as
0.2
make the output more focused and deterministic.
Redpanda recommends adding a value for this field or top_p
, but not both.
Type: float
user
A unique identifier that represents the end-user generating the prompt. This value can help OpenAI monitor and detect platform abuse. This field supports interpolation functions.
Type: string
response_format
Specify the configured model’s output format.
If you choose the json_schema
option, you must also configure a json_schema
or schema_registry
.
Type: string
Default: text
Options: text
, json
, json_schema
json_schema
The JSON schema used by the model when generating responses in json_schema
format. To learn more about supported JSON schema features, see the OpenAI documentation.
Type: object
json_schema.description
An optional description, which helps the model understand the schema’s purpose.
Type: string
schema_registry
The schema registry to dynamically load schemas for model responses in json_schema
format. Schemas must be in JSON format. To learn more about supported JSON schema features, see the OpenAI documentation.
Type: object
schema_registry.name_prefix
A prefix to add to the schema registry name. To form the complete schema registry name, the schema ID is appended as a suffix.
Type: string
Default: schema_registry_id_
schema_registry.subject
The subject name used to fetch the schema from the schema registry.
Type: string
schema_registry.refresh_interval
How frequently to poll the schema registry for updates. If not specified, the schema does not refresh automatically.
Type: string
schema_registry.tls.skip_cert_verify
Whether to skip server-side certificate verification.
Type: bool
Default: false
schema_registry.tls.enable_renegotiation
Whether to allow the remote server to request renegotiation. Enable this option if you’re seeing the error message local error:
tls: no renegotiation
.
Type: bool
Default: false
schema_registry.tls.root_cas
Specify a certificate authority to use (optional). This is a string that represents a certificate chain from the parent trusted root certificate, through possible intermediate signing certificates, to the host certificate.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
# Examples
root_cas: |-
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
schema_registry.tls.root_cas_file
Specify the path to a root certificate authority file (optional). This is a file, often with a .pem
extension, that contains a certificate chain from the parent trusted root certificate, through possible intermediate signing certificates, to the host certificate.certificate.
Type: string
Default: ""
# Examples
root_cas_file: ./root_cas.pem
schema_registry.tls.client_certs
A list of client certificates to use. For each certificate, specify values for either the cert
and key
fields, or cert_file
and key_file
fields.
Type: array
Default: []
# Examples
client_certs:
- cert: foo
key: bar
client_certs:
- cert_file: ./example.pem
key_file: ./example.key
schema_registry.tls.client_certs[].key
The plain text certificate key to use.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
schema_registry.tls.client_certs[].cert_file
The path to the certificate to use.
Type: string
Default: ""
schema_registry.tls.client_certs[].key_file
The path of a certificate key to use.
Type: string
Default: ""
schema_registry.tls.client_certs[].password
The plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete pbeWithMD5AndDES-CBC
algorithm is not supported for the PKCS#8 format.
The pbeWithMD5AndDES-CBC algorithm does not authenticate ciphertext, and is vulnerable to padding oracle attacks which may allow an attacker to recover the plain text password.
|
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
# Examples
password: foo
password: ${KEY_PASSWORD}
schema_registry.oauth
Configure OAuth version 1.0 to give this component authorized access to your schema registry.
Type: object
schema_registry.oauth.enabled
Whether to use OAuth version 1 in requests to the schema registry.
Type: bool
Default: false
schema_registry.oauth.consumer_key
The value used to identify this component or client to your schema registry.
Type: string
Default: ""
schema_registry.oauth.consumer_secret
The secret used to establish ownership of the consumer key.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
schema_registry.oauth.access_token
The value this component can use to gain access to the data in the schema registry.
Type: string
Default: ""
schema_registry.oauth.access_token_secret
The secret that establishes ownership of the oauth.access_token
.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
schema_registry.basic_auth
Configure basic authentication for requests from this component to your schema registry.
Type: object
schema_registry.basic_auth.enabled
Whether to use basic authentication in requests.
Type: bool
Default: false
schema_registry.basic_auth.username
The username of the account credentials to authenticate as.
Type: string
Default: ""
schema_registry.basic_auth.password
The password of the account credentials to authenticate with.
This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. |
Type: string
Default: ""
schema_registry.jwt
(beta)
Configure JSON Web Token (JWT) authentication for the secure transmission of data from your schema registry to this component.
Type: object
schema_registry.jwt.enabled
Whether to use JWT authentication in requests.
Type: bool
Default: false
schema_registry.jwt.private_key_file
A file in PEM format, encoded using PKCS1 or PKCS8 as private key.
Type: string
Default: ""
schema_registry.jwt.signing_method
The method used to sign the token, such as RS256, RS384, RS512, or EdDSA.
Type: string
Default: ""
schema_registry.jwt.claims
Values used to pass the identity of the authenticated entity to the service provider. In this case, between this component and the schema registry.
Type: object
Default: {}
schema_registry.jwt.headers
The key/value pairs that identify the type of token and signing algorithm (optional).
Type: object
Default: {}
top_p
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p
probability mass. For example, a top_p
of 0.1
means only the tokens comprising the top 10% probability mass are sampled.
Redpanda recommends adding a value for this field or temperature
, but not both.
Type: float
frequency_penalty
Specify a number between -2.0
and 2.0
. Positive values penalize new tokens based on the frequency of their appearance in the text so far. This decreases the model’s likelihood to repeat the same line verbatim.
Type: float
presence_penalty
Specify a number between -2.0
and 2.0
. Positive values penalize new tokens if they have appeared in the text so far. This increases the model’s likelihood to talk about new topics.
Type: float
seed
When set to a specific number, Redpanda Connect attempts to generate consistent responses for requests that use the same prompt, seed, and parameters.
Type: int
# Examples
seed: 42
stop
Specify up to four stop sequences to use. When the model encounters a stop pattern, it stops generating text and returns the final response.
Type: array
tools
External tools the model can invoke, such as functions, APIs, or web browsing. You can build a series of processors that include definitions of these tools, and the specified model can choose when to invoke them to help answer a prompt. For more information, see Examples.
If you don’t want to use external tools, enter an empty array tools:[] .
|
Type: array
tools[].description
A description of what the tool does. The model uses this to decide when to invoke the tool.
Type: string
Examples
-
Analyze an image and generate a description
-
Generate chat history
-
Make calls to external tools
This configuration fetches image URLs from the stdin
input, and uses the GPT-4o LLM to describe the images.
input:
stdin:
scanner:
lines: {}
pipeline:
processors:
- http:
verb: GET
url: "${!content().string()}"
- openai_chat_completion:
model: gpt-4o
api_key: "${OPENAI_API_KEY}"
prompt: "Describe the following image"
image: "root = content()"
output:
stdout:
codec: lines: lines
In this configuration, a pipeline executes a number of processors, including a cache, to generate and send chat history to a GPT-4o model.
input:
stdin:
scanner:
lines: {}
pipeline:
processors:
- mapping: |
root.prompt = content().string()
- branch:
processors:
- cache:
resource: mem
operator: get
key: history
- catch:
- mapping: 'root = []'
result_map: 'root.history = this'
- branch:
processors:
- openai_chat_completion:
model: gpt-4o
api_key: "${OPENAI_API_KEY}"
prompt: "${!this.prompt}"
history: 'root = this.history'
result_map: 'root.response = content().string()'
- mutation: |
root.history = this.history.concat([
{"role": "user", "content": this.prompt},
{"role": "assistant", "content": this.response},
])
- cache:
resource: mem
operator: set
key: history
value: '${!this.history}'
- mapping: |
root = this.response
output:
stdout:
codec: lines
cache_resources:
- label: mem
memory: {}
In this configuration, the GPT-4o model executes a number of processors, which make a tool call to retrieve weather data for a specific city.
input:
generate:
count: 1
mapping: |
root = "What is the weather like in Chicago?"
pipeline:
processors:
- openai_chat_completion:
model: gpt-4o
api_key: "${OPENAI_API_KEY}"
prompt: "${!content().string()}"
tools:
- name: GetWeather
description: "Retrieve the weather for a specific city"
parameters:
required: ["city"]
properties:
city:
type: string
description: the city to look up the weather for
processors:
- http:
verb: GET
url: 'https://wttr.in/${!this.city}?T'
headers:
User-Agent: curl/8.11.1 # Returns a text string from the weather website
output:
stdout: {}