aiAssistant
The AI Assistant SDK was released in 2025.07.1. Usage requires a specific license.
The AI Assistant SDK helps developers build AI-powered conversations with insightful knowledge about information on current site.
This is a sibling to the ai SDK and is designed for server-side environments. Most aiAssistant methods requires current user to be authenticated (see specific docs).
Requires AI Assistant site configuration
This SDK relies fully on a proper AI Assistant (sv:aiAssistant) configuration on current site. The AI Assistant configuration determines:
- What LLM to use for the conversation.
- What Semantic Index to use when extracting appropriate, site-specific knowledge needed to answer questions in the conversation.
- What system message/prompt to use in the conversation.
Check out the community article Working with AI Assistants for an overview of the workflow.
Methods
Methods requires current user to be authenticated, unless explicitly stated not so.
aiAssistant.createConversation(aiAssistant)
Returns a new, unique conversationIdentifier string - the key needed to identify a conversation for current user.
This method does not require current user to be authenticated.
Note that an actual conversation is identified by the conversationIdentifier and current user - combined. Hence, multiple users could in theory use an identical conversationIdentifier value but doing so is strictly discouraged (conversational data could accidentally be mixed easily for multiple users).
aiAssistant.getConversationMemory(aiAssistant, conversationIdentifier)
Method to retrieve the conversation of current user.
Returns an array of message entries.
The result is structurally identical to the "messages" option of the askLLM method. But the result never contains any system message!
aiAssistant.getConversationKnowledge(aiAssistant, conversationIdentifier)
Returns the potential knowledge string used in the conversation of current user.
aiAssistant.querySemanticIndex(aiAssistant, options)
Method to extract knowledge entries ("rag data chunks") for a given message.
This method does not require current user to be authenticated.
| Property | Type | Description |
|---|---|---|
query | string | The query, typically the user message to be used in a conversation (mandatory) |
| maxHits | number | Max number of chunks |
Returns
Returns an array of knowledge entries
The type property and support for external entries was introduced in 2025.09.2. The score property was introduced in 2025.10.1
aiAssistant.askAssistant(aiAssistant, options)
Method to ask a question and stream the answer from the LLM of an AI Assistant configuration.
This function doesn't return anything. The streamed response data from the remote LLM is consumed by two callbacks (onChunk and onFinish) specified in the options object.
Options
| Property | Type | Description |
|---|---|---|
message | string | The user message in the conversation (mandatory) |
| conversationIdentifier | string | The key that identifies the conversation for current user (mandatory) |
| knowledge | string or array<string> | The potential knowledge needed to answer the user question/message (optional).
A provided knowledge value will be included in the system message/prompt of this conversation. The value will replace potentially previous knowledge.
Hints. This value is typically extracted via the querySemanticIndex method. And potential existing knowledge in the conversation can be retrieved via the getConversationKnowledge method. |
| additionalInstructions | string | Custom instructions that should be appended to the Assistant instructions (optional) @since 2025.09.2 |
Handling the streaming response
The result of the streaming operation is handled via the onChunk and the onFinish callback functions. Both properties and their functions must be supplied or the askAssistant operation will fail.
| Property | Type | Description |
|---|---|---|
onChunk | function(token: string) | Callback that is invoked whenever a token is received from the remote LLM of the AI Assistant |
Streamed tokens
The onChunk function is responsible to dispatch tokens to the askAssistant caller, typically an end user. This callback is invoked whenever a single token is received from the remote LLM. A token is typically a small fraction of the total result - typically a single word. It can therefore be advisible to gather multiple tokens to a batch before dispatching them to the caller.
Note that the askAssistant response must be explicitly flushed if the caller should get the response data in a streamed manner. Without an explicit flush, the caller will typically receive all response data as a single chunk when the streaming operation has completed fully.
| Property | Type | Description |
|---|---|---|
onFinish | function({error:string, finishReason:string, usage: {promptTokens:number, completionTokens:number, totalTokens:number}}) | Callback that is invoked when the response from the remote LLM of the AI Assistant is finished |
The onFinish callback argument
The function argument is a result object that describes the stream operation result.
Property | Type | Description |
|---|---|---|
| string | This is always an empty string. This property only exist to be structurally identical with the result object of the askLLM function |
| string | Contains an error message if something went wrong during the streaming operation. If the request was successful, this will be an empty string. |
finishReason | string | Indicates the reason why the streaming process stopped. Common values may include:
|
usage | object{ | Contains token usage information for the request. Note that usage may vary and may not be supported by some models. |
Timeout
The default timeout and allowed timeout range may vary depending on the Sitevision environment. It’s typically advisable to adjust this based on your specific use case and the performance of the language model.
| Property | Type | Description | Default |
|---|---|---|---|
timeout | number | The maximum amount of time, in milliseconds, that Sitevision will wait for a response before the connection times out. | 15000 (15s). May vary depending on Sitevision environment. |
Other options
Note that the optimal values and ranges for the following options may vary depending on the specific model being used.
Property | Type | Description |
|---|---|---|
| number | Specifies the maximum number of tokens (words or word fragments) that the model can generate in the response. This includes both the prompt and the completion. |
| number | Controls the randomness or creativity of the model's output. Lower values make the output more deterministic and focused, while higher values introduce more variability and creativity. |
frequencePenalty | number | Penalizes the model for using the same tokens repeatedly. A higher value reduces the likelihood of repeating the same phrases, promoting more diverse language usage. |
aiAssistant.askLLM(aiAssistant, options)
Method to generate text using the LLM of an AI Assistant configuration. Useful for non-interactive use cases such as summarization.
Options
| Property | Type | Description |
|---|---|---|
messages | array<{role: string, content: string}> | Messages that represent a conversation. See example below. |
Message object structure
- role: A string indicating the role of the entity that generated the message. It can take the following values:
system: Represents the system or environment providing context or instructions.user: Represents the user interacting with the language model.assistant: Represents the language model or assistant generating a response.
- content: A string containing the actual message or content provided by the role.
Ordering rules
The conversation should be logically structured, ensuring that the roles follow a sequence that reflects a natural conversation.
- A system message, if present, must always be the first message in the sequence.
- A system message, if present, must always be immediately followed by a user message.
- This ordering is particularly important when implementing a message window with eviction, since eviction must not break the invariant of system -> user -> assistant message flow.
Timeout
The default timeout and allowed timeout range may vary depending on the Sitevision environment. It’s typically advisable to adjust this based on your specific use case and the performance of the language model.
| Property | Type | Description | Default |
|---|---|---|---|
timeout | number | The maximum amount of time, in milliseconds, that Sitevision will wait for a response before the connection times out. | 30000 (30s). May vary depending on Sitevision environment. |
Other options
Note that the optimal values and ranges for the following options may vary depending on the specific model being used.
Property | Type | Description |
|---|---|---|
| number | Specifies the maximum number of tokens (words or word fragments) that the model can generate in the response. This includes both the prompt and the completion. |
| number | Controls the randomness or creativity of the model's output. Lower values make the output more deterministic and focused, while higher values introduce more variability and creativity. |
frequencePenalty | number | Penalizes the model for using the same tokens repeatedly. A higher value reduces the likelihood of repeating the same phrases, promoting more diverse language usage. |
Returns
The results object contains the output and metadata related to the ask LLM request. It includes the following properties.
Property | Type | Description |
|---|---|---|
| string | The generated text based on the conversation provided in the messages array. In this case, it contains the summary generated by the assistant. |
| string | Contains an error message if something went wrong during the text generation process. If the request was successful, this will be an empty string. |
finishReason | string | Indicates the reason why the generation process stopped. Common values may include:
|
usage | object{ | Contains token usage information for the request. Note that usage may vary and may not be supported by some models. |