Logotype Sitevision Developer
Log in
Log in

aiAssistant

The AI Assistant SDK was released in 2025.07.1. Usage requires a specific license.

The AI Assistant SDK helps developers build AI-powered conversations with insightful knowledge about information on current site.

This is a sibling to the ai SDK and is designed for server-side environments. Most aiAssistant methods requires current user to be authenticated (see specific docs).

js
import aiAssistant from "@sitevision/api/server/aiAssistant";

Requires AI Assistant site configuration

This SDK relies fully on a proper AI Assistant (sv:aiAssistant) configuration on current site. The AI Assistant configuration determines:

  • What LLM to use for the conversation.
  • What Semantic Index to use when extracting appropriate, site-specific knowledge needed to answer questions in the conversation.
  • What system message/prompt to use in the conversation.

Check out the community article Working with AI Assistants for an overview of the workflow.

Methods

Methods requires current user to be authenticated, unless explicitly stated not so.

aiAssistant.createConversation(aiAssistant)

Returns a new, unique conversationIdentifier string - the key needed to identify a conversation for current user.

This method does not require current user to be authenticated.

Note that an actual conversation is identified by the conversationIdentifier and current user - combined. Hence, multiple users could in theory use an identical conversationIdentifier value but doing so is strictly discouraged (conversational data could accidentally be mixed easily for multiple users).

aiAssistant.getConversationMemory(aiAssistant, conversationIdentifier)

Method to retrieve the conversation of current user.

Returns an array of message entries.

The result is structurally identical to the "messages" option of the askLLM method. But the result never contains any system message!

js
[ { role: 'user', content: 'What is the weather like today?' }, { role: 'assistant', content: 'The weather is sunny with a high of 75°F.' }, ]

aiAssistant.getConversationKnowledge(aiAssistant, conversationIdentifier)

Returns the potential knowledge string used in the conversation of current user.

aiAssistant.querySemanticIndex(aiAssistant, options)

Method to extract knowledge entries ("rag data chunks") for a given message.

This method does not require current user to be authenticated.

Query Semantic Index options
PropertyTypeDescription
querystringThe query, typically the user message to be used in a conversation (mandatory)
maxHitsnumberMax number of chunks

Returns

Returns an array of knowledge entries

js
[ { id: '4.1212121212', text: 'Lorem ipsum from an internal sv:page of the site', type: 'internal', score: 0.9124 }, { id: '18.1212121212', text: 'Lorem ipsum from an internal sv:file of the site', type: 'internal', score: 0.8961 }, { id: 'ext_docId_477', text: 'Lorem ipsum from an external source', type: 'external', score: 0.87992, source: 'External System A', name: 'Lorem', timestamp: 599912121212, url: 'https://theexternalsystem.example/lorem.html' } ]

The type property and support for external entries was introduced in 2025.09.2. The score property was introduced in 2025.10.1

aiAssistant.askAssistant(aiAssistant, options)

Method to ask a question and stream the answer from the LLM of an AI Assistant configuration.

This function doesn't return anything. The streamed response data from the remote LLM is consumed by two callbacks (onChunk and onFinish) specified in the options object.

Options

Ask Assistant options
PropertyTypeDescription
messagestringThe user message in the conversation (mandatory)
conversationIdentifierstringThe key that identifies the conversation for current user (mandatory)
knowledge

string

or

array<string>

The potential knowledge needed to answer the user question/message (optional).

 

A provided knowledge value will be included in the system message/prompt of this conversation. The value will replace potentially previous knowledge.

 

Hints. This value is typically extracted via the querySemanticIndex method. And potential existing knowledge in the conversation can be retrieved via the getConversationKnowledge method.

additionalInstructions

string

Custom instructions that should be appended to the Assistant instructions (optional) @since 2025.09.2

 

Handling the streaming response

The result of the streaming operation is handled via the onChunk and the onFinish callback functions. Both properties and their functions must be supplied or the askAssistant operation will fail.

Ask Assistant options
PropertyTypeDescription
onChunkfunction(token: string)Callback that is invoked whenever a token is received from the remote LLM of the AI Assistant

Streamed tokens

The onChunk function is responsible to dispatch tokens to the askAssistant caller, typically an end user. This callback is invoked whenever a single token is received from the remote LLM. A token is typically a small fraction of the total result - typically a single word. It can therefore be advisible to gather multiple tokens to a batch before dispatching them to the caller.

Note that the askAssistant response must be explicitly flushed if the caller should get the response data in a streamed manner. Without an explicit flush, the caller will typically receive all response data as a single chunk when the streaming operation has completed fully.

Ask Assistant options
PropertyTypeDescription
onFinishfunction({error:string, finishReason:string, usage: {promptTokens:number, completionTokens:number, totalTokens:number}})Callback that is invoked when the response from the remote LLM of the AI Assistant is finished

The onFinish callback argument

The function argument is a result object that describes the stream operation result.

The onFinish result object

Property

Type

Description

text

string

This is always an empty string. This property only exist to be structurally identical with the result object of the askLLM function

error

string

Contains an error message if something went wrong during the streaming operation. If the request was successful, this will be an empty string.

finishReason

string

Indicates the reason why the streaming process stopped. Common values may include:

  • stop
    • The model stopped because it reached the end of the streamed text or an expected stopping point.
  • length
    • The stream operation was halted because it reached the maximum number of tokens allowed.
  • error
    • The generation could not be performed at all or failed unexpectedly.
  • other
    • The stream operation stopped due to another reason not covered by the typical cases. This could be due to specific model or system behavior.
usage

object{
promptTokens: number,
completionTokens: number,
totalTokens: number
}

Contains token usage information for the request. Note that usage may vary and may not be supported by some models.

Token values are typically -1 for finishReason error

Timeout

The default timeout and allowed timeout range may vary depending on the Sitevision environment. It’s typically advisable to adjust this based on your specific use case and the performance of the language model.

Ask Assistant options
PropertyTypeDescriptionDefault
timeoutnumberThe maximum amount of time, in milliseconds, that Sitevision will wait for a response before the connection times out.15000 (15s). May vary depending on Sitevision environment.

Other options

Note that the optimal values and ranges for the following options may vary depending on the specific model being used.

Ask Assistant options

Property

Type

Description

maxTokens

number

Specifies the maximum number of tokens (words or word fragments) that the model can generate in the response. This includes both the prompt and the completion.

temperature

number

Controls the randomness or creativity of the model's output. Lower values make the output more deterministic and focused, while higher values introduce more variability and creativity.

frequencePenalty

number

Penalizes the model for using the same tokens repeatedly. A higher value reduces the likelihood of repeating the same phrases, promoting more diverse language usage.


aiAssistant.askLLM(aiAssistant, options)

Method to generate text using the LLM of an AI Assistant configuration. Useful for non-interactive use cases such as summarization.

Options

Ask LLM options
PropertyTypeDescription
messagesarray<{role: string, content: string}>Messages that represent a conversation. See example below.

Message object structure

  • role: A string indicating the role of the entity that generated the message. It can take the following values:
    • system: Represents the system or environment providing context or instructions.
    • user: Represents the user interacting with the language model.
    • assistant: Represents the language model or assistant generating a response.
  • content: A string containing the actual message or content provided by the role.
js
// messages example [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'What is the weather like today?' }, { role: 'assistant', content: 'The weather is sunny with a high of 75°F.' }, ]

Ordering rules

The conversation should be logically structured, ensuring that the roles follow a sequence that reflects a natural conversation.

  • A system message, if present, must always be the first message in the sequence.
  • A system message, if present, must always be immediately followed by a user message.
  • This ordering is particularly important when implementing a message window with eviction, since eviction must not break the invariant of system -> user -> assistant message flow.

Timeout

The default timeout and allowed timeout range may vary depending on the Sitevision environment. It’s typically advisable to adjust this based on your specific use case and the performance of the language model.

Ask LLM options
PropertyTypeDescriptionDefault
timeoutnumberThe maximum amount of time, in milliseconds, that Sitevision will wait for a response before the connection times out.30000 (30s). May vary depending on Sitevision environment.

Other options

Note that the optimal values and ranges for the following options may vary depending on the specific model being used.

Ask LLM options

Property

Type

Description

maxTokens

number

Specifies the maximum number of tokens (words or word fragments) that the model can generate in the response. This includes both the prompt and the completion.

temperature

number

Controls the randomness or creativity of the model's output. Lower values make the output more deterministic and focused, while higher values introduce more variability and creativity.

frequencePenalty

number

Penalizes the model for using the same tokens repeatedly. A higher value reduces the likelihood of repeating the same phrases, promoting more diverse language usage.


Returns

The results object contains the output and metadata related to the ask LLM request. It includes the following properties.

Ask LLM result object

Property

Type

Description

text

string

The generated text based on the conversation provided in the messages array. In this case, it contains the summary generated by the assistant.

error

string

Contains an error message if something went wrong during the text generation process. If the request was successful, this will be an empty string.

finishReason

string

Indicates the reason why the generation process stopped. Common values may include:

  • stop
    • The model stopped because it reached the end of the generated text or an expected stopping point.
  • length
    • The generation was halted because it reached the maximum number of tokens allowed.
  • error
    • The generation could not be performed at all or failed unexpectedly.
  • other
    • The generation stopped due to another reason not covered by the typical cases. This could be due to specific model or system behavior.
usage

object{
promptTokens: number,
completionTokens: number,
totalTokens: number
}

Contains token usage information for the request. Note that usage may vary and may not be supported by some models.

Token values are typically -1 for finishReason error

Example

js
{ text: 'The key points are...', error: '', finishReason: 'stop', usage: { promptTokens: 50, completionTokens: 20, totalTokens: 70 } }
Did you find the content on this page useful?