Optional fields: Partial<OpenAIInput> & Partial<AzureOpenAIInput> & BaseLLMParams & { Batch size to use when passing multiple documents to generate
The async caller should be used by subclasses to make any async calls, which will thus benefit from the concurrency and retry logic.
A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim.
Maximum number of tokens to generate in the completion. -1 returns as many tokens as possible given the prompt and the model's maximum context size.
Model name to use
The number of completions choices that should be generated per provided prompt as part of an overall completions response. Because this setting can generate many completions, it may quickly consume your token quota. Use carefully and ensure reasonable settings for max_tokens and stop.
A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model's likelihood to output new topics.
Whether to stream the results or not. Enabling disables tokenUsage reporting
The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict.
An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass. As an example, a value of 0.15 will cause only the tokens comprising the top 15% of probability mass to be considered. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict.
Whether to print out response text.
Optional azureAzure OpenAI API deployment name to use for completions when making requests to Azure OpenAI. This is the name of the deployment you created in the Azure portal. e.g. "my-openai-deployment" this will be used in the endpoint URL: https://{InstanceName}.openai.azure.com/openai/deployments/my-openai-deployment/
Optional azureAPI key to use when making requests to Azure OpenAI.
Optional azureEndpoint to use when making requests to Azure OpenAI
Optional bestA value that controls how many completions will be internally generated prior to response formulation. When used together with n, best_of controls the number of candidate completions and must be greater than n. Because this setting can generate many completions, it may quickly consume your token quota. Use carefully and ensure reasonable settings for max_tokens and stop.
Optional cacheOptional callbacksOptional echoA value specifying whether completions responses should include input prompts as prefixes to their generated output.
Optional logitA map between GPT token IDs and bias scores that influences the probability of specific tokens appearing in a completions response. Token IDs are computed via external tokenizer tools, while bias scores reside in the range of -100 to 100 with minimum and maximum values corresponding to a full ban or exclusive selection of a token, respectively. The exact behavior of a given bias score varies by model.
Optional logprobsA value that controls the emission of log probabilities for the provided number of most likely tokens within a completions response.
Optional metadataOptional modelHolds any additional parameters that are valid to pass to openai.createCompletion that are not explicitly specified on this class.
Optional nameOptional stopA collection of textual sequences that will end completions generation.
Optional tagsOptional timeoutTimeout to use when making requests to OpenAI.
Optional userAn identifier for the caller or end user of the operation. This may be used for tracking or rate-limiting purposes.
Keys that the language model accepts as call options.
Assigns new fields to the dict output of this runnable. Returns a new runnable.
Default implementation of batch, which calls invoke N times. Subclasses should override this method if they can batch more efficiently.
Array of inputs to each batch call.
Optional options: Partial<CallOptions> | Partial<CallOptions>[]Either a single call options object to apply to each batch call or an array for each call.
Optional batchOptions: RunnableBatchOptions & { An array of RunOutputs, or mixed RunOutputs and errors if batchOptions.returnExceptions is set
Optional options: Partial<CallOptions> | Partial<CallOptions>[]Optional batchOptions: RunnableBatchOptions & { Optional options: Partial<CallOptions> | Partial<CallOptions>[]Optional batchOptions: RunnableBatchOptionsBind arguments to a Runnable, returning a new Runnable.
A new RunnableBinding that, when invoked, will apply the bound args.
Optional options: string[] | CallOptionsOptional callbacks: CallbacksUse .invoke() instead. Will be removed in 0.2.0. Convenience wrapper for generate that takes in a single string prompt and returns a single string output.
This method takes prompt values, options, and callbacks, and generates a result based on the prompts.
Prompt values for the LLM.
Optional options: string[] | CallOptionsOptions for the LLM call.
Optional callbacks: CallbacksCallbacks for the LLM call.
An LLMResult based on the prompts.
This method takes an input and options, and returns a string. It converts the input to a prompt value and generates a result based on the prompt.
Input for the LLM.
Optional options: CallOptionsOptions for the LLM call.
A string result based on the prompt.
Return a new Runnable that maps a list of inputs to a list of outputs, by calling invoke() with each input.
Pick keys from the dict output of this runnable. Returns a new runnable.
Create a new runnable sequence that runs each individual runnable in series, piping the output of one runnable into another runnable or runnable-like.
A runnable, function, or object whose values are functions or runnables.
A new runnable sequence.
Input text for the prediction.
Optional options: string[] | CallOptionsOptions for the LLM call.
Optional callbacks: CallbacksCallbacks for the LLM call.
A prediction based on the input text.
Use .invoke() instead. Will be removed in 0.2.0.
This feature is deprecated and will be removed in the future.
It is not recommended for use.
This method is similar to call, but it's used for making predictions
based on the input text.
A list of messages for the prediction.
Optional options: string[] | CallOptionsOptions for the LLM call.
Optional callbacks: CallbacksCallbacks for the LLM call.
A predicted message based on the list of messages.
Use .invoke() instead. Will be removed in 0.2.0.
This method takes a list of messages, options, and callbacks, and returns a predicted message.
Return a json-like object representing this LLM.
Stream output in chunks.
Optional options: Partial<CallOptions>A readable stream that is also an iterable.
Generate a stream of events emitted by the internal steps of the runnable.
Use to create an iterator over StreamEvents that provide real-time information about the progress of the runnable, including StreamEvents from intermediate results.
A StreamEvent is a dictionary with the following schema:
event: string - Event names are of the format: on_[runnable_type]_(start|stream|end).name: string - The name of the runnable that generated the event.run_id: string - Randomly generated ID associated with the given execution of
the runnable that emitted the event. A child runnable that gets invoked as part of the execution of a
parent runnable is assigned its own unique ID.tags: string[] - The tags of the runnable that generated the event.metadata: Record<string, any> - The metadata of the runnable that generated the event.data: Record<string, any>Below is a table that illustrates some events that might be emitted by various chains. Metadata fields have been omitted from the table for brevity. Chain definitions have been included after the table.
| event | name | chunk | input | output |
|---|---|---|---|---|
| on_llm_start | [model name] | {'input': 'hello'} | ||
| on_llm_stream | [model name] | 'Hello' OR AIMessageChunk("hello") | ||
| on_llm_end | [model name] | 'Hello human!' | ||
| on_chain_start | format_docs | |||
| on_chain_stream | format_docs | "hello world!, goodbye world!" | ||
| on_chain_end | format_docs | [Document(...)] | "hello world!, goodbye world!" | |
| on_tool_start | some_tool | {"x": 1, "y": "2"} | ||
| on_tool_stream | some_tool | {"x": 1, "y": "2"} | ||
| on_tool_end | some_tool | {"x": 1, "y": "2"} | ||
| on_retriever_start | [retriever name] | {"query": "hello"} | ||
| on_retriever_chunk | [retriever name] | {documents: [...]} | ||
| on_retriever_end | [retriever name] | {"query": "hello"} | {documents: [...]} | |
| on_prompt_start | [template_name] | {"question": "hello"} | ||
| on_prompt_end | [template_name] | {"question": "hello"} | ChatPromptValue(messages: [SystemMessage, ...]) |
Optional streamOptions: Omit<LogStreamCallbackHandlerInput, "autoClose">Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.
Optional options: Partial<CallOptions>Optional streamOptions: Omit<LogStreamCallbackHandlerInput, "autoClose">Default implementation of transform, which buffers input and then calls stream. Subclasses should override this method if they can start producing output while input is still being generated.
Bind config to a Runnable, returning a new Runnable.
New configuration parameters to attach to the new runnable.
A new RunnableBinding with a config matching what's passed.
Create a new runnable from the current one that will try invoking other passed fallback runnables if the initial invocation fails.
Other runnables to call if the runnable errors.
A new RunnableWithFallbacks.
Bind lifecycle listeners to a Runnable, returning a new Runnable. The Run object contains information about the run, including its id, type, input, output, error, startTime, endTime, and any tags or metadata added to the run.
The object containing the callback functions.
Optional onCalled after the runnable finishes running, with the Run object.
Optional config: RunnableConfigOptional onCalled if the runnable throws an error, with the Run object.
Optional config: RunnableConfigOptional onCalled before the runnable starts running, with the Run object.
Optional config: RunnableConfigAdd retry logic to an existing runnable.
Optional fields: { Optional onOptional stopA new RunnableRetry that, when invoked, will retry according to the parameters.
Optional withOptional method?: "functionCalling" | "jsonMode"Optional includeOptional method?: "functionCalling" | "jsonMode"Model wrapper that returns outputs formatted to match the given schema.
The input type for the Runnable, expected to be the same input for the LLM.
The output type for the Runnable, expected to be a Zod schema object for structured output validation.
Optional includeOptional method?: "functionCalling" | "jsonMode"A new runnable that calls the LLM with structured output.
Static deserializeLoad an LLM from a json-like object describing it.
Static isGenerated using TypeDoc
LLM Wrapper. Takes in a prompt (or prompts) and returns a string.