You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* The name of a compatible [Cohere model](https://docs.cohere.com/docs/models) or the ID of a [fine-tuned](https://docs.cohere.com/docs/chat-fine-tuning) model.
* When specified, the default Cohere preamble will be replaced with the provided one. Preambles are a part of the prompt used to adjust the model's overall behavior and conversation style, and use the `SYSTEM` role.
68
33
*
69
34
* The `SYSTEM` role is also used for the contents of the optional `chat_history=` parameter. When used with the `chat_history=` parameter it adds content throughout a conversation. Conversely, when used with the `preamble=` parameter it adds content at the start of the conversation only.
* Each item represents a single message in the chat history, excluding the current user turn. It has two properties: `role` and `message`. The `role` identifies the sender (`CHATBOT`, `SYSTEM`, or `USER`), while the `message` contains the text content.
77
43
*
78
44
* The chat_history parameter should not be used for `SYSTEM` messages in most cases. Instead, to add a `SYSTEM` role message at the beginning of a conversation, the `preamble` parameter should be used.
* With `prompt_truncation` set to "AUTO_PRESERVE_ORDER", some elements from `chat_history` and `documents` will be dropped in an attempt to construct a prompt that fits within the model's context length limit. During this process the order of the documents and chat history will be preserved as they are inputted into the API.
97
65
*
98
66
* With `prompt_truncation` set to "OFF", no elements will be dropped. If the sum of the inputs exceeds the model's context length limit, a `TooManyTokens` error will be returned.
* Accepts `{"id": "web-search"}`, and/or the `"id"` for a custom [connector](https://docs.cohere.com/docs/connectors), if you've [created](https://docs.cohere.com/docs/creating-and-deploying-a-connector) one.
104
73
*
105
74
* When specified, the model's reply will be enriched with information found by quering each of the connectors (RAG).
75
+
* Compatible Deployments: Cohere Platform
106
76
*
107
77
*/
108
78
connectors?: Cohere.ChatConnector[];
109
79
/**
110
80
* Defaults to `false`.
111
81
*
112
82
* When `true`, the response will only contain a list of generated search queries, but no search will take place, and no reply from the model to the user's `message` will be generated.
* An `_excludes` field (array of strings) can be optionally supplied to omit some key-value pairs from being shown to the model. The omitted fields will still show up in the citation object. The "_excludes" field will not be passed to the model.
132
103
*
133
104
* See ['Document Mode'](https://docs.cohere.com/docs/retrieval-augmented-generation-rag#document-mode) in the guide for more information.
* Dictates the approach taken to generating citations as part of the RAG flow by allowing the user to specify whether they want `"accurate"` results or `"fast"` results.
* A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations, and higher temperatures mean more random generations.
141
121
*
142
122
* Randomness can be further maximized by increasing the value of the `p` parameter.
* The maximum number of input tokens to send to the model. If not specified, `max_input_tokens` is the model's context length limit minus a small buffer.
153
135
*
154
136
* Input will be truncated according to the `prompt_truncation` parameter.
137
+
* Compatible Deployments: Cohere Platform
155
138
*
156
139
*/
157
140
maxInputTokens?: number;
158
141
/**
159
142
* Ensures only the top `k` most likely tokens are considered for generation at each step.
160
143
* Defaults to `0`, min value of `0`, max value of `500`.
* Ensures that only the most likely tokens, with total probability mass of `p`, are considered for generation at each step. If both `k` and `p` are enabled, `p` acts after `k`.
166
150
* Defaults to `0.75`. min value of `0.01`, max value of `0.99`.
/** If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed. */
155
+
/**
156
+
* If specified, the backend will make a best effort to sample tokens
157
+
* deterministically, such that repeated requests with the same
158
+
* seed and parameters should return the same result. However,
* A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens and return the generated text up to that point not including the stop sequence.
* Defaults to `0.0`, min value of `0.0`, max value of `1.0`.
179
172
*
180
173
* Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
* Defaults to `0.0`, min value of `0.0`, max value of `1.0`.
186
180
*
187
181
* Used to reduce repetitiveness of generated tokens. Similar to `frequency_penalty`, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies.
/** The prompt is returned in the `prompt` response field when this is enabled. */
194
+
returnPrompt?: boolean;
193
195
/**
194
196
* A list of available tools (functions) that the model may suggest invoking before producing a text response.
195
197
*
196
198
* When `tools` is passed (without `tool_results`), the `text` field in the response will be `""` and the `tool_calls` field in the response will be populated with a list of tool calls that need to be made. If no calls need to be made, the `tool_calls` array will be empty.
0 commit comments