docs/layouts/shortcodes/generated/openai_configuration.html - flink - Git at Google

 <table class="configuration table table-bordered">
     <thead>
         <tr>
             <th class="text-left" style="width: 20%">Key</th>
             <th class="text-left" style="width: 15%">Default</th>
             <th class="text-left" style="width: 10%">Type</th>
             <th class="text-left" style="width: 55%">Description</th>
         </tr>
     </thead>
     <tbody>
         <tr>
             <td><h5>api-key</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>OpenAI API key for authentication.</td>
         </tr>
         <tr>
             <td><h5>context-overflow-action</h5></td>
             <td style="word-wrap: break-word;">truncated-tail</td>
             <td><p>Enum</p></td>
             <td>Action to handle context overflows.<br /><br />Possible values:<ul><li>"truncated-tail": Truncates exceeded tokens from the tail of the context.</li><li>"truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log.</li><li>"truncated-head": Truncates exceeded tokens from the head of the context.</li><li>"truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log.</li><li>"skipped": Skips the input row.</li><li>"skipped-log": Skips the input row. Records the skipping log.</li></ul></td>
         </tr>
         <tr>
             <td><h5>dimension</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Long</td>
             <td>The size of the embedding result array.</td>
         </tr>
         <tr>
             <td><h5>endpoint</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>Full URL of the OpenAI API endpoint, e.g., <code class="highlighter-rouge">https://api.openai.com/v1/chat/completions</code> or <code class="highlighter-rouge">https://api.openai.com/v1/embeddings</code></td>
         </tr>
         <tr>
             <td><h5>error-handling-strategy</h5></td>
             <td style="word-wrap: break-word;">RETRY</td>
             <td><p>Enum</p></td>
             <td>Strategy for handling errors during model requests.<br /><br />Possible values:<ul><li>"RETRY": Retry sending the request.</li><li>"FAILOVER": Throw exceptions and fail the Flink job.</li><li>"IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.</li></ul></td>
         </tr>
         <tr>
             <td><h5>max-context-size</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Integer</td>
             <td>Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded.</td>
         </tr>
         <tr>
             <td><h5>max-tokens</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Long</td>
             <td>The maximum number of tokens that can be generated in the chat completion.</td>
         </tr>
         <tr>
             <td><h5>model</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>Model name, e.g., <code class="highlighter-rouge">gpt-3.5-turbo</code>, <code class="highlighter-rouge">text-embedding-ada-002</code>.</td>
         </tr>
         <tr>
             <td><h5>n</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Long</td>
             <td>How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.</td>
         </tr>
         <tr>
             <td><h5>presence-penalty</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Double</td>
             <td>Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.</td>
         </tr>
         <tr>
             <td><h5>response-format</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td><p>Enum</p></td>
             <td>The format of the response, e.g., 'text' or 'json_object'.<br /><br />Possible values:<ul><li>"text"</li><li>"json_object"</li></ul></td>
         </tr>
         <tr>
             <td><h5>retry-fallback-strategy</h5></td>
             <td style="word-wrap: break-word;">FAILOVER</td>
             <td><p>Enum</p></td>
             <td>Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry.<br /><br />Possible values:<ul><li>"FAILOVER": Throw exceptions and fail the Flink job.</li><li>"IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.</li></ul></td>
         </tr>
         <tr>
             <td><h5>retry-num</h5></td>
             <td style="word-wrap: break-word;">100</td>
             <td>Integer</td>
             <td>Number of retry for OpenAI client requests.</td>
         </tr>
         <tr>
             <td><h5>seed</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Long</td>
             <td>If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.</td>
         </tr>
         <tr>
             <td><h5>stop</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>A CSV list of strings to pass as stop sequences to the model.</td>
         </tr>
         <tr>
             <td><h5>system-prompt</h5></td>
             <td style="word-wrap: break-word;">"You are a helpful assistant."</td>
             <td>String</td>
             <td>The system message of a chat.</td>
         </tr>
         <tr>
             <td><h5>temperature</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Double</td>
             <td>Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0.</td>
         </tr>
         <tr>
             <td><h5>top-p</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Double</td>
             <td>The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both.</td>
         </tr>
     </tbody>
 </table>
	<table class="configuration table table-bordered">
	<thead>
	<tr>
	<th class="text-left" style="width: 20%">Key</th>
	<th class="text-left" style="width: 15%">Default</th>
	<th class="text-left" style="width: 10%">Type</th>
	<th class="text-left" style="width: 55%">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td><h5>api-key</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>OpenAI API key for authentication.</td>
	</tr>
	<tr>
	<td><h5>context-overflow-action</h5></td>
	<td style="word-wrap: break-word;">truncated-tail</td>
	<td><p>Enum</p></td>
	<td>Action to handle context overflows.<br /><br />Possible values:<ul><li>"truncated-tail": Truncates exceeded tokens from the tail of the context.</li><li>"truncated-tail-log": Truncates exceeded tokens from the tail of the context. Records the truncation log.</li><li>"truncated-head": Truncates exceeded tokens from the head of the context.</li><li>"truncated-head-log": Truncates exceeded tokens from the head of the context. Records the truncation log.</li><li>"skipped": Skips the input row.</li><li>"skipped-log": Skips the input row. Records the skipping log.</li></ul></td>
	</tr>
	<tr>
	<td><h5>dimension</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Long</td>
	<td>The size of the embedding result array.</td>
	</tr>
	<tr>
	<td><h5>endpoint</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>Full URL of the OpenAI API endpoint, e.g., <code class="highlighter-rouge">https://api.openai.com/v1/chat/completions</code> or <code class="highlighter-rouge">https://api.openai.com/v1/embeddings</code></td>
	</tr>
	<tr>
	<td><h5>error-handling-strategy</h5></td>
	<td style="word-wrap: break-word;">RETRY</td>
	<td><p>Enum</p></td>
	<td>Strategy for handling errors during model requests.<br /><br />Possible values:<ul><li>"RETRY": Retry sending the request.</li><li>"FAILOVER": Throw exceptions and fail the Flink job.</li><li>"IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.</li></ul></td>
	</tr>
	<tr>
	<td><h5>max-context-size</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Integer</td>
	<td>Max number of tokens for context. context-overflow-action would be triggered if this threshold is exceeded.</td>
	</tr>
	<tr>
	<td><h5>max-tokens</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Long</td>
	<td>The maximum number of tokens that can be generated in the chat completion.</td>
	</tr>
	<tr>
	<td><h5>model</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>Model name, e.g., <code class="highlighter-rouge">gpt-3.5-turbo</code>, <code class="highlighter-rouge">text-embedding-ada-002</code>.</td>
	</tr>
	<tr>
	<td><h5>n</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Long</td>
	<td>How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.</td>
	</tr>
	<tr>
	<td><h5>presence-penalty</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Double</td>
	<td>Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.</td>
	</tr>
	<tr>
	<td><h5>response-format</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td><p>Enum</p></td>
	<td>The format of the response, e.g., 'text' or 'json_object'.<br /><br />Possible values:<ul><li>"text"</li><li>"json_object"</li></ul></td>
	</tr>
	<tr>
	<td><h5>retry-fallback-strategy</h5></td>
	<td style="word-wrap: break-word;">FAILOVER</td>
	<td><p>Enum</p></td>
	<td>Fallback strategy to employ if the retry attempts are exhausted. This strategy is applied when error-handling-strategy is set to retry.<br /><br />Possible values:<ul><li>"FAILOVER": Throw exceptions and fail the Flink job.</li><li>"IGNORE": Ignore the input that caused the error and continue. The error itself would be recorded in log.</li></ul></td>
	</tr>
	<tr>
	<td><h5>retry-num</h5></td>
	<td style="word-wrap: break-word;">100</td>
	<td>Integer</td>
	<td>Number of retry for OpenAI client requests.</td>
	</tr>
	<tr>
	<td><h5>seed</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Long</td>
	<td>If specified, the model platform will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.</td>
	</tr>
	<tr>
	<td><h5>stop</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>A CSV list of strings to pass as stop sequences to the model.</td>
	</tr>
	<tr>
	<td><h5>system-prompt</h5></td>
	<td style="word-wrap: break-word;">"You are a helpful assistant."</td>
	<td>String</td>
	<td>The system message of a chat.</td>
	</tr>
	<tr>
	<td><h5>temperature</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Double</td>
	<td>Controls the randomness or “creativity” of the output. Typical values are between 0.0 and 1.0.</td>
	</tr>
	<tr>
	<td><h5>top-p</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Double</td>
	<td>The probability cutoff for token selection. Usually, either temperature or topP are specified, but not both.</td>
	</tr>
	</tbody>
	</table>