This repository was archived by the owner on Jul 4, 2025. It is now read-only.
File tree Expand file tree Collapse file tree 2 files changed +1
-9
lines changed Expand file tree Collapse file tree 2 files changed +1
-9
lines changed Original file line number Diff line number Diff line change @@ -70,8 +70,8 @@ In case you got error while loading models. Please check for the correct model p
7070| ` ctx_len ` | Integer | The context length for the model operations. |
7171| ` embedding ` | Boolean | Whether to use embedding in the model. |
7272| ` n_parallel ` | Integer | The number of parallel operations.|
73- | ` cpu_threads ` | Integer| The number of threads for CPU inference.|
7473| ` cont_batching ` | Boolean | Whether to use continuous batching. |
74+ | ` cpu_threads ` | Integer| The number of threads for CPU inference.|
7575| ` user_prompt ` | String | The prompt to use for the user. |
7676| ` ai_prompt ` | String | The prompt to use for the AI assistant. |
7777| ` system_prompt ` | String | The prompt for system rules. |
Original file line number Diff line number Diff line change @@ -441,10 +441,6 @@ components:
441441 default : true
442442 nullable : true
443443 description : Determines if output generation is in a streaming manner.
444- cache_prompt :
445- type : boolean
446- default : true
447- description : Optimize performance in repeated or similar requests.
448444 temp :
449445 type : number
450446 default : 0.7
@@ -585,10 +581,6 @@ components:
585581 min : 0
586582 max : 1
587583 description : Set probability threshold for more relevant outputs
588- cache_prompt :
589- type : boolean
590- default : true
591- description : Optimize performance in repeated or similar requests.
592584 ChatCompletionResponse :
593585 type : object
594586 description : Description of the response structure
You can’t perform that action at this time.
0 commit comments