Skip to content

Commit 0f89d32

Browse files
committed
Add example for extended Llama 2 context
1 parent 845a05e commit 0f89d32

File tree

1 file changed

+15
-0
lines changed

1 file changed

+15
-0
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
Start-Process "http://127.0.0.1:8080"
2+
3+
# We are increasing the context size of a Llama 2 model from 4096 token
4+
# to 32768 token, which is a ctx_scale of 8.0. The paramters formula is:
5+
#
6+
# --rope-freq-scale = 1 / ctx_scale
7+
# --rope-freq-base = 10000 * ctx_scale
8+
#
9+
../vendor/llama.cpp/build/bin/Release/server `
10+
--model "../vendor/llama.cpp/models/Phind-CodeLlama-34B-v2/model-quantized-q4_K_M.gguf" `
11+
--ctx-size 16384 `
12+
--rope-freq-scale 0.125 `
13+
--rope-freq-base 80000 `
14+
--threads 16 `
15+
--n-gpu-layers 10

0 commit comments

Comments
 (0)