From 930596a40411ded3774419bb40808c7e91a5f9b2 Mon Sep 17 00:00:00 2001 From: kykim0 Date: Sat, 29 Jun 2024 00:06:20 -0700 Subject: [PATCH] Fix the get_info/grad path in README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e8300bd..38bb456 100644 --- a/README.md +++ b/README.md @@ -66,7 +66,7 @@ MODEL_PATH=../out/llama2-7b-p0.05-lora-seed3/checkpoint-${CKPT} OUTPUT_PATH=../grads/llama2-7b-p0.05-lora-seed3/${TRAINING_DATA_NAME}-ckpt${CKPT}-${GRADIENT_TYPE} DIMS="8192" -./less/scripts/get_info/get_train_lora_grads.sh "$TRAINING_DATA_FILE" "$MODEL_PATH" "$OUTPUT_PATH" "$DIMS" "$GRADIENT_TYPE" +./less/scripts/get_info/grad/get_train_lora_grads.sh "$TRAINING_DATA_FILE" "$MODEL_PATH" "$OUTPUT_PATH" "$DIMS" "$GRADIENT_TYPE" ``` Ideally, you would aim to create a datastore that encompasses a gradient of all the checkpoints and training data from which you wish to choose. @@ -82,7 +82,7 @@ OUTPUT_PATH=../grads/llama2-7b-p0.05-lora-seed3/${TASK}-ckpt${CKPT}-sgd # for va DATA_DIR=../data DIMS="4096 8192" # We use 8192 as our default projection dimension -./less/scripts/get_info/get_eval_lora_grads.sh "$TASK" "$DATA_DIR" "$MODEL_PATH" $OUTPUT_PATH "$DIMS" +./less/scripts/get_info/grad/get_eval_lora_grads.sh "$TASK" "$DATA_DIR" "$MODEL_PATH" $OUTPUT_PATH "$DIMS" ``` You should gain the gradients of the validation data for all the checkpoints you used for building the gradient datastore in the previous step. After obtaining the gradients for the validation data, we can then select data for the task. The following script will calculate the influence score for each training data point, and select the top-k data points with the highest influence score.