bug fix

chensuyue · chensuyue · commit bb837b036d96 · 2025-11-05T15:46:41.000+08:00
Signed-off-by: chensuyue &lt;suyue.chen@intel.com&gt;
diff --git a/.github/workflows/pr-link-scan.yml b/.github/workflows/pr-link-scan.yml
@@ -110,31 +110,24 @@ jobs:
               echo "No.2----->refer_path is $refer_path, png_path is $png_path"
 
               if [[ "${png_path:0:1}" == "/" ]]; then
+                # absolute path
                 check_path=$(echo "${png_path:1}" | cut -d '#' -f1)
                 echo "No.3----->check_path is $check_path"
-              elif [[ "$png_path" == *#* ]]; then
-                relative_path=$(echo "$png_path" | cut -d '#' -f1)
-                echo "No.4----->relative_path is $relative_path"
-                if [ -n "$relative_path" ]; then
-                  check_path=$(dirname "$refer_path")/$relative_path
-                  png_path=$(echo "$png_path" | awk -F'#' '{print "#" $2}')
-                  echo "No.5----->check_path is $check_path, png_path is $png_path"
-                else
-                  check_path=$refer_path
-                  echo "No.6----->check_path is $check_path"
-                fi
               else
-                check_path=$(dirname "$refer_path")/$png_path
-                echo "No.7----->check_path is $check_path"
+                # relative path 
+                check_path=${refer_path}
+                relative_path=$(echo "$png_path" | cut -d '#' -f1)
+                if [ -n "$relative_path" ]; then check_path=$(dirname "$refer_path")/$relative_path fi
+                echo "No.4----->check_path is $check_path"
               fi
 
               if [ -e "$check_path" ]; then
                 real_path=$(realpath $check_path)
-                echo "No.8 -----> Found relative path: $png_line from ${{github.workspace}}/$refer_path"
-                if [[ "$png_line" == *#* ]]; then
+                echo "No.5----->real_path is $real_path"
+                if [[ "$png_path" == *#* ]]; then
                   if [ -n "changed_files" ] && echo "$changed_files" | grep -q "^${refer_path}$"; then
-                    url_dev=$branch$(echo "$real_path" | sed 's|.*/neural-compressor||')$png_path
-                    echo "No.9----->url_dev is $url_dev"
+                    url_dev=$branch$(echo "$real_path" | sed 's|.*/neural-compressor||')$(echo "$png_path" | cut -d '#' -f2)
+                    echo "No.5----->url_dev is $url_dev"
                     sleep $delay
                     response=$(curl -I -L -s -o /dev/null -w "%{http_code}" "$url_dev")
                     if [ "$response" -ne 200 ]; then
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -134,7 +134,7 @@ repos:
         exclude: |
           (?x)^(
               examples/.*(txt|patch)|
-              examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static/prompt.json|
+              examples/deprecated/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static/prompt.json|
               neural_compressor/torch/algorithms/fp8_quant/internal/diffusion_evaluation/SR_evaluation/imagenet1000_clsidx_to_labels.txt|
               neural_compressor/evaluation/hf_eval/datasets/cnn_validation.json|
               neural_compressor/torch/algorithms/fp8_quant/.+|
diff --git a/docs/source/3x/PT_MixedPrecision.md b/docs/source/3x/PT_MixedPrecision.md
@@ -18,7 +18,7 @@ The 4th Gen Intel® Xeon® Scalable processor supports FP16 instruction set arch
 Further details can be found in the [Intel AVX512 FP16 Guide](https://www.intel.com/content/www/us/en/content-details/669773/intel-avx-512-fp16-instruction-set-for-intel-xeon-processor-based-products-technology-guide.html) published by Intel.
 
 The latest Intel Xeon processors deliver flexibility of Intel Advanced Matrix Extensions (Intel AMX) ,an accelerator that improves the performance of deep learning(DL) training and inference, making it ideal for workloads like NLP, recommender systems, and image recognition. Developers can code AI functionality to take advantage of the Intel AMX instruction set, and they can code non-AI functionality to use the processor instruction set architecture (ISA). Intel has integrated the Intel® oneAPI Deep Neural Network Library (oneDNN), its oneAPI DL engine, into Pytorch.
-Further details can be found in the [Intel AMX Document](https://www.intel.com/content/www/us/en/content-details/785250/accelerate-artificial-intelligence-ai-workloads-with-intel-advanced-matrix-extensions-intel-amx.html) published by Intel.
+Further details can be found in the [Intel AMX Document](https://www.intel.com/content/www/us/en/content-details/785250/accelerate-artificial-intelligence-workloads-with-intel-advanced-matrix-extensions.html) published by Intel.
 
 <p align="center" width="100%">
     <img src="./imgs/data_format.png" alt="Architecture" height=230>
diff --git a/docs/source/dataloader.md b/docs/source/dataloader.md
@@ -93,6 +93,6 @@ q_model = quantization.fit(model, config, calib_dataloader=dataloader, eval_func
 
 ## Examples
 
-- Refer to this [example](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq_static) for how to define a customised dataloader.
+- Refer to this [example](https://github.com/intel/neural-compressor/blob/master/examples/deprecated/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq_static) for how to define a customised dataloader.
 
-- Refer to this [example](https://github.com/intel/neural-compressor/blob/master/examples/onnxrt/nlp/bert/quantization/ptq_static) for how to use internal dataloader.
+- Refer to this [example](https://github.com/intel/neural-compressor/blob/master/examples/deprecated/onnxrt/nlp/bert/quantization/ptq_static) for how to use internal dataloader.
diff --git a/docs/source/metric.md b/docs/source/metric.md
@@ -124,6 +124,6 @@ q_model = fit(model, config, calib_dataloader=calib_dataloader, eval_dataloader=
 
 ## Example
 
-- Refer to this [example](https://github.com/intel/neural-compressor/tree/master/examples/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq_static) for how to define a customised metric.
+- Refer to this [example](https://github.com/intel/neural-compressor/tree/master/examples/deprecated/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq_static) for how to define a customised metric.
 
-- Refer to this [example](https://github.com/intel/neural-compressor/blob/master/examples/tensorflow/image_recognition/tensorflow_models/efficientnet-b0/quantization/ptq) for how to use internal metric.
+- Refer to this [example](https://github.com/intel/neural-compressor/tree/master/examples/deprecated/tensorflow/image_recognition/tensorflow_models/efficientnet-b0/quantization/ptq) for how to use internal metric.
diff --git a/docs/source/pruning.md b/docs/source/pruning.md
@@ -107,7 +107,7 @@ Pruning patterns defines the rules of pruned weights' arrangements in space. Int
 
 - Multi-head Attention Pruning
 
-  Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. **We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim**. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)
+  Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. **We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim**. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/deprecated/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)
 
 
 
diff --git a/docs/source/smooth_quant.md b/docs/source/smooth_quant.md
@@ -446,7 +446,7 @@ recipes = {"smooth_quant": True,
 conf = PostTrainingQuantConfig(recipes=recipes）
 ```
 
-To get more information, please refer to [examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm).
+To get more information, please refer to [examples](https://github.com/intel/neural-compressor/blob/master/examples/deprecated/pytorch/nlp/huggingface_models/language-modeling/quantization/llm).
 
  
 ## Supported Framework Matrix
diff --git a/examples/deprecated/onnxrt/nlp/huggingface_model/text_classification/mix_precision/README.md b/examples/deprecated/onnxrt/nlp/huggingface_model/text_classification/mix_precision/README.md
@@ -11,7 +11,7 @@ git clone -b dnnl_ep --depth 1 https://github.com/intel/neural-compressor.git
 cd neural-compressor
 pip install -e ./
 
-cd examples/onnxrt/nlp/huggingface_model/text_classification/mix_precision/
+cd examples/deprecated/onnxrt/nlp/huggingface_model/text_classification/mix_precision/
 pip install -r requirements.txt
 ```
 
diff --git a/neural_compressor/compression/pruner/README.md b/neural_compressor/compression/pruner/README.md
@@ -107,7 +107,7 @@ Pruning patterns defines the rules of pruned weights' arrangements in space. Int
 
 - Multi-head Attention Pruning
 
-  Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. **We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim**. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)
+  Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. **We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim**. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/examples/deprecated/pytorch/nlp/huggingface_models/question-answering/model_slim)
 
 
 

Original file line number	Diff line number	Diff line change
`@@ -107,7 +107,7 @@ Pruning patterns defines the rules of pruned weights' arrangements in space. Int`
`107`	`107`
`108`	`108`	`- Multi-head Attention Pruning`
`109`	`109`
`110`		- Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)
	`110`	+ Multi-head attention mechanism boosts transformer models' capability of contextual information analysis. However, different heads' contribution to the final output varies. In most situation, a number of heads can be removed without causing accuracy drop. Head pruning can be applied in a wide range of scenes including BERT, GPT as well as other large language models. We haven't support it in pruning, but we have provided experimental feature in Model Auto Slim. Please refer to [multi-head attention auto slim examples](https://github.com/intel/neural-compressor/blob/master/deprecated/examples/pytorch/nlp/huggingface_models/question-answering/model_slim)
`111`	`111`
`112`	`112`
`113`	`113`