Skip to content

mistralai/Mixtral-8x22B-Instruct-v0.1 / tokenizer #69

@ggbetz

Description

@ggbetz

Same error as here #66

2024-10-04:15:42:36,551 INFO     [__main__.py:364] Passed `--trust_remote_code`, setting environment variable `HF_DATASETS_TRUST_REMOTE_CODE=true`
2024-10-04:15:42:36,551 INFO     [__main__.py:376] Selected Tasks: ['logiqa2_base', 'logiqa_base', 'lsat-ar_base', 'lsat-lr_base', 'lsat-rc_base']
2024-10-04:15:42:36,553 INFO     [evaluator.py:161] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-10-04:15:42:36,553 INFO     [evaluator.py:198] Initializing local-completions model, with arguments: {'base_url': 'http://localhost:8080/v1/completions', 'num_concurrent': 1, 'max_retries': 3, 'tokenized_requests': False, 'model': 'mistralai/Mixtral-8x22B-Instruct-v0.1', 'trust_remote_code': True}
2024-10-04:15:42:36,553 INFO     [api_models.py:108] Using max length 2048 - 1
2024-10-04:15:42:36,553 INFO     [api_models.py:111] Concurrent requests are disabled. To enable concurrent requests, set `num_concurrent` > 1.
2024-10-04:15:42:36,553 INFO     [api_models.py:121] Using tokenizer huggingface
Traceback (most recent call last):
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2450, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 157, in __init__
    super().__init__(
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 107, in __init__
    raise ValueError(
ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/bin/lm-eval", line 8, in <module>
    sys.exit(cli_evaluate())
             ^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/__main__.py", line 382, in cli_evaluate
    results = evaluator.simple_evaluate(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/evaluator.py", line 201, in simple_evaluate
    lm = lm_eval.api.registry.get_model(model).create_from_arg_string(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/api/model.py", line 147, in create_from_arg_string
    return cls(**args, **args2)
           ^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 18, in __init__
    super().__init__(
  File "/scratch/slurm_tmpdir/job_2674890/lm-evaluation-harness/lm_eval/models/api_models.py", line 130, in __init__
    self.tokenizer = transformers.AutoTokenizer.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 907, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2216, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2451, in _from_pretrained
    except import_protobuf_decode_error():
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/slurm_tmpdir/job_2674890/venv-cot-eval/lib64/python3.11/site-packages/transformers/tokenization_utils_base.py", line 87, in import_protobuf_decode_error
    raise ImportError(PROTOBUF_IMPORT_ERROR.format(error_message))
ImportError:
 requires the protobuf library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions