Skip to content

Introduce a num_of_sample/evaluated_samples parameter to the evaluate function in the docling-eval module #141

@AndrewTsai0406

Description

@AndrewTsai0406

I noticed that the DatasetEvaluation class includes a variable evaluated_samples

class DatasetEvaluation(BaseModel):
    evaluated_samples: int = -1
    rejected_samples: Dict[EvaluationRejectionType, int] = {}

However, it seems the current evaluator classes only use this parameter to process the entire dataset (test split) in the benchmark. I’m wondering if we could allow an arbitrary value to be passed during the evaluation dataset construction phase. This could help speed up the evaluation process for benchmark like OmniDocBench, which currently takes about an hour to complete on my machine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions