Skip to content

自定义数据集只能用pkl?我记得之前用过csv 近期再用发现提示让我用pkl #68

@hdyzhuxun

Description

@hdyzhuxun

请问哪里更改使用csv格式数据集来训练? 我找了好久没有发现可以改的地方呢
def read_data(cls, input_file,quotechar = None):
"""Reads a tab separated value file."""
if 'pkl' in str(input_file): #pkl 改 csv ??
lines = load_pickle(input_file)
else:
lines = input_file
return lines

run_bert.py 里
`def run_train(args):
# --------- data
processor = BertProcessor(vocab_path=config['bert_vocab_path'], do_lower_case=args.do_lower_case)
label_list = processor.get_labels()
label2id = {label: i for i, label in enumerate(label_list)}
id2label = {i: label for i, label in enumerate(label_list)}

train_data = processor.get_train(config['data_dir'] / f"{args.data_name}.train.csv")
train_examples = processor.create_examples(lines=train_data,
                                           example_type='train',
                                           cached_examples_file=config[
                                                'data_dir'] / f"cached_train_examples_{args.arch}")`

可以解惑一下么

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions