Skip to content

default dtypes & core.construction #188

@smitkadvani

Description

@smitkadvani

looking at tests, there is a lot of boilerplate that could be reduced, and tests could be made more readable, if we could specify dtypes for functions in core.construction (including from_any, from_list, and from_series).

for example:

    df1 = pd.DataFrame(
        [
            ['chr1', 1, 1]
        ],
        columns=['chrom','start','end']
    ).astype({"start": pd.Int64Dtype(), "end": pd.Int64Dtype()})

would become

df1 = bf.from_any(['chr1', 1, 1])

We provide a dictionary for default columns names in core.specs, however there does not seem to be a dictionary (or other specification) for default dtypes.

One option would be to add them right after the default column names in core.specs:
https://github.com/open2c/bioframe/blob/main/bioframe/core/specs.py#L11C1-L12C1

If added, should they be int, pd.Int64Dtype(), or something else for start and end?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions