Skip to content

Feature request: support for making interaction variables #26

@jhelvy

Description

@jhelvy

Hi! First of all, thanks for making this. I just did some quick benchmarking and {fastDummies} really is fast! I have a use case where this will actually make a big difference: simulating several million combinations of models using simulated data where each iteration requires creating new dummy variables. So this will help a lot!

In my case though I also need to create interaction variables, and it would be great if there was a way to build on this package to make them. Here is an example to show what I mean.

df <- data.frame( 
    price = runif(100, 5, 10),
    brand = sample(c("Nike", "Adidas"), 100, replace = TRUE)
)

df_without_ints <- fastDummies::dummy_cols(df, "brand")

df_with_ints <- as.data.frame(
    model.matrix(
        data = df, 
        object = ~price + brand + price*brand - 1)
    )

The df_without_ints data frame uses fastDummies::dummy_cols() to generate dummies, but it doesn't include interactions between price and the dummied brand coefficients. In contrast, I can use model.matrix() to generate both (see the df_with_ints object). model.matrix() isn't as fast, but it works well if you need both dummies and interactions with other columns. Does this make sense, and do you think it might be something others might be interested in?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions