Skip to content

Conversation

IoannaNika
Copy link
Contributor

Added support for:

  1. date_build() - clock package
  2. difftime() - base package

Fixed:

  1. add_days()
  2. add_years()

The issue is shown below. When the functions are used with a variable being passed for the interval number to be added, the translation is wrong. Parenthesis are needed to support variables (see documentation https://duckdb.org/docs/sql/functions/date.html#date_adddate-interval)

library(clock)
library(DBI)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
  
con <- DBI::dbConnect(duckdb::duckdb())

date_df <- dplyr::tibble(
  date1 = as.Date(c("2000-12-01", "2000-12-01", "2000-12-01", "2000-12-01")),
  date2 = as.Date(c("2001-12-01", "2001-12-02", "2001-11-30", "2001-01-01")),
  y = 2000L, m = 10L, d = 11L
)

DBI::dbWriteTable(con,  "tmpdate", date_df)
date_tbl <- dplyr::tbl(con,  "tmpdate")


df <- dplyr::mutate(date_tbl,
  date7 = as.Date(add_years(date1, m)),
  date8 = as.Date(add_days(date1, m))
) 

df %>% show_query()
#> <SQL>
#> SELECT
#>   tmpdate.*,
#>   CAST(DATE_ADD(date1, INTERVAL 'm year') AS DATE) AS date7,
#>   CAST(DATE_ADD(date1, INTERVAL 'm day') AS DATE) AS date8
#> FROM tmpdate

df
#> Error in `collect()`:
#> ! Failed to collect lazy table.
#> Caused by error in `duckdb_result()`:
#> ! rapi_execute: Failed to run query
#> Error: Conversion Error: Could not convert string 'm year' to INTERVAL

sql <- "SELECT
   tmpdate.*,
   CAST(DATE_ADD(date1, INTERVAL (m) year) AS DATE) AS date7,
   CAST(DATE_ADD(date1, INTERVAL (m) day) AS DATE) AS date8
 FROM tmpdate"

dbGetQuery(conn = con, sql)
#>        date1      date2    y  m  d      date7      date8
#> 1 2000-12-01 2001-12-01 2000 10 11 2010-12-01 2000-12-11
#> 2 2000-12-01 2001-12-02 2000 10 11 2010-12-01 2000-12-11
#> 3 2000-12-01 2001-11-30 2000 10 11 2010-12-01 2000-12-11
#> 4 2000-12-01 2001-01-01 2000 10 11 2010-12-01 2000-12-11

sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS 15.0
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Amsterdam
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.1.4 DBI_1.2.3   clock_0.7.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       cli_3.6.3         knitr_1.48        rlang_1.1.4      
#>  [5] xfun_0.48         purrr_1.0.2       generics_0.1.3    glue_1.8.0       
#>  [9] dbplyr_2.5.0.9000 htmltools_0.5.8.1 fansi_1.0.6       rmarkdown_2.28   
#> [13] evaluate_1.0.0    tibble_3.2.1      tzdb_0.4.0        fastmap_1.2.0    
#> [17] yaml_2.3.10       lifecycle_1.0.4   duckdb_1.1.3-1    compiler_4.4.1   
#> [21] blob_1.2.4        fs_1.6.4          pkgconfig_2.0.3   rstudioapi_0.16.0
#> [25] digest_0.6.37     R6_2.5.1          tidyselect_1.2.1  reprex_2.1.1     
#> [29] utf8_1.2.4        pillar_1.9.0      magrittr_2.0.3    tools_4.4.1      
#> [33] withr_3.0.2

Created on 2024-12-12 with reprex v2.1.1

Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The bug fix and the new features could be two separate PRs. Also, medium-term, I'd like to switch to snapshot tests for the SQL translations, this PR could be a start.

# clock
add_days = function(x, n, ...) {
build_sql("DATE_ADD(", !!x, ", INTERVAL '", n ," day')")
build_sql("DATE_ADD(", !!x, ", INTERVAL (", n ,") day)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now wonder why this says !!x with the bang-bang and n without.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess is a choice, I think x can be a function to be evaluated in this case, while n can't


},
date_build = function(year, month = 1L, day = 1L, ..., invalid = NULL) {
dbplyr:::check_unsupported_arg(invalid, allow_null = TRUE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not access a function private to dbplyr here. How do other packages handle this? As a last resort, we could vendor a variant of check_unsupported_arg() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you implement such a function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a new dbplyr.R or check_unsupported_arg.R .

date_build = function(year, month = 1L, day = 1L, ..., invalid = NULL) {
dbplyr:::check_unsupported_arg(invalid, allow_null = TRUE)
rlang::check_dots_empty()
build_sql("MAKE_DATE(CAST(", year, " AS INTEGER), CAST(", month, " AS INTEGER), CAST(", day, " AS INTEGER))")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how SELECT MAKE_DATE(2022.0, 3.0, 8.0); fails in duckdb. Still, casting shouldn't be a concern to MAKE_DATE() .

difftime = function(time1, time2, tz, units = "days") {
dbplyr:::check_unsupported_arg(tz)
dbplyr:::check_unsupported_arg(units, allowed = "days")
build_sql("DATEDIFF('day', ", !!time2, ", " ,!!time1, ")")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could use the recommended sql_expr() in the translation here and above.

@krlmlr
Copy link
Collaborator

krlmlr commented Sep 7, 2025

@IoannaNika: Do you want to take another stab?

@IoannaNika
Copy link
Contributor Author

@krlmlr I can try to make some time to work on this. Should I work on a separate PR?

If I recall correctly the tasks remaining are to add support for these functions:

  1. date_build() - clock package
  2. difftime() - base package

@krlmlr
Copy link
Collaborator

krlmlr commented Sep 8, 2025

Thanks. The comments above are still valid, adding to this PR is fine.

@IoannaNika
Copy link
Contributor Author

@krlmlr I have issues building the package, I get some error:

use of undeclared identifier 'ArrowTypeExtensionData ArrowTypeExtensionData::GetExtensionTypes(*options.client_context, scan_state.Types()));

Do you know why that is?

@krlmlr
Copy link
Collaborator

krlmlr commented Sep 15, 2025

When the code changes, I often need git clean -fdx src for the build to work.

@IoannaNika
Copy link
Contributor Author

IoannaNika commented Oct 15, 2025

I think It's working now @krlmlr

library(duckdb)
#> Loading required package: DBI
library(clock)
library(DBI)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

con <- DBI::dbConnect(duckdb::duckdb())

date_df <- dplyr::tibble(id = 1)

DBI::dbWriteTable(con, "tmpdate", date_df)
date_tbl <- dplyr::tbl(con, "tmpdate")


df <- dplyr::mutate(date_tbl,
  date7 = clock::date_build(as.integer(1999)),
  date8 = clock::date_build(as.integer(2000), as.integer(1), as.integer(1))
)

df
#> # Source:   SQL [?? x 3]
#> # Database: DuckDB 1.4.2-dev16 [root@Darwin 24.0.0:R 4.4.1/:memory:]
#>      id date7      date8     
#>   <dbl> <date>     <date>    
#> 1     1 1999-01-01 2000-01-01

df <- dplyr::mutate(df,
  diff_time = difftime(date8, date7)
)

df
#> # Source:   SQL [?? x 4]
#> # Database: DuckDB 1.4.2-dev16 [root@Darwin 24.0.0:R 4.4.1/:memory:]
#>      id date7      date8      diff_time
#>   <dbl> <date>     <date>         <dbl>
#> 1     1 1999-01-01 2000-01-01       365

Created on 2025-10-15 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants