-
-
Notifications
You must be signed in to change notification settings - Fork 65
Description
A simple preprocessing with grep -v '^#'
is one way to solve this issue, and maybe it wouldn't be a simple fix inside cmdstanr if it's related to how these files are written during sampling, but just in case it would be simple inside cmdstanr...
The files created by save_output_files
have some comment lines at the start, some more in between the column headers (parameter names) and the values, and some more at the end.
This is too much for poor data.table::fread()
to handle: pending its long awaited comment.char
argument, it can only reliably skip lines that come together at the start of the file. Since data.table::fread()
is go-to for huge csv files, it would be nice if all the comment lines were put together at the start of the file, such that these files can be read as-is by fread.
Example
library(data.table)
library(cmdstanr)
code <- "
data {
int N;
vector[N] x;
vector[N] y;
}
parameters {
real m;
real c;
real sigma;
}
model {
y ~ normal(m * x + c, sigma);
}
"
file <- write_stan_file(code)
model <- cmdstan_model(file)
samples <- model$sample(data = list(N = 1, x = 1, y = 1), iter_sampling = 10, iter_warmup = 10)
samples$save_output_files("~/", basename = "foo", timestamp = FALSE, random = FALSE)
df_ <- fread("~/foo-1.csv")
gives
Warning messages:
1: In fread("~/foo-1.csv") :
Detected 3 column names but the data has 10 columns (i.e. invalid file). Added 7 extra default column names at the end.
2: In fread("~/foo-1.csv") :
Stopped early on line 63. Expected 10 fields but found 1. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<# >>
and df_
is
# 1 1 1 V4 V5 V6 V7 V8 V9 V10
<num> <num> <num> <int> <int> <int> <num> <num> <num> <num>
1: -4.76425 0.999885 2.74896 7 127 0 6.27136 184.340 -244.86300 95.1088
2: -4.66059 0.999970 2.74896 6 63 0 5.03261 213.799 -171.11100 96.2315
3: -5.09712 0.981220 2.74896 6 66 1 6.39704 166.980 -125.97500 158.4170
4: -6.11659 0.999955 2.74896 7 127 0 6.23478 281.608 -7.43179 252.5850
5: -6.35803 0.999807 2.74896 8 255 0 9.37554 242.192 -442.52700 538.0940
6: -7.64953 0.999985 2.74896 10 1023 0 8.23180 444.349 -868.81500 2055.1400
7: -7.33027 1.000000 2.74896 10 1023 0 8.37176 -251.299 922.23900 1348.7000
8: -6.84893 1.000000 2.74896 10 1023 0 7.82562 -1589.190 1803.51000 917.7360
9: -6.91315 0.999973 2.74896 8 511 0 7.26564 -1565.040 1840.16000 965.7080
10: -6.92680 0.999974 2.74896 9 831 0 7.81856 -2004.660 2241.34000 990.8020