Can't predict on randomForest when test set contains NA's in features

I don't know if this is a bug in some sort or if I'm overlooking something, but this baffled @ja-thomas and me a bit this morning.
Consider a simple case where you have a missing value somewhere in your test set like in this example:

```r
lrn.rf = makeLearner("classif.randomForest")
mod = train(lrn.rf, iris.task)
test.df = getTaskData(iris.task)
test.df[1L, 1L] = NA
```
mlr then throws an error when you try to predict on this set, randomForest's predict method doesn't though:
```r
# throws error: row names contain missing values
predict(mod, newdata = test.df)
# if I'm directly using the predict method from randomForest it works
predict(mod$learner.model, test.df)
```

I tried printing out `.newdata` in `predictLearner.classif.randomForest` to see if we do sth unwanted with the data.frame before sending it to the learner's predict method but row names / str etc. looks fine.
Any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Can't predict on randomForest when test set contains NA's in features #1515

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Can't predict on randomForest when test set contains NA's in features #1515

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions