Evaluating weka classifier J48 with missing values in test set, R RWeka

I have an error when evaluating a simple test set with evaluate_Weka_classifier. Trying to learn how the interface works from R to Weka with RWeka, but I still don't get this.

library("RWeka")
iris_input  <- iris[1:140,]
iris_test <- iris[-(1:140),]
iris_fit  <- J48(Species ~ ., data = iris_input)
evaluate_Weka_classifier(iris_fit, newdata = iris_test, numFolds=5)

No problems here, as we would assume (It is ofcourse a stupit test, no random holdout data etc). But now I want to simulate missing data (alot). So i set Petal.Width as missing:

iris_test$Petal.Width <- NA
evaluate_Weka_classifier(iris_fit, newdata = iris_test, numFolds=5)

Which gives the error: Error in .jcall(evaluation, "S", "toSummaryString", complexity) : java.lang.IllegalArgumentException: Can't have more folds than instances!

Edit: This error should tell me that I have not enough instances, but I have 10

Edit: If I use write.arff, it can be exported and read in by Weka. Change Petal.Width {} into Petal.Width numeric to make the two files exactly the same. Then it works in Weka.

Is this a thinking error? When reading Machine Learning, Practical machine learning tools and techniques it seems to be legit. Maybe I just have to tell RWeka that I want to use fractions when a split uses a missing variable?

Thnx!

Answers


The issue is that you need to tell J48() what to do with missing values.

library(RWeka)
?J48()  

#pertinent output  
J48(formula, data, subset, na.action,
control = Weka_control(), options = NULL)

na.action tells R what to do with missing values. When following up on na.action you will find that "The ‘factory-fresh’ default is na.omit". Under this setting of course there are not enough instances!

Instead of leaving na.action as the default omit, I have changed it as follows,

iris_fit<-J48(Species~., data = iris_input, na.action=NULL)

and it works like a charm!


Need Your Help

How to read Facebook Post data in Sencha touch store

json facebook extjs sencha-touch script-tag

I got the following data from the graph api url.From my store is loading correctly but data is not adding. In Store data I am getting all fields that were given in model as null. I don't think the

About shell and subshell

linux bash shell

I'm new to shell,I just learned that use (command) will create a new subshell and exec the command, so I try to print the pid of father shell and subshell: