Handling NAs in the input data set
@carstenn I have a question about NA values in the input data. Once we request data from GTS2, we remove the dates for which a reference data point has NA value (otherwise, we will not be able to extract the values for layer). However, we do not remove all the NAs
, therefore, there is still the possibility to have layers pixels which their value is NA
. When that happens RandomForest complains with the message missing values in object
. We are talking about this piece of code:
model1 <- randomForest::randomForest(as.factor(classes) ~ .,
data = data,
mtry = mtry)
One solution for this issue is either to remove all NAs from the input data or run the RandomForest with the parameter na.action = na.omit
. The latter is the easiest option:
model1 <- randomForest::randomForest(as.factor(classes) ~ .,
na.action = na.omit,
data = data,
mtry = mtry)
We have tested that and of course that will lead to a result with holes when NAs exist, this is, the output results will look like this:
@carstenn what is your option? Should we add this option or not? Currently we are not able to process data for large time-series because there is often NAs
in some pixels.