Skip to content

Optimized version.

Romulo Pereira Goncalves requested to merge classification_parallelism into master

In this version we have improved several things.

  1. We are now able to define the number of trees for randomForest.
  2. Run in optimized mode a. Improve the code to only use raster objects in memory. Clean the temporary storage. b. Use matrices instead of raster when possible.
  3. Add a new sample method which only works for the optimized_mode since it runs with matrices. Now we have random_raster (equivalent to old random which uses raster::sampleRandom function), raster_regular (equivalent to old regular which uses raster::sampleRegular function), and random_matrix (new one which uses matrices and the stats::sample function over the only existent pixels and not over all as raster::sampleRandom - more info here).
  4. Add last_ref_val, this is, the default reference value for the last step (default: 1000)

@dara, @jknoch and @carstenn the official reviewer will be Daniela, but it would be great if all of you could review the changes and test it. There is still the documentation to update which I hope to get it done tomorrow based on your feedback. I am aware the review of this branch will take a while because we did quite some changes.

I have tested the optimized version several times and it all seemed running as expected. In case you want to repeat your previous run, just restart R, run again the function with the same seed (now as default we recommend to always set the seed to the current time as integer), the same init.samples, sample_type, and models, and use the same thresholds and you will get exactly the same results (it works for random_raster and random_matrix as well). About the seeds, more info in issue #59 (closed) and #58 (closed).

Closes issue #54 (closed), #53 (closed), #55 (closed), #56 (closed) ,#58 (closed), #59 (closed), #61 (closed)

Edited by Romulo Pereira Goncalves

Merge request reports