Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • HabitatSampler HabitatSampler
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 8
    • Issues 8
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Habitat SamplerHabitat Sampler
  • HabitatSamplerHabitatSampler
  • Merge requests
  • !33

Optimized version.

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Romulo Pereira Goncalves requested to merge classification_parallelism into master Feb 09, 2022
  • Overview 12
  • Commits 117
  • Pipelines 0
  • Changes 53

In this version we have improved several things.

  1. We are now able to define the number of trees for randomForest.
  2. Run in optimized mode a. Improve the code to only use raster objects in memory. Clean the temporary storage. b. Use matrices instead of raster when possible.
  3. Add a new sample method which only works for the optimized_mode since it runs with matrices. Now we have random_raster (equivalent to old random which uses raster::sampleRandom function), raster_regular (equivalent to old regular which uses raster::sampleRegular function), and random_matrix (new one which uses matrices and the stats::sample function over the only existent pixels and not over all as raster::sampleRandom - more info here).
  4. Add last_ref_val, this is, the default reference value for the last step (default: 1000)

@dara, @jknoch and @carstenn the official reviewer will be Daniela, but it would be great if all of you could review the changes and test it. There is still the documentation to update which I hope to get it done tomorrow based on your feedback. I am aware the review of this branch will take a while because we did quite some changes.

I have tested the optimized version several times and it all seemed running as expected. In case you want to repeat your previous run, just restart R, run again the function with the same seed (now as default we recommend to always set the seed to the current time as integer), the same init.samples, sample_type, and models, and use the same thresholds and you will get exactly the same results (it works for random_raster and random_matrix as well). About the seeds, more info in issue #59 (closed) and #58 (closed).

Closes issue #54 (closed), #53 (closed), #55 (closed), #56 (closed) ,#58 (closed), #59 (closed), #61 (closed)

Edited Mar 21, 2022 by Romulo Pereira Goncalves
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: classification_parallelism