Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • HabitatSampler HabitatSampler
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 8
    • Issues 8
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Habitat Sampler
  • HabitatSamplerHabitatSampler
  • Issues
  • #62
Closed
Open
Created Mar 21, 2022 by Romulo Pereira Goncalves@romuloOwner

Update documentation

The following points should be covered in the documentation:

Sampling

  1. There are three sampling methods. We need to explain each of them and their optimization strategies.

    • regular_raster
    • random_raster
    • random_matrix
  2. Both seed for random_raster and random_matrix should be set to a different value in each run, such as, seed=as.integer(Sys.time()), unless the user wants reproducible results. Results reproducibility is possible in two ways.

    • At a specific step.
    • An entire classification run.
  3. In case it is not possible to find models, increasing the number of init.samples is not always the solution. The user should also try to re-sample so a new set of sample points is picked.

Prediction

  1. For randomForest it is possible to set the number of trees, which should be 1/3 of the total number of predictors. For small values, below 100, the value should be odd so the models can be used by different predict functions, but also reproducible between runs. Check related issues for more information and to add support material.

Classification

  1. Information to cover issue #57

  2. Add information about issue #61 (closed)

Overall

  1. Describe what are the optimizations for the optimized_mode operation mode.
Edited Mar 21, 2022 by Romulo Pereira Goncalves
Assignee
Assign to
Time tracking