Commit 8544840d authored by Daniel Scheffler's avatar Daniel Scheffler
Browse files

Merge branch 'enhancement/improve_docs' into 'master'

Enhancement/improve docs

See merge request !12
parents 0c9ece4a 2ed4dfdf
Pipeline #16104 failed with stages
in 71 minutes and 11 seconds
......@@ -39,8 +39,12 @@ test_gms_preprocessing:
- pip check
# run tests
# run nosetests
- make nosetests
# create the docs
- pip install -U sphinx_rtd_theme # Read-the-docs theme for SPHINX documentation
- pip install -U sphinx-autodoc-typehints
- make docs
......@@ -135,6 +139,7 @@ pages: # this job must be called 'pages' to advise GitLab to upload content to
expire_in: 10 days
- master
- enhancement/improve_docs
......@@ -78,10 +78,10 @@ nosetests: clean-test ## Runs nosetests with coverage, xUnit and nose-html-outpu
docs: ## generate Sphinx HTML documentation, including API docs
rm -f docs/gms_preprocessing.rst
rm -f docs/modules.rst
sphinx-apidoc -o docs/ gms_preprocessing
sphinx-apidoc -o docs/ gms_preprocessing --doc-project 'API Reference'
$(MAKE) -C docs clean
$(MAKE) -C docs html
#$(BROWSER) docs/_build/html/index.html
# $(BROWSER) docs/_build/html/index.html
servedocs: docs ## compile the docs watching for changes
watchmedo shell-command -p '*.rst' -c '$(MAKE) -C docs html' -R -D .
......@@ -118,49 +118,12 @@ This is an example:
PC = ProcessController(jobID=123456, **configuration)
PC = ProcessController(job_ID=123456, **configuration)
Possible configuration arguments can be found `here <>`__.
gms_preprocessing depends on some open source packages which are usually installed without problems by the automatic install
routine. However, for some projects, we strongly recommend resolving the dependency before the automatic installer
is run. This approach avoids problems with conflicting versions of the same software.
Using conda_, the recommended approach is:
.. code:: bash
# create virtual environment for gms_preprocessing, this is optional
conda create -c conda-forge --name gms_preprocessing python=3
conda activate gms_preprocessing
# install some dependencies that cause trouble when installed via pip
conda install -c conda-forge numpy gdal scikit-image pyproj geopandas ipython matplotlib cartopy scikit-learn=0.23.2 shapely pyhdf python-fmask holoviews
# install not pip-installable deps of arosics
conda install -c conda-forge pyfftw pykrige
# install not pip-installable deps of sicor
conda install -c conda-forge glymur pygrib cachetools pyhdf h5py pytables llvmlite numba
# install gms_preprocessing
git clone
cd gms_preprocessing
pip install .
To enable lock functionality (needed for CPU / memory / disk IO management), install redis-server_:
.. code-block:: bash
sudo apt-get install redis-server
History / Changelog
.wy-nav-content {
max-width: 1200px !important;
The goal of the gms_preprocessing Python library is to provide a fully automatic
pre-precessing pipeline for spatial and spectral fusion (i.e., homogenization)
of multispectral satellite image data. Currently it offers compatibility to
Landsat-5, Landsat-7, Landsat-8, Sentinel-2A and Sentinel-2B.
* Free software: GNU General Public License v3 or later (GPLv3+) (`license details <>`_)
* Documentation:
* Code history: Release notes for the current and earlier versions of gms_preprocessing can be found `here <./HISTORY.rst>`_.
* OS compatibility: Linux
Feature overview
Level-1 processing:
* data import and metadata homogenization (compatibility: Landsat-5/7/8, Sentinel-2A/2B)
* equalization of acquisition- and illumination geometry
* atmospheric correction (using `SICOR <>`_)
* correction of geometric errors (using `AROSICS <>`_)
Level-2 processing:
* spatial homogenization
* spectral homogenization (using `SpecHomo <>`_)
* estimation of accuracy layers
=> application oriented analysis dataset
.. _algorithm_description:
Algorithm descriptions
......@@ -13,14 +13,14 @@
# All configuration values have a default; values that are commented out
# serve to show the default.
import sys
import os
# If extensions (or modules to document with autodoc) are in another
# directory, add these directories to sys.path here. If the directory is
# relative to the documentation root, use os.path.abspath to make it
# absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
# Get the project root dir, which is the parent dir of this
cwd = os.getcwd()
......@@ -40,12 +40,22 @@ import gms_preprocessing
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode', 'sphinx.ext.todo', 'sphinxarg.ext']
extensions = [
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The encoding of source files.
......@@ -56,7 +66,8 @@ master_doc = 'index'
# General information about the project.
project = u'gms_preprocessing'
copyright = u"2017, Daniel Scheffler"
copyright = u"2017-2020, Daniel Scheffler"
author = u"Daniel Scheffler"
# The version info for the project you're documenting, acts as replacement
# for |version| and |release|, also used in various other places throughout
......@@ -69,7 +80,10 @@ release = gms_preprocessing.__version__
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# There are two options for replacing |today|: either, you set today to
# some non-false value, then it is used:
......@@ -79,7 +93,8 @@ release = gms_preprocessing.__version__
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
# This patterns also effect to html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# The reST default role (used for this markup: `text`) to use for all
# documents.
......@@ -106,17 +121,52 @@ pygments_style = 'sphinx'
# documents.
#keep_warnings = False
# Define how to document class docstrings
# '__init__' documents only the __init__ methods, 'class' documents only the class methods and 'both' documents both
autoclass_content = 'both'
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = True
# Apply custom sphinx styles (e.g., increase content width of generated docs)
def setup(app):
# Add mappings for intersphinx extension (allows to link to the API reference of other sphinx documentations)
intersphinx_mapping = {
'geoarray': ('', None),
'python': ('', None),
# -- Options for HTML output -------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'default'
# html_theme = 'default'
html_theme = 'sphinx_rtd_theme' # The one installed via pip install sphinx_rtd_theme in the .gitlab.yml
# Theme options are theme-specific and customize the look and feel of a
# theme further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
html_theme_options = {
'canonical_url': '',
'analytics_id': '',
'logo_only': False,
'display_version': True,
'prev_next_buttons_location': 'bottom',
'style_external_links': False,
'vcs_pageview_mode': 'view',
# Toc options
'collapse_navigation': True,
'sticky_navigation': True,
'navigation_depth': 4,
'includehidden': True,
'titles_only': False,
'set_type_checking_flag': True # option of sphinx_autodoc_typehints extension
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
......@@ -202,13 +252,16 @@ latex_elements = {
# Additional stuff for the LaTeX preamble.
#'preamble': '',
# Latex figure (float) alignment
# 'figure_align': 'htbp',
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass
# [howto/manual]).
# [howto, manual, or own class]).
latex_documents = [
('index', 'gms_preprocessing.tex',
(master_doc, 'gms_preprocessing.tex',
u'gms_preprocessing Documentation',
u'Daniel Scheffler', 'manual'),
......@@ -239,9 +292,9 @@ latex_documents = [
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'gms_preprocessing',
(master_doc, 'gms_preprocessing',
u'gms_preprocessing Documentation',
[u'Daniel Scheffler'], 1)
[author], 1)
# If true, show URL addresses after external links.
......@@ -254,9 +307,9 @@ man_pages = [
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
('index', 'gms_preprocessing',
(master_doc, 'gms_preprocessing',
u'gms_preprocessing Documentation',
u'Daniel Scheffler',
'One line description of project.',
.. include:: ../HISTORY.rst
History / Changelog
You can find the protocol of recent changes in the gms_preprocessing package
`here <>`__.
Welcome to gms_preprocessing's documentation!
Documentation of the gms_preprocessing package
.. todo::
This documentation is not yet complete but will be continously updated in future.
Your contributions are always welcome!
.. toctree::
:maxdepth: 2
:maxdepth: 4
:caption: Contents:
Source code repository <>
This section gives an overview about what is needed to run pre-processing / homogenization jobs of gms_preprocessing
on your machine.
.. include:: infrastructure/ecmwf_db.rst
.. include:: infrastructure/postgresql_db.rst
ECMWF database
The atmospheric correction implemented in gms_preprocessing (`SICOR <>`_)
uses `ECMWF <>`_ data (European Centre for Medium-Range Weather Forecasts) to model the
atmospheric state for each scene processed.
These data are ...
* either **downloaded during runtime** for the current scene to process,
* or **downloaded in batch** for specific time intervals before running gms_preprocessing.
To be able to download the data, you need to create an account for the ECMWF Web API and save a file called
`.ecmwfapirc` to your home directory that contains your access token.
See `here <>`__ for further details.
The file path of your local ECMWF database (which is automatically created by gms_preprocessing when downloading ECMWF
data) can be set with a configuration parameter of gms_preprocessing.
PostgreSQL metadata database
.. highlight:: shell
Using Anaconda or Miniconda (recommended)
Stable release
Using conda_ (latest version recommended), gms_preprocessing is installed as follows:
To install gms_preprocessing, run this command in your terminal:
.. code-block:: console
1. Create virtual environment for gms_preprocessing (optional but recommended):
$ pip install gms_preprocessing
.. code-block:: bash
This is the preferred method to install gms_preprocessing, as it will always install the most recent stable release.
$ conda create -c conda-forge --name gms python=3
$ conda activate gms
If you don't have `pip`_ installed, this `Python installation guide`_ can guide
you through the process.
.. _pip:
.. _Python installation guide:
2. Then install gms_preprocessing itself:
.. code-block:: bash
$ conda install -c conda-forge gms_preprocessing
This is the preferred method to install gms_preprocessing, as it always installs the most recent stable release and
automatically resolves all the dependencies.
From sources
Using pip (not recommended)
The sources for gms_preprocessing can be downloaded from the `Github repo`_.
There is also a `pip`_ installer for gms_preprocessing. However, please note that gms_preprocessing depends on some
open source packages that may cause problems when installed with pip. Therefore, we strongly recommend
to resolve the following dependencies before the pip installer is run:
You can either clone the public repository:
* gdal
* geopandas
* ipython
* matplotlib
* numpy
* pyhdf
* python-fmask
* pyproj
* scikit-image
* scikit-learn=0.23.2
* shapely
* scipy
.. code-block:: console
Then, the pip installer can be run by:
$ git clone git://
.. code-block:: bash
Or download the `tarball`_:
$ pip install gms_preprocessing
To enable lock functionality (needed for CPU / memory / disk IO management), install redis-server_:
.. code-block:: bash
sudo apt-get install redis-server
.. code-block:: console
$ curl -OL
If you don't have `pip`_ installed, this `Python installation guide`_ can guide
you through the process.
Once you have a copy of the source, you can install it with:
.. code-block:: console
.. note::
$ python install
The gms_preprocessing package has been tested with Python 3.4+. It should be fully compatible to all Python
versions from 3.4 onwards.
.. _Github repo:
.. _tarball:
.. _pip:
.. _Python installation guide:
.. _conda:
.. _redis-server:
.. include:: ../README.rst
To use gms_preprocessing in a project::
In this section you can find some advice how to use gms_preprocessing
with regard to the Python API and the command line interface.
import gms_preprocessing
Python API
gms_preprocessing command line interface
.. toctree::
:maxdepth: 4
Command line interface
At the command line, gms_preprocessing provides the **** command:
.. argparse::
:filename: ./../bin/
:func: get_gms_argparser
.. toctree::
:maxdepth: 4
Add new data manually
You can also add datasets to the local GeoMultiSens data storage which you previously downloaded on your own
(e.g., via EarthExplorer_ or the `Copernicus Open Access Hub`_).
The following code snippet will exemplarily import two Landat-7 scenes into the GeoMultiSens database:
.. code-block:: python
from gms_preprocessing.options.config import get_conn_database
from gms_preprocessing.misc.database_tools import add_externally_downloaded_data_to_GMSDB
However, this currently only works for Landsat legacy data or if the given filenames are already known in the
GeoMultiSens metadata database.
In other cases, you have to:
1. copy the provider data archives to the GeoMultiSens data storage directory (choose the proper sub-directory
corresponding to the right sensor)
2. register the new datasets in the GeoMultiSens metadata database as follows:
.. code-block:: python
from gms_preprocessing.options.config import get_conn_database
from gms_preprocessing.misc.database_tools import update_records_in_postgreSQLdb
entityids = ["LE70450322008300EDC00",
filenames = ["LE07_L1TP_045032_20081026_20160918_01_T1.tar.gz",
for eN, fN in zip(entityids, filenames):
'filename': fN,
'proc_level': 'DOWNLOADED'},
'entityid': eN
.. _EarthExplorer:
.. _`Copernicus Open Access Hub`:
.. _ref__add_new_data_to_the_database:
Add new data to the database
There are three ways to add new satellite data to the locally stored database. You can use the **WebUI**,
you can run the **data downloader** from the command line or you **add the data manually**.
In each case, two steps have to be carried out:
* the downloaded provider archive data need to be physically copied to the **data storage directory** on disk
* the respective metadata entries need to be added to the GeoMultiSens **metadata database**
.. hint::
Regarding the metadata entry, these conditions must be fulfilled to make GeoMultiSens recognize a dataset as properly
* the **'scenes' table** of the GeoMultiSens metadata database **must contain a corresponding entry** at all
(if the entry is not there, the database needs to be updated by the metadata crawler which has to
be done by the database administrator)
* the **'filename' column** of the respective entry in the 'scenes' table must contain a **valid filename string**
* the **'proc_status' column** of the respective entry in the 'scenes' table must at least be **'DOWNLOADED'**
.. include:: ./using_the_data_downloader.rst
.. include:: ./add_new_data_manually.rst
.. argparse::
:filename: ./../bin/
:func: get_gms_argparser
.. _ref__create_new_jobs:
Create new jobs
There are multiple ways to create new jobs depending on what you have. The section below gives a brief overview.
.. note::
Only those datasets that were correctly added to the local GeoMultiSens data storage before can be used to create a
new GeoMultiSens preprocessing job (see :ref:`ref__add_new_data_to_the_database`).
Create a job from a list of filenames
The list of filenames refers to the filenames of the previously downloaded provider archive data.
.. code-block:: python