Epic: Experiment System Requirements

Scope

The Earthquake Forecast Experiment for Italy (hereafter Experiment) is designed to test the performance of state-of-the-art short-term forecasting models in a prospective fashion. The general objectives are (i) improve our understanding of the physics and statistics of earthquake occurrence and (ii) validate the models and their components used in Operational Earthquake Forecasting. To perform the Experiment, a new software system should be designed, whose architecture satisfies the current open-source scientific standards and CSEP philosophy. However, the design of the software architecture is not sufficient on its own to achieve the objectives of the Experiment, although they must be closely related to them through the system requirements.

Requirements

An upper-level definition of the requirements of the system. We define the system requirements based on current-standards of scientific code development.

Reproducible

Must allow any independent user to obtain the exact Experiment results

Re-runnable

The system can execute the Experiment during the testing period and after its termination
It can be run on-demand by any user during its duration

Accessible

The system allows the Experiment results to be easily obtained and visualized
The Experiment can be deployed by any user with minimum (tbd) programming knowledge

Reusable

The software system can be applied to other case studies with minimal modifications
New features (e.g. methods) can be easily added or existing can be modified

Replicable

The system must allow replicability of the Experiment, i.e. using different datasets
Unambiguously defined, such it could be replicated by a different developer

Specifications

How will the requirements be satisfied.

Reproducibility

Results of the experiment can be exactly obtained by any independent user

Bookkeeping of authoritative data
- Versioning management > e.g. The DOI of the Buollettino Sismico Italiano (BSI) from each catalogue release
- Storage > Every run of the experiment should store the used data
Model source code and parameters
- Models should be standalone code containers
- Models' codes should be identifiable and citable
- The Experiment system must be decoupled of the Model code, but record/store the details of the Model's execution.
Experiment Source code
- Experiment source code can be obtained an official source (e.g. Zenodo) as is, where versioning/labeling should correspond to each results' publication

Re-runnable

Experiment can be run on-demand by any user during its duration

Experiment can be deployed and run in any machine (linux-based only?), as long as it satisfies the computational requirements. We will take advantage of the following
- All models will be standalone dockerized code, determined from a Dockerfile (shared responsibility between modeler/tester)
- All Evaluation codes will be handled by pyCSEP
- The experiment architecture (filepath management, generate model forecasts, pycsep interface, evaluations, results generation and visualization) will be designed in docker-compose
- How much of the experiment design can be implemented in pycsep? THE MORE THE BETTER (for reusability)

It will be possible to run the experiment in the future, even after its termination

Don't step so deep in dependency hell: Clear definition of system requirements for the models and the experiment:
- The dockerization of the models should aim to reduce the possibility of dependency conflicts in the future
- Use tag (+push_id) versioning to create the Experiment from supported base docker-images (e.g. FROM python:3.8, FROM r-base:4.1.3)
- Identify system libraries versioning (e.g. apt install libgdal=1.20)
- Install python/R packages with fully specified versions through pip or R>install.packages()
- git installations should be done by specifying commit/tags

Accessible

Experiment results are easy to obtain, execute or visualize

Experiment results distribution should be uncoupled of the Experiment Code architecture. (e.g. results could be generated anywhere, but officially published on demand by an authorized user)
The experiment code should be able to generate a simple report of the results (or multiple features, with different levels of detail)
- As the model is run on demand, a branch/tag could be created in a git-repo to display the results of each run (similar to GEFE), where results are shown in README.md. Maybe also git pages?
- Experiment should also provide results in human-readable format (e.g. csv sheet with evaluation values for interested users)
- MAYBE: Experiment can create HTML, flask pages, which can be hosted by any webserver (e.g. GFZ,etc) in the future (or cloud?)

The Experiment can be deployed by any user with minimum (tbd) programming knowledge

The Experiment architecture can be completely accessed downloaded from an official source (e.g. zenodo).
The Experiment itself will be a docker-compose container, able to acquire authoritative data, run the models, create the forecast and evaluate, with a couple of code lines.
The code should be design in favor of clarity and simplicity, rather than efficiency.
Documentation is key

Reusable

Code can be applied to other case studies with minimal modifications

Decoupling the Experiment System, from the Experiment itself: For this reason, I am in favor of designing the Italy experiment in a generic way, with most of the code to be implemented in pyCSEP >> For example, a Model class that interfaces a docker-container (code, virtual environment, etc), generating the required forecast-files. An Experiment class that handles the Models, Authoritative data, Evaluations, book-keeping etc. An Experiment object can interface a Model through specifying a Model's Format (i.e. Model type, the filepaths structure/storage), same as a Catalog Format. It probably implies, that the system must be designed independent of the datasets itself.

New methods can be easily added or modified

This goes in line with pyCSEP, regarding new evaluation methods that can be implemented there. If we choose to include Model class, it should be able to handle different types of Model abstractions (virtual environments, local machine models, dockers, etc.)

Replicable

Experiment can be replicable with different dataset

For instance, a different catalog

Unambiguously defined, such it could be replicated by a different developer

Documenting both the code and clearly stating the steps in manuscript.
Perhaps much of the experiment creation, can be part of pycsep wiki/ documentation.

Tasks

There are at least three main areas that needs to be developed for this experiment

Experiment Design

Models preparation

Dockerize all models, libraries specification, test code execution.
- Devise template/instructive for modelers to handler their repos, and then Dockerize their models
Design interface with Experiment

pyCSEP improving

Determine which features will be included in pycsep nightly builds
Implement ISIDE catalog web API accessing

Use cases

Example of expected uses of the system

A. Visualize results of the experiment during its testing time

Overview: A researcher working in OEF wants to find out the performance of the competing model after one year the experiment started. Pre-reqs: (i) Experiment must have been run on-demand at least once. (ii) Results' figures must have been published/uploaded into the web. Success The researcher access the Zenodo Repo of the Experiment, whose last-version points to a Tagged commit of Github. From the README.md or files within, the researcher obtains a high-quality, citable figure and corresponding DOI, which can be either used internally in their organization, or shown where it is required. The figure could also be obtained through INGV or CSEP website, but that is out of the scope of the project.

B. Obtain results of the experiment during its testing time

Overview: A researcher working in short-term hazard wants to retrieve the results' values of model performance (e.g. log-likelihoods) after one year the experiment started, to weight branches of logic-trees, perform bayesian updating. Pre-reqs: Same as before + Results published in Human Readable Format Success: Access the git repo tag for the last run. In there, obtains a csv file containing the experiment results up-to-date: E.g. time-series of evaluation metrics for competing models, will all the information for citation.

C. Re use the code to perform a different experiment

Overview: A researcher working in Chile wants to design an experiment to evaluate their OEF models. In the seismological service, modeling time-dependent seismicity is relatively new, and they have no experience in performing experiments. Pre-reqs: (i) Source code completely available. (ii) Documentation for experiment implementation is available, both the Scientific and Programming levels. (iii) An example mockup model is available Success: (i) Feasibility check: The researcher downloads the source code from Zenodo. Replaces Catalog and Regions data, and experiment specifications (e.g. time window) and tests a Mockup model in the region. (ii) Modifies an own model (e.g. simple ETAS) to match our Model Template and runs the experiment pseudo-prospectively. (iii) Builds a prospective-experiment for Chile, following most of this experiment guidelines, and uploads the source-code to Zenodo

D. Reproduce results

Overview: Someone wants to reproduce the exact results of the experiment to the date. It could be either to assess the quality of the experiment, re-assess evaluation metrics, or test their new-models pseudo-prospectively against gold-standard models and the metrics from this experiment Pre-reqs: (i) Source code available. (ii) Results available online, along with the authoritative dataset required to reproduce them (iii) Experiment completely defined at each version of a run (e.g. RUN_2022_3, RUN_2023_3, etc), such that the results are reproduced by running the source code. Success: The researcher is able to run the Experiment on its totality. If there is a difference in results, it can be easily isolated (ambiguous model, catalog re-evaluation, etc.).

Edited Jun 07, 2022 by wsavran

Epic: Experiment System Requirements

Table of Contents

Scope

Requirements

Reproducible

Re-runnable

Accessible

Reusable

Replicable

Specifications

Reproducibility

Re-runnable

Accessible

Reusable

Replicable

Tasks

Experiment Design

Models preparation

pyCSEP improving

Use cases