Investigate the necessary architecture for optimizing database design, mainly to store attributes per dataset with varying data types

Part 1 - Test infrastructure setup

Create a setup for testing the various possible infrastructure setup.

Part 1.1. Setup monitoring infrastructure

To scale and test several possible designs that fit with the problem, the following will be explored, and if initial tests suggest useful features, we can integrate them later.

setup docker-based infra
postgres logs extensions added
grafana + prometheus for monitoring load - example setup in OSM
parametrized tests (1m entries, 10m, 100m, ... - based on python scripts)

Part 1.2. Import database and test initial changes

minimal changes to field types
partitioning
- random separation
- openstreetmap_H3
citus for horizontal scaling

Part 2 - Database design

There is a need to store varying pieces of information of different data types per entity, asset, etc. The current proposal is to use the JSON data type per dataset in the respective tables and to store all varying information in this JSON field. Other solutions should be explored.

The necessary conditions to agree on one solution are:

Full compatibility of SQL queries between PostGIS and SpatiaLite as otherwise exposure-lib will explode in size if all queries need to be implemented differently for the two different databases.
Searching in the attributes should be fast.
Inserting, updating, and deleting of parts of the attributes should be fast.
Estimate difference for size of database for each scenario (tag-table or JSON).

\rfc @all

Edited Dec 16, 2024 by Doren Calliku