Investigate the necessary architecture for optimizing database design, mainly to store attributes per dataset with varying data types
Part 1 - Test infrastructure setup
Create a setup for testing the various possible infrastructure setup.
Part 1.1. Setup monitoring infrastructure
To scale and test several possible designs that fit with the problem, the following will be explored, and if initial tests suggest useful features, we can integrate them later.
- setup docker-based infra
- postgres logs extensions added
- grafana + prometheus for monitoring load - example setup in OSM
- parametrized tests (1m entries, 10m, 100m, ... - based on python scripts)
Part 1.2. Import database and test initial changes
- minimal changes to field types
- partitioning
- random separation
- openstreetmap_H3
- citus for horizontal scaling
Part 2 - Database design
There is a need to store varying pieces of information of different data types per entity, asset, etc. The current proposal is to use the JSON
data type per dataset in the respective tables and to store all varying information in this JSON field. Other solutions should be explored.
The necessary conditions to agree on one solution are:
- Full compatibility of SQL queries between PostGIS and SpatiaLite as otherwise exposure-lib will explode in size if all queries need to be implemented differently for the two different databases.
- Searching in the attributes should be fast.
- Inserting, updating, and deleting of parts of the attributes should be fast.
- Estimate difference for size of database for each scenario (tag-table or JSON).
\rfc @all
Edited by Doren Calliku