Skip to content
GitLab
Menu
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Dynamic Exposure
OpenBuildingMap
quadtreegrid
Commits
eff87c88
Commit
eff87c88
authored
Jun 10, 2021
by
Nicolas Garcia Ospina
Browse files
Included grid generator scripts
parent
de53b79c
Pipeline
#28250
passed with stage
in 3 minutes and 32 seconds
Changes
5
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
.gitlab-ci.yml
View file @
eff87c88
...
...
@@ -9,6 +9,8 @@ cache:
-
venv/
before_script
:
-
apt-get update -y
-
apt-get install -y python3-pip proj-bin libgeos-dev gdal-bin libgdal-dev
-
python3 -V
-
pip3 install virtualenv
-
virtualenv venv
...
...
README.md
View file @
eff87c88
# quadtreegrid
This repository contains code related to the creation and management of quadtree grids.
\ No newline at end of file
This repository contains code related to the creation and writing of quadtree grids.
# quadtreegrid.py examples
The
`quadtreegrid.py`
script allows the creation of a grid with all tiles that contain
a bounding box.
To launch this script, an input extent is always required, it may be inserted manually,
or it can be extracted from a shapefile.
## -in_bounds
For instance, a specific bounding box is introduced with the -in_bounds argument followed
by the West,South,East,North limits separated with single spaces:
```
python quadtreegrid.py -in_bounds 1 1 5 5
```
This would create a file called
`output_quadkeys.txt`
with all the
`zoom level 18`
tiles that
contain that bounding box.
## -in_file
The bounding box can be also extracted from a
`shapefile`
for example, this can be done with the
argument
`-in_file`
followed by the relative or absolute path of the target file. For example:
```
python quadtreegrid.py -in_file /path/to/my_shapefile.shp
```
This would create a file called
`output_quadkeys.txt`
with all the
`zoom level 18`
tiles that
contain the bounding box of the entire shapefile.
## Optional arguments
Two optional arguments to include are: 1)
`-output`
which allows to set a filename or filepath to
the resulting text file. If the resulting filename has the
`csv`
extension, a header with the name
`quadkey`
will be included. The default for this is
`output_quadkeys.txt`
. 2) The
`-zoom`
argument
sets the desired zoom level of the output tiles. Its default value is
`18`
.
```
python quadtreegrid.py -in_bounds 1 1 5 5 -output ./custom_filepath.csv -zoom 9
```
This would create a file called
`custom_filepath.csv`
with all the
`zoom level 9`
tiles that
contain the input bounding box and the header
`quadkey`
.
# quadtreegrid_polyfill.py examples
The
`quadtreegrid_polyfill.py`
script allows for the identification of the quadkeys of all tiles that
contain an irregular polygon or multipolygon buffered by 0.05 deg, this way tiles in the borders
won't be accidentally skipped. It is an optimization of the
`Babel("bing").polyfill()`
function from
[
Babelgrid
](
https://pypi.org/project/babelgrid/
)
. The current version of this script uses the country
borders from
[
GADM 3.6
](
https://gadm.org/
)
dataset. This dataset must be downloaded separately from
[
this_link
](
https://biogeo.ucdavis.edu/data/gadm3.6/gadm36_levels_shp.zip
)
, and the shapefile
`gadm36_0.shp`
must be placed in the
`data`
directory of this repository (
`quadtreegrid/data`
). Currently, this is hard
coded but may become more flexible in a future version.
## -country
To launch this script, an input territory is always required, it is inserted manually with the
`-country`
argument followed by the name of the country in english. Whenever the country name has more than one
word, they must be separated by an underscore
`_`
. Some examples may be:
```
python quadtreegrid_polyfill.py -country Italy
```
```
python quadtreegrid_polyfill.py -country Puerto_Rico
```
```
python quadtreegrid_polyfill.py -country Democratic_Republic_of_the_Congo
```
## Optional arguments
The
`-zoom`
and
`-output`
arguments work in the same way as in
`quadtreegrid.py`
. One characteristic of
the
`polyfill()`
function from
`Babelgrid`
is that it cannot allocate quadkeys if their size is larger than
a single polygon (e.g. a very small country, or an island with low zoom level quadkeys). This is partly solved
with the buffer applied to all polygons and using an adequate zoom level. An example that is likely to fail is:
```
python quadtreegrid_polyfill.py -country Vatican_City -zoom 8
```
To optimize the memory usage, and the program speed, a
`multiprocessing`
framework has been incorporated.
For it, the target geometry is gridded to produce smaller batches to be processed, this is done with the
`-max_batches`
argument, whose default value is 100. If the amount of batches is increased, the processing
speed for each one decreases. Keep in mind that their size also depends on the size of the original geometry
(country).
The other argument that controls the
`multiprocessing`
behaviour is
`-n_cores`
, with it the amount of parallel
jobs to be executed is established. This depends on the CPU thread count. The default value is 1, meaning that
it will not parallelize the execution. A complete setup to launch this program could be the following:
```
python quadtreegrid_polyfill.py -country Italy -max_batches 10000 -n_cores 12 -output Italy_quadkeys.txt
```
This will launch a program to find the zoom level 18 quadkeys in Italy with 12 parallel jobs where 10000
batches will be distributed and finally write a text file called
`Italy_quadkeys.txt`
.
quadtreegrid/quadtreegrid.py
View file @
eff87c88
...
...
@@ -19,18 +19,78 @@
import
logging
import
sys
import
geopandas
as
gpd
import
mercantile
import
argparse
# Add a logger printing error, warning, info and debug messages to the screen
logger
=
logging
.
getLogger
()
logger
.
setLevel
(
logging
.
DEBUG
)
logger
.
addHandler
(
logging
.
StreamHandler
(
sys
.
stdout
))
logger
.
info
(
"Launching quadtreegrid"
)
parser
=
argparse
.
ArgumentParser
(
description
=
"""Get all quadkeys from given bounds
or get the bounds of a vector file"""
)
parser
.
add_argument
(
"-in_bounds"
,
type
=
float
,
nargs
=
"+"
,
help
=
"Bounds in order West South East North, separated by spaces"
,
)
parser
.
add_argument
(
"-in_file"
,
type
=
str
,
help
=
"Shapefile with a given geometry to extract a bounding box"
)
parser
.
add_argument
(
"-zoom"
,
nargs
=
"?"
,
const
=
1
,
type
=
int
,
default
=
18
,
help
=
"Set zoom level of output tiles, default: 18"
,
)
parser
.
add_argument
(
"-output"
,
type
=
str
,
nargs
=
"?"
,
const
=
1
,
default
=
"./output_quadkeys.txt"
,
help
=
"""Set the output file path and extension as
text or CSV file path, default: ./output_quadkeys.txt"""
,
)
args
=
parser
.
parse_args
()
def
main
():
if
args
.
in_bounds
:
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
=
args
.
in_bounds
assert
sw_lon
<
ne_lon
,
"West longitude is greater than east longitude"
assert
sw_lat
<
ne_lat
,
"South latitude is greater than north latitude"
elif
args
.
in_file
:
outline
=
gpd
.
read_file
(
args
.
in_file
).
to_crs
(
"epsg:4326"
)
geometry
=
outline
.
envelope
.
values
[
0
]
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
=
geometry
.
bounds
else
:
raise
ValueError
(
"No input geometry found"
)
# Example logging output
logger
.
info
(
"quadtreegrid has started"
)
logger
.
info
(
"Zoom level {} quadkeys will be created for the bounding "
"box {} {} {} {}."
.
format
(
args
.
zoom
,
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
)
)
tiles_generator
=
mercantile
.
tiles
(
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
,
args
.
zoom
)
logger
.
info
(
"Writing to {}"
.
format
(
args
.
output
))
outfile
=
open
(
args
.
output
,
"w"
)
if
args
.
output
[
-
3
:]
==
"csv"
:
outfile
.
write
(
"quadkey
\n
"
)
for
tile
in
tiles_generator
:
outfile
.
write
(
mercantile
.
quadkey
(
tile
)
+
"
\n
"
)
outfile
.
close
()
# Example logging output
logger
.
info
(
"quadtreegrid is closing"
)
# Leave the program
sys
.
exit
()
...
...
quadtreegrid/quadtreegrid_polyfill.py
0 → 100644
View file @
eff87c88
#!/usr/bin/env python3
# Copyright (c) 2021:
# Helmholtz-Zentrum Potsdam Deutsches GeoForschungsZentrum GFZ
#
# This program is free software: you can redistribute it and/or modify it
# under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or (at
# your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero
# General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see http://www.gnu.org/licenses/.
import
logging
import
sys
import
geopandas
as
gpd
import
argparse
import
mercantile
from
multiprocessing
import
Pool
import
numpy
as
np
from
shapely.geometry
import
Polygon
import
os
logger
=
logging
.
getLogger
()
logger
.
setLevel
(
logging
.
INFO
)
logger
.
addHandler
(
logging
.
StreamHandler
(
sys
.
stdout
))
logger
.
info
(
"Launching quadtreegrid"
)
parser
=
argparse
.
ArgumentParser
(
description
=
"""Get all quadkeys from given polygon shapefile"""
)
parser
.
add_argument
(
"-country"
,
type
=
str
,
help
=
"Country name in GADM 3.6"
)
parser
.
add_argument
(
"-zoom"
,
nargs
=
"?"
,
const
=
1
,
type
=
int
,
default
=
18
,
help
=
"Set zoom level of output tiles, default: 18"
,
)
parser
.
add_argument
(
"-max_batches"
,
nargs
=
"?"
,
const
=
1
,
type
=
int
,
default
=
100
,
help
=
"""Set the maximum amount of batches for multiprocessing"""
,
)
parser
.
add_argument
(
"-n_cores"
,
nargs
=
"?"
,
const
=
1
,
type
=
int
,
default
=
1
,
help
=
"""Set the maximum amount of jobs to spawn in multiprocessing"""
,
)
parser
.
add_argument
(
"-output"
,
type
=
str
,
nargs
=
"?"
,
const
=
1
,
default
=
"./output_quadkeys.txt"
,
help
=
"""Set the output file path and extension as
text or CSV file path, default: ./output_quadkeys.txt"""
,
)
args
=
parser
.
parse_args
()
# Get a country polygon by name from the GADM dataset and do small buffer
country_name
=
args
.
country
.
replace
(
"_"
,
" "
)
world
=
gpd
.
read_file
(
os
.
path
.
join
(
".."
,
os
.
path
.
dirname
(
__file__
),
"data"
,
"gadm36_0.shp"
)
).
to_crs
(
"epsg:4326"
)
logger
.
info
(
country_name
)
country
=
world
[
world
[
"NAME_0"
]
==
country_name
]
geom
=
country
.
unary_union
del
world
country
=
gpd
.
GeoDataFrame
(
crs
=
"epsg:4326"
,
geometry
=
[
geom
])
# Build the batches from griding the input polygon
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
=
country
.
geometry
.
values
[
0
].
bounds
logger
.
info
(
"Creating batches"
)
width
=
abs
(
ne_lon
-
sw_lon
)
/
np
.
sqrt
(
args
.
max_batches
)
height
=
abs
(
ne_lat
-
sw_lat
)
/
np
.
sqrt
(
args
.
max_batches
)
cols
=
list
(
np
.
arange
(
int
(
np
.
floor
(
sw_lon
))
-
0.5
,
int
(
np
.
ceil
(
ne_lon
))
+
0.5
,
width
))
rows
=
list
(
np
.
arange
(
int
(
np
.
floor
(
sw_lat
))
-
0.5
,
int
(
np
.
ceil
(
ne_lat
))
+
0.5
,
height
))
rows
.
reverse
()
polygons
=
[]
for
x
in
cols
:
for
y
in
rows
:
polygons
.
append
(
Polygon
(
[
(
x
,
y
),
(
x
+
width
,
y
),
(
x
+
width
,
y
-
height
),
(
x
,
y
-
height
),
]
)
)
grid
=
gpd
.
GeoDataFrame
({
"geometry"
:
polygons
},
crs
=
"epsg:4326"
)
gridded_polygon
=
gpd
.
overlay
(
grid
,
country
,
how
=
"intersection"
)
batch_list
=
gridded_polygon
.
geometry
.
tolist
()
logger
.
info
(
"{} batches to be processed with {} parallel jobs"
.
format
(
len
(
batch_list
),
args
.
n_cores
)
)
def
get_tiles_from_polygon
(
in_geometry
):
"""Wrapper function that extracts quadkeys inside a given polygon.
Args:
in_geometry (shapely.geometry.polygon.Polygon): Polygon to fill with quadkeys.
Returns:
tiles_ids (list): List of quadkeys as strings.
"""
in_geometry
=
in_geometry
.
buffer
(
0.01
)
tiles
=
[]
if
in_geometry
.
geom_type
==
"Polygon"
:
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
=
in_geometry
.
bounds
tiles_generator
=
mercantile
.
tiles
(
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
,
args
.
zoom
)
tiles
.
extend
([
mercantile
.
quadkey
(
tile
)
for
tile
in
tiles_generator
])
elif
in_geometry
.
geom_type
==
"MultiPolygon"
:
geometries
=
list
(
in_geometry
)
for
geometry
in
geometries
:
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
=
geometry
.
bounds
tiles_generator
=
mercantile
.
tiles
(
sw_lon
,
sw_lat
,
ne_lon
,
ne_lat
,
args
.
zoom
)
tiles
.
extend
([
mercantile
.
quadkey
(
tile
)
for
tile
in
tiles_generator
])
else
:
raise
IOError
(
"Input is not a valid geometry."
)
if
len
(
tiles
)
>
0
:
logger
.
info
(
"Found {} tiles in batch"
.
format
(
len
(
tiles
)))
return
tiles
else
:
logger
.
info
(
"No quadkeys found in batch"
)
return
tiles
def
main
():
logger
.
info
(
"create pool"
)
p
=
Pool
(
processes
=
args
.
n_cores
)
logger
.
info
(
"execute pool"
)
data
=
p
.
map
(
get_tiles_from_polygon
,
batch_list
)
logger
.
info
(
"closing pool"
)
p
.
close
()
p
.
join
()
# Remove possible duplicated quadkeys
flat_list_ids
=
list
(
set
([
item
for
sublist
in
data
for
item
in
sublist
]))
if
len
(
flat_list_ids
)
>
0
:
logger
.
info
(
"Writing {} quadkeys to {}"
.
format
(
len
(
flat_list_ids
),
args
.
output
))
outfile
=
open
(
args
.
output
,
"w"
)
if
args
.
output
[
-
3
:]
==
"csv"
:
outfile
.
write
(
"quadkey
\n
"
)
for
element
in
flat_list_ids
:
outfile
.
write
(
element
+
"
\n
"
)
outfile
.
close
()
else
:
logger
.
info
(
"No output file will be produced"
)
if
__name__
==
"__main__"
:
main
()
setup.py
View file @
eff87c88
...
...
@@ -8,10 +8,18 @@ linters_require = ["black>=20.8b1", "pylint", "flake8"]
setup
(
name
=
"quadtreegrid"
,
version
=
"0.0.1"
,
description
=
"Quadtree grid creation
for OpenBuildingMap
"
,
description
=
"Quadtree grid creation"
,
keywords
=
"OpenBuildingMap, quadtree, grid"
,
author
=
"Helmholtz-Zentrum Potsdam Deutsches GeoForschungsZentrum GFZ"
,
install_requires
=
[
"numpy"
],
install_requires
=
[
"pyproj==3.0.0.post1"
,
"rtree"
,
"shapely"
,
"mercantile"
,
"geopandas"
,
"numpy"
,
"babelgrid"
,
],
extras_require
=
{
"tests"
:
tests_require
,
"linters"
:
linters_require
,
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment