Commit 5191447d authored by Maximilian Schanner's avatar Maximilian Schanner
Browse files

Seeds to data for reproducability.

parent 8441a04d
# This file was produced using the notebook Gen_Data.ipynb with the following parameters:
# real_locs=False, n_points=412, r_inc=0.00, noise=False, ddec=4.50, dinc=4.50, dint=8250, kappa=650.0
# real_locs=False, n_points=412, r_inc=0.00, noise=False, ddec=4.50, dinc=4.50, dint=8250, kappa=650.0, seed=161
,co-lat,lat,lon,rad,t,dt,D,I,F,dD,dI,dF
0,171.11206237329037,-81.11206237329037,49.93169867784993,6371.200000000001,2020,0,-68.29998556773754,-69.33365904373535,51029.559307795294,4.5,4.5,8250
1,77.24438250477797,12.755617495222026,-43.03162472501459,6371.199999999999,2020,0,-16.79784030155171,17.699876296870524,31343.977656929,4.5,4.5,8250
......
This diff is collapsed.
This diff is collapsed.
# This file was produced using the notebook Gen_Data.ipynb with the following parameters:
# real_locs=True, n_points=412, r_inc=0.80, noise=True, ddec=4.50, dinc=4.50, dint=8250, kappa=650.0
# real_locs=True, n_points=412, r_inc=0.80, noise=True, ddec=4.50, dinc=4.50, dint=8250, kappa=650.0, seed=161
,co-lat,lat,lon,rad,t,dt,D,I,F,dD,dI,dF
0,36.13,53.87,10.81,6371.2,2020,0,10.289285779595852,71.21679961664647,,4.5,4.5,8250
1,36.13,53.87,10.81,6371.2,2020,0,9.18034727167325,66.83457645580954,38487.70788018437,4.5,4.5,8250
......
This diff is collapsed.
%% Cell type:markdown id: tags:
 
# Synthetic tests
The purpose of this notebook is to show some synthetic tests for the `CORBASS` algorithm. Synthetic data are generated using the notebook `Gen_Data.ipynb`. First some imports:
 
%% Cell type:code id: tags:
 
``` python
# Imports
import sys
import os
# relative import
sys.path.append(os.path.abspath('') + '/../')
 
import numpy as np
import pandas as pd
 
from matplotlib import pyplot as plt, colors, cm
from cartopy import crs as ccrs
 
from scipy.stats import gaussian_kde
 
import pyfield
from corbass.inversion import Inversion
from dip_lin_inversion import Dip_Lin_Inversion
 
glob_proj = ccrs.Mollweide(central_longitude=0)
# a handy plotting function
def plot_field(lat, lon, field, names=None, proj=glob_proj, cbar=True, cmap='RdBu',
vmin=None, vmax=None, symm=False, cbarlabel=r'$\mu$T'):
fig, ax = plt.subplots(1, 3, figsize=(17, 10), subplot_kw={'projection': proj})
bnds = ax[0].get_position().bounds
scl = bnds[3]
spc = 0.2*scl
cbar_hght = 0.07*scl
if cbar and cbarlabel:
fig.text(bnds[0]-0.1*spc, bnds[1]+spc-0.5*cbar_hght, cbarlabel,
va='center', ha='right')
for it in range(3):
bnds = ax[it].get_position().bounds
ax[it].tripcolor(lat, lon, field[it::3], cmap=cmap)
ax[it].coastlines(zorder=4)
if names:
ax[it].set_title('NEZ'[it])
if cbar:
if vmin is not None:
_vmin = vmin
else:
_vmin = min(field[it::3])
if vmax is not None:
_vmax = vmax
else:
_vmax = max(field[it::3])
 
if symm:
_vmax = max(abs(_vmax), abs(_vmin))
_vmin = -_vmax
 
colax = fig.add_axes([bnds[0],
bnds[1]+spc-cbar_hght,
bnds[2],
cbar_hght])
norm = colors.Normalize(vmin=_vmin,
vmax=_vmax)
cbar = fig.colorbar(cm.ScalarMappable(norm=norm, cmap=cmap), cax=colax,
orientation='horizontal')
return fig, ax
```
 
%% Cell type:markdown id: tags:
 
We start by setting some basic variables we use throughout the notebook. Similar to `Gen_Data.ipynb`, we use the IGRF-13 model as a reference. You can get it [here](https://www.ngdc.noaa.gov/IAGA/vmod/coeffs/igrf13coeffs.txt). Place the file in the `/dat/` folder.
 
%% Cell type:code id: tags:
 
``` python
# the reference coefficients from IGRF
IGRF = pd.read_csv('../dat/igrf13coeffs.txt', header=0, delim_whitespace=True, skiprows=3)
ref_coeffs = IGRF[['2020.0']].to_numpy().flatten()
# retrieve the maximal degree using pyfield and the index of the last entry in ref_coeffs
l_max = pyfield.i2lm_l(len(ref_coeffs)-1)
 
# the approximate number of design points
n_points = 2000
# parameters for the inversions
r_ref = 2800
lamb = 16000
epsilon = 1.34
rho = 5000
# the axial dipole to linearize around in nT
lin_dip = -23e3
 
# various data files, generated using the notebook Gen_Data.ipynb
# data without noise, at random locations, no records missing
data_clean_complete = pd.read_csv('../dat/synth_data_clean_complete.csv', skiprows=2)
# data without noise, at random locations, 80% are incomplete (i.e. at least one component is missing)
data_clean_incomplete = pd.read_csv('../dat/synth_data_clean_incomplete.csv', skiprows=2)
# same as above, but at locations taken from GEOMAGIA
data_clean_incomplete_real = pd.read_csv('../dat/synth_data_clean_incomplete_real.csv', skiprows=2)
# data with artificial noise, at locations taken from GEOMAGIA, no records missing
data_noisy_complete = pd.read_csv('../dat/synth_data_noisy_complete.csv', skiprows=2)
# same as above, but records that are missing in GEOMAGIA have been excluded
data_noisy_incomplete = pd.read_csv('../dat/synth_data_noisy_incomplete.csv', skiprows=2)
# same as above, but the noise level has been significantly increased
data_very_noisy_incomplete = pd.read_csv('../dat/synth_data_very_noisy_incomplete.csv', skiprows=2)
 
data_labels = ['Clean Complete',
'Clean Incomplete',
'Clean Incomplete Real',
'Noisy Complete',
'Noisy Incomplete',
'Very Noisy Incomplete']
data_lst = [data_clean_complete,
data_clean_incomplete,
data_clean_incomplete_real,
data_noisy_complete,
data_noisy_incomplete,
data_very_noisy_incomplete]
```
 
%% Cell type:markdown id: tags:
 
## Preliminaries
We perform a detailed test for one dataset. The other datasets are compared in a table at the end of this section.
 
%% Cell type:code id: tags:
 
``` python
data_detail = data_clean_complete
```
 
%% Cell type:markdown id: tags:
 
We start by generating design points and constructing the reference field at this design points.
 
%% Cell type:code id: tags:
 
``` python
# get design points using CORBASS routines
x_desi, n_act = Inversion.desi_points(None, n_points)
# latitude and longitude of reference points for plotting
lat, lon, _ = glob_proj.transform_points(ccrs.Geodetic(),
x_desi[1],
90-x_desi[0]).T
 
# construct a basis using pyfield
dspharm = np.empty((len(ref_coeffs), 3*n_act), order='F')
pyfield.dspharm(src=pyfield.SOURCE_INTERNAL,
gSys=pyfield.SYS_GEO,
atSys=pyfield.SYS_GEO,
atForm=pyfield.COOR_CLR,
bSys=pyfield.SYS_GEO,
bForm=pyfield.FIELD_NED,
lmax=l_max,
R=pyfield.REARTH,
t=0.,
at=x_desi[:3, :],
B=dspharm)
# reference field is the scalar product of coefficients and basis
ref_field = np.dot(ref_coeffs, dspharm)
```
 
%% Cell type:markdown id: tags:
 
Let's have a look at the **reference field** by using the handy plotting routine defined above. On top of the north component we also plot the record locations in pink.
 
%% Cell type:code id: tags:
 
``` python
fig, ax = plot_field(lat, lon, ref_field/1000, names='NEZ', symm=True);
ax[0].scatter(data_detail[['lon']], data_detail[['lat']],
s=12, marker='o', lw=0., transform=ccrs.Geodetic(),
c='C6', zorder=5);
```
 
%% Output
 
 
%% Cell type:markdown id: tags:
 
## Detailed Test
Now we can get to the actual tests. First setup the inversion classes:
 
%% Cell type:code id: tags:
 
``` python
# CORBASS strategy
ours = Inversion(data_detail)
# linearization about a given axial dipole
dipl = Dip_Lin_Inversion(data_detail)
```
 
%% Cell type:markdown id: tags:
 
Now we can run the inversions:
 
%% Cell type:code id: tags:
 
``` python
ours_mean, ours_cov = ours.field_inversion(r_ref, lamb, epsilon, rho, n_points)
dipl_mean, dipl_cov = dipl.field_inversion(r_ref, lamb, epsilon, rho, lin_dip, n_points)
```
 
%% Cell type:markdown id: tags:
 
We can again look at the resulting fields using the plotting routine. We also show the difference in the third row.
 
%% Cell type:code id: tags:
 
``` python
# rescale to uT to have smaller numbers in the legend
plot_field(lat, lon, ours_mean/1000, names='NEZ', symm=True);
plot_field(lat, lon, dipl_mean/1000, symm=True);
plot_field(lat, lon, np.abs(ours_mean-ref_field)/1000, vmin=0, cmap='binary');
```
 
%% Output
 
 
 
 
%% Cell type:markdown id: tags:
 
It is further interesting to have a look at the posterior standard deviations, the last row again shows the difference:
 
%% Cell type:code id: tags:
 
``` python
ours_sd = np.sqrt(np.diag(ours_cov))
dipl_sd = np.sqrt(np.diag(dipl_cov))
# rescale to uT to have smaller numbers at the legend
_, ours_ax = plot_field(lat, lon, ours_sd/1000, names='NEZ', vmin=0, cmap='binary')
_, dipl_ax = plot_field(lat, lon, dipl_sd/1000, vmin=0, cmap='binary')
plot_field(lat, lon, (ours_sd-dipl_sd)/1000, symm=True);
 
# switch for plotting the data locations on top of the standard deviation
if False:
for it in range(3):
ours_ax[it].scatter(data_detail[['lon']], data_detail[['lat']],
s=12, marker='o', lw=0, transform=ccrs.Geodetic(),
c='C1', zorder=5);
dipl_ax[it].scatter(data_detail[['lon']], data_detail[['lat']],
s=12, marker='o', lw=0, transform=ccrs.Geodetic(),
c='C1', zorder=5);
```
 
%% Output