Zonal Statistics

The wradlib.zonalstats module provides classes and functions for calculation of zonal statistics for data on arbitrary grids and projections.

It provides classes for:

  • managing georeferenced data (grid points or grid polygons, zonal polygons),
  • calculation of geographic intersections and managing resulting vector data
  • calculation of zonal statistics and managing result data as vector attributes
  • output to vector and raster files available within ogr/gdal
In [2]:
import wradlib as wrl
import matplotlib.pyplot as pl
import matplotlib as mpl
import warnings
warnings.filterwarnings('ignore')
try:
    get_ipython().magic("matplotlib inline")
except:
    pl.ion()
import numpy as np

DataSource

The wradlib.zonalstats.DataSource class handles point or polygon vector data by wrapping ogr.DataSource with special functions.

The following example shows how to create different DataSource objects:

In [3]:
from osgeo import osr

# create gk zone 2 projection osr object
proj_gk2 = osr.SpatialReference()
proj_gk2.ImportFromEPSG(31466)

# Setting up DataSource
box0 = np.array([[2600000., 5630000.],[2600000., 5640000.],
                 [2610000., 5640000.],[2610000., 5630000.],
                 [2600000., 5630000.]])
box1 = np.array([[2610000., 5630000.],[2610000., 5640000.],
                 [2620000., 5640000.],[2620000., 5630000.],
                 [2610000., 5630000.]])
box2 = np.array([[2600000., 5640000.],[2600000., 5650000.],
                 [2610000., 5650000.],[2610000., 5640000.],
                 [2600000., 5640000.]])
box3 = np.array([[2610000., 5640000.],[2610000., 5650000.],
                 [2620000., 5650000.],[2620000., 5640000.],
                 [2610000., 5640000.]])

point0 = np.array(wrl.zonalstats.get_centroid(box0))
point1 = np.array(wrl.zonalstats.get_centroid(box1))
point2 = np.array(wrl.zonalstats.get_centroid(box2))
point3 = np.array(wrl.zonalstats.get_centroid(box3))

# creates Polygons in Datasource
poly = wrl.zonalstats.DataSource(np.array([box0, box1, box2, box3]), srs=proj_gk2, name='poly')

# creates Points in Datasource
point = wrl.zonalstats.DataSource(np.vstack((point0, point1, point2, point3)),
                                  srs=proj_gk2, name='point')

Let’s have a look at the data, which will be exported as numpy arrays. The property data exports all available data:

In [4]:
print(poly.data)
print(point.data)
[[[ 2600000.  5630000.]
  [ 2600000.  5640000.]
  [ 2610000.  5640000.]
  [ 2610000.  5630000.]
  [ 2600000.  5630000.]]

 [[ 2610000.  5630000.]
  [ 2610000.  5640000.]
  [ 2620000.  5640000.]
  [ 2620000.  5630000.]
  [ 2610000.  5630000.]]

 [[ 2600000.  5640000.]
  [ 2600000.  5650000.]
  [ 2610000.  5650000.]
  [ 2610000.  5640000.]
  [ 2600000.  5640000.]]

 [[ 2610000.  5640000.]
  [ 2610000.  5650000.]
  [ 2620000.  5650000.]
  [ 2620000.  5640000.]
  [ 2610000.  5640000.]]]
[[ 2605000.  5635000.]
 [ 2615000.  5635000.]
 [ 2605000.  5645000.]
 [ 2615000.  5645000.]]

Currently data can also be retrieved by:

Now, with the DataSource being created, we can add/set attribute data of the features:

In [5]:
# add attribute
poly.set_attribute('mean', np.array([10.1, 20.2, 30.3, 40.4]))
point.set_attribute('mean', np.array([10.1, 20.2, 30.3, 40.4]))

Attributes associated with features can also be retrieved:

In [6]:
# get attributes
print(poly.get_attributes(['mean']))
# get attributes filtered
print(poly.get_attributes(['mean'], filt=('index', 2)))
[[10.1, 20.2, 30.3, 40.4]]
[[30.3]]

Finally, we can export the contained data to OGR/GDAL supported vector and raster files:

In [7]:
# dump as 'ESRI Shapefile', default
poly.dump_vector('test_poly.shp')
point.dump_vector('test_point.shp')
# dump as 'GeoJSON'
poly.dump_vector('test_poly.geojson', 'GeoJSON')
point.dump_vector('test_point.geojson', 'GeoJSON')
# dump as 'GTiff', default
poly.dump_raster('test_poly_raster.tif', attr='mean', pixel_size=100.)
# dump as 'netCDF'
poly.dump_raster('test_poly_raster.nc', 'netCDF', attr='mean', pixel_size=100.)
Rasterize layers
Rasterize layers

ZonalData

ZonalData is usually available as georeferenced regular gridded data. Here the wradlib.zonalstats.ZonalDataBase class manages the grid data, the zonal data (target polygons) and the intersection data of source grid and target polygons. Because the calculation of intersection is different for point grids and polygon grids, we have subclasses wradlib.zonalstats.ZonalDataPoly and wradlib.zonalstats.ZonalDataPoint.

Basically, wradlib.zonalstats.ZonalDataBase encapsulates three wradlib.zonalstats.DataSource objects:

  • source grid (points/polygons)
  • target polygons
  • destination (intersection) (points/polygons)

The destination DataSource object is created from the provided source grid and target polygons at initialisation time.

As an example the creation of a wradlib.zonalstats.ZonalDataPoly class instance is shown:

In [8]:
# setup test grid and catchment
lon = 7.071664
lat = 50.730521
r = np.array(range(50, 100*1000 + 50 , 100))
a = np.array(range(0, 90, 1))
rays = a.shape[0]
bins = r.shape[0]
# create polar grid polygon vertices in lat,lon
radar_ll = wrl.georef.polar2polyvert(r, a, (lon, lat))

# setup OSR objects
proj_gk = osr.SpatialReference()
proj_gk.ImportFromEPSG(31466)
proj_ll = osr.SpatialReference()
proj_ll.ImportFromEPSG(4326)

# project ll grids to GK2
radar_gk = wrl.georef.reproject(radar_ll, projection_source=proj_ll,
                            projection_target=proj_gk)

# reshape
radar_gk.shape = (rays * bins, 5, 2)

box0 = np.array([[2600000., 5630000.],[2600000., 5640000.],
                 [2610000., 5640000.],[2610000., 5630000.],
                 [2600000., 5630000.]])

box1 = np.array([[2610000., 5630000.],[2610000., 5640000.],
                 [2620000., 5640000.],[2620000., 5630000.],
                 [2610000., 5630000.]])

targets = np.array([box0, box1])

zdpoly = wrl.zonalstats.ZonalDataPoly(radar_gk, targets, srs=proj_gk)
Calculate Intersection source/target-layers

When calculationg the intersection, also weights are calculated for every source grid feature and attributed to the destination features.

With the property isecs it is possible to retrieve the intersection geometries as numpy array, further get-functions add to the functionality:

In [9]:
# get intersections as numpy array
isecs = zdpoly.isecs
# get intersections for target polygon 0
isec0 = zdpoly.get_isec(0)
# get source indices referring to target polygon 0
ind0 = zdpoly.get_source_index(0)

print(isecs.shape, isec0.shape, ind0.shape)
((2,), (2147,), (2147,))

There are import/export functions using ESRI-Shapfile Format as data format. Next export and import is shown: wradlib.zonalstats.ZonalDataBase

In [10]:
zdpoly.dump_vector('test_zdpoly')
zdpoly_new = wrl.zonalstats.ZonalDataPoly('test_zdpoly')

ZonalStats

For ZonalStats the wradlib.zonalstats.ZonalStatsBase class and the two subclasses wradlib.zonalstats.GridCellsToPoly and wradlib.zonalstats.GridPointsToPoly are available. ZonalStatsBase encapsulates one ZonalData object. Properties for simple access of ZonalData, intersection indices and weights are provided. The following code will add mean and var attributes to the target DataSource:

In [11]:
# create GridCellsToPoly instance
gc = wrl.zonalstats.GridCellsToPoly(zdpoly_new)
# create some artificial data for processing using the features indices
count = radar_gk.shape[0]
data = 1000000. / np.array(range(count))
# calculate mean and variance
mean = gc.mean(data)
var = gc.var(data)

print("Average:", mean)
print("Variance:", var)
('Average:', array([ 13.04174018,  12.43947455]))
('Variance:', array([ 4.13158686,  1.89511724]))

Next we can export the resulting zonal statistics to vector and raster files:

In [12]:
# export to vector GeoJSON
gc.zdata.trg.dump_vector('test_zonal_json.geojson', 'GeoJSON')
# export 'mean' to raster netCDF
gc.zdata.trg.dump_raster('test_zonal_hdr.nc', 'netCDF', 'mean', pixel_size=100.)
Rasterize layers

The ZonalStats classes can also be used without any ZonalData by instantiating with precalculated index and weight values. Be sure to use matching ix, w and data arrays:

In [13]:
# get ix, and weight arrays
ix = gc.ix
w = gc.w
# instantiate new ZonlaStats object
gc1 = wrl.zonalstats.GridCellsToPoly(ix=ix, w=w)
# caclulate statistics
avg = gc1.mean(data)
var = gc1.var(data)

print("Average:", avg)
print("Variance:", var)
('Average:', array([ 13.04174018,  12.43947455]))
('Variance:', array([ 4.13158686,  1.89511724]))

Examples

Examples of using Zonal Statistics working with rectangular grids as well as polar grids is shown in the Zonal Statistics Example notebook.