Coordinate Reference Systems with xproj.CRSIndex

Coordinate Reference Systems with xproj.CRSIndex#

See also

Learn more at the xproj documentation

Highlights#

  1. xproj provides a CRSIndex to handle “coordinate reference system” (CRS) information.

  2. CRS information is commonly stored in the attributes of a scalar coordinate variable, usually called "spatial_ref" or one that is annotated as a “grid mapping variable” according to the CF conventions.

  3. The purpose of CRSIndex is to handle alignment, not indexing. That is, we would like errors to be raised when adding or concatenating (for example) two datasets with different CRS. Such operations are nonsensical without reprojecting the data to a common CRS.

Example#

Assigning#

%xmode minimal

import xarray as xr
import xproj  # registers the .proj accessor

xr.set_options(display_expand_data=False, display_expand_attrs=False, display_expand_indexes=True)

ds = xr.tutorial.load_dataset("air_temperature")
ds
Exception reporting mode: Minimal
<xarray.Dataset> Size: 31MB
Dimensions:  (time: 2920, lat: 25, lon: 53)
Coordinates:
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
Data variables:
    air      (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes: (5)

This dataset has no CRS information. xproj uses Xarray’s accessor interface to provide functionality. We use that to assign a CRS — the common WGS84 global geodetic CRS (EPSG code: 4326).

ds_wgs84 = ds.proj.assign_crs(spatial_ref="epsg:4326")
ds_wgs84.spatial_ref.attrs = ds_wgs84.proj.crs.to_cf()  # TODO: add to xproj
ds_wgs84
<xarray.Dataset> Size: 31MB
Dimensions:      (time: 2920, lat: 25, lon: 53)
Coordinates:
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
  * spatial_ref  int64 8B 0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4326)
Attributes: (5)

Note the CRS information in the repr for CRSIndex associated with spatial_ref above.

Alignment#

Alignment means making sure the objects conform to the same coordinate reference frame. When executing multi-object operations (e.g. binary operations, concatenation, merging etc.), Xarray automatically aligns these objects using Indexes.

To illustrate we will artificially create a new dataset with a different CRS:

ds_wgs72 = ds_wgs84.proj.assign_crs(spatial_ref="epsg:4322", allow_override=True)
ds_wgs72.spatial_ref.attrs = ds_wgs72.proj.crs.to_cf()  # TODO: add to xproj
ds_wgs72
<xarray.Dataset> Size: 31MB
Dimensions:      (time: 2920, lat: 25, lon: 53)
Coordinates:
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
  * spatial_ref  int64 8B 0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4322)
Attributes: (5)

Now lets add the two:

ds_wgs84 + ds_wgs72
AlignmentError: Objects to align do not have the same CRS
first index:
CRSIndex
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich


second index:
CRSIndex
<Geographic 2D CRS: EPSG:4322>
Name: WGS 72
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1972
- Ellipsoid: WGS 72
- Prime Meridian: Greenwich

How about concatenating?

xr.concat([ds_wgs84, ds_wgs72], dim="newdim")
/tmp/ipykernel_980/3843994017.py:1: FutureWarning: In a future version of xarray the default value for join will change from join='outer' to join='exact'. This change will result in the following ValueError: cannot be aligned with join='exact' because index/labels/sizes are not equal along these coordinates (dimensions): 'spatial_ref' () The recommendation is to set join explicitly for this case.
  xr.concat([ds_wgs84, ds_wgs72], dim="newdim")
AlignmentError: Objects to align do not have the same CRS
first index:
CRSIndex
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich


second index:
CRSIndex
<Geographic 2D CRS: EPSG:4322>
Name: WGS 72
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1972
- Ellipsoid: WGS 72
- Prime Meridian: Greenwich