Floating point ranges with RangeIndex#
Highlights#
Pandas has no equivalent of
pandas.RangeIndexfor floating point ranges. Fortunately, there isxarray.indexes.RangeIndexthat works with real numbers.Xarray’s
RangeIndexis built on top ofxarray.indexes.CoordinateTransformIndex(see Functional transformations with CoordinateTransformIndex) and therefore supports very large ranges represented as lazy coordinate variables.
Example#
Assigning#
import xarray as xr
Using xarray.indexes.RangeIndex.arange().
idx1 = xr.indexes.RangeIndex.arange(0.0, 1000.0, 1e-9, dim="x")
ds1 = xr.Dataset(coords=xr.Coordinates.from_xindex(idx1))
ds1
<xarray.Dataset> Size: 8TB
Dimensions: (x: 1000000000000)
Coordinates:
* x (x) float64 8TB 0.0 1e-09 2e-09 3e-09 ... 1e+03 1e+03 1e+03 1e+03
Data variables:
*empty*
Indexes:
x RangeIndex (start=0, stop=1e+03, step=1e-09)Using xarray.indexes.RangeIndex.linspace().
idx2 = xr.indexes.RangeIndex.linspace(
0.0, 1000.0, 1_000_000_000_000, dim="x"
)
ds2 = xr.Dataset(coords=xr.Coordinates.from_xindex(idx2))
ds2
<xarray.Dataset> Size: 8TB
Dimensions: (x: 1000000000000)
Coordinates:
* x (x) float64 8TB 0.0 1e-09 2e-09 3e-09 ... 1e+03 1e+03 1e+03 1e+03
Data variables:
*empty*
Indexes:
x RangeIndex (start=0, stop=1e+03, step=1e-09)Lazy coordinate#
The x coordinate variable associated with the range index is lazy (i.e., all
array values are not fully materialized in memory).
ds1.x
<xarray.DataArray 'x' (x: 1000000000000)> Size: 8TB
[1000000000000 values with dtype=float64]
Coordinates:
* x (x) float64 8TB 0.0 1e-09 2e-09 3e-09 ... 1e+03 1e+03 1e+03 1e+03
Indexes:
x RangeIndex (start=0, stop=1e+03, step=1e-09)If materialized, this would be a very large array!
ds1.x.nbytes / 1024**4 # 7TB!
7.275957614183426
Important
ds.x.values will materialize all values in-memory! x may behave like a “coordinate variable bomb” 💣.
Indexing#
Slicing along the x dimension preserves the range index – although with a new
range – and keeps a lazy associated coordinate variable.
sliced = ds1.isel(x=slice(1_000, 50_000, 100))
sliced.x
<xarray.DataArray 'x' (x: 490)> Size: 4kB
[490 values with dtype=float64]
Coordinates:
* x (x) float64 4kB 1e-06 1.1e-06 1.2e-06 ... 4.98e-05 4.99e-05
Indexes:
x RangeIndex (start=1e-06, stop=5e-05, step=1e-07)