smart_geocubes
¶
Smart-Geocubes: A high-performance library for intelligent loading and caching of remote geospatial raster data, built with xarray and zarr.
Modules:
-
accessors
–Smart-Geocubes cccessor implementations.
-
datasets
–Predefined datasets for the SmartGeocubes package.
-
exceptions
–Exceptions for the smart_geocubes package.
Classes:
-
ArcticDEM10m
–Accessor for ArcticDEM 10m data.
-
ArcticDEM2m
–Accessor for ArcticDEM 2m data.
-
ArcticDEM32m
–Accessor for ArcticDEM 32m data.
-
TCTrend
–Accessor for TCTrend data.
ArcticDEM10m
¶
Bases: ArcticDEMABC
Accessor for ArcticDEM 10m data.
Attributes:
-
extent
(GeoBox
) –The extent of the datacube represented by a GeoBox.
-
chunk_size
(int
) –The chunk size of the datacube.
-
channels
(list
) –The channels of the datacube.
-
storage
(Storage
) –The icechunk storage.
-
repo
(Repository
) –The icechunk repository.
-
title
(str
) –The title of the datacube.
-
stopuhr
(StopUhr
) –The benchmarking timer from the stopuhr library.
-
zgeobox
(GeoBox
) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created
(bool
) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False
.
Parameters:
-
storage
¶Storage
) –The icechunk storage of the datacube.
-
create_icechunk_storage
¶bool
, default:True
) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
Raises:
-
ValueError
–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_tiles
–Get adjacent tiles from a STAC API.
-
assert_created
–Assert that the datacube exists in the storage.
-
create
–Create an empty datacube and write it to the store.
-
current_state
–Get info about currently stored tiles.
-
download
–Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
-
download_tile
–Download a tile from a STAC API and write it to a zarr datacube.
-
load
–Load the data for the given geobox.
-
load_like
–Load the data for the given geobox.
-
log_benchmark_summary
–Log the benchmark summary.
-
open_xarray
–Open the xarray datacube in read-only mode.
-
open_zarr
–Open the zarr datacube in read-only mode.
-
post_create
–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
procedural_download
–Download the data for the given geobox.
-
procedural_download_blocking
–Download tiles procedurally in blocking mode.
-
procedural_download_threading
–Download tiles procedurally in threading mode.
-
visualize_state
–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/accessors/base.py
adjacent_tiles
¶
Get adjacent tiles from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
Returns:
-
list[TileWrapper]
–list[TileWrapper]: List of adjacent tiles, wrapped in own datastructure for easier processing.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
Assert that the datacube exists in the storage.
Raises:
-
FileNotFoundError
–If the datacube does not exist.
Source code in src/smart_geocubes/accessors/base.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
Raises:
-
FileExistsError
–If a datacube already exists at location
Source code in src/smart_geocubes/accessors/base.py
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download
¶
Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
Parameters:
-
roi
¶GeoBox | GeoDataFrame
) –The reference geobox or reference geodataframe to download the data for.
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
download_tile
¶
Download a tile from a STAC API and write it to a zarr datacube.
Parameters:
-
tile
¶TileWrapper
) –The tile to download and write.
Returns:
-
Dataset
–xr.Dataset: The downloaded tile data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
geobox: GeoBox,
buffer: int = 0,
persist: bool = True,
create: bool = False,
concurrency_mode: ConcurrencyModes = "blocking",
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to load the data for.
-
buffer
¶int
, default:0
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
¶bool
, default:True
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
¶bool
, default:False
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
ref
¶Dataset | DataArray
) –The reference dataarray or dataset to load the data for.
-
**kwargs
¶Unpack[LoadParams]
, default:{}
) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer
(int
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
(bool
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
(bool
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
(ConcurrencyModes
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
log_benchmark_summary
¶
open_xarray
¶
Open the xarray datacube in read-only mode.
Returns:
-
Dataset
–xr.Dataset: The xarray datacube.
Source code in src/smart_geocubes/accessors/base.py
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group
–zarr.Group: The zarr datacube.
Source code in src/smart_geocubes/accessors/base.py
post_create
¶
procedural_download
¶
Download the data for the given geobox.
Note
The "threading" concurrency mode requires Python 3.13 or higher.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to download the data for.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Raises:
-
ValueError
–If an unknown concurrency mode is provided.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_blocking
¶
Download tiles procedurally in blocking mode.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_threading
¶
Download tiles procedurally in threading mode.
Note
This method ensures that only a single download is running at a time. It uses a SetQueue to prevent duplicate downloads. The threading mode requires Python 3.13 or higher.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
RuntimeError
–If the Python version is lower than 3.13.
Source code in src/smart_geocubes/accessors/base.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
ax
¶Axes | None
, default:None
) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes
–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError
–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
ArcticDEM2m
¶
Bases: ArcticDEMABC
Accessor for ArcticDEM 2m data.
Attributes:
-
extent
(GeoBox
) –The extent of the datacube represented by a GeoBox.
-
chunk_size
(int
) –The chunk size of the datacube.
-
channels
(list
) –The channels of the datacube.
-
storage
(Storage
) –The icechunk storage.
-
repo
(Repository
) –The icechunk repository.
-
title
(str
) –The title of the datacube.
-
stopuhr
(StopUhr
) –The benchmarking timer from the stopuhr library.
-
zgeobox
(GeoBox
) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created
(bool
) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False
.
Parameters:
-
storage
¶Storage
) –The icechunk storage of the datacube.
-
create_icechunk_storage
¶bool
, default:True
) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
Raises:
-
ValueError
–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_tiles
–Get adjacent tiles from a STAC API.
-
assert_created
–Assert that the datacube exists in the storage.
-
create
–Create an empty datacube and write it to the store.
-
current_state
–Get info about currently stored tiles.
-
download
–Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
-
download_tile
–Download a tile from a STAC API and write it to a zarr datacube.
-
load
–Load the data for the given geobox.
-
load_like
–Load the data for the given geobox.
-
log_benchmark_summary
–Log the benchmark summary.
-
open_xarray
–Open the xarray datacube in read-only mode.
-
open_zarr
–Open the zarr datacube in read-only mode.
-
post_create
–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
procedural_download
–Download the data for the given geobox.
-
procedural_download_blocking
–Download tiles procedurally in blocking mode.
-
procedural_download_threading
–Download tiles procedurally in threading mode.
-
visualize_state
–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/accessors/base.py
adjacent_tiles
¶
Get adjacent tiles from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
Returns:
-
list[TileWrapper]
–list[TileWrapper]: List of adjacent tiles, wrapped in own datastructure for easier processing.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
Assert that the datacube exists in the storage.
Raises:
-
FileNotFoundError
–If the datacube does not exist.
Source code in src/smart_geocubes/accessors/base.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
Raises:
-
FileExistsError
–If a datacube already exists at location
Source code in src/smart_geocubes/accessors/base.py
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download
¶
Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
Parameters:
-
roi
¶GeoBox | GeoDataFrame
) –The reference geobox or reference geodataframe to download the data for.
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
download_tile
¶
Download a tile from a STAC API and write it to a zarr datacube.
Parameters:
-
tile
¶TileWrapper
) –The tile to download and write.
Returns:
-
Dataset
–xr.Dataset: The downloaded tile data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
geobox: GeoBox,
buffer: int = 0,
persist: bool = True,
create: bool = False,
concurrency_mode: ConcurrencyModes = "blocking",
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to load the data for.
-
buffer
¶int
, default:0
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
¶bool
, default:True
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
¶bool
, default:False
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
ref
¶Dataset | DataArray
) –The reference dataarray or dataset to load the data for.
-
**kwargs
¶Unpack[LoadParams]
, default:{}
) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer
(int
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
(bool
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
(bool
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
(ConcurrencyModes
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
log_benchmark_summary
¶
open_xarray
¶
Open the xarray datacube in read-only mode.
Returns:
-
Dataset
–xr.Dataset: The xarray datacube.
Source code in src/smart_geocubes/accessors/base.py
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group
–zarr.Group: The zarr datacube.
Source code in src/smart_geocubes/accessors/base.py
post_create
¶
procedural_download
¶
Download the data for the given geobox.
Note
The "threading" concurrency mode requires Python 3.13 or higher.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to download the data for.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Raises:
-
ValueError
–If an unknown concurrency mode is provided.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_blocking
¶
Download tiles procedurally in blocking mode.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_threading
¶
Download tiles procedurally in threading mode.
Note
This method ensures that only a single download is running at a time. It uses a SetQueue to prevent duplicate downloads. The threading mode requires Python 3.13 or higher.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
RuntimeError
–If the Python version is lower than 3.13.
Source code in src/smart_geocubes/accessors/base.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
ax
¶Axes | None
, default:None
) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes
–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError
–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
ArcticDEM32m
¶
Bases: ArcticDEMABC
Accessor for ArcticDEM 32m data.
Attributes:
-
extent
(GeoBox
) –The extent of the datacube represented by a GeoBox.
-
chunk_size
(int
) –The chunk size of the datacube.
-
channels
(list
) –The channels of the datacube.
-
storage
(Storage
) –The icechunk storage.
-
repo
(Repository
) –The icechunk repository.
-
title
(str
) –The title of the datacube.
-
stopuhr
(StopUhr
) –The benchmarking timer from the stopuhr library.
-
zgeobox
(GeoBox
) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created
(bool
) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False
.
Parameters:
-
storage
¶Storage
) –The icechunk storage of the datacube.
-
create_icechunk_storage
¶bool
, default:True
) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
Raises:
-
ValueError
–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_tiles
–Get adjacent tiles from a STAC API.
-
assert_created
–Assert that the datacube exists in the storage.
-
create
–Create an empty datacube and write it to the store.
-
current_state
–Get info about currently stored tiles.
-
download
–Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
-
download_tile
–Download a tile from a STAC API and write it to a zarr datacube.
-
load
–Load the data for the given geobox.
-
load_like
–Load the data for the given geobox.
-
log_benchmark_summary
–Log the benchmark summary.
-
open_xarray
–Open the xarray datacube in read-only mode.
-
open_zarr
–Open the zarr datacube in read-only mode.
-
post_create
–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
procedural_download
–Download the data for the given geobox.
-
procedural_download_blocking
–Download tiles procedurally in blocking mode.
-
procedural_download_threading
–Download tiles procedurally in threading mode.
-
visualize_state
–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/accessors/base.py
adjacent_tiles
¶
Get adjacent tiles from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
Returns:
-
list[TileWrapper]
–list[TileWrapper]: List of adjacent tiles, wrapped in own datastructure for easier processing.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
Assert that the datacube exists in the storage.
Raises:
-
FileNotFoundError
–If the datacube does not exist.
Source code in src/smart_geocubes/accessors/base.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
Raises:
-
FileExistsError
–If a datacube already exists at location
Source code in src/smart_geocubes/accessors/base.py
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download
¶
Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
Parameters:
-
roi
¶GeoBox | GeoDataFrame
) –The reference geobox or reference geodataframe to download the data for.
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
download_tile
¶
Download a tile from a STAC API and write it to a zarr datacube.
Parameters:
-
tile
¶TileWrapper
) –The tile to download and write.
Returns:
-
Dataset
–xr.Dataset: The downloaded tile data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
geobox: GeoBox,
buffer: int = 0,
persist: bool = True,
create: bool = False,
concurrency_mode: ConcurrencyModes = "blocking",
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to load the data for.
-
buffer
¶int
, default:0
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
¶bool
, default:True
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
¶bool
, default:False
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
ref
¶Dataset | DataArray
) –The reference dataarray or dataset to load the data for.
-
**kwargs
¶Unpack[LoadParams]
, default:{}
) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer
(int
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
(bool
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
(bool
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
(ConcurrencyModes
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
log_benchmark_summary
¶
open_xarray
¶
Open the xarray datacube in read-only mode.
Returns:
-
Dataset
–xr.Dataset: The xarray datacube.
Source code in src/smart_geocubes/accessors/base.py
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group
–zarr.Group: The zarr datacube.
Source code in src/smart_geocubes/accessors/base.py
post_create
¶
procedural_download
¶
Download the data for the given geobox.
Note
The "threading" concurrency mode requires Python 3.13 or higher.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to download the data for.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Raises:
-
ValueError
–If an unknown concurrency mode is provided.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_blocking
¶
Download tiles procedurally in blocking mode.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_threading
¶
Download tiles procedurally in threading mode.
Note
This method ensures that only a single download is running at a time. It uses a SetQueue to prevent duplicate downloads. The threading mode requires Python 3.13 or higher.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
RuntimeError
–If the Python version is lower than 3.13.
Source code in src/smart_geocubes/accessors/base.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
ax
¶Axes | None
, default:None
) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes
–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError
–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
TCTrend
¶
Bases: GEEAccessor
Accessor for TCTrend data.
Attributes:
-
extent
(GeoBox
) –The extent of the datacube represented by a GeoBox.
-
chunk_size
(int
) –The chunk size of the datacube.
-
channels
(list
) –The channels of the datacube.
-
storage
(Storage
) –The icechunk storage.
-
repo
(Repository
) –The icechunk repository.
-
title
(str
) –The title of the datacube.
-
stopuhr
(StopUhr
) –The benchmarking timer from the stopuhr library.
-
zgeobox
(GeoBox
) –The geobox of the zarr array. Should be equal to the extent geobox.
-
created
(bool
) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False
.
Parameters:
-
storage
¶Storage
) –The icechunk storage of the datacube.
-
create_icechunk_storage
¶bool
, default:True
) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
Raises:
-
ValueError
–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_tiles
–Get adjacent tiles from Google Earth Engine.
-
assert_created
–Assert that the datacube exists in the storage.
-
create
–Create an empty datacube and write it to the store.
-
current_state
–Get info about currently stored tiles.
-
download
–Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
-
download_tile
–Download a tile from Google Earth Engine.
-
load
–Load the data for the given geobox.
-
load_like
–Load the data for the given geobox.
-
log_benchmark_summary
–Log the benchmark summary.
-
open_xarray
–Open the xarray datacube in read-only mode.
-
open_zarr
–Open the zarr datacube in read-only mode.
-
post_create
–Post create actions. Can be overwritten by the dataset accessor.
-
procedural_download
–Download the data for the given geobox.
-
procedural_download_blocking
–Download tiles procedurally in blocking mode.
-
procedural_download_threading
–Download tiles procedurally in threading mode.
-
visualize_state
–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/accessors/base.py
adjacent_tiles
¶
Get adjacent tiles from Google Earth Engine.
Parameters:
Returns:
-
list[TileWrapper]
–list[TileWrapper]: List of adjacent tiles, wrapped in own datastructure for easier processing.
Source code in src/smart_geocubes/accessors/gee.py
assert_created
¶
Assert that the datacube exists in the storage.
Raises:
-
FileNotFoundError
–If the datacube does not exist.
Source code in src/smart_geocubes/accessors/base.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
Raises:
-
FileExistsError
–If a datacube already exists at location
Source code in src/smart_geocubes/accessors/base.py
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame: Tiles from odc.geo.GeoboxTiles. None if datacube is empty.
Source code in src/smart_geocubes/accessors/gee.py
download
¶
Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
Parameters:
-
roi
¶GeoBox | GeoDataFrame
) –The reference geobox or reference geodataframe to download the data for.
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
download_tile
¶
Download a tile from Google Earth Engine.
Parameters:
-
tile
¶TileWrapper
) –The tile to download.
Returns:
-
Dataset
–xr.Dataset: The downloaded tile data.
Source code in src/smart_geocubes/accessors/gee.py
load
¶
load(
geobox: GeoBox,
buffer: int = 0,
persist: bool = True,
create: bool = False,
concurrency_mode: ConcurrencyModes = "blocking",
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to load the data for.
-
buffer
¶int
, default:0
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
¶bool
, default:True
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
¶bool
, default:False
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
ref
¶Dataset | DataArray
) –The reference dataarray or dataset to load the data for.
-
**kwargs
¶Unpack[LoadParams]
, default:{}
) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer
(int
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
(bool
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
(bool
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
(ConcurrencyModes
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
log_benchmark_summary
¶
open_xarray
¶
Open the xarray datacube in read-only mode.
Returns:
-
Dataset
–xr.Dataset: The xarray datacube.
Source code in src/smart_geocubes/accessors/base.py
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group
–zarr.Group: The zarr datacube.
Source code in src/smart_geocubes/accessors/base.py
post_create
¶
procedural_download
¶
Download the data for the given geobox.
Note
The "threading" concurrency mode requires Python 3.13 or higher.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to download the data for.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Raises:
-
ValueError
–If an unknown concurrency mode is provided.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_blocking
¶
Download tiles procedurally in blocking mode.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_threading
¶
Download tiles procedurally in threading mode.
Note
This method ensures that only a single download is running at a time. It uses a SetQueue to prevent duplicate downloads. The threading mode requires Python 3.13 or higher.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
RuntimeError
–If the Python version is lower than 3.13.
Source code in src/smart_geocubes/accessors/base.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
ax
¶Axes | None
, default:None
) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes
–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError
–If the datacube is empty