smart_geocubes.accessors.stac
¶
STAC Accessor for Smart Geocubes.
Classes:
-
STACAccessor
–Accessor for STAC data.
Functions:
-
correct_bounds
–Correct the bounds of a tile to fit within a GeoBox.
STACAccessor
¶
Bases: RemoteAccessor
Accessor for STAC data.
Attributes:
-
extent
(GeoBox
) –The extent of the datacube represented by a GeoBox.
-
chunk_size
(int
) –The chunk size of the datacube.
-
channels
(list
) –The channels of the datacube.
-
storage
(Storage
) –The icechunk storage.
-
repo
(Repository
) –The icechunk repository.
-
title
(str
) –The title of the datacube.
-
stopuhr
(StopUhr
) –The benchmarking timer from the stopuhr library.
-
zgeobox
(GeoBox
) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created
(bool
) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False
.
Parameters:
-
storage
¶Storage
) –The icechunk storage of the datacube.
-
create_icechunk_storage
¶bool
, default:True
) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
Raises:
-
ValueError
–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_tiles
–Get adjacent tiles from a STAC API.
-
assert_created
–Assert that the datacube exists in the storage.
-
create
–Create an empty datacube and write it to the store.
-
current_state
–Get info about currently stored tiles.
-
download
–Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
-
download_tile
–Download a tile from a STAC API and write it to a zarr datacube.
-
load
–Load the data for the given geobox.
-
load_like
–Load the data for the given geobox.
-
log_benchmark_summary
–Log the benchmark summary.
-
open_xarray
–Open the xarray datacube in read-only mode.
-
open_zarr
–Open the zarr datacube in read-only mode.
-
post_create
–Post create actions. Can be overwritten by the dataset accessor.
-
procedural_download
–Download the data for the given geobox.
-
procedural_download_blocking
–Download tiles procedurally in blocking mode.
-
procedural_download_threading
–Download tiles procedurally in threading mode.
-
visualize_state
–Visulize currently stored tiles / chunk.
Source code in src/smart_geocubes/accessors/base.py
adjacent_tiles
¶
Get adjacent tiles from a STAC API.
Parameters:
Returns:
-
list[TileWrapper]
–list[TileWrapper]: List of adjacent tiles, wrapped in own datastructure for easier processing.
Source code in src/smart_geocubes/accessors/stac.py
assert_created
¶
Assert that the datacube exists in the storage.
Raises:
-
FileNotFoundError
–If the datacube does not exist.
Source code in src/smart_geocubes/accessors/base.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
Raises:
-
FileExistsError
–If a datacube already exists at location
Source code in src/smart_geocubes/accessors/base.py
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None
–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download
¶
Download the data for the given region of interest which can be provided either as GeoBox or GeoDataFrame.
Parameters:
-
roi
¶GeoBox | GeoDataFrame
) –The reference geobox or reference geodataframe to download the data for.
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
download_tile
¶
Download a tile from a STAC API and write it to a zarr datacube.
Parameters:
-
tile
¶TileWrapper
) –The tile to download and write.
Returns:
-
Dataset
–xr.Dataset: The downloaded tile data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
geobox: GeoBox,
buffer: int = 0,
persist: bool = True,
create: bool = False,
concurrency_mode: ConcurrencyModes = "blocking",
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to load the data for.
-
buffer
¶int
, default:0
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
¶bool
, default:True
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
¶bool
, default:False
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
ref
¶Dataset | DataArray
) –The reference dataarray or dataset to load the data for.
-
**kwargs
¶Unpack[LoadParams]
, default:{}
) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer
(int
) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist
(bool
) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create
(bool
) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
-
concurrency_mode
(ConcurrencyModes
) –The concurrency mode for the download. Defaults to "blocking".
Returns:
-
Dataset
–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/accessors/base.py
log_benchmark_summary
¶
open_xarray
¶
Open the xarray datacube in read-only mode.
Returns:
-
Dataset
–xr.Dataset: The xarray datacube.
Source code in src/smart_geocubes/accessors/base.py
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group
–zarr.Group: The zarr datacube.
Source code in src/smart_geocubes/accessors/base.py
post_create
¶
procedural_download
¶
Download the data for the given geobox.
Note
The "threading" concurrency mode requires Python 3.13 or higher.
Parameters:
-
geobox
¶GeoBox
) –The reference geobox to download the data for.
-
concurrency_mode
¶ConcurrencyModes
, default:'blocking'
) –The concurrency mode for the download. Defaults to "blocking".
Raises:
-
ValueError
–If an unknown concurrency mode is provided.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_blocking
¶
Download tiles procedurally in blocking mode.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError
–If no tries are left.
Source code in src/smart_geocubes/accessors/base.py
procedural_download_threading
¶
Download tiles procedurally in threading mode.
Note
This method ensures that only a single download is running at a time. It uses a SetQueue to prevent duplicate downloads. The threading mode requires Python 3.13 or higher.
Parameters:
Raises:
-
ValueError
–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
RuntimeError
–If the Python version is lower than 3.13.
Source code in src/smart_geocubes/accessors/base.py
visualize_state
abstractmethod
¶
Visulize currently stored tiles / chunk.
Must be implemented by the DatasetAccessor.
Parameters:
-
ax
¶Axes | None
, default:None
) –The axes drawn to. If None, will create a new figure and axes. Defaults to None.
Returns:
-
Figure | Axes
–plt.Figure | plt.Axes: The figure with the visualization
Source code in src/smart_geocubes/accessors/base.py
correct_bounds
¶
Correct the bounds of a tile to fit within a GeoBox.
Parameters:
Raises:
-
ValueError
–If the tile is out of the geobox's bounds.
Returns:
-
Dataset
–xr.Dataset: The corrected tile.