smart_geocubes.datasets.arcticdem
¶
Predefined accessor for ArcticDEM 32m, 10m and 2m data.
Classes:
-
ArcticDEM10m–Accessor for ArcticDEM 10m data.
-
ArcticDEM2m–Accessor for ArcticDEM 2m data.
-
ArcticDEM32m–Accessor for ArcticDEM 32m data.
-
ArcticDEMABC–ABC for Arcticdem data.
-
LazyStacPatchIndex–Lazy wrapper for a PatchIndex containing a STAC Item.
ArcticDEM10m
¶
ArcticDEM10m(
storage: Storage | Path | str,
create_icechunk_storage: bool = True,
backend: Literal["threaded", "simple"] = "threaded",
)
Bases: ArcticDEMABC
Accessor for ArcticDEM 10m data.
Attributes:
-
extent(GeoBox) –The extent of the datacube represented by a GeoBox.
-
chunk_size(int) –The chunk size of the datacube.
-
channels(list) –The channels of the datacube.
-
storage(Storage) –The icechunk storage.
-
repo(Repository) –The icechunk repository.
-
title(str) –The title of the datacube.
-
stopuhr(StopUhr) –The benchmarking timer from the stopuhr library.
-
zgeobox(GeoBox) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created(bool) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False.
Parameters:
-
(storage¶Storage) –The icechunk storage of the datacube.
-
(create_icechunk_storage¶bool, default:True) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
-
(backend¶Literal['threaded', 'simple'], default:'threaded') –The backend to use for downloading data. Currently, only "threaded" is supported. Defaults to "threaded".
Raises:
-
ValueError–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_patches–Get adjacent patch indexes from a STAC API.
-
assert_created–Assert that the datacube exists in the storage.
-
assert_temporal_cube–Assert that the datacube has a temporal dimension.
-
create–Create an empty datacube and write it to the store.
-
current_state–Get info about currently stored tiles.
-
download_patch–Download the data for the given patch.
-
load–Load the data for the given geobox.
-
load_like–Load the data for the given geobox.
-
loaded_patches–Get the ids of already (down-)loaded patches.
-
log_benchmark_summary–Log the benchmark summary.
-
open_xarray–Open the xarray datacube in read-only mode.
-
open_zarr–Open the zarr datacube in read-only mode.
-
post_create–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
post_init–Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
-
procedural_download–Download tiles procedurally.
-
visualize_state–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/core/accessor.py
created
property
¶
Check if the datacube already exists in the storage.
Returns:
-
bool(bool) –True if the datacube already exists in the storage.
is_temporal
property
¶
Check if the datacube has a temporal dimension.
Returns:
-
bool(bool) –True if the datacube has a temporal dimension.
adjacent_patches
¶
Get adjacent patch indexes from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
-
(roi¶Geometry | GeoBox | GeoDataFrame) –The reference geometry, geobox or reference geodataframe
-
(toi¶TOI) –The time of interest to download. Not used in this implementation since ArcticDEM is not temporal.
Returns:
-
list[PatchIndex]–list[PatchIndex]: List of adjacent patches, wrapped in own datastructure for easier processing.
Raises:
-
ValueError–If the roi is not a GeoBox or a GeoDataFrame.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
assert_temporal_cube
¶
Assert that the datacube has a temporal dimension.
Raises:
-
ValueError–If the datacube has no temporal dimension.
Source code in src/smart_geocubes/core/accessor.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
-
(overwrite¶bool, default:False) –Allowing overwriting an existing datacube. Has no effect if exists_ok is True. Defaults to False.
-
(exists_ok¶bool, default:False) –Do not raise an error if the datacube already exists.
Raises:
-
FileExistsError–If a datacube already exists at location and exists_ok is False.
Source code in src/smart_geocubes/core/accessor.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | |
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download_patch
¶
Download the data for the given patch.
Must be implemented by the Accessor.
Parameters:
-
(idx¶PatchIndex[Item]) –The reference patch to download the data for.
Returns:
-
Dataset–xr.Dataset: The downloaded patch data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
aoi: Geometry | GeoBox,
toi: TOI = None,
persist: bool = True,
create: bool = False,
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
(aoi¶Geometry | GeoBox) –The reference geometry to load the data for. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI, default:None) –The temporal slice to load. Defaults to None.
-
(persist¶bool, default:True) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
(create¶bool, default:False) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
(ref¶Dataset | DataArray) –The reference dataarray or dataset to load the data for.
-
(**kwargs¶Unpack[LoadParams], default:{}) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer(int) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist(bool) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create(bool) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
loaded_patches
¶
Get the ids of already (down-)loaded patches.
Returns:
Source code in src/smart_geocubes/core/accessor.py
log_benchmark_summary
¶
open_xarray
¶
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group–zarr.Group: The zarr datacube.
post_create
¶
post_init
¶
Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
Source code in src/smart_geocubes/datasets/arcticdem.py
procedural_download
¶
Download tiles procedurally.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
-
(aoi¶Geometry | GeoBox) –The geometry of the aoi to download. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI) –The time of interest to download.
Raises:
-
ValueError–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError–If not all downloads were successful.
Source code in src/smart_geocubes/core/accessor.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
(ax¶Axes | None, default:None) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
ArcticDEM2m
¶
ArcticDEM2m(
storage: Storage | Path | str,
create_icechunk_storage: bool = True,
backend: Literal["threaded", "simple"] = "threaded",
)
Bases: ArcticDEMABC
Accessor for ArcticDEM 2m data.
Attributes:
-
extent(GeoBox) –The extent of the datacube represented by a GeoBox.
-
chunk_size(int) –The chunk size of the datacube.
-
channels(list) –The channels of the datacube.
-
storage(Storage) –The icechunk storage.
-
repo(Repository) –The icechunk repository.
-
title(str) –The title of the datacube.
-
stopuhr(StopUhr) –The benchmarking timer from the stopuhr library.
-
zgeobox(GeoBox) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created(bool) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False.
Parameters:
-
(storage¶Storage) –The icechunk storage of the datacube.
-
(create_icechunk_storage¶bool, default:True) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
-
(backend¶Literal['threaded', 'simple'], default:'threaded') –The backend to use for downloading data. Currently, only "threaded" is supported. Defaults to "threaded".
Raises:
-
ValueError–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_patches–Get adjacent patch indexes from a STAC API.
-
assert_created–Assert that the datacube exists in the storage.
-
assert_temporal_cube–Assert that the datacube has a temporal dimension.
-
create–Create an empty datacube and write it to the store.
-
current_state–Get info about currently stored tiles.
-
download_patch–Download the data for the given patch.
-
load–Load the data for the given geobox.
-
load_like–Load the data for the given geobox.
-
loaded_patches–Get the ids of already (down-)loaded patches.
-
log_benchmark_summary–Log the benchmark summary.
-
open_xarray–Open the xarray datacube in read-only mode.
-
open_zarr–Open the zarr datacube in read-only mode.
-
post_create–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
post_init–Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
-
procedural_download–Download tiles procedurally.
-
visualize_state–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/core/accessor.py
created
property
¶
Check if the datacube already exists in the storage.
Returns:
-
bool(bool) –True if the datacube already exists in the storage.
is_temporal
property
¶
Check if the datacube has a temporal dimension.
Returns:
-
bool(bool) –True if the datacube has a temporal dimension.
adjacent_patches
¶
Get adjacent patch indexes from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
-
(roi¶Geometry | GeoBox | GeoDataFrame) –The reference geometry, geobox or reference geodataframe
-
(toi¶TOI) –The time of interest to download. Not used in this implementation since ArcticDEM is not temporal.
Returns:
-
list[PatchIndex]–list[PatchIndex]: List of adjacent patches, wrapped in own datastructure for easier processing.
Raises:
-
ValueError–If the roi is not a GeoBox or a GeoDataFrame.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
assert_temporal_cube
¶
Assert that the datacube has a temporal dimension.
Raises:
-
ValueError–If the datacube has no temporal dimension.
Source code in src/smart_geocubes/core/accessor.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
-
(overwrite¶bool, default:False) –Allowing overwriting an existing datacube. Has no effect if exists_ok is True. Defaults to False.
-
(exists_ok¶bool, default:False) –Do not raise an error if the datacube already exists.
Raises:
-
FileExistsError–If a datacube already exists at location and exists_ok is False.
Source code in src/smart_geocubes/core/accessor.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | |
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download_patch
¶
Download the data for the given patch.
Must be implemented by the Accessor.
Parameters:
-
(idx¶PatchIndex[Item]) –The reference patch to download the data for.
Returns:
-
Dataset–xr.Dataset: The downloaded patch data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
aoi: Geometry | GeoBox,
toi: TOI = None,
persist: bool = True,
create: bool = False,
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
(aoi¶Geometry | GeoBox) –The reference geometry to load the data for. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI, default:None) –The temporal slice to load. Defaults to None.
-
(persist¶bool, default:True) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
(create¶bool, default:False) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
(ref¶Dataset | DataArray) –The reference dataarray or dataset to load the data for.
-
(**kwargs¶Unpack[LoadParams], default:{}) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer(int) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist(bool) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create(bool) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
loaded_patches
¶
Get the ids of already (down-)loaded patches.
Returns:
Source code in src/smart_geocubes/core/accessor.py
log_benchmark_summary
¶
open_xarray
¶
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group–zarr.Group: The zarr datacube.
post_create
¶
post_init
¶
Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
Source code in src/smart_geocubes/datasets/arcticdem.py
procedural_download
¶
Download tiles procedurally.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
-
(aoi¶Geometry | GeoBox) –The geometry of the aoi to download. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI) –The time of interest to download.
Raises:
-
ValueError–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError–If not all downloads were successful.
Source code in src/smart_geocubes/core/accessor.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
(ax¶Axes | None, default:None) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
ArcticDEM32m
¶
ArcticDEM32m(
storage: Storage | Path | str,
create_icechunk_storage: bool = True,
backend: Literal["threaded", "simple"] = "threaded",
)
Bases: ArcticDEMABC
Accessor for ArcticDEM 32m data.
Attributes:
-
extent(GeoBox) –The extent of the datacube represented by a GeoBox.
-
chunk_size(int) –The chunk size of the datacube.
-
channels(list) –The channels of the datacube.
-
storage(Storage) –The icechunk storage.
-
repo(Repository) –The icechunk repository.
-
title(str) –The title of the datacube.
-
stopuhr(StopUhr) –The benchmarking timer from the stopuhr library.
-
zgeobox(GeoBox) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created(bool) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False.
Parameters:
-
(storage¶Storage) –The icechunk storage of the datacube.
-
(create_icechunk_storage¶bool, default:True) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
-
(backend¶Literal['threaded', 'simple'], default:'threaded') –The backend to use for downloading data. Currently, only "threaded" is supported. Defaults to "threaded".
Raises:
-
ValueError–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_patches–Get adjacent patch indexes from a STAC API.
-
assert_created–Assert that the datacube exists in the storage.
-
assert_temporal_cube–Assert that the datacube has a temporal dimension.
-
create–Create an empty datacube and write it to the store.
-
current_state–Get info about currently stored tiles.
-
download_patch–Download the data for the given patch.
-
load–Load the data for the given geobox.
-
load_like–Load the data for the given geobox.
-
loaded_patches–Get the ids of already (down-)loaded patches.
-
log_benchmark_summary–Log the benchmark summary.
-
open_xarray–Open the xarray datacube in read-only mode.
-
open_zarr–Open the zarr datacube in read-only mode.
-
post_create–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
post_init–Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
-
procedural_download–Download tiles procedurally.
-
visualize_state–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/core/accessor.py
created
property
¶
Check if the datacube already exists in the storage.
Returns:
-
bool(bool) –True if the datacube already exists in the storage.
is_temporal
property
¶
Check if the datacube has a temporal dimension.
Returns:
-
bool(bool) –True if the datacube has a temporal dimension.
adjacent_patches
¶
Get adjacent patch indexes from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
-
(roi¶Geometry | GeoBox | GeoDataFrame) –The reference geometry, geobox or reference geodataframe
-
(toi¶TOI) –The time of interest to download. Not used in this implementation since ArcticDEM is not temporal.
Returns:
-
list[PatchIndex]–list[PatchIndex]: List of adjacent patches, wrapped in own datastructure for easier processing.
Raises:
-
ValueError–If the roi is not a GeoBox or a GeoDataFrame.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
assert_temporal_cube
¶
Assert that the datacube has a temporal dimension.
Raises:
-
ValueError–If the datacube has no temporal dimension.
Source code in src/smart_geocubes/core/accessor.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
-
(overwrite¶bool, default:False) –Allowing overwriting an existing datacube. Has no effect if exists_ok is True. Defaults to False.
-
(exists_ok¶bool, default:False) –Do not raise an error if the datacube already exists.
Raises:
-
FileExistsError–If a datacube already exists at location and exists_ok is False.
Source code in src/smart_geocubes/core/accessor.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | |
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download_patch
¶
Download the data for the given patch.
Must be implemented by the Accessor.
Parameters:
-
(idx¶PatchIndex[Item]) –The reference patch to download the data for.
Returns:
-
Dataset–xr.Dataset: The downloaded patch data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
aoi: Geometry | GeoBox,
toi: TOI = None,
persist: bool = True,
create: bool = False,
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
(aoi¶Geometry | GeoBox) –The reference geometry to load the data for. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI, default:None) –The temporal slice to load. Defaults to None.
-
(persist¶bool, default:True) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
(create¶bool, default:False) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
(ref¶Dataset | DataArray) –The reference dataarray or dataset to load the data for.
-
(**kwargs¶Unpack[LoadParams], default:{}) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer(int) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist(bool) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create(bool) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
loaded_patches
¶
Get the ids of already (down-)loaded patches.
Returns:
Source code in src/smart_geocubes/core/accessor.py
log_benchmark_summary
¶
open_xarray
¶
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group–zarr.Group: The zarr datacube.
post_create
¶
post_init
¶
Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
Source code in src/smart_geocubes/datasets/arcticdem.py
procedural_download
¶
Download tiles procedurally.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
-
(aoi¶Geometry | GeoBox) –The geometry of the aoi to download. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI) –The time of interest to download.
Raises:
-
ValueError–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError–If not all downloads were successful.
Source code in src/smart_geocubes/core/accessor.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
(ax¶Axes | None, default:None) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
ArcticDEMABC
¶
ArcticDEMABC(
storage: Storage | Path | str,
create_icechunk_storage: bool = True,
backend: Literal["threaded", "simple"] = "threaded",
)
Bases: STACAccessor
ABC for Arcticdem data.
Attributes:
-
extent(GeoBox) –The extent of the datacube represented by a GeoBox.
-
chunk_size(int) –The chunk size of the datacube.
-
channels(list) –The channels of the datacube.
-
storage(Storage) –The icechunk storage.
-
repo(Repository) –The icechunk repository.
-
title(str) –The title of the datacube.
-
stopuhr(StopUhr) –The benchmarking timer from the stopuhr library.
-
zgeobox(GeoBox) –The geobox of the underlaying zarr array. Should be equal to the extent geobox. However, this property is used to find the target index of the downloaded data, so better save than sorry.
-
created(bool) –True if the datacube already exists in the storage.
Initialize base class for remote accessors.
Warning
In a multiprocessing environment, it is strongly recommended to not set create_icechunk_storage=False.
Parameters:
-
(storage¶Storage) –The icechunk storage of the datacube.
-
(create_icechunk_storage¶bool, default:True) –If an icechunk repository should be created at provided storage if no exists. This should be disabled in a multiprocessing environment. Defaults to True.
-
(backend¶Literal['threaded', 'simple'], default:'threaded') –The backend to use for downloading data. Currently, only "threaded" is supported. Defaults to "threaded".
Raises:
-
ValueError–If the storage is not an icechunk.Storage.
Methods:
-
adjacent_patches–Get adjacent patch indexes from a STAC API.
-
assert_created–Assert that the datacube exists in the storage.
-
assert_temporal_cube–Assert that the datacube has a temporal dimension.
-
create–Create an empty datacube and write it to the store.
-
current_state–Get info about currently stored tiles.
-
download_patch–Download the data for the given patch.
-
load–Load the data for the given geobox.
-
load_like–Load the data for the given geobox.
-
loaded_patches–Get the ids of already (down-)loaded patches.
-
log_benchmark_summary–Log the benchmark summary.
-
open_xarray–Open the xarray datacube in read-only mode.
-
open_zarr–Open the zarr datacube in read-only mode.
-
post_create–Download the ArcticDEM mosaic extent info and store it in the datacube.
-
post_init–Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
-
procedural_download–Download tiles procedurally.
-
visualize_state–Visulize the extend, hence the already downloaded and filled data, of the datacube.
Source code in src/smart_geocubes/core/accessor.py
created
property
¶
Check if the datacube already exists in the storage.
Returns:
-
bool(bool) –True if the datacube already exists in the storage.
is_temporal
property
¶
Check if the datacube has a temporal dimension.
Returns:
-
bool(bool) –True if the datacube has a temporal dimension.
adjacent_patches
¶
Get adjacent patch indexes from a STAC API.
Overwrite the default implementation from the STAC accessor to use pre-downloaded extent files instead of querying the STAC API. This results in a faster loading time, but requires the extent files to be downloaded beforehand. This is done in the post_create step.
Parameters:
-
(roi¶Geometry | GeoBox | GeoDataFrame) –The reference geometry, geobox or reference geodataframe
-
(toi¶TOI) –The time of interest to download. Not used in this implementation since ArcticDEM is not temporal.
Returns:
-
list[PatchIndex]–list[PatchIndex]: List of adjacent patches, wrapped in own datastructure for easier processing.
Raises:
-
ValueError–If the roi is not a GeoBox or a GeoDataFrame.
Source code in src/smart_geocubes/datasets/arcticdem.py
assert_created
¶
assert_temporal_cube
¶
Assert that the datacube has a temporal dimension.
Raises:
-
ValueError–If the datacube has no temporal dimension.
Source code in src/smart_geocubes/core/accessor.py
create
¶
Create an empty datacube and write it to the store.
Parameters:
-
(overwrite¶bool, default:False) –Allowing overwriting an existing datacube. Has no effect if exists_ok is True. Defaults to False.
-
(exists_ok¶bool, default:False) –Do not raise an error if the datacube already exists.
Raises:
-
FileExistsError–If a datacube already exists at location and exists_ok is False.
Source code in src/smart_geocubes/core/accessor.py
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | |
current_state
¶
Get info about currently stored tiles.
Returns:
-
GeoDataFrame | None–gpd.GeoDataFrame: Tile info from pystac. None if datacube is empty.
Source code in src/smart_geocubes/accessors/stac.py
download_patch
¶
Download the data for the given patch.
Must be implemented by the Accessor.
Parameters:
-
(idx¶PatchIndex[Item]) –The reference patch to download the data for.
Returns:
-
Dataset–xr.Dataset: The downloaded patch data.
Source code in src/smart_geocubes/accessors/stac.py
load
¶
load(
aoi: Geometry | GeoBox,
toi: TOI = None,
persist: bool = True,
create: bool = False,
) -> xr.Dataset
Load the data for the given geobox.
Parameters:
-
(aoi¶Geometry | GeoBox) –The reference geometry to load the data for. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI, default:None) –The temporal slice to load. Defaults to None.
-
(persist¶bool, default:True) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
(create¶bool, default:False) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
load_like
¶
Load the data for the given geobox.
Parameters:
-
(ref¶Dataset | DataArray) –The reference dataarray or dataset to load the data for.
-
(**kwargs¶Unpack[LoadParams], default:{}) –The load parameters (buffer, persist, create, concurrency_mode).
Other Parameters:
-
buffer(int) –The buffer around the projected geobox in pixels. Defaults to 0.
-
persist(bool) –If the data should be persisted in memory. If not, this will return a Dask backed Dataset. Defaults to True.
-
create(bool) –Create a new zarr array at defined storage if it not exists. This is not recommended, because it can have side effects in a multi-process environment. Defaults to False.
Returns:
-
Dataset–xr.Dataset: The loaded dataset in the same resolution and extent like the geobox.
Source code in src/smart_geocubes/core/accessor.py
loaded_patches
¶
Get the ids of already (down-)loaded patches.
Returns:
Source code in src/smart_geocubes/core/accessor.py
log_benchmark_summary
¶
open_xarray
¶
open_zarr
¶
Open the zarr datacube in read-only mode.
Returns:
-
Group–zarr.Group: The zarr datacube.
post_create
¶
post_init
¶
Check if the ArcticDEM mosaic extent info is already downloaded and downlaod if not.
Source code in src/smart_geocubes/datasets/arcticdem.py
procedural_download
¶
Download tiles procedurally.
Warning
This method is meant for single-process use, but can (in theory) be used in a multi-process environment. However, in a multi-process environment it can happen that multiple processes try to write concurrently, which results in a conflict. In such cases, the download will be retried until it succeeds or the number of maximum-tries is reached.
Parameters:
-
(aoi¶Geometry | GeoBox) –The geometry of the aoi to download. If a Geobox is provided, it will use the extent of the geobox.
-
(toi¶TOI) –The time of interest to download.
Raises:
-
ValueError–If no adjacent tiles are found. This can happen if the geobox is out of the dataset bounds.
-
ValueError–If not all downloads were successful.
Source code in src/smart_geocubes/core/accessor.py
visualize_state
¶
Visulize the extend, hence the already downloaded and filled data, of the datacube.
Parameters:
-
(ax¶Axes | None, default:None) –The axes drawn to. If None, will create a new figure and axes.
Returns:
-
Figure | Axes–plt.Figure | plt.Axes: The figure with the visualization if no axes was provided, else the axes.
Raises:
-
ValueError–If the datacube is empty
Source code in src/smart_geocubes/datasets/arcticdem.py
LazyStacPatchIndex
¶
Lazy wrapper for a PatchIndex containing a STAC Item.
This is necessary since the download function of the STAC accessor expects a TileWrapper object containing a pystac.Item.
However, creating such a pystac Item always fetches the metadata from the STAC API. For just loading the ArcticDEM data, we don't need this pystac Item. Hence, we create it lazily when it is actually needed.