Skip to content

Home / reference / config

icechunk.config#

Repository configuration, manifest settings, compression, and caching.

icechunk.config #

Classes:

Name Description
CachingConfig

Configuration for how Icechunk caches its metadata files

CompressionAlgorithm

Enum for selecting the compression algorithm used by Icechunk to write its metadata files

CompressionConfig

Configuration for how Icechunk compresses its metadata files

ManifestConfig

Configuration for how Icechunk manifests

ManifestPreloadCondition

Configuration for conditions under which manifests will preload on session creation

ManifestPreloadConfig

Configuration for how Icechunk manifest preload on session creation

ManifestSplitCondition

Configuration for conditions under which manifests will be split into splits

ManifestSplitDimCondition

Conditions for specifying dimensions along which to shard manifests.

ManifestSplittingConfig

Configuration for manifest splitting.

ManifestVirtualChunkLocationCompressionConfig

Configuration for zstd dictionary-based compression of virtual chunk location URLs in manifests.

RepositoryConfig

Configuration for an Icechunk repository

Functions:

Name Description
initialize_logs

Initialize the logging system for the library.

set_logs_filter

Set filters and log levels for the different modules.

CachingConfig #

Configuration for how Icechunk caches its metadata files

Methods:

Name Description
__new__

Create a new CachingConfig object

Attributes:

Name Type Description
num_bytes_attributes int | None

The number of bytes of attributes to cache.

num_bytes_chunks int | None

The number of bytes of chunks to cache.

num_chunk_refs int | None

The number of chunk references to cache.

num_snapshot_nodes int | None

The number of snapshot nodes to cache.

num_transaction_changes int | None

The number of transaction changes to cache.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class CachingConfig:
    """Configuration for how Icechunk caches its metadata files"""

    def __new__(
        cls,
        num_snapshot_nodes: int | None = None,
        num_chunk_refs: int | None = None,
        num_transaction_changes: int | None = None,
        num_bytes_attributes: int | None = None,
        num_bytes_chunks: int | None = None,
    ) -> CachingConfig:
        """
        Create a new `CachingConfig` object

        Parameters
        ----------
        num_snapshot_nodes: int | None
            The number of snapshot nodes to cache.
            Default: 500,000
        num_chunk_refs: int | None
            The number of chunk references to cache.
            Default: 15,000,000
        num_transaction_changes: int | None
            The number of transaction changes to cache.
            Default: 0
        num_bytes_attributes: int | None
            The number of bytes of attributes to cache.
            Default: 0
        num_bytes_chunks: int | None
            The number of bytes of chunks to cache.
            Default: 0
        """
        ...
    def __repr__(self, /) -> str: ...
    def __str__(self, /) -> str: ...
    def _repr_html_(self, /) -> str: ...
    @property
    def num_snapshot_nodes(self) -> int | None:
        """
        The number of snapshot nodes to cache.

        Default: 500,000

        Returns
        -------
        int | None
            The number of snapshot nodes to cache.
        """
        ...
    @num_snapshot_nodes.setter
    def num_snapshot_nodes(self, value: int | None) -> None:
        """
        Set the number of snapshot nodes to cache.

        Parameters
        ----------
        value: int | None
            The number of snapshot nodes to cache.
        """
        ...
    @property
    def num_chunk_refs(self) -> int | None:
        """
        The number of chunk references to cache.

        Default: 15,000,000

        Returns
        -------
        int | None
            The number of chunk references to cache.
        """
        ...
    @num_chunk_refs.setter
    def num_chunk_refs(self, value: int | None) -> None:
        """
        Set the number of chunk references to cache.

        Parameters
        ----------
        value: int | None
            The number of chunk references to cache.
        """
        ...
    @property
    def num_transaction_changes(self) -> int | None:
        """
        The number of transaction changes to cache.

        Default: 0

        Returns
        -------
        int | None
            The number of transaction changes to cache.
        """
        ...
    @num_transaction_changes.setter
    def num_transaction_changes(self, value: int | None) -> None:
        """
        Set the number of transaction changes to cache.

        Parameters
        ----------
        value: int | None
            The number of transaction changes to cache.
        """
        ...
    @property
    def num_bytes_attributes(self) -> int | None:
        """
        The number of bytes of attributes to cache.

        Default: 0

        Returns
        -------
        int | None
            The number of bytes of attributes to cache.
        """
        ...
    @num_bytes_attributes.setter
    def num_bytes_attributes(self, value: int | None) -> None:
        """
        Set the number of bytes of attributes to cache.

        Parameters
        ----------
        value: int | None
            The number of bytes of attributes to cache.
        """
        ...
    @property
    def num_bytes_chunks(self) -> int | None:
        """
        The number of bytes of chunks to cache.

        Default: 0

        Returns
        -------
        int | None
            The number of bytes of chunks to cache.
        """
        ...
    @num_bytes_chunks.setter
    def num_bytes_chunks(self, value: int | None) -> None:
        """
        Set the number of bytes of chunks to cache.

        Parameters
        ----------
        value: int | None
            The number of bytes of chunks to cache.
        """
        ...

num_bytes_attributes property writable #

num_bytes_attributes

The number of bytes of attributes to cache.

Default: 0

Returns:

Type Description
int | None

The number of bytes of attributes to cache.

num_bytes_chunks property writable #

num_bytes_chunks

The number of bytes of chunks to cache.

Default: 0

Returns:

Type Description
int | None

The number of bytes of chunks to cache.

num_chunk_refs property writable #

num_chunk_refs

The number of chunk references to cache.

Default: 15,000,000

Returns:

Type Description
int | None

The number of chunk references to cache.

num_snapshot_nodes property writable #

num_snapshot_nodes

The number of snapshot nodes to cache.

Default: 500,000

Returns:

Type Description
int | None

The number of snapshot nodes to cache.

num_transaction_changes property writable #

num_transaction_changes

The number of transaction changes to cache.

Default: 0

Returns:

Type Description
int | None

The number of transaction changes to cache.

__new__ #

__new__(
    num_snapshot_nodes=None,
    num_chunk_refs=None,
    num_transaction_changes=None,
    num_bytes_attributes=None,
    num_bytes_chunks=None,
)

Create a new CachingConfig object

Parameters:

Name Type Description Default
num_snapshot_nodes int | None

The number of snapshot nodes to cache. Default: 500,000

None
num_chunk_refs int | None

The number of chunk references to cache. Default: 15,000,000

None
num_transaction_changes int | None

The number of transaction changes to cache. Default: 0

None
num_bytes_attributes int | None

The number of bytes of attributes to cache. Default: 0

None
num_bytes_chunks int | None

The number of bytes of chunks to cache. Default: 0

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls,
    num_snapshot_nodes: int | None = None,
    num_chunk_refs: int | None = None,
    num_transaction_changes: int | None = None,
    num_bytes_attributes: int | None = None,
    num_bytes_chunks: int | None = None,
) -> CachingConfig:
    """
    Create a new `CachingConfig` object

    Parameters
    ----------
    num_snapshot_nodes: int | None
        The number of snapshot nodes to cache.
        Default: 500,000
    num_chunk_refs: int | None
        The number of chunk references to cache.
        Default: 15,000,000
    num_transaction_changes: int | None
        The number of transaction changes to cache.
        Default: 0
    num_bytes_attributes: int | None
        The number of bytes of attributes to cache.
        Default: 0
    num_bytes_chunks: int | None
        The number of bytes of chunks to cache.
        Default: 0
    """
    ...

CompressionAlgorithm #

Bases: Enum

Enum for selecting the compression algorithm used by Icechunk to write its metadata files

Attributes:

Name Type Description
Zstd int

The Zstd compression algorithm.

Methods:

Name Description
default

The default compression algorithm used by Icechunk to write its metadata files.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class CompressionAlgorithm(Enum):
    """Enum for selecting the compression algorithm used by Icechunk to write its metadata files

    Attributes
    ----------
    Zstd: int
        The Zstd compression algorithm.
    """

    Zstd = 0

    def __new__(cls) -> CompressionAlgorithm: ...
    @staticmethod
    def default() -> CompressionAlgorithm:
        """
        The default compression algorithm used by Icechunk to write its metadata files.

        Returns
        -------
        CompressionAlgorithm
            The default compression algorithm.
        """
        ...

default staticmethod #

default()

The default compression algorithm used by Icechunk to write its metadata files.

Returns:

Type Description
CompressionAlgorithm

The default compression algorithm.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def default() -> CompressionAlgorithm:
    """
    The default compression algorithm used by Icechunk to write its metadata files.

    Returns
    -------
    CompressionAlgorithm
        The default compression algorithm.
    """
    ...

CompressionConfig #

Configuration for how Icechunk compresses its metadata files

Methods:

Name Description
__new__

Create a new CompressionConfig object

default

The default compression configuration used by Icechunk to write its metadata files.

Attributes:

Name Type Description
algorithm CompressionAlgorithm | None

The compression algorithm used by Icechunk to write its metadata files.

level int | None

The compression level used by Icechunk to write its metadata files.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class CompressionConfig:
    """Configuration for how Icechunk compresses its metadata files"""

    def __new__(
        cls, algorithm: CompressionAlgorithm | None = None, level: int | None = None
    ) -> CompressionConfig:
        """
        Create a new `CompressionConfig` object

        Parameters
        ----------
        algorithm: CompressionAlgorithm | None
            The compression algorithm to use.
            Default: Zstd
        level: int | None
            The compression level to use.
            Default: 3
        """
        ...
    @property
    def algorithm(self) -> CompressionAlgorithm | None:
        """
        The compression algorithm used by Icechunk to write its metadata files.

        Default: Zstd

        Returns
        -------
        CompressionAlgorithm | None
            The compression algorithm used by Icechunk to write its metadata files.
        """
        ...
    @algorithm.setter
    def algorithm(self, value: CompressionAlgorithm | None) -> None:
        """
        Set the compression algorithm used by Icechunk to write its metadata files.

        Parameters
        ----------
        value: CompressionAlgorithm | None
            The compression algorithm to use.
        """
        ...
    @property
    def level(self) -> int | None:
        """
        The compression level used by Icechunk to write its metadata files.

        Default: 3

        Returns
        -------
        int | None
            The compression level used by Icechunk to write its metadata files.
        """
        ...
    @level.setter
    def level(self, value: int | None) -> None:
        """
        Set the compression level used by Icechunk to write its metadata files.

        Parameters
        ----------
        value: int | None
            The compression level to use.
        """
        ...
    @staticmethod
    def default() -> CompressionConfig:
        """
        The default compression configuration used by Icechunk to write its metadata files.

        Returns
        -------
        CompressionConfig
        """

algorithm property writable #

algorithm

The compression algorithm used by Icechunk to write its metadata files.

Default: Zstd

Returns:

Type Description
CompressionAlgorithm | None

The compression algorithm used by Icechunk to write its metadata files.

level property writable #

level

The compression level used by Icechunk to write its metadata files.

Default: 3

Returns:

Type Description
int | None

The compression level used by Icechunk to write its metadata files.

__new__ #

__new__(algorithm=None, level=None)

Create a new CompressionConfig object

Parameters:

Name Type Description Default
algorithm CompressionAlgorithm | None

The compression algorithm to use. Default: Zstd

None
level int | None

The compression level to use. Default: 3

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls, algorithm: CompressionAlgorithm | None = None, level: int | None = None
) -> CompressionConfig:
    """
    Create a new `CompressionConfig` object

    Parameters
    ----------
    algorithm: CompressionAlgorithm | None
        The compression algorithm to use.
        Default: Zstd
    level: int | None
        The compression level to use.
        Default: 3
    """
    ...

default staticmethod #

default()

The default compression configuration used by Icechunk to write its metadata files.

Returns:

Type Description
CompressionConfig
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def default() -> CompressionConfig:
    """
    The default compression configuration used by Icechunk to write its metadata files.

    Returns
    -------
    CompressionConfig
    """

ManifestConfig #

Configuration for how Icechunk manifests

Methods:

Name Description
__new__

Create a new ManifestConfig object

Attributes:

Name Type Description
preload ManifestPreloadConfig | None

The configuration for how Icechunk manifests will be preloaded.

splitting ManifestSplittingConfig | None

The configuration for how Icechunk manifests will be split.

virtual_chunk_location_compression ManifestVirtualChunkLocationCompressionConfig | None

The configuration for zstd compression of virtual chunk location URLs.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestConfig:
    """Configuration for how Icechunk manifests"""

    def __new__(
        cls,
        preload: ManifestPreloadConfig | None = None,
        splitting: ManifestSplittingConfig | None = None,
        virtual_chunk_location_compression: ManifestVirtualChunkLocationCompressionConfig
        | None = None,
    ) -> ManifestConfig:
        """
        Create a new `ManifestConfig` object

        Parameters
        ----------
        preload: ManifestPreloadConfig | None
            The configuration for how Icechunk manifests will be preloaded. When
            None, the default `ManifestPreloadConfig` is used.
            Default: None
        splitting: ManifestSplittingConfig | None
            The configuration for how Icechunk manifests will be split. When
            None, the default `ManifestSplittingConfig` is used.
            Default: None
        virtual_chunk_location_compression: ManifestVirtualChunkLocationCompressionConfig | None
            The configuration for zstd compression of virtual chunk location URLs.
            When None, the default `ManifestVirtualChunkLocationCompressionConfig` is used.
            Default: None
        """
        ...
    @property
    def preload(self) -> ManifestPreloadConfig | None:
        """
        The configuration for how Icechunk manifests will be preloaded.

        Default: None (uses the default `ManifestPreloadConfig`)

        Returns
        -------
        ManifestPreloadConfig | None
            The configuration for how Icechunk manifests will be preloaded.
        """
        ...
    @preload.setter
    def preload(self, value: ManifestPreloadConfig | None) -> None:
        """
        Set the configuration for how Icechunk manifests will be preloaded.

        Parameters
        ----------
        value: ManifestPreloadConfig | None
            The configuration for how Icechunk manifests will be preloaded.
        """
        ...

    @property
    def splitting(self) -> ManifestSplittingConfig | None:
        """
        The configuration for how Icechunk manifests will be split.

        Default: None (uses the default `ManifestSplittingConfig`)

        Returns
        -------
        ManifestSplittingConfig | None
            The configuration for how Icechunk manifests will be split.
        """
        ...

    @splitting.setter
    def splitting(self, value: ManifestSplittingConfig | None) -> None:
        """
        Set the configuration for how Icechunk manifests will be split.

        Parameters
        ----------
        value: ManifestSplittingConfig | None
            The configuration for how Icechunk manifests will be split.
        """
        ...

    @property
    def virtual_chunk_location_compression(
        self,
    ) -> ManifestVirtualChunkLocationCompressionConfig | None:
        """
        The configuration for zstd compression of virtual chunk location URLs.

        Default: None (uses the default `ManifestVirtualChunkLocationCompressionConfig`)

        Returns
        -------
        ManifestVirtualChunkLocationCompressionConfig | None
            The compression configuration.
        """
        ...

    @virtual_chunk_location_compression.setter
    def virtual_chunk_location_compression(
        self, value: ManifestVirtualChunkLocationCompressionConfig | None
    ) -> None:
        """
        Set the configuration for zstd compression of virtual chunk location URLs.

        Parameters
        ----------
        value: ManifestVirtualChunkLocationCompressionConfig | None
            The compression configuration.
        """
        ...

preload property writable #

preload

The configuration for how Icechunk manifests will be preloaded.

Default: None (uses the default ManifestPreloadConfig)

Returns:

Type Description
ManifestPreloadConfig | None

The configuration for how Icechunk manifests will be preloaded.

splitting property writable #

splitting

The configuration for how Icechunk manifests will be split.

Default: None (uses the default ManifestSplittingConfig)

Returns:

Type Description
ManifestSplittingConfig | None

The configuration for how Icechunk manifests will be split.

virtual_chunk_location_compression property writable #

virtual_chunk_location_compression

The configuration for zstd compression of virtual chunk location URLs.

Default: None (uses the default ManifestVirtualChunkLocationCompressionConfig)

Returns:

Type Description
ManifestVirtualChunkLocationCompressionConfig | None

The compression configuration.

__new__ #

__new__(
    preload=None,
    splitting=None,
    virtual_chunk_location_compression=None,
)

Create a new ManifestConfig object

Parameters:

Name Type Description Default
preload ManifestPreloadConfig | None

The configuration for how Icechunk manifests will be preloaded. When None, the default ManifestPreloadConfig is used. Default: None

None
splitting ManifestSplittingConfig | None

The configuration for how Icechunk manifests will be split. When None, the default ManifestSplittingConfig is used. Default: None

None
virtual_chunk_location_compression ManifestVirtualChunkLocationCompressionConfig | None

The configuration for zstd compression of virtual chunk location URLs. When None, the default ManifestVirtualChunkLocationCompressionConfig is used. Default: None

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls,
    preload: ManifestPreloadConfig | None = None,
    splitting: ManifestSplittingConfig | None = None,
    virtual_chunk_location_compression: ManifestVirtualChunkLocationCompressionConfig
    | None = None,
) -> ManifestConfig:
    """
    Create a new `ManifestConfig` object

    Parameters
    ----------
    preload: ManifestPreloadConfig | None
        The configuration for how Icechunk manifests will be preloaded. When
        None, the default `ManifestPreloadConfig` is used.
        Default: None
    splitting: ManifestSplittingConfig | None
        The configuration for how Icechunk manifests will be split. When
        None, the default `ManifestSplittingConfig` is used.
        Default: None
    virtual_chunk_location_compression: ManifestVirtualChunkLocationCompressionConfig | None
        The configuration for zstd compression of virtual chunk location URLs.
        When None, the default `ManifestVirtualChunkLocationCompressionConfig` is used.
        Default: None
    """
    ...

ManifestPreloadCondition #

Configuration for conditions under which manifests will preload on session creation

Methods:

Name Description
__and__

Create a preload condition that matches if both this condition and other match.

__or__

Create a preload condition that matches if either this condition or other match.

and_conditions

Create a preload condition that matches only if all passed conditions match

false

Create a preload condition that never matches any manifests

name_matches

Create a preload condition that matches if the array's name matches the passed regex.

num_refs

Create a preload condition that matches only if the number of chunk references in the manifest is within the given range.

or_conditions

Create a preload condition that matches if any of conditions matches

path_matches

Create a preload condition that matches if the full path to the array matches the passed regex.

true

Create a preload condition that always matches any manifest

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestPreloadCondition:
    """Configuration for conditions under which manifests will preload on session creation"""

    @staticmethod
    def or_conditions(
        conditions: list[ManifestPreloadCondition],
    ) -> ManifestPreloadCondition:
        """Create a preload condition that matches if any of `conditions` matches"""
        ...
    @staticmethod
    def and_conditions(
        conditions: list[ManifestPreloadCondition],
    ) -> ManifestPreloadCondition:
        """Create a preload condition that matches only if all passed `conditions` match"""
        ...
    @staticmethod
    def path_matches(regex: str) -> ManifestPreloadCondition:
        """Create a preload condition that matches if the full path to the array matches the passed regex.

        Array paths are absolute, as in `/path/to/my/array`
        """
        ...
    @staticmethod
    def name_matches(regex: str) -> ManifestPreloadCondition:
        """Create a preload condition that matches if the array's name matches the passed regex.

        Example, for an array  `/model/outputs/temperature`, the following will match:
        ```
        name_matches(".*temp.*")
        ```
        """
        ...
    @staticmethod
    def num_refs(from_refs: int | None, to_refs: int | None) -> ManifestPreloadCondition:
        """Create a preload condition that matches only if the number of chunk references in the manifest is within the given range.

        from_refs is inclusive, to_refs is exclusive.
        """
        ...
    @staticmethod
    def true() -> ManifestPreloadCondition:
        """Create a preload condition that always matches any manifest"""
        ...
    @staticmethod
    def false() -> ManifestPreloadCondition:
        """Create a preload condition that never matches any manifests"""
        ...
    def __and__(self, other: ManifestPreloadCondition, /) -> ManifestPreloadCondition:
        """Create a preload condition that matches if both this condition and `other` match."""
        ...
    def __or__(self, other: ManifestPreloadCondition, /) -> ManifestPreloadCondition:
        """Create a preload condition that matches if either this condition or `other` match."""
        ...

__and__ #

__and__(other)

Create a preload condition that matches if both this condition and other match.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __and__(self, other: ManifestPreloadCondition, /) -> ManifestPreloadCondition:
    """Create a preload condition that matches if both this condition and `other` match."""
    ...

__or__ #

__or__(other)

Create a preload condition that matches if either this condition or other match.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __or__(self, other: ManifestPreloadCondition, /) -> ManifestPreloadCondition:
    """Create a preload condition that matches if either this condition or `other` match."""
    ...

and_conditions staticmethod #

and_conditions(conditions)

Create a preload condition that matches only if all passed conditions match

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def and_conditions(
    conditions: list[ManifestPreloadCondition],
) -> ManifestPreloadCondition:
    """Create a preload condition that matches only if all passed `conditions` match"""
    ...

false staticmethod #

false()

Create a preload condition that never matches any manifests

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def false() -> ManifestPreloadCondition:
    """Create a preload condition that never matches any manifests"""
    ...

name_matches staticmethod #

name_matches(regex)

Create a preload condition that matches if the array's name matches the passed regex.

Example, for an array /model/outputs/temperature, the following will match:

name_matches(".*temp.*")

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def name_matches(regex: str) -> ManifestPreloadCondition:
    """Create a preload condition that matches if the array's name matches the passed regex.

    Example, for an array  `/model/outputs/temperature`, the following will match:
    ```
    name_matches(".*temp.*")
    ```
    """
    ...

num_refs staticmethod #

num_refs(from_refs, to_refs)

Create a preload condition that matches only if the number of chunk references in the manifest is within the given range.

from_refs is inclusive, to_refs is exclusive.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def num_refs(from_refs: int | None, to_refs: int | None) -> ManifestPreloadCondition:
    """Create a preload condition that matches only if the number of chunk references in the manifest is within the given range.

    from_refs is inclusive, to_refs is exclusive.
    """
    ...

or_conditions staticmethod #

or_conditions(conditions)

Create a preload condition that matches if any of conditions matches

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def or_conditions(
    conditions: list[ManifestPreloadCondition],
) -> ManifestPreloadCondition:
    """Create a preload condition that matches if any of `conditions` matches"""
    ...

path_matches staticmethod #

path_matches(regex)

Create a preload condition that matches if the full path to the array matches the passed regex.

Array paths are absolute, as in /path/to/my/array

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def path_matches(regex: str) -> ManifestPreloadCondition:
    """Create a preload condition that matches if the full path to the array matches the passed regex.

    Array paths are absolute, as in `/path/to/my/array`
    """
    ...

true staticmethod #

true()

Create a preload condition that always matches any manifest

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def true() -> ManifestPreloadCondition:
    """Create a preload condition that always matches any manifest"""
    ...

ManifestPreloadConfig #

Configuration for how Icechunk manifest preload on session creation

Methods:

Name Description
__new__

Create a new ManifestPreloadConfig object

Attributes:

Name Type Description
max_arrays_to_scan int | None

The maximum number of arrays to scan when looking for manifests to preload.

max_total_refs int | None

The maximum number of references to preload.

preload_if ManifestPreloadCondition | None

The condition under which manifests will be preloaded.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestPreloadConfig:
    """Configuration for how Icechunk manifest preload on session creation"""

    def __new__(
        cls,
        max_total_refs: int | None = None,
        preload_if: ManifestPreloadCondition | None = None,
        max_arrays_to_scan: int | None = None,
    ) -> ManifestPreloadConfig:
        """
        Create a new `ManifestPreloadConfig` object

        Parameters
        ----------
        max_total_refs: int | None
            The maximum number of references to preload.
            Default: 10,000
        preload_if: ManifestPreloadCondition | None
            The condition under which manifests will be preloaded. When None,
            preloads arrays whose name matches CF-like coordinate names
            (e.g. ``time``, ``lat``, ``lon``, ``x``, ``y``, ``z``) and whose
            manifest has at most 1,000 chunk references. The name-matching
            regexes are lifted from cf-xarray's coordinate-axis heuristics;
            see https://cf-xarray.readthedocs.io/en/latest/generated/cf_xarray.accessor.CFAccessor.html#cf_xarray.accessor.CFAccessor.guess_coord_axis.
            Default: None
        max_arrays_to_scan: int | None
            The maximum number of arrays to scan when looking for manifests to preload.
            Increase for repositories with many nested groups.
            Default: 50
        """
        ...
    @property
    def max_total_refs(self) -> int | None:
        """
        The maximum number of references to preload.

        Default: 10,000

        Returns
        -------
        int | None
            The maximum number of references to preload.
        """
        ...
    @max_total_refs.setter
    def max_total_refs(self, value: int | None) -> None:
        """
        Set the maximum number of references to preload.

        Parameters
        ----------
        value: int | None
            The maximum number of references to preload.
        """
        ...
    @property
    def preload_if(self) -> ManifestPreloadCondition | None:
        """
        The condition under which manifests will be preloaded.

        Default: None (preload arrays with CF-like coordinate names and at most 1,000 chunk references)

        Returns
        -------
        ManifestPreloadCondition | None
            The condition under which manifests will be preloaded.
        """
        ...
    @preload_if.setter
    def preload_if(self, value: ManifestPreloadCondition | None) -> None:
        """
        Set the condition under which manifests will be preloaded.

        Parameters
        ----------
        value: ManifestPreloadCondition | None
            The condition under which manifests will be preloaded.
        """
        ...
    @property
    def max_arrays_to_scan(self) -> int | None:
        """
        The maximum number of arrays to scan when looking for manifests to preload.

        Default: 50

        Returns
        -------
        int | None
            The maximum number of arrays to scan.
        """
        ...
    @max_arrays_to_scan.setter
    def max_arrays_to_scan(self, value: int | None) -> None:
        """
        Set the maximum number of arrays to scan when looking for manifests to preload.

        Parameters
        ----------
        value: int | None
            The maximum number of arrays to scan.
        """
        ...

max_arrays_to_scan property writable #

max_arrays_to_scan

The maximum number of arrays to scan when looking for manifests to preload.

Default: 50

Returns:

Type Description
int | None

The maximum number of arrays to scan.

max_total_refs property writable #

max_total_refs

The maximum number of references to preload.

Default: 10,000

Returns:

Type Description
int | None

The maximum number of references to preload.

preload_if property writable #

preload_if

The condition under which manifests will be preloaded.

Default: None (preload arrays with CF-like coordinate names and at most 1,000 chunk references)

Returns:

Type Description
ManifestPreloadCondition | None

The condition under which manifests will be preloaded.

__new__ #

__new__(
    max_total_refs=None,
    preload_if=None,
    max_arrays_to_scan=None,
)

Create a new ManifestPreloadConfig object

Parameters:

Name Type Description Default
max_total_refs int | None

The maximum number of references to preload. Default: 10,000

None
preload_if ManifestPreloadCondition | None

The condition under which manifests will be preloaded. When None, preloads arrays whose name matches CF-like coordinate names (e.g. time, lat, lon, x, y, z) and whose manifest has at most 1,000 chunk references. The name-matching regexes are lifted from cf-xarray's coordinate-axis heuristics; see https://cf-xarray.readthedocs.io/en/latest/generated/cf_xarray.accessor.CFAccessor.html#cf_xarray.accessor.CFAccessor.guess_coord_axis. Default: None

None
max_arrays_to_scan int | None

The maximum number of arrays to scan when looking for manifests to preload. Increase for repositories with many nested groups. Default: 50

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls,
    max_total_refs: int | None = None,
    preload_if: ManifestPreloadCondition | None = None,
    max_arrays_to_scan: int | None = None,
) -> ManifestPreloadConfig:
    """
    Create a new `ManifestPreloadConfig` object

    Parameters
    ----------
    max_total_refs: int | None
        The maximum number of references to preload.
        Default: 10,000
    preload_if: ManifestPreloadCondition | None
        The condition under which manifests will be preloaded. When None,
        preloads arrays whose name matches CF-like coordinate names
        (e.g. ``time``, ``lat``, ``lon``, ``x``, ``y``, ``z``) and whose
        manifest has at most 1,000 chunk references. The name-matching
        regexes are lifted from cf-xarray's coordinate-axis heuristics;
        see https://cf-xarray.readthedocs.io/en/latest/generated/cf_xarray.accessor.CFAccessor.html#cf_xarray.accessor.CFAccessor.guess_coord_axis.
        Default: None
    max_arrays_to_scan: int | None
        The maximum number of arrays to scan when looking for manifests to preload.
        Increase for repositories with many nested groups.
        Default: 50
    """
    ...

ManifestSplitCondition #

Configuration for conditions under which manifests will be split into splits

Methods:

Name Description
AnyArray

Create a splitting condition that matches any array.

__and__

Create a splitting condition that matches if both this condition and other match

__or__

Create a splitting condition that matches if either this condition or other matches

and_conditions

Create a splitting condition that matches only if all passed conditions match

name_matches

Create a splitting condition that matches if the array's name matches the passed regex.

or_conditions

Create a splitting condition that matches if any of conditions matches

path_matches

Create a splitting condition that matches if the full path to the array matches the passed regex.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestSplitCondition:
    """Configuration for conditions under which manifests will be split into splits"""

    @staticmethod
    def or_conditions(
        conditions: list[ManifestSplitCondition],
    ) -> ManifestSplitCondition:
        """Create a splitting condition that matches if any of `conditions` matches"""
        ...
    @staticmethod
    def and_conditions(
        conditions: list[ManifestSplitCondition],
    ) -> ManifestSplitCondition:
        """Create a splitting condition that matches only if all passed `conditions` match"""
        ...
    @staticmethod
    def path_matches(regex: str) -> ManifestSplitCondition:
        """Create a splitting condition that matches if the full path to the array matches the passed regex.

        Array paths are absolute, as in `/path/to/my/array`
        """
        ...
    @staticmethod
    def name_matches(regex: str) -> ManifestSplitCondition:
        """Create a splitting condition that matches if the array's name matches the passed regex.

        Example, for an array  `/model/outputs/temperature`, the following will match:
        ```
        name_matches(".*temp.*")
        ```
        """
        ...

    @staticmethod
    def AnyArray() -> ManifestSplitCondition:
        """Create a splitting condition that matches any array."""
        ...

    def __or__(self, other: ManifestSplitCondition, /) -> ManifestSplitCondition:
        """Create a splitting condition that matches if either this condition or `other` matches"""
        ...

    def __and__(self, other: ManifestSplitCondition, /) -> ManifestSplitCondition:
        """Create a splitting condition that matches if both this condition and `other` match"""
        ...

AnyArray staticmethod #

AnyArray()

Create a splitting condition that matches any array.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def AnyArray() -> ManifestSplitCondition:
    """Create a splitting condition that matches any array."""
    ...

__and__ #

__and__(other)

Create a splitting condition that matches if both this condition and other match

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __and__(self, other: ManifestSplitCondition, /) -> ManifestSplitCondition:
    """Create a splitting condition that matches if both this condition and `other` match"""
    ...

__or__ #

__or__(other)

Create a splitting condition that matches if either this condition or other matches

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __or__(self, other: ManifestSplitCondition, /) -> ManifestSplitCondition:
    """Create a splitting condition that matches if either this condition or `other` matches"""
    ...

and_conditions staticmethod #

and_conditions(conditions)

Create a splitting condition that matches only if all passed conditions match

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def and_conditions(
    conditions: list[ManifestSplitCondition],
) -> ManifestSplitCondition:
    """Create a splitting condition that matches only if all passed `conditions` match"""
    ...

name_matches staticmethod #

name_matches(regex)

Create a splitting condition that matches if the array's name matches the passed regex.

Example, for an array /model/outputs/temperature, the following will match:

name_matches(".*temp.*")

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def name_matches(regex: str) -> ManifestSplitCondition:
    """Create a splitting condition that matches if the array's name matches the passed regex.

    Example, for an array  `/model/outputs/temperature`, the following will match:
    ```
    name_matches(".*temp.*")
    ```
    """
    ...

or_conditions staticmethod #

or_conditions(conditions)

Create a splitting condition that matches if any of conditions matches

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def or_conditions(
    conditions: list[ManifestSplitCondition],
) -> ManifestSplitCondition:
    """Create a splitting condition that matches if any of `conditions` matches"""
    ...

path_matches staticmethod #

path_matches(regex)

Create a splitting condition that matches if the full path to the array matches the passed regex.

Array paths are absolute, as in /path/to/my/array

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def path_matches(regex: str) -> ManifestSplitCondition:
    """Create a splitting condition that matches if the full path to the array matches the passed regex.

    Array paths are absolute, as in `/path/to/my/array`
    """
    ...

ManifestSplitDimCondition #

Conditions for specifying dimensions along which to shard manifests.

Classes:

Name Description
Any

Split along any other unspecified dimension.

Axis

Split along specified integer axis.

DimensionName

Split along specified named dimension.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestSplitDimCondition:
    """Conditions for specifying dimensions along which to shard manifests."""
    class Axis:
        """Split along specified integer axis."""
        def __new__(cls, axis: int) -> ManifestSplitDimCondition.Axis: ...

    class DimensionName:
        """Split along specified named dimension."""
        def __new__(cls, regex: str) -> ManifestSplitDimCondition.DimensionName: ...

    class Any:
        """Split along any other unspecified dimension."""
        def __new__(cls) -> ManifestSplitDimCondition.Any: ...

Any #

Split along any other unspecified dimension.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class Any:
    """Split along any other unspecified dimension."""
    def __new__(cls) -> ManifestSplitDimCondition.Any: ...

Axis #

Split along specified integer axis.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class Axis:
    """Split along specified integer axis."""
    def __new__(cls, axis: int) -> ManifestSplitDimCondition.Axis: ...

DimensionName #

Split along specified named dimension.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class DimensionName:
    """Split along specified named dimension."""
    def __new__(cls, regex: str) -> ManifestSplitDimCondition.DimensionName: ...

ManifestSplittingConfig #

Configuration for manifest splitting.

Methods:

Name Description
__new__

Configuration for how Icechunk manifests will be split.

Attributes:

Name Type Description
split_sizes _SplitSizes

Configuration for how Icechunk manifests will be split.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestSplittingConfig:
    """Configuration for manifest splitting."""

    @staticmethod
    def from_dict(
        split_sizes: dict[
            ManifestSplitCondition,
            dict[
                ManifestSplitDimCondition.Axis
                | ManifestSplitDimCondition.DimensionName
                | ManifestSplitDimCondition.Any,
                int,
            ],
        ],
    ) -> ManifestSplittingConfig: ...
    def to_dict(
        self,
    ) -> dict[
        ManifestSplitCondition,
        dict[
            ManifestSplitDimCondition.Axis
            | ManifestSplitDimCondition.DimensionName
            | ManifestSplitDimCondition.Any,
            int,
        ],
    ]: ...
    def __new__(cls, split_sizes: _SplitSizes | None = None) -> ManifestSplittingConfig:
        """Configuration for how Icechunk manifests will be split.

        Parameters
        ----------
        split_sizes: tuple[tuple[ManifestSplitCondition, tuple[tuple[ManifestSplitDimCondition, int], ...]], ...]
            The configuration for how Icechunk manifests will be split.
            Default: None (a single rule matching every array with no splitting, i.e. one manifest per array)

        Examples
        --------

        Split manifests for the `temperature` array, with 3 chunks per shard along the `longitude` dimension.
        >>> ManifestSplittingConfig.from_dict(
        ...     {
        ...         ManifestSplitCondition.name_matches("temperature"): {
        ...             ManifestSplitDimCondition.DimensionName("longitude"): 3
        ...         }
        ...     }
        ... )
        """
        pass

    @property
    def split_sizes(self) -> _SplitSizes:
        """
        Configuration for how Icechunk manifests will be split.

        Default: None (a single rule matching every array with no splitting, i.e. one manifest per array)

        Returns
        -------
        tuple[tuple[ManifestSplitCondition, tuple[tuple[ManifestSplitDimCondition, int], ...]], ...]
            The configuration for how Icechunk manifests will be split.
        """
        ...

    @split_sizes.setter
    def split_sizes(self, value: _SplitSizes) -> None:
        """
        Set the sizes for how Icechunk manifests will be split.

        Parameters
        ----------
        value: tuple[tuple[ManifestSplitCondition, tuple[tuple[ManifestSplitDimCondition, int], ...]], ...]
            The configuration for how Icechunk manifests will be preloaded.
        """
        ...

split_sizes property writable #

split_sizes

Configuration for how Icechunk manifests will be split.

Default: None (a single rule matching every array with no splitting, i.e. one manifest per array)

Returns:

Type Description
tuple[tuple[ManifestSplitCondition, tuple[tuple[ManifestSplitDimCondition, int], ...]], ...]

The configuration for how Icechunk manifests will be split.

__new__ #

__new__(split_sizes=None)

Configuration for how Icechunk manifests will be split.

Parameters:

Name Type Description Default
split_sizes _SplitSizes | None

The configuration for how Icechunk manifests will be split. Default: None (a single rule matching every array with no splitting, i.e. one manifest per array)

None

Examples:

Split manifests for the temperature array, with 3 chunks per shard along the longitude dimension.

>>> ManifestSplittingConfig.from_dict(
...     {
...         ManifestSplitCondition.name_matches("temperature"): {
...             ManifestSplitDimCondition.DimensionName("longitude"): 3
...         }
...     }
... )
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(cls, split_sizes: _SplitSizes | None = None) -> ManifestSplittingConfig:
    """Configuration for how Icechunk manifests will be split.

    Parameters
    ----------
    split_sizes: tuple[tuple[ManifestSplitCondition, tuple[tuple[ManifestSplitDimCondition, int], ...]], ...]
        The configuration for how Icechunk manifests will be split.
        Default: None (a single rule matching every array with no splitting, i.e. one manifest per array)

    Examples
    --------

    Split manifests for the `temperature` array, with 3 chunks per shard along the `longitude` dimension.
    >>> ManifestSplittingConfig.from_dict(
    ...     {
    ...         ManifestSplitCondition.name_matches("temperature"): {
    ...             ManifestSplitDimCondition.DimensionName("longitude"): 3
    ...         }
    ...     }
    ... )
    """
    pass

ManifestVirtualChunkLocationCompressionConfig #

Configuration for zstd dictionary-based compression of virtual chunk location URLs in manifests.

Methods:

Name Description
__new__

Create a new ManifestVirtualChunkLocationCompressionConfig object

Attributes:

Name Type Description
compression_level int | None

Zstd compression level applied to virtual chunk location URLs.

dictionary_max_size_bytes int | None

Maximum size of the trained compression dictionary in bytes.

dictionary_max_training_samples int | None

Maximum number of URL samples used to train the compression dictionary.

min_num_chunks int | None

Minimum number of virtual chunks required to enable compression.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
class ManifestVirtualChunkLocationCompressionConfig:
    """Configuration for zstd dictionary-based compression of virtual chunk location URLs in manifests."""

    def __new__(
        cls,
        min_num_chunks: int | None = None,
        *,
        dictionary_max_training_samples: int | None = None,
        dictionary_max_size_bytes: int | None = None,
        compression_level: int | None = None,
    ) -> ManifestVirtualChunkLocationCompressionConfig:
        """
        Create a new `ManifestVirtualChunkLocationCompressionConfig` object

        Parameters
        ----------
        min_num_chunks: int | None
            Minimum number of virtual chunks required to enable compression.
            Default: 1,000
        dictionary_max_training_samples: int | None
            Maximum number of URL samples used to train the compression dictionary.
            Default: 100
        dictionary_max_size_bytes: int | None
            Maximum size of the trained compression dictionary in bytes.
            Default: 2,048
        compression_level: int | None
            Zstd compression level.
            Default: 3
        """
        ...

    @property
    def min_num_chunks(self) -> int | None:
        """
        Minimum number of virtual chunks required to enable compression.

        Default: 1,000

        Returns
        -------
        int | None
            The threshold below which virtual chunk locations are not compressed.
        """
        ...
    @min_num_chunks.setter
    def min_num_chunks(self, value: int | None) -> None: ...
    @property
    def dictionary_max_training_samples(self) -> int | None:
        """
        Maximum number of URL samples used to train the compression dictionary.

        Default: 100

        Returns
        -------
        int | None
            The maximum number of URL samples used during dictionary training.
        """
        ...
    @dictionary_max_training_samples.setter
    def dictionary_max_training_samples(self, value: int | None) -> None: ...
    @property
    def dictionary_max_size_bytes(self) -> int | None:
        """
        Maximum size of the trained compression dictionary in bytes.

        Default: 2,048

        Returns
        -------
        int | None
            The maximum dictionary size in bytes.
        """
        ...
    @dictionary_max_size_bytes.setter
    def dictionary_max_size_bytes(self, value: int | None) -> None: ...
    @property
    def compression_level(self) -> int | None:
        """
        Zstd compression level applied to virtual chunk location URLs.

        Default: 3

        Returns
        -------
        int | None
            The zstd compression level.
        """
        ...
    @compression_level.setter
    def compression_level(self, value: int | None) -> None: ...

compression_level property writable #

compression_level

Zstd compression level applied to virtual chunk location URLs.

Default: 3

Returns:

Type Description
int | None

The zstd compression level.

dictionary_max_size_bytes property writable #

dictionary_max_size_bytes

Maximum size of the trained compression dictionary in bytes.

Default: 2,048

Returns:

Type Description
int | None

The maximum dictionary size in bytes.

dictionary_max_training_samples property writable #

dictionary_max_training_samples

Maximum number of URL samples used to train the compression dictionary.

Default: 100

Returns:

Type Description
int | None

The maximum number of URL samples used during dictionary training.

min_num_chunks property writable #

min_num_chunks

Minimum number of virtual chunks required to enable compression.

Default: 1,000

Returns:

Type Description
int | None

The threshold below which virtual chunk locations are not compressed.

__new__ #

__new__(
    min_num_chunks=None,
    *,
    dictionary_max_training_samples=None,
    dictionary_max_size_bytes=None,
    compression_level=None,
)

Create a new ManifestVirtualChunkLocationCompressionConfig object

Parameters:

Name Type Description Default
min_num_chunks int | None

Minimum number of virtual chunks required to enable compression. Default: 1,000

None
dictionary_max_training_samples int | None

Maximum number of URL samples used to train the compression dictionary. Default: 100

None
dictionary_max_size_bytes int | None

Maximum size of the trained compression dictionary in bytes. Default: 2,048

None
compression_level int | None

Zstd compression level. Default: 3

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls,
    min_num_chunks: int | None = None,
    *,
    dictionary_max_training_samples: int | None = None,
    dictionary_max_size_bytes: int | None = None,
    compression_level: int | None = None,
) -> ManifestVirtualChunkLocationCompressionConfig:
    """
    Create a new `ManifestVirtualChunkLocationCompressionConfig` object

    Parameters
    ----------
    min_num_chunks: int | None
        Minimum number of virtual chunks required to enable compression.
        Default: 1,000
    dictionary_max_training_samples: int | None
        Maximum number of URL samples used to train the compression dictionary.
        Default: 100
    dictionary_max_size_bytes: int | None
        Maximum size of the trained compression dictionary in bytes.
        Default: 2,048
    compression_level: int | None
        Zstd compression level.
        Default: 3
    """
    ...

RepositoryConfig #

Configuration for an Icechunk repository

Methods:

Name Description
__new__

Create a new RepositoryConfig object

clear_virtual_chunk_containers

Clear all virtual chunk containers from the repository.

default

Create a default repository config instance

get_virtual_chunk_container

Get the virtual chunk container for the repository associated with the given name.

merge

Merge another RepositoryConfig with this one.

set_virtual_chunk_container

Add or update a virtual chunk container in the repository configuration.

Attributes:

Name Type Description
caching CachingConfig | None

The caching configuration for the repository.

compression CompressionConfig | None

The compression configuration for the repository.

get_partial_values_concurrency int | None

The number of concurrent requests to make when getting partial values from storage.

inline_chunk_threshold_bytes int | None

The maximum size of a chunk that will be stored inline in the repository. Chunks larger than this size will be written to storage.

manifest ManifestConfig | None

The manifest configuration for the repository.

max_concurrent_requests int | None

The maximum number of concurrent HTTP requests Icechunk will do for this repo.

num_updates_per_repo_info_file int | None

Maximum number of updates stored in a single repo info file. When this

repo_update_retries RepoUpdateRetryConfig | None

Retry configuration for repo info update operations.

storage StorageSettings | None

The storage configuration for the repository.

virtual_chunk_containers dict[str, VirtualChunkContainer] | None

The virtual chunk containers for the repository.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
class RepositoryConfig:
    """Configuration for an Icechunk repository"""

    def __new__(
        cls,
        inline_chunk_threshold_bytes: int | None = None,
        get_partial_values_concurrency: int | None = None,
        compression: CompressionConfig | None = None,
        max_concurrent_requests: int | None = None,
        caching: CachingConfig | None = None,
        storage: StorageSettings | None = None,
        virtual_chunk_containers: dict[str, VirtualChunkContainer] | None = None,
        manifest: ManifestConfig | None = None,
        repo_update_retries: RepoUpdateRetryConfig | None = None,
        num_updates_per_repo_info_file: int | None = None,
    ) -> RepositoryConfig:
        """
        Create a new `RepositoryConfig` object

        Parameters
        ----------
        inline_chunk_threshold_bytes: int | None
            The maximum size of a chunk that will be stored inline in the repository.
            Default: 512
        get_partial_values_concurrency: int | None
            The number of concurrent requests to make when getting partial values from storage.
            Default: 10
        compression: CompressionConfig | None
            The compression configuration for the repository. When None, the
            default `CompressionConfig` is used.
            Default: None
        max_concurrent_requests: int | None
            The maximum number of concurrent HTTP requests Icechunk will do for this repo.
            Default: 256
        caching: CachingConfig | None
            The caching configuration for the repository. When None, the default
            `CachingConfig` is used.
            Default: None
        storage: StorageSettings | None
            The storage configuration for the repository. When None, the
            storage backend's own default settings apply.
            Default: None
        virtual_chunk_containers: dict[str, VirtualChunkContainer] | None
            The virtual chunk containers for the repository.
            Default: None
        manifest: ManifestConfig | None
            The manifest configuration for the repository. When None, the
            default `ManifestConfig` is used.
            Default: None
        repo_update_retries: RepoUpdateRetryConfig | None
            Retry configuration for repo info update operations. When None,
            the default `RepoUpdateRetryConfig` is used.
            Default: None
        num_updates_per_repo_info_file: int | None
            Maximum number of updates stored in a single repo info file. When this
            limit is reached, a new repo info file is created. Lower values produce
            slightly smaller repo info files but require more object fetches to
            reconstruct the ops log.
            Default: 1,000
        """
        ...
    def __repr__(self, /) -> str: ...
    def __str__(self, /) -> str: ...
    def _repr_html_(self, /) -> str: ...
    @staticmethod
    def default() -> RepositoryConfig:
        """Create a default repository config instance"""
        ...
    @property
    def inline_chunk_threshold_bytes(self) -> int | None:
        """
        The maximum size of a chunk that will be stored inline in the repository. Chunks larger than this size will be written to storage.

        Default: 512
        """
        ...
    @inline_chunk_threshold_bytes.setter
    def inline_chunk_threshold_bytes(self, value: int | None) -> None:
        """
        Set the maximum size of a chunk that will be stored inline in the repository. Chunks larger than this size will be written to storage.
        """
        ...
    @property
    def get_partial_values_concurrency(self) -> int | None:
        """
        The number of concurrent requests to make when getting partial values from storage.

        Default: 10

        Returns
        -------
        int | None
            The number of concurrent requests to make when getting partial values from storage.
        """
        ...
    @get_partial_values_concurrency.setter
    def get_partial_values_concurrency(self, value: int | None) -> None:
        """
        Set the number of concurrent requests to make when getting partial values from storage.

        Parameters
        ----------
        value: int | None
            The number of concurrent requests to make when getting partial values from storage.
        """
        ...
    @property
    def compression(self) -> CompressionConfig | None:
        """
        The compression configuration for the repository.

        Default: None (uses the default `CompressionConfig`)

        Returns
        -------
        CompressionConfig | None
            The compression configuration for the repository.
        """
        ...
    @compression.setter
    def compression(self, value: CompressionConfig | None) -> None:
        """
        Set the compression configuration for the repository.

        Parameters
        ----------
        value: CompressionConfig | None
            The compression configuration for the repository.
        """
        ...
    @property
    def max_concurrent_requests(self) -> int | None:
        """
        The maximum number of concurrent HTTP requests Icechunk will do for this repo.

        Default: 256

        Returns
        -------
        int | None
            The maximum number of concurrent HTTP requests Icechunk will do for this repo.
        """
        ...
    @max_concurrent_requests.setter
    def max_concurrent_requests(self, value: int | None) -> None:
        """
        Set the maximum number of concurrent HTTP requests Icechunk should do for this repo.

        Parameters
        ----------
        value: int | None
            The maximum allowed.
        """
        ...
    @property
    def caching(self) -> CachingConfig | None:
        """
        The caching configuration for the repository.

        Default: None (uses the default `CachingConfig`)

        Returns
        -------
        CachingConfig | None
            The caching configuration for the repository.
        """
        ...
    @caching.setter
    def caching(self, value: CachingConfig | None) -> None:
        """
        Set the caching configuration for the repository.

        Parameters
        ----------
        value: CachingConfig | None
            The caching configuration for the repository.
        """
        ...
    @property
    def storage(self) -> StorageSettings | None:
        """
        The storage configuration for the repository.

        Default: None (the storage backend's own default settings apply)

        Returns
        -------
        StorageSettings | None
            The storage configuration for the repository.
        """
        ...
    @storage.setter
    def storage(self, value: StorageSettings | None) -> None:
        """
        Set the storage configuration for the repository.

        Parameters
        ----------
        value: StorageSettings | None
            The storage configuration for the repository.
        """
        ...
    @property
    def manifest(self) -> ManifestConfig | None:
        """
        The manifest configuration for the repository.

        Default: None (uses the default `ManifestConfig`)

        Returns
        -------
        ManifestConfig | None
            The manifest configuration for the repository.
        """
        ...
    @manifest.setter
    def manifest(self, value: ManifestConfig | None) -> None:
        """
        Set the manifest configuration for the repository.

        Parameters
        ----------
        value: ManifestConfig | None
            The manifest configuration for the repository.
        """
        ...
    @property
    def virtual_chunk_containers(self) -> dict[str, VirtualChunkContainer] | None:
        """
        The virtual chunk containers for the repository.

        Default: None

        Returns
        -------
        dict[str, VirtualChunkContainer] | None
            The virtual chunk containers for the repository.
        """
        ...
    def get_virtual_chunk_container(self, name: str) -> VirtualChunkContainer | None:
        """
        Get the virtual chunk container for the repository associated with the given name.

        Parameters
        ----------
        name: str
            The name of the virtual chunk container to get.

        Returns
        -------
        VirtualChunkContainer | None
            The virtual chunk container for the repository associated with the given name.
        """
        ...
    def set_virtual_chunk_container(self, cont: VirtualChunkContainer) -> None:
        """
        Add or update a virtual chunk container in the repository configuration.

        For named containers, the name is the identity: if a container with the
        same name already exists (even with a different url_prefix), it will be
        replaced. For unnamed containers, the url_prefix is the key.

        Parameters
        ----------
        cont: VirtualChunkContainer
            The virtual chunk container to set.
        """
        ...
    def clear_virtual_chunk_containers(self) -> None:
        """
        Clear all virtual chunk containers from the repository.
        """
        ...
    @property
    def repo_update_retries(self) -> RepoUpdateRetryConfig | None:
        """Retry configuration for repo info update operations.

        Default: None (uses the default `RepoUpdateRetryConfig`)
        """
        ...
    @repo_update_retries.setter
    def repo_update_retries(self, value: RepoUpdateRetryConfig | None) -> None: ...
    @property
    def num_updates_per_repo_info_file(self) -> int | None:
        """Maximum number of updates stored in a single repo info file. When this
        limit is reached, a new repo info file is created. Lower values produce
        slightly smaller repo info files but require more object fetches to
        reconstruct the ops log.

        Default: 1,000
        """
        ...
    @num_updates_per_repo_info_file.setter
    def num_updates_per_repo_info_file(self, value: int | None) -> None: ...
    def merge(self, other: RepositoryConfig) -> RepositoryConfig:
        """
        Merge another RepositoryConfig with this one.

        When merging, values from the other config take precedence. For nested configs
        (compression, caching, manifest, storage), the merge is applied recursively.
        For virtual_chunk_containers, entries from the other config extend this one.

        Parameters
        ----------
        other: RepositoryConfig
            The configuration to merge with this one.

        Returns
        -------
        RepositoryConfig
            A new merged configuration.
        """
        ...

caching property writable #

caching

The caching configuration for the repository.

Default: None (uses the default CachingConfig)

Returns:

Type Description
CachingConfig | None

The caching configuration for the repository.

compression property writable #

compression

The compression configuration for the repository.

Default: None (uses the default CompressionConfig)

Returns:

Type Description
CompressionConfig | None

The compression configuration for the repository.

get_partial_values_concurrency property writable #

get_partial_values_concurrency

The number of concurrent requests to make when getting partial values from storage.

Default: 10

Returns:

Type Description
int | None

The number of concurrent requests to make when getting partial values from storage.

inline_chunk_threshold_bytes property writable #

inline_chunk_threshold_bytes

The maximum size of a chunk that will be stored inline in the repository. Chunks larger than this size will be written to storage.

Default: 512

manifest property writable #

manifest

The manifest configuration for the repository.

Default: None (uses the default ManifestConfig)

Returns:

Type Description
ManifestConfig | None

The manifest configuration for the repository.

max_concurrent_requests property writable #

max_concurrent_requests

The maximum number of concurrent HTTP requests Icechunk will do for this repo.

Default: 256

Returns:

Type Description
int | None

The maximum number of concurrent HTTP requests Icechunk will do for this repo.

num_updates_per_repo_info_file property writable #

num_updates_per_repo_info_file

Maximum number of updates stored in a single repo info file. When this limit is reached, a new repo info file is created. Lower values produce slightly smaller repo info files but require more object fetches to reconstruct the ops log.

Default: 1,000

repo_update_retries property writable #

repo_update_retries

Retry configuration for repo info update operations.

Default: None (uses the default RepoUpdateRetryConfig)

storage property writable #

storage

The storage configuration for the repository.

Default: None (the storage backend's own default settings apply)

Returns:

Type Description
StorageSettings | None

The storage configuration for the repository.

virtual_chunk_containers property #

virtual_chunk_containers

The virtual chunk containers for the repository.

Default: None

Returns:

Type Description
dict[str, VirtualChunkContainer] | None

The virtual chunk containers for the repository.

__new__ #

__new__(
    inline_chunk_threshold_bytes=None,
    get_partial_values_concurrency=None,
    compression=None,
    max_concurrent_requests=None,
    caching=None,
    storage=None,
    virtual_chunk_containers=None,
    manifest=None,
    repo_update_retries=None,
    num_updates_per_repo_info_file=None,
)

Create a new RepositoryConfig object

Parameters:

Name Type Description Default
inline_chunk_threshold_bytes int | None

The maximum size of a chunk that will be stored inline in the repository. Default: 512

None
get_partial_values_concurrency int | None

The number of concurrent requests to make when getting partial values from storage. Default: 10

None
compression CompressionConfig | None

The compression configuration for the repository. When None, the default CompressionConfig is used. Default: None

None
max_concurrent_requests int | None

The maximum number of concurrent HTTP requests Icechunk will do for this repo. Default: 256

None
caching CachingConfig | None

The caching configuration for the repository. When None, the default CachingConfig is used. Default: None

None
storage StorageSettings | None

The storage configuration for the repository. When None, the storage backend's own default settings apply. Default: None

None
virtual_chunk_containers dict[str, VirtualChunkContainer] | None

The virtual chunk containers for the repository. Default: None

None
manifest ManifestConfig | None

The manifest configuration for the repository. When None, the default ManifestConfig is used. Default: None

None
repo_update_retries RepoUpdateRetryConfig | None

Retry configuration for repo info update operations. When None, the default RepoUpdateRetryConfig is used. Default: None

None
num_updates_per_repo_info_file int | None

Maximum number of updates stored in a single repo info file. When this limit is reached, a new repo info file is created. Lower values produce slightly smaller repo info files but require more object fetches to reconstruct the ops log. Default: 1,000

None
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def __new__(
    cls,
    inline_chunk_threshold_bytes: int | None = None,
    get_partial_values_concurrency: int | None = None,
    compression: CompressionConfig | None = None,
    max_concurrent_requests: int | None = None,
    caching: CachingConfig | None = None,
    storage: StorageSettings | None = None,
    virtual_chunk_containers: dict[str, VirtualChunkContainer] | None = None,
    manifest: ManifestConfig | None = None,
    repo_update_retries: RepoUpdateRetryConfig | None = None,
    num_updates_per_repo_info_file: int | None = None,
) -> RepositoryConfig:
    """
    Create a new `RepositoryConfig` object

    Parameters
    ----------
    inline_chunk_threshold_bytes: int | None
        The maximum size of a chunk that will be stored inline in the repository.
        Default: 512
    get_partial_values_concurrency: int | None
        The number of concurrent requests to make when getting partial values from storage.
        Default: 10
    compression: CompressionConfig | None
        The compression configuration for the repository. When None, the
        default `CompressionConfig` is used.
        Default: None
    max_concurrent_requests: int | None
        The maximum number of concurrent HTTP requests Icechunk will do for this repo.
        Default: 256
    caching: CachingConfig | None
        The caching configuration for the repository. When None, the default
        `CachingConfig` is used.
        Default: None
    storage: StorageSettings | None
        The storage configuration for the repository. When None, the
        storage backend's own default settings apply.
        Default: None
    virtual_chunk_containers: dict[str, VirtualChunkContainer] | None
        The virtual chunk containers for the repository.
        Default: None
    manifest: ManifestConfig | None
        The manifest configuration for the repository. When None, the
        default `ManifestConfig` is used.
        Default: None
    repo_update_retries: RepoUpdateRetryConfig | None
        Retry configuration for repo info update operations. When None,
        the default `RepoUpdateRetryConfig` is used.
        Default: None
    num_updates_per_repo_info_file: int | None
        Maximum number of updates stored in a single repo info file. When this
        limit is reached, a new repo info file is created. Lower values produce
        slightly smaller repo info files but require more object fetches to
        reconstruct the ops log.
        Default: 1,000
    """
    ...

clear_virtual_chunk_containers #

clear_virtual_chunk_containers()

Clear all virtual chunk containers from the repository.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def clear_virtual_chunk_containers(self) -> None:
    """
    Clear all virtual chunk containers from the repository.
    """
    ...

default staticmethod #

default()

Create a default repository config instance

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
@staticmethod
def default() -> RepositoryConfig:
    """Create a default repository config instance"""
    ...

get_virtual_chunk_container #

get_virtual_chunk_container(name)

Get the virtual chunk container for the repository associated with the given name.

Parameters:

Name Type Description Default
name str

The name of the virtual chunk container to get.

required

Returns:

Type Description
VirtualChunkContainer | None

The virtual chunk container for the repository associated with the given name.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def get_virtual_chunk_container(self, name: str) -> VirtualChunkContainer | None:
    """
    Get the virtual chunk container for the repository associated with the given name.

    Parameters
    ----------
    name: str
        The name of the virtual chunk container to get.

    Returns
    -------
    VirtualChunkContainer | None
        The virtual chunk container for the repository associated with the given name.
    """
    ...

merge #

merge(other)

Merge another RepositoryConfig with this one.

When merging, values from the other config take precedence. For nested configs (compression, caching, manifest, storage), the merge is applied recursively. For virtual_chunk_containers, entries from the other config extend this one.

Parameters:

Name Type Description Default
other RepositoryConfig

The configuration to merge with this one.

required

Returns:

Type Description
RepositoryConfig

A new merged configuration.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def merge(self, other: RepositoryConfig) -> RepositoryConfig:
    """
    Merge another RepositoryConfig with this one.

    When merging, values from the other config take precedence. For nested configs
    (compression, caching, manifest, storage), the merge is applied recursively.
    For virtual_chunk_containers, entries from the other config extend this one.

    Parameters
    ----------
    other: RepositoryConfig
        The configuration to merge with this one.

    Returns
    -------
    RepositoryConfig
        A new merged configuration.
    """
    ...

set_virtual_chunk_container #

set_virtual_chunk_container(cont)

Add or update a virtual chunk container in the repository configuration.

For named containers, the name is the identity: if a container with the same name already exists (even with a different url_prefix), it will be replaced. For unnamed containers, the url_prefix is the key.

Parameters:

Name Type Description Default
cont VirtualChunkContainer

The virtual chunk container to set.

required
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def set_virtual_chunk_container(self, cont: VirtualChunkContainer) -> None:
    """
    Add or update a virtual chunk container in the repository configuration.

    For named containers, the name is the identity: if a container with the
    same name already exists (even with a different url_prefix), it will be
    replaced. For unnamed containers, the url_prefix is the key.

    Parameters
    ----------
    cont: VirtualChunkContainer
        The virtual chunk container to set.
    """
    ...

initialize_logs #

initialize_logs()

Initialize the logging system for the library.

Reads the value of the environment variable ICECHUNK_LOG to obtain the filters. This is autamtically called on import icechunk.

Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def initialize_logs() -> None:
    """
    Initialize the logging system for the library.

    Reads the value of the environment variable ICECHUNK_LOG to obtain the filters.
    This is autamtically called on `import icechunk`.
    """
    ...

set_logs_filter #

set_logs_filter(log_filter_directive)

Set filters and log levels for the different modules.

Examples: - set_logs_filter("trace") # trace level for all modules - set_logs_filter("error") # error level for all modules - set_logs_filter("icechunk=debug,info") # debug level for icechunk, info for everything else

Full spec for the log_filter_directive syntax is documented in https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html#directives

Parameters:

Name Type Description Default
log_filter_directive str | None

The comma separated list of directives for modules and log levels. If None, the directive will be read from the environment variable ICECHUNK_LOG

required
Source code in icechunk-python/python/icechunk/_icechunk_python.pyi
def set_logs_filter(log_filter_directive: str | None) -> None:
    """
    Set filters and log levels for the different modules.

    Examples:
      - set_logs_filter("trace")  # trace level for all modules
      - set_logs_filter("error")  # error level for all modules
      - set_logs_filter("icechunk=debug,info")  # debug level for icechunk, info for everything else

    Full spec for the log_filter_directive syntax is documented in
    https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html#directives

    Parameters
    ----------
    log_filter_directive: str | None
        The comma separated list of directives for modules and log levels.
        If None, the directive will be read from the environment variable
        ICECHUNK_LOG
    """
    ...