Skip to content

Home / icechunk-python / quickstart

Quickstart

Icechunk is designed to be mostly in the background. As a Python user, you'll mostly be interacting with Zarr. If you're not familiar with Zarr, you may want to start with the Zarr Tutorial

Installation

Install Icechunk with pip

pip install icechunk 

Note

Icechunk is currently designed to support the Zarr V3 Specification. Using it today requires installing the latest pre-release of Zarr Python 3.

Create a new store

To get started, let's create a new Icechunk store. We recommend creating your store on S3 to get the most out of Icechunk's cloud-native design. However, you can also create a store on your local filesystem.

storage_config = icechunk.StorageConfig.s3_from_env(
    bucket="icechunk-test",
    prefix="quickstart-demo-1"
)
store = icechunk.IcechunkStore.create(storage_config)
storage_config = icechunk.StorageConfig.filesystem("./icechunk-local")
store = icechunk.IcechunkStore.create(storage_config)

Write some data and commit

We can now use our Icechunk store with Zarr. Let's first create a group and an array within it.

group = zarr.group(store)
array = group.create("my_array", shape=10, dtype=int)

Now let's write some data

array[:] = 1

Now let's commit our update

store.commit("first commit")

🎉 Congratulations! You just made your first Icechunk snapshot.

Make a second commit

Let's now put some new data into our array, overwriting the first five elements.

array[:5] = 2

...and commit the changes

store.commit("overwrite some values")

Explore version history

We can see the full version history of our repo:

hist = store.ancestry()
for anc in hist:
    print(anc.id, anc.message, anc.written_at)

# Output:
# AHC3TSP5ERXKTM4FCB5G overwrite some values 2024-10-14 14:07:27.328429+00:00
# Q492CAPV7SF3T1BC0AA0 first commit 2024-10-14 14:07:26.152193+00:00
# T7SMDT9C5DZ8MP83DNM0 Repository initialized 2024-10-14 14:07:22.338529+00:00

...and we can go back in time to the earlier version.

# latest version
assert array[0] == 2
# check out earlier snapshot
store.checkout(snapshot_id=hist[1].id)
# verify data matches first version
assert array[0] == 1

That's it! You now know how to use Icechunk! For your next step, dig deeper into configuration, explore the version control system, or learn how to use Icechunk with Xarray.