Home / reference / migration-2
Migrate to 2.0#
Icechunk 2.0 uses a new storage format (spec version 2). Existing repositories created with Icechunk 1.x can be automatically upgraded to version 2.0 using the upgrade_icechunk_repository() function. This migration only uses repository metadata, and it will not read or write any chunks.
Warning
This is an administrative operation. It must be executed in isolation — no other readers or writers should be accessing the repository during the migration. Any writes made during the migration may be lost.
import icechunk as ic
# Open the v1 repository
storage = ic.s3_storage(
bucket="my-bucket",
prefix="my-repo",
region="us-east-1",
)
repo = ic.Repository.open(storage)
# You can use `dry_run=True` to test the migration process without writing the
# icechunk version 2 specific files. `dry_run=False` will run the migration
# and when complete will not be reversible.
migrated_repo = ic.upgrade_icechunk_repository(repo, dry_run=False)
assert migrated_repo.spec_version == 2
# Use migrated_repo from here on — the original `repo` object is invalidated
session = migrated_repo.writable_session("main")
Parameters#
| Parameter | Default | Description |
|---|---|---|
repo | (required) | The v1 repository to migrate. |
dry_run | (required) | If True, it attempts the migration without writing version 2 specific files, leaving the v1 repo in place |
delete_unused_v1_files | True | Remove legacy v1 metadata files after migration. |
prefetch_concurrency | 64 | Number of snapshots to fetch concurrently while migrating. Lower this for environments that cannot fit many snapshots in memory. |
How it works#
The migration reads all snapshots pointed to by branches and tags, and collects them into the new version 2 schema with a reconstructed ops log. If successful and delete_unused_v1_files=True, the legacy v1 metadata files are removed.
If something goes wrong at any step, the repository is left in a working state.
Note
The migration is usually fast, but can take several minutes for repositories with thousands of snapshots.
After migration#
The original repo object is invalidated after migration. Any attempt to use it will raise a RuntimeError. Always use the new Repository object returned by upgrade_icechunk_repository().