icechunk.session#
Sessions for reading and writing data. Includes ForkSession for distributed writes and SessionMode.
icechunk.session #
Classes:
| Name | Description |
|---|---|
Session | A session object that allows for reading and writing data from an Icechunk repository. |
SessionMode | Enum for session access modes |
Session #
A session object that allows for reading and writing data from an Icechunk repository.
Methods:
| Name | Description |
|---|---|
all_virtual_chunk_locations | Return the location URLs of all virtual chunks. |
all_virtual_chunk_locations_async | Return the location URLs of all virtual chunks (async version). |
amend | Commit the changes in the session to the repository, by amending/overwriting the previous commit. |
amend_async | Commit the changes in the session to the repository, by amending/overwriting the previous commit. |
chunk_coordinates | Return an async iterator to all initialized chunks for the array at array_path |
chunk_type | Return the chunk type for the specified coordinates |
chunk_type_async | Return the chunk type for the specified coordinates |
commit | Commit the changes in the session to the repository. |
commit_async | Commit the changes in the session to the repository (async version). |
discard_changes | When the session is writable, discard any uncommitted changes. |
flush | Save the changes in the session to a new snapshot without modifying the current branch. |
flush_async | Save the changes in the session to a new snapshot without modifying the current branch. |
fork | Create a child session that can be pickled to a worker job and later merged. |
get_node_id | Return the node ID for the array or group at the given path. |
get_node_id_async | Return the node ID for the array or group at the given path. |
merge | Merge the changes for this session with the changes from another session. |
merge_async | Merge the changes for this session with the changes from another session (async version). |
move | Move or rename a node (array or group) in the hierarchy. |
move_async | Async version of :meth: |
rebase | Rebase the session to the latest ancestry of the branch. |
rebase_async | Rebase the session to the latest ancestry of the branch (async version). |
reindex_array | Reindex chunks in an array by applying a transformation function. |
shift_array | Shift all chunks in an array by the given chunk offset. |
status | Compute an overview of the current session changes |
Attributes:
| Name | Type | Description |
|---|---|---|
branch | str | None | The branch that the session is based on. This is only set if the session is writable. |
config | RepositoryConfig | Get the repository configuration. |
has_uncommitted_changes | bool | Whether the session has uncommitted changes. This is only possibly true if the session is writable. |
mode | SessionMode | The mode of this session. |
read_only | bool | Whether the session is read-only. |
snapshot_id | str | The base snapshot ID of the session. |
store | IcechunkStore | Get a zarr Store object for reading and writing data from the repository using zarr python. |
Source code in icechunk-python/python/icechunk/session.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 | |
branch property #
The branch that the session is based on. This is only set if the session is writable.
Returns:
| Type | Description |
|---|---|
str or None | The branch that the session is based on if the session is writable, None otherwise. |
config property #
Get the repository configuration.
Notice that changes to the returned object won't be impacted. To change configuration values use Repository.reopen.
Returns:
| Type | Description |
|---|---|
RepositoryConfig | The config for the repository that owns this session. |
has_uncommitted_changes property #
Whether the session has uncommitted changes. This is only possibly true if the session is writable.
Returns:
| Type | Description |
|---|---|
bool | True if the session has uncommitted changes, False otherwise. |
mode property #
The mode of this session.
Returns:
| Type | Description |
|---|---|
SessionMode | The session mode - one of READONLY, WRITABLE, or REARRANGE. |
read_only property #
Whether the session is read-only.
Returns:
| Type | Description |
|---|---|
bool | True if the session is read-only, False otherwise. |
snapshot_id property #
The base snapshot ID of the session.
Returns:
| Type | Description |
|---|---|
str | The base snapshot ID of the session. |
store property #
Get a zarr Store object for reading and writing data from the repository using zarr python.
Returns:
| Type | Description |
|---|---|
IcechunkStore | A zarr Store object for reading and writing data from the repository. |
all_virtual_chunk_locations #
Return the location URLs of all virtual chunks.
Returns:
| Type | Description |
|---|---|
list of str | The location URLs of all virtual chunks. |
Source code in icechunk-python/python/icechunk/session.py
all_virtual_chunk_locations_async async #
Return the location URLs of all virtual chunks (async version).
Returns:
| Type | Description |
|---|---|
list of str | The location URLs of all virtual chunks. |
Source code in icechunk-python/python/icechunk/session.py
amend #
Commit the changes in the session to the repository, by amending/overwriting the previous commit.
When successful, the writable session is completed and the session is now read-only and based on the new commit. The snapshot ID of the new commit is returned.
If the session is out of date, this will raise a ConflictError exception depicting the conflict that occurred. The session will need to be rebased before committing.
This operation doesn't create a new commit in the repo ancestry. It replaces the previous commit.
The first commit to the repo cannot be amended.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
allow_empty | bool | If True, allow amending even if no data changes have been made to the session. This is useful when you only want to update the commit message. Default is False. | False |
Returns:
| Type | Description |
|---|---|
str | The snapshot ID of the new commit. |
Raises:
| Type | Description |
|---|---|
ConflictError | If the session is out of date and a conflict occurs. |
Source code in icechunk-python/python/icechunk/session.py
amend_async async #
Commit the changes in the session to the repository, by amending/overwriting the previous commit.
When successful, the writable session is completed and the session is now read-only and based on the new commit. The snapshot ID of the new commit is returned.
If the session is out of date, this will raise a ConflictError exception depicting the conflict that occurred. The session will need to be rebased before committing.
This operation doesn't create a new commit in the repo ancestry. It replaces the previous commit.
The first commit to the repo cannot be amended.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
allow_empty | bool | If True, allow amending even if no data changes have been made to the session. This is useful when you only want to update the commit message. Default is False. | False |
Returns:
| Type | Description |
|---|---|
str | The snapshot ID of the new commit. |
Raises:
| Type | Description |
|---|---|
ConflictError | If the session is out of date and a conflict occurs. |
Source code in icechunk-python/python/icechunk/session.py
chunk_coordinates async #
Return an async iterator to all initialized chunks for the array at array_path
Returns:
| Type | Description |
|---|---|
an async iterator to chunk coordinates as tuples | |
Source code in icechunk-python/python/icechunk/session.py
chunk_type #
Return the chunk type for the specified coordinates
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array_path | str | The path to the array inside the Zarr store. Example: "/groupA/groupB/outputs/my-array". | required |
chunk_coordinates | Sequence[int] | A sequence of integers (list or tuple) used to locate the chunk. Example: [0, 1, 5]. | required |
Returns:
| Type | Description |
|---|---|
ChunkType | One of the supported chunk types. |
Source code in icechunk-python/python/icechunk/session.py
chunk_type_async async #
Return the chunk type for the specified coordinates
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array_path | str | The path to the array inside the Zarr store. Example: "/groupA/groupB/outputs/my-array". | required |
chunk_coordinates | Sequence[int] | A sequence of integers (list or tuple) used to locate the chunk. Example: [0, 1, 5]. | required |
Returns:
| Type | Description |
|---|---|
ChunkType | One of the supported chunk types. |
Source code in icechunk-python/python/icechunk/session.py
commit #
Commit the changes in the session to the repository.
When successful, the writable session is completed and the session is now read-only and based on the new commit. The snapshot ID of the new commit is returned.
If the session is out of date, this will raise a ConflictError exception depicting the conflict that occurred. The session will need to be rebased before committing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
rebase_with | ConflictSolver | None | If other session committed while the current session was writing, use Session.rebase with this solver. | None |
rebase_tries | int | If other session committed while the current session was writing, use Session.rebase up to this many times in a loop. | 1000 |
allow_empty | bool | If True, allow creating a commit even if there are no changes. Default is False. | False |
Returns:
| Type | Description |
|---|---|
str | The snapshot ID of the new commit. |
Raises:
| Type | Description |
|---|---|
ConflictError | If the session is out of date and a conflict occurs. |
NoChangesToCommitError | If there are no changes to commit and allow_empty is False. |
Source code in icechunk-python/python/icechunk/session.py
commit_async async #
Commit the changes in the session to the repository (async version).
When successful, the writable session is completed and the session is now read-only and based on the new commit. The snapshot ID of the new commit is returned.
If the session is out of date, this will raise a ConflictError exception depicting the conflict that occurred. The session will need to be rebased before committing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
rebase_with | ConflictSolver | None | If other session committed while the current session was writing, use Session.rebase with this solver. | None |
rebase_tries | int | If other session committed while the current session was writing, use Session.rebase up to this many times in a loop. | 1000 |
allow_empty | bool | If True, allow creating a commit even if there are no changes. Default is False. | False |
Returns:
| Type | Description |
|---|---|
str | The snapshot ID of the new commit. |
Raises:
| Type | Description |
|---|---|
ConflictError | If the session is out of date and a conflict occurs. |
NoChangesToCommitError | If there are no changes to commit and allow_empty is False. |
Source code in icechunk-python/python/icechunk/session.py
discard_changes #
flush #
Save the changes in the session to a new snapshot without modifying the current branch.
When successful, the writable session is completed and the session is now read-only and based on the new snapshot. The ID of the new snapshot is returned.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
Returns:
| Type | Description |
|---|---|
str | The ID of the new snapshot. |
Source code in icechunk-python/python/icechunk/session.py
flush_async async #
Save the changes in the session to a new snapshot without modifying the current branch.
When successful, the writable session is completed and the session is now read-only and based on the new snapshot. The ID of the new snapshot is returned.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message | str | The message to write with the commit. | required |
metadata | dict[str, Any] | None | Additional metadata to store with the commit snapshot. | None |
Returns:
| Type | Description |
|---|---|
str | The ID of the new snapshot. |
Source code in icechunk-python/python/icechunk/session.py
fork #
Create a child session that can be pickled to a worker job and later merged.
This method supports Icechunk's distributed, collaborative jobs. A coordinator task creates a new session using Repository.writable_session. Then Session.fork is called repeatedly to create as many serializable sessions as worker jobs. Each new ForkSession is pickled to the worker that uses it to do all its writes. Finally, the ForkSessions are pickled back to the coordinator that uses ForkSession.merge to merge them back into the original session and commit.
Learn more about collaborative writes at https://icechunk.io/en/latest/parallel/
Raises:
| Type | Description |
|---|---|
ValueError | When |
ValueError | When |
Source code in icechunk-python/python/icechunk/session.py
get_node_id #
Return the node ID for the array or group at the given path.
Each node is assigned an opaque ID when it is created. This ID is stable across moves and renames — a node keeps the same ID for its entire lifetime. See the icechunk spec <https://icechunk.io/en/stable/spec/>_ for details on node identity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path | str | Absolute path to the node (e.g., "/data/temperature"). | required |
Returns:
| Type | Description |
|---|---|
str | The node ID as an opaque string. |
Raises:
| Type | Description |
|---|---|
IcechunkError | If no node exists at the given path. |
Source code in icechunk-python/python/icechunk/session.py
get_node_id_async async #
Return the node ID for the array or group at the given path.
Each node is assigned an opaque ID when it is created. This ID is stable across moves and renames — a node keeps the same ID for its entire lifetime. See the icechunk spec <https://icechunk.io/en/stable/spec/>_ for details on node identity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path | str | Absolute path to the node (e.g., "/data/temperature"). | required |
Returns:
| Type | Description |
|---|---|
str | The node ID as an opaque string. |
Raises:
| Type | Description |
|---|---|
IcechunkError | If no node exists at the given path. |
Source code in icechunk-python/python/icechunk/session.py
merge #
Merge the changes for this session with the changes from another session.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
others | ForkSession | The forked sessions to merge changes from. | () |
Source code in icechunk-python/python/icechunk/session.py
merge_async async #
Merge the changes for this session with the changes from another session (async version).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
others | ForkSession | The forked sessions to merge changes from. | () |
Source code in icechunk-python/python/icechunk/session.py
move #
Move or rename a node (array or group) in the hierarchy.
This is a metadata-only operation—no data is copied. Requires a rearrange session:
session = repo.rearrange_session("main")
session.move("/data/raw", "/data/v1")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
from_path | str | The current path of the node (e.g., "/data/raw"). | required |
to_path | str | The new path for the node (e.g., "/data/v1"). | required |
Source code in icechunk-python/python/icechunk/session.py
move_async async #
rebase #
Rebase the session to the latest ancestry of the branch.
This method will iteratively crawl the ancestry of the branch and apply the changes from the branch to the session. If a conflict is detected, the conflict solver will be used to optionally resolve the conflict. When complete, the session will be based on the latest commit of the branch and the session will be ready to attempt another commit.
When a conflict is detected and a resolution is not possible with the provided solver, a RebaseFailed exception will be raised. This exception will contain the snapshot ID that the rebase failed on and a list of conflicts that occurred.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
solver | ConflictSolver | The conflict solver to use when a conflict is detected. | required |
Raises:
| Type | Description |
|---|---|
RebaseFailedError | When a conflict is detected and the solver fails to resolve it. |
Source code in icechunk-python/python/icechunk/session.py
rebase_async async #
Rebase the session to the latest ancestry of the branch (async version).
This method will iteratively crawl the ancestry of the branch and apply the changes from the branch to the session. If a conflict is detected, the conflict solver will be used to optionally resolve the conflict. When complete, the session will be based on the latest commit of the branch and the session will be ready to attempt another commit.
When a conflict is detected and a resolution is not possible with the provided solver, a RebaseFailed exception will be raised. This exception will contain the snapshot ID that the rebase failed on and a list of conflicts that occurred.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
solver | ConflictSolver | The conflict solver to use when a conflict is detected. | required |
Raises:
| Type | Description |
|---|---|
RebaseFailedError | When a conflict is detected and the solver fails to resolve it. |
Source code in icechunk-python/python/icechunk/session.py
reindex_array #
Reindex chunks in an array by applying a transformation function.
Only existing (non-empty) chunks are visited — empty positions are skipped. This means that if an empty chunk would have shifted into an occupied position, that position retains stale data unless a backward function is also provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array_path | str | Path to the array. | required |
forward | Callable[[Iterable[int]], Iterable[int] | None] | Function that maps old chunk coordinates to new coordinates. Receives a list of non-negative integers (the current chunk index) and must return either a new index (as a list/tuple of non-negative integers within the array's chunk grid bounds) or | required |
backward | Callable[[Iterable[int]], Iterable[int] | None] | Inverse of | None |
Source code in icechunk-python/python/icechunk/session.py
shift_array #
Shift all chunks in an array by the given chunk offset.
Out-of-bounds chunks are discarded. To preserve them, resize the array first to make room. Vacated source positions are cleared (reset to fill value).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
array_path | str | The path to the array to shift. | required |
chunk_offset | Iterable[int] | The number of chunks to shift by in each dimension. Positive values shift right/down, negative values shift left/up. | required |
Notes
To shift right while preserving all data, first resize the array using zarr's array.resize(), then shift.
Source code in icechunk-python/python/icechunk/session.py
SessionMode #
Bases: Enum
Enum for session access modes
Attributes:
| Name | Type | Description |
|---|---|---|
readonly | int | Session can only read data |
writable | int | Session can read and write data |
rearrange | int | Session can only move nodes and reindex arrays |