Skip to content

[Kernel] Implement catalogManaged table feature in KernelΒ #4573

@scottsand-db

Description

@scottsand-db

Overview

Please see #4381

Design

[Public] [External] Design Doc: Delta Kernel <> catalogManaged Tables

Project Tracking

Merged = βœ…
Needs Review = πŸ‘€
Waiting for merge = β˜‘
Abandoned = πŸ›‘

[COMPLETE βœ… ] Milestone 0.1: E2E read MVP

Description PR Status Created Merged
ParsedLogData #4579 βœ… 05/19 05/21
ParsedCheckpointData ordering #4597 βœ… 05/21 05/27
APIs - TableManager, ResolvedTableBuilder, ResolvedTable #4614 βœ… 05/22 05/28
ResolvedTable impl; Builder impl; Factory impl; Include parsedLogData when constructing LogSegment #4615 βœ… 05/22 06/02
Load the protocol, metadata, and LogSegment only as needed #4644 βœ… 05/27 06/03
Refactor: Make LogReplay load P&M lazily, without impacting any existing code paths today #4641 βœ… 05/27 05/27
Refactor: Refactor SnapshotQueryContext error reporting to instance method #4654 βœ… 05/28 05/28
Refactor: Move testMetadata creation helper to separate test trait #4657 βœ… 05/28 05/28
ResolvedTableBuilder input validation #4664 βœ… 05/29 06/16
@mmmyr -- #4639 -- Move static util assertLogFilesBelongToTable into LogSegment constructor validation #4682 βœ… 05/30 06/02
Refactor: Make AbstractTestUtils so we can run tests using both old and new APIs #4676 βœ… 05/30 06/14
Refactor: Make LogReplay take in a lazy LogSegment #4690 βœ… 06/02 06/03
catalogManaged preview table feature support #4686 βœ… 06/02 06/14
Fix deltas + commits merging logic to favor the ratified commits #4768 βœ… 06/13 06/18
Create ScanBuilder from ResolvedTable; add E2E ResolvedTable read tests #4663 βœ… 06/16 06/18
Simple E2E read suite with real table and real staged commits #4761 βœ… 06/16 06/18
Mock the P & M in unit tests N/A TODO xx xx
Support a table of only log data N/A TODO xx xx
If table constructed with ratified staged commit log data, verify table feature catalogManaged is supported N/A TODO xx xx
AbstractCatalogManagedE2ESuite and AbstractCatalogManagedTestClient #4745 πŸ›‘ xx xx

[COMPLETE βœ… ] Milestone 0.2: E2E read MVP that can read real UC managed tables

Description PR Status Created Merged
Simple UCCatalogManagedClient with UCCatalogManagedClientSuite #4780 βœ… 06/17 06/25
Simple InMemoryUCClient with InMemoryUCClientSuite #4835 βœ… 06/17 06/27
UCCatalogManagedClient loadTable tests #4838 βœ… 06/26 06/30
GitHub workflow for untiy tests #4857 βœ… 06/30 06/30
UCCatalogManagedClient: minor fixes (e.g. table version is optional) #4944 πŸ‘€ 07/17 xx

[In Progress πŸ›  ] Milestone 0.3: E2E basic write MVP with in-memory client

Description PR Status Created Merged
New Transaction and Committer etc. APIs #4814 βœ… 06/24 07/14
Basic TransactionV2 and CommitContext impementation #4916 πŸ‘€ 07/11 xx
Set the committer on ResolvedTableBuilder; No-op DefaultCommitter #4936 πŸ‘€ 07/15 xx
ImmutableInternalTransactionState #4911 πŸ›‘ 07/10 xx

[In Progress πŸ›  ] Milestone 0.4: E2E write MVP that can write to real UC managed tables

Description PR Status Created Merged
[Do not merge] E2E Write Prototype #4921 πŸ›‘ 07/14 xx

Followups / TODOs

Description PR Status Created Merged
[Refactor] ParsedLogData refactor (remove isMaterialized, remove enums) #4805 πŸ‘€ 06/23 xx
[Docs] Better class/method docs for TableManager, ResolvedTableBuilder, ResolvedTable #4822 βœ… 06/24 06/25
Implement ResolvedTable::getTimestamp (refactor existing ICT utilities) xx xx xx xx
#4908: Include conflicting, and winning, catalog ratified commits in CommitFailedException xx xx xx xx
#4816: Public ParsedLogData API xx xx xx xx
#4820: Public Protocol API xx xx xx xx
#4821: Public Metadata API xx xx xx xx
#4817: Update UCCatalogManagedClient::loadTable to take in the TableInfo result from UC xx xx xx xx
#4763 -- Support official catalogManaged table feature name when RFC accepted xx xx xx xx
#4764 -- Update CatalogManagedEnablementSuite to use new APIs xx xx xx xx
#4765 -- Update ResolvedTableBuilder to accept other types of ParsedLogDatas, not only Staged Ratified Commits xx xx xx xx
#4770 -- Implement getVersionBeforeOrAtTimestamp (etc.) APIs xx xx xx xx
#4787 -- Refactor SnapshotManager::getLogSegmentForVersion xx xx xx xx

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions