Skip to content

Enable WriteSerializable Isolation Level #1262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

koertkuipers
Copy link
Contributor

@koertkuipers koertkuipers commented Jul 9, 2022

Description

Resolves #1261

How was this patch tested?

Added tests for isolation level WriteSerializable in OptimisticTransactionSuite

Does this PR introduce any user-facing changes?

Yes. it enabled setting config delta.isolationLevel=WriteSerializable

@koertkuipers koertkuipers changed the title Expose isolation level [WIP] Expose isolation level Jul 10, 2022
@koertkuipers
Copy link
Contributor Author

this pullreq is WIP because i am waiting to hear if there is any good reason this should not be exposed...

@tdas tdas requested review from tdas and zsxwing July 13, 2022 18:12
Serializable.toString,
IsolationLevel.fromString(_),
_ => true,
"needs to be Serializable or WriteSerializable"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have the implementation for WriteSerializable yet, could you make this config only accept Serializable?

We can target this PR to allow users to set delta.isolationLevel to Serializable. This is correct and there is no point to block people.

Supporting WriteSerializable can be a separate PR that would require a large change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain what is missing for the implementation for WriteSerializable? that's what i am trying to understand.
what i can see is that the logic for WriteSerializable is already present in ConflictChecker and OptimisticTransaction, but i am unsure what else would be needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took a quick look at the existing code for WriteSerializable. It might not be correct. I need to take a detail look and think through it. Haven't read these codes for a while. This is the most complicated part in Delta.

Meanwhile, you can change the code to enable Serializable only. Will open it to allow WriteSerializable when we fix the bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #1276 to fix #1265
This is just to allow setting Serializable isolation level only

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did an experiment where i changed default isolation level to WriteSerializable and ran all the tests. the only tests that failed where the ones that should fail due to the change in behavior.
we also started testing multiple jobs writing concurrently to same table using WriteSerializable isolation level on real cluster and everything is behaving as expected.
i am in no way saying the code for WriteSerializable is correct, but if its not i haven't spotted the issues so far.
just FYI

@koertkuipers
Copy link
Contributor Author

ok i will create new pullreq to just expose isolation level and only allow Serializable.
i will let this one remain WIP just in case it turns out current code for WriteSerializable is correct.

@koertkuipers koertkuipers changed the title [WIP] Expose isolation level [WIP] Expose isolation level and enable WriteSerializable Jul 16, 2022
Copy link
Contributor

@jaceklaskowski jaceklaskowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two more nits and we're done. Sorry for dragging it so long.

@scottsand-db scottsand-db requested a review from zsxwing August 1, 2022 18:15
@scottsand-db
Copy link
Collaborator

ok i will create new pullreq to just expose isolation level and only allow Serializable. i will let this one remain WIP just in case it turns out current code for WriteSerializable is correct.

What's the status here? Did you create that new PR? Can we close this one?

Thanks!

@koertkuipers koertkuipers changed the title [WIP] Expose isolation level and enable WriteSerializable [WIP] Enable WriteSerializable Isolation Level Oct 15, 2022
@koertkuipers
Copy link
Contributor Author

koertkuipers commented Oct 15, 2022

ok i will create new pullreq to just expose isolation level and only allow Serializable. i will let this one remain WIP just in case it turns out current code for WriteSerializable is correct.

What's the status here? Did you create that new PR? Can we close this one?

Thanks!

the pullrequest to enable just setting Serializable isolation level has been merged. it fixes #1265

this pullrequest enables WriteSerializable isolation level, which is a separate issue that has not been resolved yet. see #1261
i am waiting for @zsxwing to take a detailed look at current code for WriteSerializable which this pullreq depends on (and thats why its marked as [WIP]).

@koertkuipers
Copy link
Contributor Author

@zsxwing can you take a look please?

this passes all tests without issues, and we have been running this for years now on clusters (HDFS and S3). the logic for WriteSerializable in ConflictChecker seems sound to me (but i am not the expert on this...)

WriteSerializable is the far more useful isolation level (and the default in databricks platform). without WriteSerializable isolation even the most basic concurrent writes will fail. allowing only Serializable isolation is making open source delta a bit of a second class citizen, so i think this is high priority.

@felipepessoto
Copy link
Contributor

Is this still a WIP?

@koertkuipers
Copy link
Contributor Author

Is this still a WIP?

i think it is not. let me change title. i does still need to be reviewed though by people that understand this part of codebase.

@koertkuipers koertkuipers changed the title [WIP] Enable WriteSerializable Isolation Level Enable WriteSerializable Isolation Level Jul 26, 2023
@numiy
Copy link

numiy commented Sep 24, 2024

Is there a plan to merge this ?

@danieljsc
Copy link

When will this be made available for non-databricks users? Thought deltalake was meant to be open source, not making parts of the code non accessible by open source users/systems.
Hopefully you will stop trying to kill delta lake table format and the open source standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Enable WriteSerializable Isolation Level
7 participants