-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Issue #809: Fix Creating an external table options #820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f038c28
to
2de06d6
Compare
Hi Lucian, thank you for contributing and submitting this PR. This is definitely a bug we want to fix and we are reviewing it now. |
Thanks for making the PR. Could you just change the above place to split table properties and options properly? Similar to delta/core/src/main/scala/org/apache/spark/sql/delta/catalog/DeltaCatalog.scala Lines 354 to 368 in d9e3877
|
d28b58b
to
59b4293
Compare
val props = if (conf.getConf(DeltaSQLConf.DELTA_LEGACY_STORE_WRITER_OPTIONS_AS_PROPS)) { | ||
// Legacy behavior | ||
writeOptions.foreach { case (k, v) => props.put(k, v) } | ||
catalogProp.properties ++ writeOptions | ||
} else { | ||
writeOptions.foreach { case (k, v) => | ||
// Continue putting in Delta prefixed options to avoid breaking workloads | ||
if (k.toLowerCase(Locale.ROOT).startsWith("delta.")) { | ||
props.put(k, v) | ||
} | ||
} | ||
val deltaPrefix: Map[String, String] = writeOptions.filter { case (k, _) => | ||
k.toLowerCase(Locale.ROOT).startsWith("delta.") } | ||
|
||
catalogProp.properties ++ deltaPrefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be added to DeltaTableProperties
. We would like to be consistent in all places.
…s as for the underlying data delta-io#809 Signed-off-by: Lucian Neghina <[email protected]>
59b4293
to
76ed376
Compare
Sorry for being late. Although the change looks good, I will need to take some time to think about the backward compatibility. |
Sorry for the delay. I'm currently working on adding a new test suite to check the existing behaviors in all APIs that we can set options. |
Just to give a update. @scottsand-db is working on adding tests for the following complicated test matrix to check what are the behaviors in all combinations (Unfortunately, we have too many ways to create/write a Delta table today and they are not sharing the same code path): Two types of options:
Two ways to refer a Delta table
APIs to write a Delta table
We want to have these tests first so that we can make sure any new changes will be consistent in all of these cases. And thanks a lot, @scottsand-db for offering the help to do this. |
Here is the test suite which shows all the different ways to write configs to a delta table. |
Hi @nflucian, #1120 has been merged. It adds a test suite that shows the status of all the different combinations of writing configs into delta. Can you update your branch with master? Then, could you please run `build/sbt 'core/testOnly *DeltaWriteConfigsSuite' and double check that your changes solve the problem? Can you then please update the relevant comment tables (e.g. here) with the latest result. Let me know if you have any questions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. There's also 3 other big changes this PR needs.
- We should only write options that are prefixed with
delta
. BUT we also need to be able to read old tables that do have options that aren't prefixed with delta. (Backwards compatibility) - We need a feature flag, which when enabled rolls back to the previous (i.e. current) behaviour.
- A unit test that tests this backwards compatibility. See EvolvabilitySuite
new DeltaTableProperties(tableProperties, writeOptions) | ||
} | ||
|
||
private def deltaProperties(writeOptions: Map[String, String]): Map[String, String] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaladoc. describe the input and the output
new DeltaTableProperties(tableProperties, writeOptions) | ||
} | ||
|
||
private def deltaProperties(writeOptions: Map[String, String]): Map[String, String] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaladoc. describe the input and the output
} | ||
} | ||
|
||
private def split(properties: Map[String, String]): (Map[String, String], Map[String, String]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaladoc: describe the output. what are the tuples?
} | ||
} | ||
|
||
private def split(properties: Map[String, String]): (Map[String, String], Map[String, String]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scaladoc: describe the output. what are the tuples?
withTempDir { tempDir => | ||
val path = tempDir.getCanonicalPath | ||
val options = Map( | ||
"header"-> "true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After syncing with @zsxwing , we've confirmed that we do not want to write properties that are not prefixed with delta
. Thus, to be clear, we only want to write to table metadata those properties which start with delta
.
withTempDir { tempDir => | ||
val path = tempDir.getCanonicalPath | ||
val options = Map( | ||
"header"-> "true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After syncing with @zsxwing , we've confirmed that we do not want to write properties that are not prefixed with delta
. Thus, to be clear, we only want to write to table metadata those properties which start with delta
.
This PR fix metadata.configuration inconsistency between write into delta and create delta table. #809