Replies: 1 comment
-
Hey @gero90, thanks for bringing this up! Created an issue to track this feature request's progress: #3823 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
If there is anyway to estimate parquet file size in
df.write_iceberg()
, it would be really nice to try to get parquet files of size close to the iceberg table propertywrite.target-file-size-bytes
(default is 512 MiB)Having parquet files close to that size makes iceberg reads more efficient, and there is less table maintenance (compaction) to perform.
As example, I'm doing
df.into_partitions(1)
right beforedf.write_iceberg()
where I know the total data is small, to get a single file per write.Thanks in advance for taking a look and for making daft awesome!
Beta Was this translation helpful? Give feedback.
All reactions