Feature: partition on non-column (static) data

# Description

Currently the way to create or update a delta table with partitions is to have the partitions as columns in the data and then call `write_deltalake` with `partition_by` set to the name of the partition column(s). This makes sense as the partition value can change per row however if the partition values are the same for all the rows (and not in-band), it would be useful to have the ability to specify these explicitly out-of-band, e.g:
```python
write_deltalake(path, data, partition_on=dict(partition_name=DataType.string()))
```

**Use Case**

We currrently use pyspark to write delta tables as part of our batch processing where the partition columns are static per batch and not part of the data itself. This involves us doing something along the lines of:
```python
df = df.withColumn(colName="partition_column", col=lit("static value"))
writer = df.write.format("delta").partitionBy("partition_column")
...
write.save(path)
```
However doing something similar with delta-rs isn't as simple as `write_deltalake` expects an `ArrowStreamExportable` so we can't inject the partition columns in pure python code anymore.
It also seems slightly inefficient to add extra columns to the data only to have them ultimately not written out to the parquet files in the delta table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: partition on non-column (static) data #3536

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Feature: partition on non-column (static) data #3536

Description

Description

Activity

ion-elgreco commented on Jun 21, 2025

johnnyg commented on Jun 22, 2025

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions