Skip to content

Commit ed431e2

Browse files
committed
add flag
1 parent a8733b7 commit ed431e2

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

python/docs/source/migration_guide/pyspark_upgrade.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Upgrading PySpark
2222
Upgrading from PySpark 4.0 to 4.1
2323
---------------------------------
2424

25-
* In Spark 4.1, ``DataFrame['column_name']`` on Spark Connect Python Client will not eagerly validate the column name which might need an RPC. To restore the legacy behavior, set ``spark.sql.pyspark.legacy.eagerColumnNameValidation.enabled`` to ``true``.
25+
* In Spark 4.1, ``DataFrame['name']`` and ``DataFrame.name`` on Spark Connect Python Client no longer eagerly validate the column name. To restore the legacy behavior, set ``spark.sql.pyspark.legacy.eagerColumnNameValidation.enabled`` to ``true``.
2626
* In Spark 4.1, Arrow-optimized Python UDF supports UDT input / output instead of falling back to the regular UDF. To restore the legacy behavior, set ``spark.sql.execution.pythonUDF.arrow.legacy.fallbackOnUDT`` to ``true``.
2727
* In Spark 4.1, unnecessary conversion to pandas instances is removed when ``spark.sql.execution.pythonUDTF.arrow.enabled`` is enabled. As a result, the type coercion changes when the produced output has a schema different from the specified schema. To restore the previous behavior, enable ``spark.sql.legacy.execution.pythonUDTF.pandas.conversion.enabled``.
2828

python/pyspark/sql/connect/dataframe.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1704,7 +1704,10 @@ def __getattr__(self, name: str) -> "Column":
17041704
errorClass="JVM_ATTRIBUTE_NOT_SUPPORTED", messageParameters={"attr_name": name}
17051705
)
17061706

1707-
if name not in self.columns:
1707+
if (
1708+
os.environ.get("spark.sql.pyspark.legacy.eagerColumnNameValidation.enabled") == "true"
1709+
and name not in self.columns
1710+
):
17081711
raise PySparkAttributeError(
17091712
errorClass="ATTRIBUTE_NOT_SUPPORTED", messageParameters={"attr_name": name}
17101713
)

0 commit comments

Comments
 (0)