Skip to content

Commit 34c29cf

Browse files
LuciferYangdongjoon-hyun
authored andcommitted
[SPARK-51365][SQL][TESTS] Add Envs to control the number of SHUFFLE_EXCHANGE/RESULT_QUERY_STAGE threads used in test cases related to SharedSparkSession/TestHive
### What changes were proposed in this pull request? This PR adds the following environment variables: - `SPARK_TEST_SQL_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD`: Used to control the `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD` for test cases related to `SharedSparkSession`. - `SPARK_TEST_SQL_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD`: Used to control the `RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for test cases related to `SharedSparkSession`. - `SPARK_TEST_HIVE_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD`: Used to control the `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD` for test cases related to `TestHive`. - `SPARK_TEST_HIVE_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD`: Used to control the `RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` for test cases related to `TestHive`. This allows the maximum number of `SHUFFLE_EXCHANGE`/`RESULT_QUERY_STAGE` threads used in test cases related to `SharedSparkSession`/`TestHive` to be controlled by setting environment variables. Additionally, due to the memory configuration of the macOS + Apple Silicon runner specification in the standard GitHub-hosted runners being only half of that of other specifications (7G vs 14G), this pr configures the following settings in `build_maven_java21_macos15.yml`: ``` "SPARK_TEST_SQL_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD": "256", "SPARK_TEST_SQL_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD": "256", "SPARK_TEST_HIVE_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD": "48", "SPARK_TEST_HIVE_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD": "48" ``` This is to avoid test errors similar to the following from occurring in daily tests on macOS: ``` Warning: [343.044s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 4096k, guardsize: 16k, detached. 11372Warning: [343.044s][warning][os,thread] Failed to start the native thread for java.lang.Thread "shuffle-exchange-1529" 11373*** RUN ABORTED *** 11374An exception or error caused a run to abort: unable to create native thread: possibly out of memory or process/resource limits reached 11375 java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached 11376 at java.base/java.lang.Thread.start0(Native Method) 11377 at java.base/java.lang.Thread.start(Thread.java:1553) 11378 at java.base/java.lang.System$2.start(System.java:2577) 11379 at java.base/jdk.internal.vm.SharedThreadContainer.start(SharedThreadContainer.java:152) 11380 at java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:953) 11381 at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1364) 11382 at scala.concurrent.impl.ExecutionContextImpl.execute(ExecutionContextImpl.scala:21) 11383 at java.base/java.util.concurrent.CompletableFuture.asyncSupplyStage(CompletableFuture.java:1782) 11384 at java.base/java.util.concurrent.CompletableFuture.supplyAsync(CompletableFuture.java:2005) 11385 at org.apache.spark.sql.execution.SQLExecution$.withThreadLocalCaptured(SQLExecution.scala:329) 11386 ... ``` ### Why are the changes needed? The default configuration values for `SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD` and `RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD` are 1024. Additionally, since the `-Xss` value used in Spark test cases is relatively large by default, such as `-Xss4m` for the SQL module and `-Xss64m` for the Hive module, it is necessary to provide the ability to adjust the maximum number of related threads to accommodate different test environments, such as the daily tests on macOS. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass Github Actions - Test macOs on Github Action: https://github.com/LuciferYang/spark/actions/runs/13745222147 ![image](https://github.com/user-attachments/assets/b7dc09bd-4450-4272-aa01-e64013c8aab4) ### Was this patch authored or co-authored using generative AI tooling? NO Closes #50206 from LuciferYang/SPARK-51365. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 0e3b7af commit 34c29cf

File tree

3 files changed

+19
-3
lines changed

3 files changed

+19
-3
lines changed

.github/workflows/build_maven_java21_macos15.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,5 +36,9 @@ jobs:
3636
os: macos-15
3737
envs: >-
3838
{
39-
"OBJC_DISABLE_INITIALIZE_FORK_SAFETY": "YES"
39+
"OBJC_DISABLE_INITIALIZE_FORK_SAFETY": "YES",
40+
"SPARK_TEST_SQL_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD": "256",
41+
"SPARK_TEST_SQL_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD": "256",
42+
"SPARK_TEST_HIVE_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD": "48",
43+
"SPARK_TEST_HIVE_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD": "48"
4044
}

sql/core/src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,12 @@ trait SharedSparkSessionBase
7979
StaticSQLConf.WAREHOUSE_PATH,
8080
conf.get(StaticSQLConf.WAREHOUSE_PATH) + "/" + getClass.getCanonicalName)
8181
conf.set(StaticSQLConf.LOAD_SESSION_EXTENSIONS_FROM_CLASSPATH, false)
82+
conf.set(StaticSQLConf.SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD,
83+
sys.env.getOrElse("SPARK_TEST_SQL_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD",
84+
StaticSQLConf.SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD.defaultValueString).toInt)
85+
conf.set(StaticSQLConf.RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD,
86+
sys.env.getOrElse("SPARK_TEST_SQL_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD",
87+
StaticSQLConf.RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD.defaultValueString).toInt)
8288
}
8389

8490
/**

sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ import org.apache.spark.sql.execution.{CommandExecutionMode, QueryExecution, SQL
4444
import org.apache.spark.sql.hive._
4545
import org.apache.spark.sql.hive.client.HiveClient
4646
import org.apache.spark.sql.internal.{SessionState, SharedState, SQLConf, WithTestConf}
47-
import org.apache.spark.sql.internal.StaticSQLConf.{CATALOG_IMPLEMENTATION, WAREHOUSE_PATH}
47+
import org.apache.spark.sql.internal.StaticSQLConf.{CATALOG_IMPLEMENTATION, RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD, SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD, WAREHOUSE_PATH}
4848
import org.apache.spark.util.{ShutdownHookManager, Utils}
4949

5050
// SPARK-3729: Test key required to check for initialization errors with config.
@@ -70,7 +70,13 @@ object TestHive
7070
// LocalRelation will exercise the optimization rules better by disabling it as
7171
// this rule may potentially block testing of other optimization rules such as
7272
// ConstantPropagation etc.
73-
.set(SQLConf.OPTIMIZER_EXCLUDED_RULES.key, ConvertToLocalRelation.ruleName))) {
73+
.set(SQLConf.OPTIMIZER_EXCLUDED_RULES.key, ConvertToLocalRelation.ruleName)
74+
.set(SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD,
75+
sys.env.getOrElse("SPARK_TEST_HIVE_SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD",
76+
SHUFFLE_EXCHANGE_MAX_THREAD_THRESHOLD.defaultValueString).toInt)
77+
.set(RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD,
78+
sys.env.getOrElse("SPARK_TEST_HIVE_RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD",
79+
RESULT_QUERY_STAGE_MAX_THREAD_THRESHOLD.defaultValueString).toInt))) {
7480
override def conf: SQLConf = sparkSession.sessionState.conf
7581
}
7682

0 commit comments

Comments
 (0)