Commit bb41e19
[SPARK-53525][CONNECT] Spark Connect ArrowBatch Result Chunking
### What changes were proposed in this pull request?
Currently, we enforce gRPC message limits on both the client and the server. These limits are largely meant to protect both sides from potential OOMs by rejecting abnormally large messages. However, there are cases in which the server incorrectly sends oversized messages that exceed these limits and cause execution failures.
Specifically, the large message issue from the server to the client we’re solving here, comes from the Arrow batch data in ExecutePlanResponse being too large. It’s caused by a single arrow row exceeding the 128MB message limit, and Arrow cannot partition further and it has to return the single large row in one gRPC message.
To improve Spark Connect stability, this PR implements chunking large Arrow batches when returning query results from the server to the client, ensuring each ExecutePlanResponse chunk remains within the size limit, and the chunks from a batch will be reassembled on the client when parsing as an arrow batch.
(Scala client changes are being implemented in a follow-up PR.)
To reproduce the existing issue we are solving here, run this code on Spark Connect:
```
repeat_num_per_mb = 1024 * 1024 // len('Apache Spark ')
res = spark.sql(f"select repeat('Apache Spark ', {repeat_num_per_mb * 300}) as huge_col from range(1)").collect()
print(len(res))
```
It fails with `StatusCode.RESOURCE_EXHAUSTED` error with message `Received message larger than max (314570608 vs. 134217728)`, because the server is trying to send an ExecutePlanResponse of ~300MB to the client.
With the improvement introduced by the PR, the above code runs successfully and prints the expected result.
### Why are the changes needed?
It improves Spark Connect stability when returning large rows.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
New tests on both the server side and the client side.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52271 from xi-db/arrow-batch-chunking.
Authored-by: Xi Lyu <[email protected]>
Signed-off-by: Herman van Hovell <[email protected]>1 parent a8f56d4 commit bb41e19
File tree
9 files changed
+644
-147
lines changed- python/pyspark/sql
- connect
- client
- proto
- tests/connect
- sql/connect
- common/src/main/protobuf/spark/connect
- server/src
- main/scala/org/apache/spark/sql/connect
- config
- execution
- service
- test/scala/org/apache/spark/sql/connect/planner
9 files changed
+644
-147
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
607 | 607 | | |
608 | 608 | | |
609 | 609 | | |
| 610 | + | |
| 611 | + | |
610 | 612 | | |
611 | 613 | | |
612 | 614 | | |
| |||
639 | 641 | | |
640 | 642 | | |
641 | 643 | | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
642 | 659 | | |
643 | 660 | | |
644 | 661 | | |
| |||
678 | 695 | | |
679 | 696 | | |
680 | 697 | | |
| 698 | + | |
| 699 | + | |
681 | 700 | | |
682 | 701 | | |
683 | 702 | | |
| |||
1235 | 1254 | | |
1236 | 1255 | | |
1237 | 1256 | | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
1238 | 1266 | | |
1239 | 1267 | | |
1240 | 1268 | | |
| |||
1408 | 1436 | | |
1409 | 1437 | | |
1410 | 1438 | | |
| 1439 | + | |
1411 | 1440 | | |
1412 | 1441 | | |
1413 | 1442 | | |
| |||
1495 | 1524 | | |
1496 | 1525 | | |
1497 | 1526 | | |
| 1527 | + | |
| 1528 | + | |
1498 | 1529 | | |
1499 | 1530 | | |
1500 | 1531 | | |
| 1532 | + | |
| 1533 | + | |
| 1534 | + | |
| 1535 | + | |
| 1536 | + | |
| 1537 | + | |
| 1538 | + | |
| 1539 | + | |
| 1540 | + | |
| 1541 | + | |
| 1542 | + | |
| 1543 | + | |
| 1544 | + | |
| 1545 | + | |
| 1546 | + | |
| 1547 | + | |
| 1548 | + | |
| 1549 | + | |
| 1550 | + | |
| 1551 | + | |
| 1552 | + | |
| 1553 | + | |
| 1554 | + | |
| 1555 | + | |
| 1556 | + | |
| 1557 | + | |
| 1558 | + | |
| 1559 | + | |
| 1560 | + | |
| 1561 | + | |
1501 | 1562 | | |
1502 | | - | |
1503 | | - | |
| 1563 | + | |
| 1564 | + | |
| 1565 | + | |
1504 | 1566 | | |
1505 | | - | |
1506 | | - | |
1507 | | - | |
1508 | | - | |
| 1567 | + | |
| 1568 | + | |
| 1569 | + | |
| 1570 | + | |
| 1571 | + | |
1509 | 1572 | | |
1510 | 1573 | | |
1511 | | - | |
1512 | | - | |
1513 | | - | |
1514 | | - | |
1515 | | - | |
1516 | | - | |
1517 | | - | |
1518 | | - | |
1519 | | - | |
1520 | | - | |
1521 | | - | |
1522 | | - | |
1523 | | - | |
| 1574 | + | |
| 1575 | + | |
| 1576 | + | |
| 1577 | + | |
| 1578 | + | |
| 1579 | + | |
| 1580 | + | |
| 1581 | + | |
| 1582 | + | |
| 1583 | + | |
| 1584 | + | |
| 1585 | + | |
1524 | 1586 | | |
1525 | 1587 | | |
1526 | 1588 | | |
| |||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1093 | 1093 | | |
1094 | 1094 | | |
1095 | 1095 | | |
| 1096 | + | |
1096 | 1097 | | |
1097 | 1098 | | |
1098 | 1099 | | |
1099 | 1100 | | |
| 1101 | + | |
| 1102 | + | |
1100 | 1103 | | |
1101 | 1104 | | |
1102 | 1105 | | |
1103 | 1106 | | |
1104 | 1107 | | |
1105 | 1108 | | |
| 1109 | + | |
1106 | 1110 | | |
1107 | 1111 | | |
1108 | 1112 | | |
| |||
1114 | 1118 | | |
1115 | 1119 | | |
1116 | 1120 | | |
| 1121 | + | |
| 1122 | + | |
1117 | 1123 | | |
1118 | 1124 | | |
1119 | 1125 | | |
| |||
1125 | 1131 | | |
1126 | 1132 | | |
1127 | 1133 | | |
| 1134 | + | |
| 1135 | + | |
1128 | 1136 | | |
1129 | 1137 | | |
1130 | 1138 | | |
1131 | 1139 | | |
1132 | | - | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
1133 | 1144 | | |
1134 | 1145 | | |
1135 | 1146 | | |
| |||
1308 | 1319 | | |
1309 | 1320 | | |
1310 | 1321 | | |
| 1322 | + | |
| 1323 | + | |
1311 | 1324 | | |
1312 | 1325 | | |
1313 | 1326 | | |
1314 | 1327 | | |
1315 | 1328 | | |
1316 | 1329 | | |
| 1330 | + | |
| 1331 | + | |
| 1332 | + | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
1317 | 1337 | | |
1318 | 1338 | | |
1319 | 1339 | | |
1320 | 1340 | | |
1321 | 1341 | | |
1322 | 1342 | | |
| 1343 | + | |
| 1344 | + | |
1323 | 1345 | | |
1324 | 1346 | | |
1325 | 1347 | | |
1326 | 1348 | | |
1327 | | - | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
1328 | 1361 | | |
1329 | 1362 | | |
1330 | 1363 | | |
1331 | 1364 | | |
1332 | 1365 | | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
1333 | 1370 | | |
1334 | 1371 | | |
| 1372 | + | |
| 1373 | + | |
1335 | 1374 | | |
1336 | 1375 | | |
| 1376 | + | |
| 1377 | + | |
1337 | 1378 | | |
1338 | 1379 | | |
1339 | 1380 | | |
1340 | 1381 | | |
1341 | 1382 | | |
1342 | 1383 | | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
| 1388 | + | |
| 1389 | + | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
| 1393 | + | |
1343 | 1394 | | |
1344 | 1395 | | |
1345 | 1396 | | |
| |||
2942 | 2993 | | |
2943 | 2994 | | |
2944 | 2995 | | |
| 2996 | + | |
| 2997 | + | |
| 2998 | + | |
| 2999 | + | |
| 3000 | + | |
| 3001 | + | |
| 3002 | + | |
| 3003 | + | |
| 3004 | + | |
| 3005 | + | |
| 3006 | + | |
| 3007 | + | |
| 3008 | + | |
| 3009 | + | |
| 3010 | + | |
| 3011 | + | |
| 3012 | + | |
| 3013 | + | |
| 3014 | + | |
| 3015 | + | |
| 3016 | + | |
| 3017 | + | |
| 3018 | + | |
| 3019 | + | |
| 3020 | + | |
| 3021 | + | |
| 3022 | + | |
| 3023 | + | |
| 3024 | + | |
| 3025 | + | |
| 3026 | + | |
| 3027 | + | |
| 3028 | + | |
| 3029 | + | |
| 3030 | + | |
| 3031 | + | |
| 3032 | + | |
| 3033 | + | |
| 3034 | + | |
| 3035 | + | |
| 3036 | + | |
| 3037 | + | |
| 3038 | + | |
| 3039 | + | |
| 3040 | + | |
| 3041 | + | |
| 3042 | + | |
| 3043 | + | |
| 3044 | + | |
| 3045 | + | |
| 3046 | + | |
| 3047 | + | |
| 3048 | + | |
| 3049 | + | |
| 3050 | + | |
| 3051 | + | |
2945 | 3052 | | |
2946 | 3053 | | |
2947 | 3054 | | |
| |||
0 commit comments