Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
8188bb8
[SPARK-XXXXX][CONNECT][PYTHON] Add SQLContext wrapper for Spark Connect
dbtsai Apr 27, 2026
63db047
Address review comments on Connect SQLContext wrapper
dbtsai Apr 28, 2026
b867ac5
Fix mypy errors in Connect SQLContext wrapper
dbtsai Apr 28, 2026
d605e53
Fix CI: register test_connect_context in modules.py and fix ruff form…
dbtsai Apr 30, 2026
94047fc
Add API reference docs for SQLContext and HiveContext legacy entry po…
dbtsai May 1, 2026
aa3e4e3
Fix newSession() in Connect SQLContext to use cloneSession()
dbtsai May 1, 2026
ee88276
Address review comments: fix tables() parity, remove type: ignore, fi…
dbtsai May 26, 2026
e4c94a3
Block HiveContext bypass via getOrCreate by overriding _from_session
dbtsai May 26, 2026
adea4d4
Fix ruff format: collapse inner cast onto one line
dbtsai May 26, 2026
b4329f6
Trigger CI
dbtsai May 27, 2026
b8bb147
Address review feedback: remove public getOrCreate from Connect SQLCo…
dbtsai May 28, 2026
30f3056
Fix HiveContext _instantiatedContext bypass and register missing test…
dbtsai May 28, 2026
c95a81f
Fix SQLContext.__init__ to use type(self) for cache, not hardcoded cl…
dbtsai May 28, 2026
bb48a0b
Remove misleading deprecated annotation from registerJavaFunction in …
dbtsai May 28, 2026
cc15e98
Fix tables() namespace truncation and stop() cache leak in Connect SQ…
dbtsai May 28, 2026
631339f
address review feedback: version tags, __all__, comment fix, getOrCre…
dbtsai May 29, 2026
4e28c40
fix: versionadded tags should be 4.2.0, not 5.0.0
dbtsai May 29, 2026
dc19a21
address review feedback: 4.3.0 version tags, session-cache comment, m…
dbtsai Jun 1, 2026
9e37701
fix: properly validate cached session in _get_or_create_from_session,…
dbtsai Jun 1, 2026
02fb0b0
fix: ruff format on context.py getattr, fix registerFunction test ret…
dbtsai Jun 2, 2026
b89f921
address review feedback (viirya): widen getOrCreate dispatch to is_re…
dbtsai Jun 4, 2026
a0a9c37
address review feedback (viirya round 2): replace assert with PySpark…
dbtsai Jun 4, 2026
4bd0da7
test cleanup from self-review: drop dead warnings filter, stop cloned…
dbtsai Jun 4, 2026
4ab909d
fix: do not stop cloned Connect session in test (terminates shared lo…
dbtsai Jun 4, 2026
edfd561
address review feedback (HyukjinKwon): bump .. deprecated:: directive…
dbtsai Jun 8, 2026
9fdf721
fix: release cloned Connect session in test to avoid atexit hang
dbtsai Jun 9, 2026
327fd80
address review feedback (hvanhovell): newSession() returns a fresh se…
dbtsai Jun 9, 2026
fb8931d
address review feedback (cloud-fan): namespace parity, em-dash, subcl…
dbtsai Jun 9, 2026
5df84a4
fix: set release_session_on_close on sessions created via object.__new__
dbtsai Jun 9, 2026
2b7fd64
fix: release leaked Connect client in parity newSession test
dbtsai Jun 9, 2026
392a4ae
address review feedback (Codex): preserve session hooks and RPC deadl…
dbtsai Jun 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions dev/sparktestsupport/modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,7 @@ def __hash__(self):
"pyspark.sql.tests.test_column",
"pyspark.sql.tests.test_conf",
"pyspark.sql.tests.test_context",
"pyspark.sql.tests.test_sql_context",
"pyspark.sql.tests.test_dataframe",
"pyspark.sql.tests.test_collection",
"pyspark.sql.tests.test_creation",
Expand Down Expand Up @@ -1165,6 +1166,8 @@ def __hash__(self):
"pyspark.sql.tests.connect.test_parity_geometrytype",
"pyspark.sql.tests.connect.test_parity_datasources",
"pyspark.sql.tests.connect.test_parity_errors",
"pyspark.sql.tests.connect.test_connect_context",
"pyspark.sql.tests.connect.test_parity_sql_context",
"pyspark.sql.tests.connect.test_parity_catalog",
"pyspark.sql.tests.connect.test_parity_conf",
"pyspark.sql.tests.connect.test_parity_serde",
Expand Down
1 change: 1 addition & 0 deletions python/docs/source/reference/pyspark.sql/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,4 @@ This page gives an overview of all public Spark SQL API.
protobuf
datasource
stateful_processor
legacy
78 changes: 78 additions & 0 deletions python/docs/source/reference/pyspark.sql/legacy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.


====================
Legacy Entry Points
====================
.. currentmodule:: pyspark.sql

:class:`SQLContext` was the primary entry point for Spark SQL in Spark 1.x.
As of Spark 2.0, it has been replaced by :class:`SparkSession`.
These classes are retained for backward compatibility only.

.. deprecated:: 3.0.0
Use :func:`SparkSession.builder.getOrCreate` instead.

.. note::
Under Spark Connect, :meth:`SQLContext.registerJavaFunction` and the whole
:class:`HiveContext` are not supported and raise
:class:`~pyspark.errors.PySparkNotImplementedError`,
since they rely on a JVM ``SparkContext`` that does not exist in Connect mode.

SQLContext
----------

.. autosummary::
:toctree: api/

SQLContext

.. autosummary::
:toctree: api/

SQLContext.getOrCreate
SQLContext.newSession
SQLContext.setConf
SQLContext.getConf
SQLContext.udf
SQLContext.udtf
SQLContext.range
SQLContext.registerFunction
SQLContext.registerJavaFunction
SQLContext.createDataFrame
SQLContext.registerDataFrameAsTable
SQLContext.dropTempTable
SQLContext.createExternalTable
SQLContext.sql
SQLContext.table
SQLContext.tables
SQLContext.tableNames
SQLContext.cacheTable
SQLContext.uncacheTable
SQLContext.clearCache
SQLContext.read
SQLContext.readStream
SQLContext.streams

HiveContext
-----------

.. autosummary::
:toctree: api/

HiveContext
29 changes: 29 additions & 0 deletions python/pyspark/sql/connect/client/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2633,3 +2633,32 @@ def clone(self, new_session_id: Optional[str] = None) -> "SparkConnectClient":
# Ensure the session ID is correctly set from the response
new_client._session_id = response.new_session_id
return new_client

def newSession(self) -> "SparkConnectClient":
"""
Create a new client against the same endpoint with a fresh, independent server-side
session that does NOT inherit any state from this client's session. Unlike
:meth:`clone`, no state (SQL configurations, temporary views, registered functions,
catalog state) is copied over, and no server round-trip is made: the new client is
built from a copy of this client's connection configuration with the session ID
cleared, so a fresh session ID is generated and the server lazily creates an empty
isolated session for it.

Returns
-------
SparkConnectClient
A new SparkConnectClient instance bound to a fresh, empty session.
"""
# Reuse the same connection configuration (endpoint, channel options, metadata,
# user) but drop the session ID so the constructor generates a fresh UUID.
new_connection = copy.deepcopy(self._builder)
new_connection._params.pop(ChannelBuilder.PARAM_SESSION_ID, None)
# Only server-side session state is left behind: client-side behavior such as
# registered session hooks and RPC deadlines carries over, as in clone().
return SparkConnectClient(
connection=new_connection,
user_id=self._user_id,
use_reattachable_execute=self._use_reattachable_execute,
session_hooks=self._session_hooks,
rpc_deadlines=self._rpc_deadlines,
)
Loading