Add spark session extension for Hyperspace#504
Conversation
c62f5b6 to
ab14466
Compare
923872a to
84b2969
Compare
0cd2a5b to
c0d439f
Compare
c0d439f to
35d4ffe
Compare
82820db to
daf9d08
Compare
add configure to control enabling hyperspace add dummy rule to avoid different behavior of Extensions / apply hyperspace add test for hyperspace extension
89cdb02 to
fa80dba
Compare
| override def apply(extensions: SparkSessionExtensions): Unit = { | ||
| extensions.injectOptimizerRule { sparkSession => | ||
| // Enable Hyperspace to leverage indexes. | ||
| sparkSession.enableHyperspace() |
There was a problem hiding this comment.
According to the new model, enableHyperspace only exists for backward compatibility. I think it's better to factor out the rule insertion code out of this method and invoke the method here and from enableHyperspace.
There was a problem hiding this comment.
@clee704 Do you mean this part?
if (!sparkSession.sessionState.experimentalMethods.extraOptimizations.contains(
ApplyHyperspace)) {
sparkSession.sessionState.experimentalMethods.extraOptimizations ++=
ApplyHyperspace :: Nil
}
if (!sparkSession.sessionState.experimentalMethods.extraStrategies.contains(
BucketUnionStrategy)) {
sparkSession.sessionState.experimentalMethods.extraStrategies ++=
BucketUnionStrategy :: Nil
}
Where should I put this code because package object hyperspace is quite customer side interfaces, so not sure if it is ok to create a function like
package object hyperspace {
/**
* Hyperspace-specific implicit class on SparkSession.
*/
implicit class Implicits(sparkSession: SparkSession) {
def enableHyperspace(): SparkSession = {
HyperspaceConf.setHyperspaceApplyEnabled(sparkSession, true)
addOptimizationsIfNeeded()
sparkSession
}
private def addOptimizationsIfNeeded(): Unit = {
if (!sparkSession.sessionState.experimentalMethods.extraOptimizations.contains(
ApplyHyperspace)) {
sparkSession.sessionState.experimentalMethods.extraOptimizations ++=
ApplyHyperspace :: Nil
}
if (!sparkSession.sessionState.experimentalMethods.extraStrategies.contains(
BucketUnionStrategy)) {
sparkSession.sessionState.experimentalMethods.extraStrategies ++=
BucketUnionStrategy :: Nil
}
}
}
}
There was a problem hiding this comment.
You can put addOptimizationsIfNeeded() in a companion object HyperspaceSparkSessionExtension and call the method from here and enableHyperspace().
There was a problem hiding this comment.
@clee704 can you check whether I implemented as you expected?
| * | ||
| * @param sparkSession Spark session that will use Hyperspace | ||
| */ | ||
| def addOptimizationsIfNeeded(sparkSession: SparkSession): Unit = { |
There was a problem hiding this comment.
I think it's better to locate the function in package.scala
What is the context for this pull request?
What changes were proposed in this pull request?
Added a feature to enable Hyperspace with SparkSessionExtention
Does this PR introduce any user-facing change?
Yes. Users now can enable Hyperspace as follows
or
How was this patch tested?
Manually with spark-shell. If an automated test is required, I will add it.