[SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories#27130
[SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories#27130kevinyu98 wants to merge 5 commits into
Conversation
|
ok to test |
|
Test build #116300 has finished for PR 27130 at commit
|
| |LOCATION '${s"${path.getCanonicalPath}"}'""".stripMargin | ||
| sql(topDirStatement) | ||
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("select * from tbl1"), Nil) |
There was a problem hiding this comment.
Shall we capitalize the SQL statement like SELECT * FROM tbl1?
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("select * from tbl1"), Nil) | ||
| } else { | ||
| intercept[IOException](sql("select * from tbl1").show()) |
There was a problem hiding this comment.
We had better check the exception message.
| |LOCATION '${s"${path.getCanonicalPath}/l1/"}'""".stripMargin | ||
| sql(l1DirStatement) | ||
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("select * from tbl2"), |
| checkAnswer(sql("select * from tbl2"), | ||
| (1 to 2).map(i => Row(i, i, s"parq$i"))) | ||
| } else { | ||
| intercept[IOException](sql("select * from tbl2").show()) |
There was a problem hiding this comment.
Please check the exception message.
There was a problem hiding this comment.
the whole exception message is
Not a file: file:/Users/qianyangyu/IdeaProjects/spark/target/tmp/spark-abc8c1ad-4a3a-420f-b4fc-58d995be9bb0/l1, I will check the first part Not a file:.
| |LOCATION '${s"${path.getCanonicalPath}/l1/l2/"}'""".stripMargin | ||
| sql(l2DirStatement) | ||
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("select * from tbl3"), |
| checkAnswer(sql("select * from tbl3"), | ||
| (3 to 4).map(i => Row(i, i, s"parq$i"))) | ||
| } else { | ||
| intercept[IOException](sql("select * from tbl3").show()) |
There was a problem hiding this comment.
Please check the exception message.
There was a problem hiding this comment.
added the checking
| withTempDir { dir => | ||
| try { | ||
| hiveClient.runSqlHive("USE default") | ||
| hiveClient.runSqlHive( |
There was a problem hiding this comment.
Do we need to use runSqlHive?
There was a problem hiding this comment.
sure, I will change to sql.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Thank you for keeping working on this. Sorry for your long waiting. At this time, I hope we can merge your PR.
|
Test build #116421 has finished for PR 27130 at commit
|
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("SELECT * FROM tbl1"), Nil) | ||
| } else { | ||
| val msg = intercept[IOException] {sql("SELECT * FROM tbl1").show() |
There was a problem hiding this comment.
sql("SELECT * FROM tbl1").show() seems to need to be in the next line.
| } else { | ||
| val msg = intercept[IOException] {sql("SELECT * FROM tbl1").show() | ||
| }.getMessage | ||
| assert(msg.contains("Not a file:")) |
| sql(l1DirStatement) | ||
| if (parquetConversion == "true") { | ||
| checkAnswer(sql("SELECT * FROM tbl2"), | ||
| (1 to 2).map(i => Row(i, i, s"parq$i"))) |
There was a problem hiding this comment.
Can we merge 269 and 270 into one line here?
| checkAnswer(sql("SELECT * FROM tbl3"), | ||
| (3 to 4).map(i => Row(i, i, s"parq$i"))) | ||
| } else { | ||
| val msg = intercept[IOException] {sql("SELECT * FROM tbl3").show() |
There was a problem hiding this comment.
sql("SELECT * FROM tbl3").show()?
There was a problem hiding this comment.
moved to next line.
| checkAnswer(sql("SELECT * FROM tbl5"), | ||
| (1 to 4).map(i => Row(i, i, s"parq$i"))) | ||
| } else { | ||
| val msg = intercept[IOException] {sql("SELECT * FROM tbl5").show() |
| sql("USE default") | ||
| sql( | ||
| """ | ||
| |CREATE EXTERNAL TABLE hive_orc( |
There was a problem hiding this comment.
I'm a little confused here.
@kevinyu98 . Do you want to get a table created by Hive here?
Usually, we use the table name, hive_orc, for that table. Please see https://github.com/apache/spark/pull/27130/files#diff-a8c26a35def87a13e6b59db19d9fb8a1R68 .
And, you still using hiveClient.runSqlHive at line 192. I'm wondering what is the test target in this PR~.
There was a problem hiding this comment.
@dongjoon-hyun Thanks for pointing out this. I was using other test cases without thinking too much. I have changed the name. I also replaced the hiveClient.runSqlHive for the insert stmt.
|
Test build #116548 has finished for PR 27130 at commit
|
|
@dongjoon-hyun can we re-test it? I ran |
|
retest this please |
|
Test build #116657 has finished for PR 27130 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Hi, @kevinyu98 .
I agree with HiveOrcSourceSuite.scala. Could you update HiveParquetSourceSuite.scala like HiveOrcSourceSuite.scala? There is no need to be different.
|
@dongjoon-hyun sure, thanks. |
| val l1DirSqlStatement = s"SELECT * FROM tbl2" | ||
| if (convertMetastore) { | ||
| checkAnswer(sql(l1DirSqlStatement), | ||
| (1 to 2).map(i => Row(i, i, s"orc$i"))) |
There was a problem hiding this comment.
Let's make line 250 and 251 as one line like HiveParquetSourceSuite.
| (1 to 2).map(i => Row(i, i, s"orc$i"))) | ||
| } else { | ||
| checkAnswer(sql(l1DirSqlStatement), | ||
| (1 to 6).map(i => Row(i, i, s"orc$i"))) |
| val l2DirSqlStatement = s"SELECT * FROM tbl3" | ||
| if (convertMetastore) { | ||
| checkAnswer(sql(l2DirSqlStatement), | ||
| (3 to 4).map(i => Row(i, i, s"orc$i"))) |
| (3 to 4).map(i => Row(i, i, s"orc$i"))) | ||
| } else { | ||
| checkAnswer(sql(l2DirSqlStatement), | ||
| (3 to 6).map(i => Row(i, i, s"orc$i"))) |
| val wildcardTopDirSqlStatement = s"SELECT * FROM tbl4" | ||
| if (convertMetastore) { | ||
| checkAnswer(sql(wildcardTopDirSqlStatement), | ||
| (1 to 2).map(i => Row(i, i, s"orc$i"))) |
| val wildcardL1DirSqlStatement = s"SELECT * FROM tbl5" | ||
| if (convertMetastore) { | ||
| checkAnswer(sql(wildcardL1DirSqlStatement), | ||
| (1 to 4).map(i => Row(i, i, s"orc$i"))) |
| val wildcardL2SqlStatement = s"SELECT * FROM tbl6" | ||
| if (convertMetastore) { | ||
| checkAnswer(sql(wildcardL2SqlStatement), | ||
| (3 to 6).map(i => Row(i, i, s"orc$i"))) |
| checkAnswer(sql(wildcardL2SqlStatement), Nil) | ||
| } | ||
| } | ||
| } finally { |
There was a problem hiding this comment.
Shall we convert this try..finally with withTable like HiveParquetSourceSuite?
|
Test build #116894 has finished for PR 27130 at commit
|
|
Test build #116967 has finished for PR 27130 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM. Thank you, @kevinyu98 .
Merged to master.
What changes were proposed in this pull request?
This PR aims to add these test cases for resolution of ORC table location reported by SPARK-25993
also add corresponding test cases for Parquet table.
Why are the changes needed?
The current behavior is complex, this test case suites are designed to prevent the accidental behavior change. This pr is rebased on master, the original pr is 23108
Does this PR introduce any user-facing change?
No. This adds test cases only.
How was this patch tested?
This is a new test case.