Skip to content

[SPARK-48411][SS][PYTHON] Add E2E test for DropDuplicateWithinWatermark#46740

Closed
eason-yuchen-liu wants to merge 5 commits into
apache:masterfrom
eason-yuchen-liu:DropDuplicateWithinWatermark_test
Closed

[SPARK-48411][SS][PYTHON] Add E2E test for DropDuplicateWithinWatermark#46740
eason-yuchen-liu wants to merge 5 commits into
apache:masterfrom
eason-yuchen-liu:DropDuplicateWithinWatermark_test

Conversation

@eason-yuchen-liu
Copy link
Copy Markdown
Contributor

@eason-yuchen-liu eason-yuchen-liu commented May 24, 2024

What changes were proposed in this pull request?

This PR adds a test for API DropDuplicateWithinWatermark in Python, which was previously missing.

Why are the changes needed?

Check the correctness of API DropDuplicateWithinWatermark.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Passed:

python/run-tests --testnames pyspark.sql.tests.streaming.test_streaming
python/run-tests --testnames pyspark.sql.tests.connect.streaming.test_parity_streaming

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 seconds is necessary to make sure the task run at least once.

@eason-yuchen-liu eason-yuchen-liu marked this pull request as ready for review May 28, 2024 20:36
@anishshri-db
Copy link
Copy Markdown
Contributor

@eason-yuchen-liu - same here. seems like the GH actions is not running the tests as expected ?

@anishshri-db
Copy link
Copy Markdown
Contributor

cc - @HeartSaVioR

@WweiL
Copy link
Copy Markdown
Contributor

WweiL commented May 29, 2024

@eason-yuchen-liu, could you please setup the CI as shown in the error message https://github.com/apache/spark/pull/46740/checks?check_run_id=25522033987
Thank you!

@eason-yuchen-liu eason-yuchen-liu force-pushed the DropDuplicateWithinWatermark_test branch from c80987f to 7d031b8 Compare May 29, 2024 20:26
Comment thread python/pyspark/sql/tests/streaming/test_streaming.py Outdated
Comment thread python/pyspark/sql/tests/streaming/test_streaming.py Outdated
Comment thread python/pyspark/sql/tests/streaming/test_streaming.py Outdated
Comment thread python/pyspark/sql/tests/streaming/test_streaming.py
.csv("python/test_support/sql/streaming/time")
)
q1 = (
df.withWatermark("time", "2 seconds")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can you move withWatermark and outputMode to new lines ? Thx

Copy link
Copy Markdown
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Copy Markdown
Contributor

Thanks, merging to master!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants