ELE-740 - paginate upload of source freshness data#850
ELE-740 - paginate upload of source freshness data#850elongl merged 2 commits intoelementary-data:masterfrom
Conversation
elongl
left a comment
There was a problem hiding this comment.
Thanks a lot for opening this PR.
I checked and it does indeed resolve the issue!
Added one very small suggestion but it's not a must, let me know if you want to add it, else I'll merge.
| quiet=True, | ||
| ) | ||
| chunk_size = 100 | ||
| for chunk in range(0, len(results), chunk_size): |
There was a problem hiding this comment.
Think it might be cool if we'll add a progress bar like we do here. What do you think?
There was a problem hiding this comment.
Hey, makes sense! I will do that.
There was a problem hiding this comment.
Hey, @elongl
I added the progress bar. I tested and it's working as expected, but let me know if there is something I should change.
I based the title on the completion message: Uploaded source freshness results successfully
elongl
left a comment
There was a problem hiding this comment.
Thanks a lot for opening this PR.
I checked and it does indeed resolve the issue!
Added one very small suggestion but it's not a must, let me know if you want to add it, else I'll merge.
|
Thanks a lot for contributing! |
This solves issue ELE-740
I am not really sure about what the best way to determine the
chunk_sizefor the upload of the source data is. What I did:MAX_ARGfor Linux systems is and according to some stackoverflow threads it's generally something between 100k and 200k characters - it's bigger for MacOS (this also finally explains why I wasn't getting the error locally, but it kept appearing in the GH action), but since we are running it in a Github instance and to go on the safe side, it obviously makes sense to go with the lesser valuesources.jsonfile from our project we have:I normally don't like the idea of hard coding it. We also could:
getconf ARG_MAXon unix based systems)but IMO this would let the code hard to read and over engineered.
That said, if you have other suggestions, I'd be happy to hear them.
Then, I am simply running a
forloop to call the command uploading the sources multiple times. Is there a better way to do this?Testing
I didn't find any testing guidelines for the cli (justo for the dbt-package). What I did to test it, was to install the altered package and run the command on our dbt project. It's working as expected. Let me know how I can better test it.
Comments
From what I see the Elementary code isn't over commented, so I also didn't comment the steps in the code.
Please let me know what I could do better or am doing wrong.