This issue pertains to ads-github-cache as included in ads-github-tools-0.3.3.
Updating paged chunks of data associated with a given GitHub v3 API resource path (e.g., /user/repos) can take a significant amount of time when the number of pages is large. This is true even though the current mechanism is checking the "ETag" of the cached object with the GitHub API to avoid pulling down pages of data that have not changed. Most of the time involved is just waiting for the round trip time of individual calls to the GitHub API, which are currently all performed sequentially.
It is expected that the total time could be significantly reduced if the ads-github-cache program were structured in such a way that it could make N requests at a time against the GitHub API, where N would be dynamically deteremined based on some TBD criteria (such as the number of available CPU cores, etc.)
This issue pertains to
ads-github-cacheas included inads-github-tools-0.3.3.Updating paged chunks of data associated with a given GitHub v3 API resource path (e.g.,
/user/repos) can take a significant amount of time when the number of pages is large. This is true even though the current mechanism is checking the "ETag" of the cached object with the GitHub API to avoid pulling down pages of data that have not changed. Most of the time involved is just waiting for the round trip time of individual calls to the GitHub API, which are currently all performed sequentially.It is expected that the total time could be significantly reduced if the
ads-github-cacheprogram were structured in such a way that it could make N requests at a time against the GitHub API, where N would be dynamically deteremined based on some TBD criteria (such as the number of available CPU cores, etc.)