Conversation
JiushengChen
left a comment
There was a problem hiding this comment.
If #26 is there with this multi-process change?
feihugis
left a comment
There was a problem hiding this comment.
@NickNickGo It looks good to me in general. One major question is about the output order. We need to make sure the output order as same as before.
fastseq_cli/transformers_generate.py
Outdated
| data_queue = Queue() | ||
| msg_queue = Queue() | ||
| p_list = [] | ||
| threads = cpu_count() |
There was a problem hiding this comment.
It may be better to allow users to specify CPU numbers.
There was a problem hiding this comment.
shouldn't make a big difference right, although I can create an argument .,
There was a problem hiding this comment.
There should be some differences. It will waste the CPU resources and it also brings overhead to create and manage these processes and sync data across these processes.
There was a problem hiding this comment.
There is a parameter define when support parallel for fairseq. GPU machine has 32/64 or more CPU. Do you get better speed when have threads > 1?
There was a problem hiding this comment.
I didn't notice significant changes in overall time when number of threads are changed.
fastseq_cli/transformers_generate.py
Outdated
|
|
||
| class IOProcess (Process) : | ||
| """ Write detokenized output to file in order.""" | ||
| def __init__ (self, msg_queue, fout): |
There was a problem hiding this comment.
| def __init__ (self, msg_queue, fout): | |
| def __init__(self, msg_queue, fout): |
There was a problem hiding this comment.
Remove the similar spaces in other places.
fastseq_cli/transformers_generate.py
Outdated
| def run (self) : | ||
| while (True) : | ||
| ind, dec = self.msg_queue.get() | ||
| if dec == GENERATE_FINISHED : |
There was a problem hiding this comment.
| if dec == GENERATE_FINISHED : | |
| if dec == GENERATE_FINISHED: |
|
Linting checks are clean. Could you please add additional formatting requirements (if any) in rcfile, this will reduce formatting iterations. |
Good suggestion. The rcfile is enhanced here(#38). One thing it does not cover is the whitespace between 1) function name and parentheses; 2) variables and colon, which you need to manually check and remove but it should be easy. |
|
Multi-worker preprocess : Bart Large BS 128 1k samples, throughput change from 11.8 (from #40 ) to 12.3. |
feihugis
left a comment
There was a problem hiding this comment.
Will the numbers in the benchmarking scripts need to be updated?
fastseq_cli/transformers_generate.py
Outdated
| return_tensors="pt", | ||
| truncation=True, | ||
| padding="max_length") |
There was a problem hiding this comment.
Add these parameters to the constructor instead of hard coding.
|
@feihugis thanks, I incorporated all nitpicks. |
feihugis
left a comment
There was a problem hiding this comment.
Last comments: 1) update the benchmarking scripts as this PR will change the performance of all the transformers models; 2) add the docs for the new classes and public APIs (e.g. short description of the API, the types and meaning of the input args and returns).
fastseq_cli/transformers_generate.py
Outdated
| def __init__(self, examples, tokenizer, model_name, prefix): | ||
| self.examples = examples | ||
| self.tokenizer= tokenizer | ||
| self.model_name = model_name | ||
| self.prefix = prefix | ||
| self.return_tensors="pt" | ||
| self.truncation=True | ||
| self.padding="max_length" |
There was a problem hiding this comment.
I mean something like
def _init(self, examples, tokenizer, model_name, prefix, return_tensors, truncation, padding):
...
self.return_tensors = return_tensors
...
|
Only HF benchmarks:
After:
|
|
@NickNickGo One minor question: are the numbers in the benchmark scripts based on #40 or not? If not, the benchmark script may fail when both PRs are merged. |
| self.return_tensors="pt" | ||
| self.truncation=True |
There was a problem hiding this comment.
Why use hard code here? We can put these two as the parameters of the constructor.
|
|
||
| class IOProcess (Process): | ||
| """ Write detokenized output to file in order.""" | ||
| def __init__(self, msg_queue, fout): |
|
|
||
| class PostProcess(Process): | ||
| """ Parallel detokenization """ | ||
| def __init__(self, tokenizer, data_queue, msg_queue, |
Async detokenization