I generated a large number of videos using Wan 2.1 14B and scored them using the model of 13B, but I saw that almost all the results were good. Does this mean that the video effects generated by Wan2.1-14B are already better than the data prepared in your previous training set, and using your reward model can no longer improve the performance of Wan2.1 14B?
I generated a large number of videos using Wan 2.1 14B and scored them using the model of 13B, but I saw that almost all the results were good. Does this mean that the video effects generated by Wan2.1-14B are already better than the data prepared in your previous training set, and using your reward model can no longer improve the performance of Wan2.1 14B?