eos early stop

#6
by Serpient - opened

As i am checking your eos early stop mechanism, i notice a weird logic, im not sure if this is your intention. If transfer_index.any(), cur_x is updated, then if all the conditions of eos early stop meet, the final_x is returned, however, final_x here is x[:, :total_length][:, : eos_pos + 1]. Given that x is only updated once each block is completely unmasked, when you assign final_x, all decoded tokens in current block are not updated to x yet.

Just a note: cur_x is a view of x, so any modifications to cur_x will also affect x.

Therefore, the explicit assignment back to x on this line is redundant and could cause some confusion.

I suggest we remove it to improve clarity. WDYT @utdawn

oh, i see, thx for clarification

Sign up or log in to comment