Loading
use 2d dynamic positional bias for fine transformer, to try to improve...
use 2d dynamic positional bias for fine transformer, to try to improve training at greater number of fine quantizers
use 2d dynamic positional bias for fine transformer, to try to improve training at greater number of fine quantizers