create specially engineered relative positional bias for fine transformer, so...
create specially engineered relative positional bias for fine transformer, so coarse and fine sequences learn to attend to each other at relative distances apart
Loading
Please register or sign in to comment