Attention is all you get

Playing this video requires the latest flash player from Adobe.

Download link (right click and 'save-as') for playing in VLC or other compatible player.

Recording Details

Scientific Areas: 
PIRSA Number: 


For the past decade, there has been a new major architectural fad in deep learning every year or two.
One such fad for the past two years has been the transformer model, an implementation of the attention method which has superseded RNNs in most sequence learning applications. I'll give an overview of the model, with some discussion of non-physics applications, and intimate some possibilities for physics.