# AttentionLayer is a thing in Mathematica and that's awesome

Posted 10 days ago
144 Views
|
0 Replies
|
3 Total Likes
|
 Today I was trying to understand Transformer architectures and as I was watching a video explanation of the seminal paper "attention is all you need", it occurred to me that the most difficult part to understand is the "attention layer". So I thought "wouldn't it be nice if there was an AttentionLayer function available in Mathematica ?", and then I checked if there happens to be one, just in case.And sure enough, there is one.I think the way Wolfram Research models machine learning concepts is very cool, and I just wanted to share my enthusiasm. You guys make implementing machine learning clean and elegant, keep it up !