Dear @Kotaro Okazaki, thank you for sharing this interesting work and trying out Wolfram Neural Network framework! I am looking forward to more of your contributions!
Perhaps you would also be interested in taking a look at a related project that was done in Wolfram Summer School: "Generating Music with Expressive Timing and Dynamics":
http://community.wolfram.com/groups/-/m/t/1380021
It also works with MIDI, but, as you already noticed with the most approaches, it uses Recurrent Neural Networks.