
MySong: automatic accompaniment generation for vocal melodies. Ian Simon, Dan Morris, and Sumit Basu.In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. SimulSpeech: End-to-end simultaneous speech to text translation. Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, and Tie-Yan Liu.In Proceedings of the 28th ACM International Conference on Multimedia. Popmag: Pop music accompaniment generation. Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, and Tie-Yan Liu.In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. Generative adversarial network for abstractive text summarization. Linqing Liu, Yao Lu, Min Yang, Qiang Qu, Jia Zhu, and Hongyan Li.Real-time pop music accompaniment generation according to vocal melody by deep learning models. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. John Lafferty, Andrew McCallum, and Fernando CN Pereira.Rl-duet: Online music accompaniment generation using deep reinforcement learning. Nan Jiang, Sheng Jin, Zhiyao Duan, and Changshui Zhang.Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M Dai, Matthew D Hoffman, Monica Dinculescu, and Douglas Eck.Compound Word Transformer: Learning to compose full-song music over dynamic directed hypergraphs. Wen-Yi Hsiao, Jen-Yu Liu, Yin-Cheng Yeh, and Yi-Hsuan Yang.Experimental Music Composition with an electronic computer.


Driven: a framework for efficient data retrieval and clustering in vehicular networks. Bastian Havers, Romaric Duvignau, Hannaneh Najdataei, Vincenzo Gulisano, Ashok Chaitanya Koppisetty, and Marina Papatriantafilou.

In Proceedings of the 1st ACM workshop on Audio and music computing multimedia. Detecting harmonic change in musical audio.

An intelligent hybrid model for chord prediction. Uraquitan Sidney Cunha and Geber Ramalho.The results show that SongDriver outperforms existing SOTA (state-of-the-art) models on both objective and subjective metrics, meanwhile significantly reducing the physical latency. In the experiment, we train SongDriver on some open-source datasets and an original àiMusic Dataset built from Chinese-style modern pop music sheets. To make up for this disadvantage, we extract four musical features from a long-term music piece before the current time step as global information. Since the input length is often constrained under real-time conditions, another potential problem is the loss of long-term sequential information. Furthermore, when predicting chords for a timestep, SongDriver refers to the cached chords from the first phase rather than its previous predictions, which avoids the exposure bias problem. With this two-phase strategy, SongDriver directly generates the accompaniment for the upcoming melody, achieving zero logical latency. 2) The prediction phase, where a CRF model generates playable multi-track accompaniments for the coming melodies based on previously cached chords. Specifically, SongDriver divides one accompaniment generation task into two phases: 1) The arrangement phase, where a Transformer model first arranges chords for input melodies in real-time, and caches the chords for the next phase instead of playing them out. In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias. However, automatic real-time music accompaniment generation is still understudied and often faces a trade-off between logical latency and exposure bias. Real-time music accompaniment generation has a wide range of applications in the music industry, such as music education and live performances.
