Abstract
The musical source separation deals with extracting the musical signals from a mixture. To attain this goal, one of the efficient methods is to decompose the mixture into a dictionary of some basic functions that inherently describe the instruments. Usually, a unique function is synthesized for each of the notes of each instrument, called the note-specific atom. In this paper, a sine-harmonic model is utilized to synthesize note-specific atoms and the note’s fundamental frequency is used as a prior information to determine the model parameters. To calculate these parameters, the training signal spectrum is processed only around the main note harmonics. Experimental results demonstrated that the proposed method is much faster in note-specific atoms synthesis, without decreasing the source separation quality and can also eliminate the single-frequency noise from training signals.













Similar content being viewed by others
References
Azamian M, Kabir E, Seyedin S, Masehian E (2017) An adaptive sparse algorithm for synthesizing note-specific atoms by spectrum analysis, applied to musical signal separation. Advances in Electrical And Computer Engineering 17(2):103–112. https://doi.org/10.4316/AECE.2017.02014
Bertin N, Badeau R, Vincent E (2010) Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription. IEEE Trans Audio Speech Lang Process 18(3):538–549
Brown J, Smaragdis P (2004) Independent component analysis for automatic note extraction from musical trills. J Acoust Soc Am 115:2295–2306
Casey MA, Westner A (2000) Separation of mixed audio sources by independent subspace analysis. Proceedings of the International Computer Music Conference, Germany
Cho N, Kuo CCJ (2011) Sparse music representation with source-specific dictionaries and its application to signal separation. IEEE Trans Audio Speech Lang Process 19(2):337–348
Davies ME, James CJ (2007) Source separation using single channel ICA. Signal Processing 87(8)
Durrieu JL, Richard G, David B, Fevotte C (2010) Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans Audio Speech Lang Process 18(3):564–575
Ewert S, Muller M, Sandler M (2013) Efficient data adaption for musical source separation methods based on parametric models. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 46–50
Ewert S, Pardo B, Mueller M, Plumbley MD (2014) Score-informed source separation for musical audio recordings: an overview. IEEE Signal Process Mag 31(3):116–124
Ewert S, Plumbley MD, Sandler M (2014) Accounting for phase cancellations in non-negative matrix factorization using weighted distances. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing: 649–653
Fitzgerald D, Coyle E, Lawlor B (2002) Sub-band independent subspace analysis for drum transcription. Proceedings of DAFx, Germany
Fitzgerald D, Coyle E, Lawlor B (2003) Independent subspace analysis using locally linear embedding. Proceedings of DAFx, Germany
Goto M, Hashiguchi H, Nishimura T, Oka R (2003) RWC music database: musical instrument sound database. ISMIR 2003:229–230
Grais EM, Roma G, Simpson JR, Plumbley MD (2017) Two-stage single-channel audio source separation using deep neural networks. IEEE Trans Audio Speech Lang Process 25(9):1773–1783
Guo Y, Zhu M (2011) Audio source separation by basis function adaptation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Islam Molla MK, Hirose K, Minematsu N (2004) Single mixture audio source separation using KLD based clustering of independent basis functions. 3rd International Conference on Electrical & Computer Engineering
Jang GJ, Lee TW, Oh YH (2003) Single-channel signal separation using time-domain basis functions. IEEE Signal Processing Letters 10(6):168–171
Jao PK, Su L, Yang YH, Wohlberg B (2016) Monaural music source separation using convolutional sparse coding. IEEE Trans Audio Speech Lang Process 24(11):2158–2170
Kameoka H (2015) Multi-resolution signal decomposition with time-domain spectrogram factorization. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing: 86–90
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Lefevre A, Bach F, Févotte C (2012) Semi-supervised NMF with time-frequency annotations for single-channel source separation. Proceedings of International Society for Music Information Retrieval Conference (ISMIR): 115–120
Liu Y, Nie L, Liu L, Rosenblum D (2016) From action to activity: Sensor-based activity recognition. Neurocomputing 181:108–115. https://doi.org/10.1016/j.neucom.2015.08.096
Ozerov A, Févotte C, Blouet R, Durrieu JL (2011) Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation. Proceedings of IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP): 257–260
Phiwma N, Sanguansat P (2010) A music information system based on improved melody contour extraction. Proceedings of the IEEE International Conference on Signal Acquisition and Processing (ICSAP), Bangalore, India: 85–89
Salamon J, Gomez E (2012) Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans Audio Speech Lang Process 20(6):1759–1770
Schmidt MN, Olsson RK (2006) Single-channel speech separation using sparse nonnegative matrix factorization. Proceeding of the Interspeech
Simsekli U, Cemgil AT (2012) Score guided musical source separation using generalized coupled tensor factorization. Proceedings of European Signal Processing Conference (EUSIPCO): 2639–2643
Smaragdis P, Mysore GJ (2009) Separation by humming: user guided sound extraction from monophonic mixtures. Proceedings of IEEE Workshop Application Signal Processing to Audio Acoustics (WASPAA): 69–72
Suits BH (2015) Frequencies for equal-tempered scale. Michigan Technological University, https://pages.mtu.edu/~suits/notefreqs.html. Accessed Mar 2018
Ueda Y, Uchiyama Y, Nishimoto T, Ono N, Sagayama S (2010) HMM-based approach for automatic chord detection using refined acoustic features. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 5518–5521
Uhle C, Dittmar C (2003) Extraction of drum tracks from polyphonic music using independent subspace analysis. in: 4th International Symposium on Independent Component Analysis and Blind Signal Separation
Universal Sound Bank database (2015). Available: http://eng.universal-soundbank.com/instruments.htm. Accessed Mar 2018
Virtanen T (2000) Audio signal modeling with sinusoids plus noise, Master’s thesis, Tampere University of Technology
Virtanen T (2004) Separation of sound sources by convolutive sparse coding. Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA), ISCA
Virtanen T (2007) Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans Audio Speech Lang Process 15(3):1066–1074
Wang L, Reiss JD, Cavallaro A (2016) Over-determined source separation and localization using distributed microphones. IEEE Trans Audio Speech Lang Process 24(9):1573–1588
Xu Y, Bao G, Xu X, Ye Z (2015) Single-channel speech separation using sequential discriminative dictionary learning. Signal Process 106:134–140. https://doi.org/10.1016/j.sigpro.2014.07.012
Yoshii K, Tomioka R, Mochihashi D, Goto M (2013) Beyond NMF: time-domain audio source separation without phase reconstruction. Proceedings of the International Society of Music Information Retrieval Conference: 369–374
Zdunek R (2013) Improved convolutive and under-determined blind audio source separation with MRF smoothing. Cogn Comput 5(4):493–503
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Azamian, M., Kabir, E. Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation. Multimed Tools Appl 78, 17929–17948 (2019). https://doi.org/10.1007/s11042-018-7060-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-7060-8