The bioacoustic analyses of animal sounds result in an enormous amount of digitized acoustic data, and we need effective automatic processing to extract the information content of the recordings. Our research focuses on the song of Collared Flycatcher (Ficedula albicollis) and we are interested in the evolution of acoustic signals. During the last 20 years, we obtained hundreds of hours of recordings of bird songs collected in natural environment, and there is a permanent need for the automatic process of recordings. In this study, we chose an open-source, deep-learning image detection system to (1) find the species-specific songs of the Collared Flycatcher on the recordings and (2) to detect the small, discrete elements so-called syllables within the song. For these tasks, we first transformed the acoustic data into spectrogram images, then we trained two deep-learning models separately on our manually segmented database. The resulted models detect the songs with an intersection of union higher than 0.8 and the syllables higher than 0.7. This technique anticipates an order of magnitude less human effort in the acoustic processing than the manual method used before. Thanks to the new technique, we are able to address new biological questions that need large amount of acoustic data.