Speaker Diarization is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity. Voice Activity Detection (VAD), speaker segmentation and speaker clustering are the main necessary parts of the Speaker Diarization systems. There are several methods for speaker segmentation. However, most of the Speaker Diarization Systems use BIC-based Segmentation methods. The main goal of this paper is to propose a new method for speaker segmentation with higher speed than the current methods - e.g. BIC - and acceptable accuracy. Our proposed method is based on the pitch frequency of the speech. The accuracy of this method is similar to the accuracy of common speaker segmentation methods. However, its computation cost is much less than theirs. We show that our method is about 2.4 times faster than the BIC-based speaker segmentation method, while the accuracy of pitch-based method is %71 which is about %1 higher than that of the BIC-based method.
Abdolali, B., Sameti, H., & Ghezeayagh, M. H. (2012). A Method for Rapid Pitch-based Speaker Segmentation. Journal of Advanced Defense Science & Technology, 3(1), 29-38.
MLA
Behrouz Abdolali; Hossein Sameti; Mohammad Hossein Ghezeayagh. "A Method for Rapid Pitch-based Speaker Segmentation", Journal of Advanced Defense Science & Technology, 3, 1, 2012, 29-38.
HARVARD
Abdolali, B., Sameti, H., Ghezeayagh, M. H. (2012). 'A Method for Rapid Pitch-based Speaker Segmentation', Journal of Advanced Defense Science & Technology, 3(1), pp. 29-38.
VANCOUVER
Abdolali, B., Sameti, H., Ghezeayagh, M. H. A Method for Rapid Pitch-based Speaker Segmentation. Journal of Advanced Defense Science & Technology, 2012; 3(1): 29-38.