Digital speech watermarking for online speaker recognition systems

Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermark...

詳細記述

書誌詳細
第一著者: Nematollahi, Mohammad Ali
フォーマット: 学位論文
言語:英語
出版事項: 2015
オンライン・アクセス:http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf
_version_ 1846215927479140352
author Nematollahi, Mohammad Ali
author_facet Nematollahi, Mohammad Ali
author_sort Nematollahi, Mohammad Ali
description Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations.
format Thesis
id oai:psasir.upm.edu.my:65614
institution Universiti Putra Malaysia
language English
publishDate 2015
record_format eprints
spelling oai:psasir.upm.edu.my:656142025-04-17T06:46:11Z http://psasir.upm.edu.my/id/eprint/65614/ Digital speech watermarking for online speaker recognition systems Nematollahi, Mohammad Ali Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations. 2015-06 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf Nematollahi, Mohammad Ali (2015) Digital speech watermarking for online speaker recognition systems. Doctoral thesis, Universiti Putra Malaysia.
spellingShingle Nematollahi, Mohammad Ali
Digital speech watermarking for online speaker recognition systems
title Digital speech watermarking for online speaker recognition systems
title_full Digital speech watermarking for online speaker recognition systems
title_fullStr Digital speech watermarking for online speaker recognition systems
title_full_unstemmed Digital speech watermarking for online speaker recognition systems
title_short Digital speech watermarking for online speaker recognition systems
title_sort digital speech watermarking for online speaker recognition systems
url http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf
url-record http://psasir.upm.edu.my/id/eprint/65614/
work_keys_str_mv AT nematollahimohammadali digitalspeechwatermarkingforonlinespeakerrecognitionsystems