Digital speech watermarking for online speaker recognition systems

Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermark...

詳細記述

書誌詳細
第一著者:	Nematollahi, Mohammad Ali
フォーマット:	学位論文
言語:	英語
出版事項:	2015
オンライン･アクセス:	http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf

_version_	1846215927479140352
author	Nematollahi, Mohammad Ali
author_facet	Nematollahi, Mohammad Ali
author_sort	Nematollahi, Mohammad Ali
description	Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations.
format	Thesis
id	oai:psasir.upm.edu.my:65614
institution	Universiti Putra Malaysia
language	English
publishDate	2015
record_format	eprints
spelling	oai:psasir.upm.edu.my:656142025-04-17T06:46:11Z http://psasir.upm.edu.my/id/eprint/65614/ Digital speech watermarking for online speaker recognition systems Nematollahi, Mohammad Ali Speaker recognition is popular and feasible for online applications such as the telephone or network. However, low recognition performance and various vulnerable slots in online speaker recognition systems are two main problems. Although some of these slots can be secured by digital speech watermarking, applying robust watermark can still seriously degrade the recognition performance of online speaker recognition systems. The main aim of this thesis was to improve the security of the communication channel, robustness, and recognition performance of online speaker recognition systems by applying digital speech watermarking. In this thesis, Multi-Factor Authentication (MFA) method was used by a combination of PIN and voice biometric through the watermarks. For this reason, a double digital speech watermarking was developed to embed semi-fragile and robust watermarks simultaneously in the speech signal to provide tamper detection and proof of ownership respectively. For the blind semi-fragile digital speech watermarking technique, Discrete Wavelet Packet Transform (DWPT) and Quantization Index Modulation (QIM) were performed to embed the watermark in an angle of the wavelet’s sub-bands where more speaker specific information was available. For watermarking the encrypted PIN in voice, a blind and robust digital speech watermarking was used by applying DWPT and multiplication. The PIN was embedded by manipulating the amplitude of the wavelet’s subbands where less speaker specific information was available. A frame selection technique was also applied to weigh the amount of speaker-specific information available inside the speech frames. In the developed frame selection technique, Linear Predictive Analysis (LPA) was applied to separate the system features (formants) and source features (residual errors) of the speech frames. Then, a frequency weighted function was used to quantify the formants. High order correlation and high order statistics were used for weighting the residual errors. The lower frames’ weight could be ignored for online speaker recognition systems but applied for digital speech watermarking. TIMIT, MIT, and MOBIO speech corpuses were used for evaluating the developed systems. The experimental results showed that a combination of DWPT and multiplication for robust digital speech watermarking technique had higher robustness as compared to other robust watermarking techniques, such as Discrete Wavelet Transform (DWT) with Singular Value Decomposition (SVD) and Lifting Wavelet Transform (LWT) with SVD, against different attacks such as filtering, additive noise, compression, re-quantization, resampling, and different signal processing attacks. Furthermore, this technique had less degradation on the performance of speaker recognition verification and identification which were 1.16% and 2.52% respectively. For semi-fragile watermark, the degradation for speaker verification and identification were 0.39 % and 0.97 % respectively which can be ignored. Twenty percent of the speech frames could be watermarked without serious degradation for the recognition performance of speaker recognition. The identification rate and Equal Error Rate (EER) were improved to 100% and 0% respectively by applying digital speech watermarking. As a conclusion, the digital speech watermarking can enhance the security of the online speaker recognition systems against spoofing and communication attacks while improving the recognition performance by solving problems and overcoming limitations. 2015-06 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf Nematollahi, Mohammad Ali (2015) Digital speech watermarking for online speaker recognition systems. Doctoral thesis, Universiti Putra Malaysia.
spellingShingle	Nematollahi, Mohammad Ali Digital speech watermarking for online speaker recognition systems
title	Digital speech watermarking for online speaker recognition systems
title_full	Digital speech watermarking for online speaker recognition systems
title_fullStr	Digital speech watermarking for online speaker recognition systems
title_full_unstemmed	Digital speech watermarking for online speaker recognition systems
title_short	Digital speech watermarking for online speaker recognition systems
title_sort	digital speech watermarking for online speaker recognition systems
url	http://psasir.upm.edu.my/id/eprint/65614/1/FK%202015%20158.pdf
url-record	http://psasir.upm.edu.my/id/eprint/65614/
work_keys_str_mv	AT nematollahimohammadali digitalspeechwatermarkingforonlinespeakerrecognitionsystems

Digital speech watermarking for online speaker recognition systems

類似資料