Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar

Product Lines Engineering (SPLE) is a systematic approach towards realising software reuse. Among important software assets to be reused includes architectural documents, test cases, source codes and also requirements. Requirements Reuse (RR) in SPLE is the process of systematically reusing previ...

Description complète

Détails bibliographiques
Auteur principal: Noor Hasrina, Bakar
Format: Thèse
Publié: 2016
Sujets:
_version_ 1849734390277472256
author Noor Hasrina, Bakar
author_facet Noor Hasrina, Bakar
author_sort Noor Hasrina, Bakar
description Product Lines Engineering (SPLE) is a systematic approach towards realising software reuse. Among important software assets to be reused includes architectural documents, test cases, source codes and also requirements. Requirements Reuse (RR) in SPLE is the process of systematically reusing previously defined requirements for an earlier software product and applying them to a new, slightly different product within similar domain. SRS documents are not easily accessible, therefore many researchers in this area opted to use other forms of requirements including product brochures, user manuals and software reviews when SRS is not available. Unfortunately, to extract reusable features from natural language requirements for reuse is not easy. This task if done manually can be very complicated, expensive, and error-prone on the results. Many research efforts in SPLE focused on issues related to architectures, designs and code reuse, but research on requirements reuse has received slightly less attention from researchers and practitioners. Results from an exploratory survey gathered among SE practitioners indicated that the main impediments for RR practice includes the unavailability of RR tools or models for adoption, the conditions of existing requirements to be reused (incomplete, poorly structured, or not kept updated), and the lack of awareness among software practitioners pertaining to the systematic RR. Additionally, a Systematic Literature Review (SLR) conducted for feature extraction approaches for RR in SPLE reveals that there is a mixture of automated and semi-automated approaches from data mining and information retrieval, with only some approaches coming with support tools. This SLR also reveals that most of the support tools proposed in the selected studies are not made available publicly and thus making it hard for practitioners’ adoption. Motivated by these findings, this research proposes a process model for feature extractions from natural language requirements for reuse (FENL). FENL consists of four main phases: Accessing Requirements, Terms Extraction, Feature Identification and Formation of Feature Model. The proposed model is demonstrated through lab experiment and online software reviews are used as the input. In phase 1, software reviews are fetched from the Internet. Then, in phase 2, these reviews undergo text pre-processing stage. In phase 3, Latent Semantic Analysis (LSA) and tfidf term weighting are used in order to determine document relatedness. Then, linguistic tagging is applied to extract software features followed by applying simple clustering algorithms to form groups of common features. In phase 4, the common features that are grouped together are passed to the feature modelling process and manual feature diagram are constructed as the final output. The extraction results from the proposed semi-automated extraction is compared with the one obtained by the manual extraction procedure performed by teachers and software practitioner. Comparisons are made in terms of accuracy metrics (precision, recall and F-Measure), and time efficiency. The proposed approach obtained a recall of up to 85.95% (78.03% average) and a precision of up to 80.16% (58.63% average), when evaluated against the truth data set created manually. Additionally, when comparing with the related works, FENL results to obtain a comparable FMeasure.
format Thesis
id oai:studentsrepo.um.edu.my:6674
institution Universiti Malaya
publishDate 2016
record_format eprints
spelling oai:studentsrepo.um.edu.my:66742019-03-05T00:32:02Z Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar Noor Hasrina, Bakar QA76 Computer software T Technology (General) Product Lines Engineering (SPLE) is a systematic approach towards realising software reuse. Among important software assets to be reused includes architectural documents, test cases, source codes and also requirements. Requirements Reuse (RR) in SPLE is the process of systematically reusing previously defined requirements for an earlier software product and applying them to a new, slightly different product within similar domain. SRS documents are not easily accessible, therefore many researchers in this area opted to use other forms of requirements including product brochures, user manuals and software reviews when SRS is not available. Unfortunately, to extract reusable features from natural language requirements for reuse is not easy. This task if done manually can be very complicated, expensive, and error-prone on the results. Many research efforts in SPLE focused on issues related to architectures, designs and code reuse, but research on requirements reuse has received slightly less attention from researchers and practitioners. Results from an exploratory survey gathered among SE practitioners indicated that the main impediments for RR practice includes the unavailability of RR tools or models for adoption, the conditions of existing requirements to be reused (incomplete, poorly structured, or not kept updated), and the lack of awareness among software practitioners pertaining to the systematic RR. Additionally, a Systematic Literature Review (SLR) conducted for feature extraction approaches for RR in SPLE reveals that there is a mixture of automated and semi-automated approaches from data mining and information retrieval, with only some approaches coming with support tools. This SLR also reveals that most of the support tools proposed in the selected studies are not made available publicly and thus making it hard for practitioners’ adoption. Motivated by these findings, this research proposes a process model for feature extractions from natural language requirements for reuse (FENL). FENL consists of four main phases: Accessing Requirements, Terms Extraction, Feature Identification and Formation of Feature Model. The proposed model is demonstrated through lab experiment and online software reviews are used as the input. In phase 1, software reviews are fetched from the Internet. Then, in phase 2, these reviews undergo text pre-processing stage. In phase 3, Latent Semantic Analysis (LSA) and tfidf term weighting are used in order to determine document relatedness. Then, linguistic tagging is applied to extract software features followed by applying simple clustering algorithms to form groups of common features. In phase 4, the common features that are grouped together are passed to the feature modelling process and manual feature diagram are constructed as the final output. The extraction results from the proposed semi-automated extraction is compared with the one obtained by the manual extraction procedure performed by teachers and software practitioner. Comparisons are made in terms of accuracy metrics (precision, recall and F-Measure), and time efficiency. The proposed approach obtained a recall of up to 85.95% (78.03% average) and a precision of up to 80.16% (58.63% average), when evaluated against the truth data set created manually. Additionally, when comparing with the related works, FENL results to obtain a comparable FMeasure. 2016 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/6674/4/Thesis_Aug2016NHB.pdf Noor Hasrina, Bakar (2016) Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar. PhD thesis, University of Malaya. http://studentsrepo.um.edu.my/6674/
spellingShingle QA76 Computer software
T Technology (General)
Noor Hasrina, Bakar
Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title_full Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title_fullStr Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title_full_unstemmed Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title_short Feature extraction from natural language to aid requirements reuse in software product lines engineering / Noor Hasrina Bakar
title_sort feature extraction from natural language to aid requirements reuse in software product lines engineering noor hasrina bakar
topic QA76 Computer software
T Technology (General)
url-record http://studentsrepo.um.edu.my/6674/
work_keys_str_mv AT noorhasrinabakar featureextractionfromnaturallanguagetoaidrequirementsreuseinsoftwareproductlinesengineeringnoorhasrinabakar