| Résumé: | Product Lines Engineering (SPLE) is a systematic approach towards
realising software reuse. Among important software assets to be reused includes
architectural documents, test cases, source codes and also requirements. Requirements
Reuse (RR) in SPLE is the process of systematically reusing previously defined
requirements for an earlier software product and applying them to a new, slightly
different product within similar domain. SRS documents are not easily accessible,
therefore many researchers in this area opted to use other forms of requirements
including product brochures, user manuals and software reviews when SRS is not
available. Unfortunately, to extract reusable features from natural language
requirements for reuse is not easy. This task if done manually can be very complicated,
expensive, and error-prone on the results. Many research efforts in SPLE focused on
issues related to architectures, designs and code reuse, but research on requirements
reuse has received slightly less attention from researchers and practitioners. Results
from an exploratory survey gathered among SE practitioners indicated that the main
impediments for RR practice includes the unavailability of RR tools or models for
adoption, the conditions of existing requirements to be reused (incomplete, poorly
structured, or not kept updated), and the lack of awareness among software practitioners
pertaining to the systematic RR. Additionally, a Systematic Literature Review (SLR)
conducted for feature extraction approaches for RR in SPLE reveals that there is a
mixture of automated and semi-automated approaches from data mining and
information retrieval, with only some approaches coming with support tools. This SLR
also reveals that most of the support tools proposed in the selected studies are not made
available publicly and thus making it hard for practitioners’ adoption. Motivated by
these findings, this research proposes a process model for feature extractions from
natural language requirements for reuse (FENL). FENL consists of four main phases:
Accessing Requirements, Terms Extraction, Feature Identification and Formation of
Feature Model. The proposed model is demonstrated through lab experiment and online
software reviews are used as the input. In phase 1, software reviews are fetched from
the Internet. Then, in phase 2, these reviews undergo text pre-processing stage. In phase
3, Latent Semantic Analysis (LSA) and tfidf term weighting are used in order to
determine document relatedness. Then, linguistic tagging is applied to extract software
features followed by applying simple clustering algorithms to form groups of common
features. In phase 4, the common features that are grouped together are passed to the
feature modelling process and manual feature diagram are constructed as the final
output. The extraction results from the proposed semi-automated extraction is
compared with the one obtained by the manual extraction procedure performed by
teachers and software practitioner. Comparisons are made in terms of accuracy metrics
(precision, recall and F-Measure), and time efficiency. The proposed approach obtained
a recall of up to 85.95% (78.03% average) and a precision of up to 80.16% (58.63%
average), when evaluated against the truth data set created manually. Additionally,
when comparing with the related works, FENL results to obtain a comparable FMeasure.
|