+ All documents
Home > Documents > Cross-artifact traceability using lightweight links

Cross-artifact traceability using lightweight links

Date post: 27-Nov-2023
Category:
Upload: utoronto
View: 1 times
Download: 0 times
Share this document with a friend
8
Cross-Artifact Traceability Using Lightweight Links Sukanya Ratanotayanon Dept. of Informatics University of California, Irvine [email protected] Susan Elliott Sim Dept. of Informatics University of California, Irvine [email protected] Derek J. Raycraft Dept. of Informatics University of California, Irvine [email protected] Abstract Much research in traceability has focused on following requirements and features over the early phases of the software lifecycle. There has been comparatively little work on traceability into later phases and artifacts. In this paper, we tackle the problem of traceability across artifacts, including documents and source code, and maintaining traceability links through successive changes. We have developed Zelda, a prototype for associating arbitrary lines in text-based files with a feature map. This representation can be used to link together sections from many types of artifacts and can also contain annotations and notes. Zelda automatically tracks and presents the locations of these links in subsequent versions of the artifacts. We evaluated Zelda using 25 versions of jEdit, (260 KLOC). The overall precision for 419 links across the five features was 0.90 and the recall was 0.73. The average precision and recall per features is 0.78 and 0.69 respectively. 1. Introduction Traceability is “the ability to describe and follow the life of a requirement, in both a forwards and backwards direction” [1]. Being able to trace features and tasks throughout a software life cycle makes it possible to track what happened to them and to know where they are manifested. This ability is beneficial for many activities such as validating that a requirement is fulfilled and locating artifacts that need to be modified when maintaining a feature. Traditionally, research in traceability is primarily concerned with mapping high-level artifacts, such as requirements specifications, architecture documents [2], related media [3, 4], and design models [5]. However, there has been relatively little work that supports traceability in later phases and artifacts, such as source code and test cases. In addition, typical traceability tools aim to provide precise information about the relationship between artifacts, which requires overhead in specifying their formal relationships before links can be made. As a result, the mechanism for capturing links is inflexible and the overhead cost of using these tools is high. Despite our best efforts, requirements sometimes appear in the later stages of software development, such as implementation, testing, and maintenance. Their manifestations are also not restricted to only early artifacts such as requirements documents. They can be in source code, test cases, change request, and bug reports. A mechanism to record the traceability links need to be able to include these artifacts and can capture the traceability links in a spontaneous manner. Another critical problem in traceability is maintaining the links after they have been created [6, 7]. Documents and source code evolve continuously and developers will be unlikely to add traceability links to artifacts, if the information will be out of date with the next change to the software. In this paper, we present an approach for capturing and managing traceability links of features to multiple artifacts. In addition, this approach aims to automatically maintain the links over successive changes by piggybacking on a revision control system. We implemented a prototype, Zelda, as an Eclipse plug-in. By integrating Zelda with a development workbench, the tool is lightweight and easy-to-use. To represent a feature, a feature map with hyperlinks into text documents is created. These links are lightweight in that no up-front specification of relationship is required. Information about a feature that is spread across multiple files can be collected in an ad hoc manner. This association mechanism is highly flexible and can work with any text-based software artifact, including source code. When the underlying artifacts are changed, link change propagation is done by extracting information from a Revision Control (RC) system. By analyzing the difference result between subsequent versions of artifacts, we can automatically TEFSE’09, May 18, 2009, Vancouver, Canada 978-1-4244-3741-2/09/$25.00 © 2009 IEEE ICSE’09 Workshop 57
Transcript

Cross-Artifact Traceability Using Lightweight Links

Sukanya Ratanotayanon Dept. of Informatics

University of California, Irvine [email protected]

Susan Elliott Sim Dept. of Informatics

University of California, Irvine [email protected]

Derek J. Raycraft Dept. of Informatics

University of California, Irvine [email protected]

Abstract

Much research in traceability has focused on

following requirements and features over the early phases of the software lifecycle. There has been comparatively little work on traceability into later phases and artifacts. In this paper, we tackle the problem of traceability across artifacts, including documents and source code, and maintaining traceability links through successive changes. We have developed Zelda, a prototype for associating arbitrary lines in text-based files with a feature map. This representation can be used to link together sections from many types of artifacts and can also contain annotations and notes. Zelda automatically tracks and presents the locations of these links in subsequent versions of the artifacts. We evaluated Zelda using 25 versions of jEdit, (260 KLOC). The overall precision for 419 links across the five features was 0.90 and the recall was 0.73. The average precision and recall per features is 0.78 and 0.69 respectively. 1. Introduction

Traceability is “the ability to describe and follow the life of a requirement, in both a forwards and backwards direction” [1]. Being able to trace features and tasks throughout a software life cycle makes it possible to track what happened to them and to know where they are manifested. This ability is beneficial for many activities such as validating that a requirement is fulfilled and locating artifacts that need to be modified when maintaining a feature.

Traditionally, research in traceability is primarily concerned with mapping high-level artifacts, such as requirements specifications, architecture documents [2], related media [3, 4], and design models [5]. However, there has been relatively little work that supports traceability in later phases and artifacts, such as source code and test cases. In addition, typical

traceability tools aim to provide precise information about the relationship between artifacts, which requires overhead in specifying their formal relationships before links can be made. As a result, the mechanism for capturing links is inflexible and the overhead cost of using these tools is high.

Despite our best efforts, requirements sometimes appear in the later stages of software development, such as implementation, testing, and maintenance. Their manifestations are also not restricted to only early artifacts such as requirements documents. They can be in source code, test cases, change request, and bug reports. A mechanism to record the traceability links need to be able to include these artifacts and can capture the traceability links in a spontaneous manner.

Another critical problem in traceability is maintaining the links after they have been created [6, 7]. Documents and source code evolve continuously and developers will be unlikely to add traceability links to artifacts, if the information will be out of date with the next change to the software.

In this paper, we present an approach for capturing and managing traceability links of features to multiple artifacts. In addition, this approach aims to automatically maintain the links over successive changes by piggybacking on a revision control system.

We implemented a prototype, Zelda, as an Eclipse plug-in. By integrating Zelda with a development workbench, the tool is lightweight and easy-to-use. To represent a feature, a feature map with hyperlinks into text documents is created. These links are lightweight in that no up-front specification of relationship is required. Information about a feature that is spread across multiple files can be collected in an ad hoc manner. This association mechanism is highly flexible and can work with any text-based software artifact, including source code. When the underlying artifacts are changed, link change propagation is done by extracting information from a Revision Control (RC) system. By analyzing the difference result between subsequent versions of artifacts, we can automatically

TEFSE’09, May 18, 2009, Vancouver, Canada978-1-4244-3741-2/09/$25.00 © 2009 IEEE ICSE’09 Workshop57

determine the correct location of links. Zelda can present the links at program element level in a treeview for Java files, and at file and line level for all files. At file and line level, links are presented using file decorations and markers in Eclipse editor, and a SeeSoft-style [8] visualization.

To evaluate the effectiveness of our approach in maintaining traceability links over successive changes, we performed an empirical study using jEdit, an open source text editor written in Java (260 KLOC). In this study, we traced the evolution of five feature maps over 25 releases of jEdit, which includes over 2,000 incremental revisions. The feature maps were created based on changes made in commit transactions. The resulting links in the feature maps joined both source code and other supporting text files, such as documentation and configuration files. The result shows that, after 25 releases, we achieved average precision and recall for each feature map at 90% and 73% respectively. 2. Background

Typically, research in traceability concerned with mapping high-level artifacts such as requirements specifications, architecture documents [2], related media [3, 4], and design models [5]. There are special-purpose tools, such as DOORS [9], Requisite Pro™ [10], and TOOR [11] that are designed to provide automated support for requirements traceability in high-level artifacts. Although these tools provide support for traceability across documents, their support does not extend into source code and other artifacts in later stages of software lifecycle.

While there are a number of tools, such as FEAT [12] and Mylyn [13], that create traceability links to source code, they tend to neglect documentation. In particular, links and annotations are associated with program elements. These approaches could be complemented by tools that trace features into non-source documents to enable cross-artifact traceability.

Maintaining the links after they have been created [6, 7] is another critical problem in successfully implementing traceability throughout software lifecycle. This problem increases in severity with the number of traceability links, which happens when the granularity of the links is made smaller and changes are frequently made.

Some research addresses this issue by aiming to discover traceability links automatically from available artifacts. A variety of approaches have been used, including analyzing the runtime trace of a scenario [14], analyzing comments in source code [15], and

employing information retrieval techniques to find similarities between high-level artifacts and parts of source code [16-21]. While there have been improvements, human intervention and feedback is needed to achieve an acceptable level of accuracy in link discovery. The main drawback of these tools is that their levels of granularity are too coarse, because they operate at the file level. However, scattered feature information tends to be interleaved within files as well, so these tools need to extend their reach.

3. Traceability Using Lightweight Links

Cross-artifact traceability is the problem of tracing requirements across heterogeneous software artifacts. This is an important problem because there are so many different kinds of artifacts in modern software projects. For example, a typical web application uses different programming languages, configuration scripts, and data formats. In addition, feature can be manifested in other forms, such as documents, email and discussion forums. In this context, traceability tools need to interact with a variety of artifacts.

3.1 Lightweight Traceability Links

Since the manifestations of a feature can be

spontaneous and informal, a mechanism for tracing them must have these properties as well. In our approach, a feature map is created to collect links to scattered information of a feature at the line level. The model of a feature can be seen in Figure 1. By associating the link with a specific version of an artifact, we can track the evolution of the artifact and evolve the links along with it.

Figure 1: Feature Map and Lightweight Link

Our links are lightweight in that it is simply an association to show that elements are related to a feature. As a result, text lines can be associated with each other without defining a relationship in advance. Associations can be added, changed, or removed as work progresses. Information or metadata specific to the feature can be added to the feature map, rather than scattering in software artifacts. Change events and timestamps could also be recorded to represent the lifespan of a feature.

58

3.2 Link Tracing Process

Software developers rarely have the time or patience to spend on non-coding activities. Therefore, traceability links must require little effort to create, and must be kept up to date automatically. To this end, we propose to incorporate the association mechanism into an integrated development environment (IDE) and to leverage a revision control system to provide change information.

Building traceability features into an IDE takes advantage of source code, and similar text documents, as central artifacts in software development. In the same manner that development activities center around source code, so too do link creation and maintenance. It should be a simple matter for a developer to activate or de-activate a representation of a feature as code is written. To add links manually, a developer selects particular lines and adds them to the relevant feature. To add links automatically, the user can choose to have links created when changes are made in real-time or when they are committed to the RC system. Both approaches can be used in combination to record links.

4. Evolving Traceability Links

The most serious issue in providing traceability

links is maintaining captured links while the underlying artifacts change. This issue is especially serious with the links at this fine granularity, because changes occur constantly during software development. The captured traceability links can deteriorate very fast as development proceeds, unless significant effort is expended to keep the associations up to date.

Therefore, we provide an approach to automatically maintain the links over successive changes by piggybacking on a Revision Control (RC) system. After the links are captured in a specific version of a file, the difference information obtained from the RC system is used as our mechanism for retrieving the correct location of the links in the future versions of the file. Working in tandem with a revision control system that maintains the artifacts, it is always possible to go back the original content of a link target or any other intermediate versions.

4.1 Determining Link Locations Our algorithm relies on the ‘diff’ result that

summarizes the differences between two files. A tool that produces the diff file is commonly included in the RC system, and is also part of UNIX operating system. As depicted in Figure 2, we need to input into the

comparison utility the version of the file in which the links are created and the current version of the file in order to receive the diff result use for updating links.

Figure 2: Determining link location

An example of a diff result in the unified format can be seen in the box at the top right of Figure 2. This file contains information about files that are compared, and whether each line exists in each version of the file being compared, version x and version y. For a range of lines, this information is grouped together into a “change chunk” starting with a header demarcated by the double ampersands (e.g. @@). Symbols at the beginning of the line indicate whether a line is found in only one file or the other, or both. Using the example diff result in Figure 2, a line starting with ‘-’ is the line that only exists in the version x and a line starting with ‘+’ is the line that only exists in the version y of the file. A line that has no starting symbol exists in both files and signifies that no changes have been made. We use this information to compute the location of the link in the newer version of the file, in this example, version y.

Pseudocode for our algorithm to compute link location is presented in Figure 3. This algorithm breaks down into two cases. In Case 1, the link location of interest does not appear in the diff file. Consequently, the new location of the link is calculated by finding the distance to the closest change chunk. In Case 2, the location of interest does appear in the diff file. Two pointers are initialized to walk the two files based on the contents of the diff file until the link location of interest is found in the older file. By looking ahead at the next line, the new link location can be computed.

59

Figure 3: Link Evolution Algorithm

The updating of links can have three possible states as shown in Figure 2. An “unchanged” line has the same content, but may have moved within the file hence have different locations. For instance, the line linked to by link 1. The link 2 is an example of a “removed” link cannot be found, and is not shown to the user. A “modified” link is one where the content has changed, and may have moved. These links will be presented, but with a different appearance than the ‘unchanged' ones. See the link number 3 for an example of an unchanged link.

Links to a feature can be created at any time during the life of a project. Therefore, links can be associated to different version of an artifact. To deal with this situation, we compute the current location of each link using the diff result between the current version of a file and a version that is corresponding to the version in which the link is created. Then we accumulate all links and present them together.

5. Zelda

A prototype platform for recording and maintaining

traceability links, called Zelda, has been implemented as an Eclipse plug-in. Back-end components include Subversion, a MySQL database, and multiple feature management systems. Zelda includes features that allow users to created feature maps and to record feature descriptions and metadata. The feature information is stored in MySQL. Feature maps link together lines in different files. It is also possible to create a feature map using features from an external source, such as XPlanner [22], a tool for managing Extreme Programming user stories. An interface has been provided to communicate with these external systems.

Figure 4 shows a screenshot of the Zelda tool. The available feature maps can be seen in the Browser view, labeled A. A feature map that the developer is interested in can be sent to the Virtual Stack view, labeled E, which presents a list of current working sets of feature maps. The user can activate and de-activate a feature map using actions on the view to created links. To create or modify a feature map, a developer can activate a map and manually select lines in the files to link them with the active feature, or choose to have changes linked automatically with the active feature map when the files are committed to Subversion.

To present traceability links of a feature map, Zelda analyzes the link information stored and the diff results obtained from Subversion to determine the current locations of the links as discussed in Section 4.1. The updated link locations are shown in the following visualizations. Overview: This visualization is labeled as B. We use the Visualiser plug-in from the AJDT group to implement this SeeSoft-style visualization. It presents an overview of a feature map with all links to files and their associated lines. The files are shown as long blocks and the associated lines are shown as stripes within the blocks. This overview provides easy access to the scattered fragments of the features. The developer can access a section of the features by double clicking on either the block or the stripe.

Let Fx represent version x of file F Let Fy represent version y of file F, where y > x Let dxy be the diff result between Fx and Fy Let lx be the location of the link in version x Let ly be the location of the link in version y We need to find ly Case 1: lx does not appear in dxy Find the change chunk in dxy that is closest to lx If (lx occurs before the change chunk) Calculate distance d between the lx and the first line of

the change chunk in Fx Set ly = first line of the change chunk in Fy - d If (lx occurs after the change chunk) Calculate the distance d between lx and the last line of

the change chunk in Fx Set ly = last line of the change chunk in Fy + d Case 2: lx appears in dxy Find the change chunk in dxy containing the lx Set px to the first line of the change chunk in Fx Set py to the first line of the change chunk in Fy Set CCcurrent to the current line in the change chunk Set CCnext to the line after CCcurrent // Walk the two files Iterate through the lines in the change chunk in the dxy , updating both line pointers px and py, until px == lx If CCcurrent occurs in Fx and Fy, increment px and py by

one If CCcurrent occurs only in the Fx, increment px by one If CCcurrent occurs only in the Fy, increment py by one // Calculate the new location If (CCcurrent and CCnext occur only in Fx) then ly = null If (CCcurrent occurs only in Fx and CCnext occurs in Fx and Fy) then ly = null If(CCcurrent occurs only in Fx and CCnext occurs in Fy) then ly = py + 1 In all other cases, ly = py

60

File Decorations: The participants of a feature map are presented at the file-level granularity using a file decoration in the package explorer view in Eclipse, as shown in the area labeled A. Markers: Markers are used to show links at the line level, as shown in the area labeled D. A marker is used to present the most recent location of each link. Program Element Tree: To increase the level of abstraction in which we show links to source code. Program elements containing lines that are linked to a feature map are shown in a tree view using their hierarchical structure as shown in the area labeled F. 6. Evaluation

The success of Zelda will be determined by its ability to capture and to accurately retain traceability links over many successive changes. To evaluate our approach, we followed the evolution of five features over 25 releases of an open source software project. In this section, we first describe our experimental design, results, and threats to validity.

6.1 Subject System and Selected Features

We used jEdit [23], an open source advanced text

editor, as the subject system. We selected this software

project because it has been used on previous evaluations, and so our results could be more easily compared with prior research. We worked with the distribution from version 4.1 pre1 (140 KLOC) to 4.3 pre 7 (260 KLOC), a total of 25 releases from May, 2002 to August, 2006, including 2,536 incremental revisions. We included not only the Java source code in our study, but also non-code text files, such as developer notes and documentation.

Since commit transactions contain log message telling the purpose of the change and the change set. We selected a feature to be traced from these commit transactions. We selected five transactions that had meaningful log message, affected more than 3 files and touched different parts of the system. We started our selection process with version 4.1 pre1, because earlier versions had less detail in their revision control logs. We went through each commit transaction in order until we found five features that fit our criteria. Coincidentally, we used one change set from each version up to version 4.1 pre 5. We created a feature map linking a feature to lines affected by the change in the commit. The summary of these five feature maps is shown in Table 1.

The created feature maps varied in both size (number of links) and the kinds of participants. All of them contained both code and non-code files. In

Figure 4: Screenshot of Zelda User Interface and Link Visualization

61

addition, all of our features contained links into the top ten most frequently changed files. These two characteristics allowed us to thoroughly exercise the link maintenance algorithm in Zelda.

Table 1: Summary of Selected Feature # of Files

Feature Non code File

Code File

No of Links

# TopTen Files

F1: Auto indent bug fix 2 5 213 4 F2: Folding bug fix 3 2 11 2 F3: Find&replace bug fix 1 3 64 1 F4: Adding new modes 3 1 10 1 F5: NQC syntax highlight 3 10 121 3

6.2 Procedure

We used the following procedure to evaluate the

correctness and completeness of the feature maps after each new release.

Given version vi of a project and a feature map Cn created in that version, let l1i, l2i, l3i,… represent the location of the links. For each feature map, we performed the following procedure for every subsequent version of the project. 1. Calculate the new location of the links l1j, l2j,

l3j,…in version vj, where j > i. 2. Check the contents of the locations returned by

Zelda. a. If no location was returned, the link is considered

“removed” by the project developers. b. If the location returned contains a line that

should be included in the feature map, the link is considered “correct.”

c. If the location returned contains a line that should not be included in the feature map, the link is considered “incorrect.”

3. Use these categorizations to calculate precision and recall for vj, defined as follows. Precision. The ratio of the number of relevant links retrieved per feature to the total number of links retrieved per feature.

correct t (1) correct+incorrect

Recall. The ratio of the number of relevant links retrieved to the total number of relevant links.

correct t (2) correct+incorrect+removed

We began this procedure for each feature map with the version immediately after it was created and continued for 25 versions. Consequently, we could compare the performance of the evolution algorithm for each feature after a given number of releases. We

felt that this comparison was fairer than comparing them across a single release, because each feature would have undergone different numbers of versions.

Table 2: Retrieved Links after 25 releases Lost

Feature CorrectRemoved Incorrect

Precision Recall

F1 139 62 12 0.89 0.66

F2 5 1 5 0.56 0.45

F3 59 0 1 0.98 0.98

F4 6 0 4 0.60 0.60

F5 93 17 11 0.89 0.76 Overall 302 79 33 0.90 0.73

Average Per Feature 0.78 0.69

6.3 Results

Overall, the algorithm performed well. For the

entire set of 419 links across the five features, after 25 versions, the precision was 0.90 and the recall was 0.73. The result is encouraging, because a large number of links were preserved during thousands of commit operations. The performance per feature was lower, with a precision and recall of 0.78 and 0.69. The precision and recall of each feature map after 25 releases are in Table 2. In the remainder of this section, we’ll probe the underlying reasons for these results. Small vs. Large Features. We had two comparatively small features, F2, and F4, with 11 and 10 links respectively. We also had two comparatively large features F1, and F5, with 213 and 121 links respectively. We found that small features had lower precision and recall, because the loss of one or two links would cause a 10-20% drop in performance.

This tendency can be seen clearly in Figure 5 and Figure 6. These graphs show the precision and recall of each feature map at N releases after creation. Losing a small number of links in F2 and F4 results in dramatic drops and jumps, while larger maps have more gradually changing rate. Increases in Precision and Recall over Time. We had expected that the precision and recall of the feature maps will gradually decrease as more changes occurred, because the changes would cause the link endpoints to be deleted or changed until Zelda could not recognize them. However, we found that they actually increased on an occasion.

Digging into the data, we discovered that this behavior was due to a quirk of the diff utility that we were using. Often, links were lost when a small number of lines before a link participant was deleted or modified, and a large number of lines were added immediately following. This combination of changes

62

tended to produce a diff result that caused the link participant to be lost. Surprisingly, new link locations were reported correctly in subsequent versions of the file. When the added lines were removed later, it was possible for the links to reappear, because we were building a link map between different versions.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29Releases after Creation

Prec

isio

n F1F2F3F4F5

Figure 5: Precision after N Releases

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Releases after Creation

Rec

all F1

F2F3F4F5

Figure 6: Recall after N Releases

6.4 Threats to Validity

There are some threats to the validity of our study.

First, a threat to the external validity is due to characteristics of our subject system and its evolution. The ability to generalize the performance of Zelda in evolving links depends on how well jEdit could serve as a representative of other software systems. JEdit is a well-known open source software project and the activity level is typical of active, popular projects. The size, frequency of changes, and sequence of changes are representative of open source projects. Especially, these changes were performed during a long period of time, from 2002 to 2006. However, there are limitations on how applicability our findings for industrial, closed source software projects.

The next threat to validity is how the features were selected. Selecting features containing files that change infrequently could bias the results in favor of our tool, thus affecting internal validity. Prior to this study, we were not familiar with jEdit and did not know which files to preferentially include or exclude. We selected

our features by establishing some criteria and examining the revision control log in sequence until we found five suitable candidates. These featured seemed to represent a range of sizes, underlying artifacts, and purposes. A beneficial side effect was that all of our feature maps link to files in the top-ten change list in jEdit, which strengthening our results. 7. Discussion and Future Work

In this implementation of Zelda, we lost some links due to the employed diff utility cannot accurately detect changes. To investigate whether this issue that is common among diff utility, we perform link location with diff result from another diff utility, WinMerge [24]. The diff result from this utility is not in a unified format, but provides the same type of information. We found that analyzing this new diff result allow us to locate the lost links correctly. This is a promising result, because it shows that these losses were not due to our approach and could easily be remedied. In fact, multiple diff utilities can be used when tracing links to provide sanity check for each other and improve the correctness of retrieved links. In addition, more effective line tracing algorithms [25, 26] can be employed to improve the performance.

Another drawback of the current link maintenance algorithm is that it does not handle large-scale changes well. For example, cutting and pasting sections of the code, movsing lines between files, or refactoring the code cause links to be lost. It is hoped that the addition of the automatic association mechanism will aid the transfer of link information from one place to another. Another possible option is to employ existing algorithms from the software evolution to detect splits, merges, and changes of a file so that we can update the links accordingly.

There is a question of scalability of Zelda in dealing with a larger system, which could have millions of links. In our system, the links are mainly maintained in the database and retrieved only for a specific feature on demand. Therefore, most links will stay in the database and the number of links that Zelda retrieves, analyzes, and stores in the memory at a time will be similar to the number of links in our study.

8. Conclusion

The ability to trace requirements across artifacts and

across the software lifecycle is necessary for successful long-lived software systems. With the wide variety of features that need to be linked in so many contexts, a software tool needs to be flexible and robust in order

63

to provide the support that is needed. In addition, to make the links useful in long-term, the captured links must be automatically maintained. To this end, we have introduced the Zelda platform.

Zelda is a prototype tool that associates lines from text files, such as a source code file, with a feature. The tool consists of three parts: feature maps for representing features; a mechanism to links lines in files to the feature map; and a mechanism for keeping the links up to date. Since Zelda can be used inside an IDE, it is easy to move from code to requirements.

We evaluated the ability of Zelda to accurately maintain links using releases of jEdit, an open-source text editor written in Java (260 KLOC). The result showed that, after 25 versions of feature map creation, the overall precision for the entire set of 419 links across the five feature maps was 0.90 and the recall was 0.73. The average precision and recall per feature map is 0.78 and 0.69 respectively.

9. References [1] O.C.Z. Gotel and C.W. Finkelstein, "An Analysis of the Requirements Traceability Problem," in RE'94: Proceedings of the First International Conference on Requirements Engineering, pp. 94-101, 1994. [2] L. Naslavsky, T.A. Alspaugh, D.J. Richardson and H. Ziv, "Using Scenarios to Support Traceability," in TEFSE '05: Proceedings of the 3rd International Workshop on Traceability in Emerging Forms of Software Engineering, pp. 25-30, 2005. [3] P. Haumer, K. Pohl, K. Weidenhaupt and M. Jarke, "Improving Reviews by Extended Traceability," in HICSS '99: Proceedings of the 32th Annual Hawaii International Conference on System Sciences-Volume 3, pp. 3052, 1999. [4] C. Lee, L. Guadagno and X. Jia, "An Agile Approach to Capturing Requirements and Traceability," in TEFSE'03: Proceeding of the 1st International Workshop on Traceability in Emerging Forms of Software Engineering, 2003. [5] T. Hughes and C. Martin, "Design Traceability of Complex Systems," in HICS '98: Proceedings of the 4th Symposium on Human Interaction with Complex Systems, pp. 37, 1998. [6] L.G.P. Murta, A. Van De Hoek and C.M.L. Werner, "ArchTrace: Policy-Based Support for Managing Evolving Architecture-to-Implementation Traceability Links," in ASE '06: Proceedings of the 21st IEEE International Conference on Automated Software Engineering, pp. 135-144, 2006. [7] J. Cleland-Huang and C.K. Chang, "Event-Based Traceability for Managing Evolutionary Change," IEEE Transactions on Software Engineering, vol. 29, pp. 796, 2003. [8] S.G. Eick, J.L. Steffen and E.E. Sumner, "Seesoft-A Tool for Visualizing Line Oriented Software Statistics," IEEE Transactions on Software Engineering, vol. 18, pp. 957, 1992.

[9] B. Azelborn, "Building a Better Traceability Matrix with DOORS," in Telelogic INDOORS Europe, 2000. [10] I. Spence and L. Probasco, "Traceability Strategies for Managing Requirements with use Cases," in Rational Software White Paper, 1999. [11] F.A.C. Pinheiro and J.A. Goguen, "An Object-Oriented Tool for Tracing Requirements,” IEEE Software, vol. 13, pp. 52, 1996. [12] M.P. Robillard and G.C. Murphy, "Representing Concerns in Source Code," ACM Transaction on Software Engineering Methodology, vol. 16, pp. 3, 2007. [13] M. Kersten and G.C. Murphy, "Mylar: A Degree-of-Interest Model for IDEs," in AOSD '05: Proceedings of the 4th International Conference on Aspect-Oriented Software Development, pp. 159-168, 2005. [14] A. Egyed, "A Scenario-Driven Approach to Traceability," in ICSE '01, pp. 123-132, 2001. [15] J. Sayyad-Shirabad, T.C. Lethbridge and S. Lyon, "A Little Knowledge can Go a Long Way Towards Program Understanding," in IWPC '97: Proceedings of the 5th International Workshop on Program Comprehension, pp. 111, 1997. [16] S. Yadla, J. H. Hayes and A. Dekhtyar, "Tracing Requirements to Defect Reports: An Application of Information Retrieval Techniques," Innovations in Systems and Software Engineering a NASA Journal, vol. 1, pp. 116, 2005. [17] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia and E. Merlo, "Recovering Traceability Links between Code and Documentation," IEEE Transactions on Software Engineering, vol. 28, pp. 970-983, 2002. [18] A. De Lucia, F. Fasano, R. Oliveto and G. Tortora, "Enhancing an Artefact Management System with Traceability Recovery Features," in ICSM '04: Proceedings of the 20th IEEE International Conference on Software Maintenance, pp. 306-315, 2004. [19] A. Marcus and J.I. Maletic, "Recovering Documentation-to-Source-Code Traceability Links using Latent Semantic Indexing," in ICSE ‘03, pp. 125-135, 2003. [20] A. De Lucia, R. Oliveto, F. Zurolo and M. Di Penta, "Improving Comprehensibility of Source Code Via Traceability Information: A Controlled Experiment," in ICPC '06: Proceedings of the 14th IEEE International Conference on Program Comprehension, pp. 317-326, 2006. [21] D. Cubranic and G.C. Murphy, "Hipikat: Recommending Pertinent Software Development Artifacts," in ICSE '03, pp. 408-418, 2003. [22] XPlanner. http://www.xplanner.org/2007. [23] jEdit. http://www.jedit.org/2008. [24] WinMerge. http://winmerge.org/2008. [25] S.P. Reiss, "Tracking Source Locations," in ICSE '08, pp. 11-20, 2008. [26] G. Canfora, L. Cerulo and M. Di Penta, "Identifying Changed Source Code Lines from Version Repositories," in MSR '07: Proceedings of the Fourth International Workshop on Mining Software Repositories, pp. 14, 2007.

64


Recommended