Introduction to ProcessPAIR

Data Preparation

Before launching the ProcessPAIR tool, have your data ready for analysis as follows:

  1. If you are using the SEI’s PSP Student Workbook, export your data  to an “*.mdb” file (such as “PSP_Assingments_be.mdb”).
  2. If you are using Process Dashboard, export your data to a folder containing a “data.xml” file (together with “defects.xml” and “time.xml”) by execution the option Tools -> Export -> Export Project Metrics” and selecting both Project and Non-Project; then unzip the generated file. See also video  Data Extraction for ProcessPAIR from Process Dashboard .

Launch and File View

Execute the tool by double-clicking on the ProcessPAIR.jar file. The following window should appear.

Screenshot 2016-02-28 20.33.28

Follow the steps:

  1. In the “Model calibration file” field, press the “Select …” button to select the calibration file ProcessPairCalibration.xml that you can dowload with the tool. Other calibration options will be explained at the end of this page.
  2. In the “Type of input file to analyze” combo-box, choose the type of your input file – “PSP Student Wotkbook Export File” or “Process Dashboard data.xml file”.
  3. In the “Input file to analyze” field, press the “Select…”  button to select the file to analyze. The usual file open dialog should appear. Locate and select the file.
  4. Press the “Analyze file” button, as illustrated below.

Screenshot 2016-02-28 20.36.34

The analysis may take just a few seconds. Once the analysis is concluded, the other tabs become enabled and the tool jumps to the “Report View”.

Report View

The goal of the Report View is to indicate in a simple way, overall (“Summary”) or project by project, the most relevant top-level performance problems (colored red or yellow in the Table View) and potential root causes (leaf causes in the Cause-Effect View) properly prioritized, as illustrated bellow.

Screenshot 2016-02-24 21.28.16

To see all the intermediate causes just uncheck the “Show only leaf causes” checkbox, as illustrated below.

Screenshot 2016-02-28 20.37.47

To see information related to a single project, just select the project on the top left combobox, as illustrated below.

Screenshot 2016-02-28 20.40.01

To get detailed information about a performance indicator, just press the link.

Table View

The Table View shows the values of the performance indicators defined in the model for the projects described in the input file, as well as summarized performance information. By default, only the top-level performance indicators are shown, as illustrated in the following picture.

Screenshot 2016-02-24 21.30.11

By clicking on the “Indicator” column the tree table is fully expanded, as illustrated next.

Screenshot 2016-02-24 21.31.48

Each cell is colored green, yellow or red, in case its value suggests no performance problem, a potential performance problem, or a clear performance problem, respectively. This way, the Table View helps quickly identifying performance problems. The exact ranges considered can be consulted in the “Indicator View”. To skip to a specific indicator in the “Indicator View”, just click on the indicator name in the first column.

The “Summary” column shows an overall rating between 1 and 5 stars for each performance indicator, computed from the per project values (with higher importance for the last projects), and colored according to the number of stars.

The performance indicators are organized hierarchically, starting from three top-level indicators (Time Estimation Accuracy, Process Quality Index, and Productivity), and descending to lower level indicators (child indicators) that affect the higher level ones according to a formula or statistical evidence (see our publications for further explanations). This way, by drilling down from the top-level indicators to the lower level ones, focusing on the red (or yellow) colored cells, one can easily identify potential root causes of performance problems. A project-by-project root-cause analysis of performance problems can be better done with the help of the  Cause-Effect View. To skip to a specific project in the “Cause-Effect View” (or “Report View”), just click (or control-click) on the project name in the column header.

To see only the major issues (red colored cells), just press the “Show only major issues” button. By pressing “Show all performance indicators”, all the indicators are shown again.

Tool tips in the column and line headers explain the navigation options available. The buttons on the top allow exporting the data to HTML, and hide/show columns.

Indicator View

Notice: This view was greatly improved in v. 1.6, for better supporting interactive performance analysis. Please look at the PSP Final Report Guidelines for more advanced features.

The goal of the Indicator View is to provide summary information (description, units, optimal value, recommended performance ranges, and statistical distribution in the model) about each performance indicator considered in the performance model, as well as its behavior for the series of projects under analysis. When using the default calibration file provided with the tool, the recommended performance ranges and statistical distribution  were derived from a large PSP data set from the SEI referring to more than 3,000 PSP developers and 30,000 projects (see our publications for further explanations).

Screenshot 2016-02-24 21.35.27

In case the optimal value is somewhere in the middle of the scale, you’ll see in the chart two light green and two yellow lines that separate the different performance ranges (see figure above). In case the optimal value is in one extreme of the scale, only one light green line and one yellow line will be depicted  to separate the performance ranges, as illustrated below.

Screenshot 2016-02-24 21.36.11

Screenshot 2016-02-24 21.36.58

In the bottom left, it is presented the statistical distribution of the values of the performance indicator in the PSP data set used for calibrating the model. The colors correspond to the performance ranges. As can be observed in the 3 figures above, some performance indicators have a continuous distribution, but others have a hybrid, continuous-discrete, distribution.

The actual values in the file under analysis are also shown, marked with the “+” symbol, for benchmarking purposes. The information shown in this chart is also used internally by the tool in the computation of ranking coefficients.

Cause-Effect View

Notice: This is an advanced view that provides essentially the  same information as the Report View (with additional details if wanted), but in a diagrammatic way.

The goal of the Cause-Effect View it to help identifying and prioritizing, project by project or overall, the root causes of performance problems, so that subsequent improvement actions can be properly directed. The child indicators are sorted according to the value of a ranking coefficient. The ranking coefficient represents a cost-benefit estimate, that relates the cost of improving the value of the child indicator with the benefit on the value of the parent indicator (see our publications for further explanations).

By default, this view shows a summarized root cause analysis of the major issues identified in the set of projects under analysis, as illustrated below.

Screenshot 2016-02-28 20.41.15

To see intermediate causes, just uncheck the “Show only leaf causes” checkbox, as illustrated below.

Screenshot 2016-02-28 20.42.26

By default, the ranking coefficients are shown by means of T-shirt sizes (XL – extra large, L – large, M – medium, S – small, XS – extra small). The numerical values of the ranking coefficients can be consulted by selecting “Numerical Ranking Labels” on the top right corner, as illustrated below.

Screenshot 2016-02-28 20.42.57

To view all the performance indicators, just select that option in the top center combobox.

For a more detailed project by project analysis , just select the project in the top left combobox, as illustrated below.

Screenshot 2016-02-28 20.44.35

A simpler textual representation of problems and possible root causes is shown in the Report View.

Automatic Calibration

In case you have a large data set of PSP course data (with at least 100 data points or projects), the tool is able to automatically calibrate the model (recommended ranges, statistical distribution of each performance indicator, etc.) based on the PSP course data file.

By launching the tool with the “-c” option, a new tab will become available as illustrated below.

Screenshot 2016-02-28 20.46.36

You have to select the input file with the data set to be used for calibration, the output (xml) file for saving the results of the calibration, and press “Generate”, as illustrated below.

Screenshot 2016-02-28 20.48.46


The automatic calibration will usually take a few seconds. If inconsistencies are found in the data set, you may obtain a warning message as illustrated below.

Screenshot 2016-02-24 21.43.50

The resulting calibration file is stored in the indicated “.xml” file, for subsequent usage in normal tool usage.

Instead of using the full data set for calibration, you also have the option to filter the data points used for calibration, in two different ways.  One possibility is to provide a set of filters as illustrated bellow (in the main window you have to press the “filters …” button).

Screenshot 2016-01-19 17.35.55

In this case, only the projects from subjects that used the Java programming language and reported a programming experience less or equal to 5 years will be selected. In case the number of resulting data points becomes too small, you’ll get an error or warning message.

The other possibility is to indicate your profile, regarding the same parameters, as illustrated bellow (in the main window you have to press the “my profile …” button).

Screenshot 2016-01-19 17.41.31

In this case, only the projects from subjects with most similar profiles to the one indicated will be selected. See our upcoming research papers for an explanation of the similarity coefficients and criteria used.  After the calibration process, you’ll get an information message similar to the one below. In this example, only 50 most similar subjects were selected (the minimum number required by the tool for statistical significance), with a similarity coefficient greater than 0.889 (the similarity coefficient ranges between 0 and 1).

Screenshot 2016-01-19 17.45.10