Use Case: Checking Data Quality

Applications can only produce good results if input data is correct and complete.

Checking the quality of input data regularly increases the value of calculated results and decreases follow-up costs caused by using bad or missing data.

The following simple example demonstrates how the RapidRep Test Suite identifies missing or unexpected currencies.

As a result of the test case it should be evident, if - for a given reference date - all required currency rates are provided or if data is incomplete or even inconsistent.

Description of Use Case

  • The data which is subject for the quality check-up is stored in an Oracle table.
  • A simple CSV file lists all currency codes for with daily rates are expected.
  • Since RapidRep is able to access data from all relevant data sources, the data sources of the scenario could be e.g. a MySQL table and a view in a MS Access database.

Typical quality related questions:

  • are all currency rates available for a given day?
  • for which currency codes are no rates available?
  • are rates available for currencies which are not expected?
  • are rate values plausible?

Layout design for a test case report using MS-Excel

  • The process using RapidRep begins with the creation of a template workbook based on MS Excel.
  • The first step in the creation of test case is the design of the test case layout using MS Excel. The customer is able to create the layout without knowing about RapidRep.
  • RapidRep does not prescribe the layout, however it is best practice for a template to have an overview sheet (Overview), which contains the overall test case result after test case execution and a detailed sheet (Details), which collects all details and may be used to document test case execution.

 

Implementation using RapidRep Designer (1/5)

  • The RapidRep Designer is a stand alone Java Application used to create so called Reportdefinitions.
  • A Reportdefinition consists of many settings arranged on a couple of tabs.
  • The Properties – tab of a Reportdefinition references the Excel workbook template created in the step before (Excel template).
  • During test case execution Excel workbooks are created and filled based on this Excel template. An appropriate name for these workbooks can be provided using so called Parameters.

Implementation using RapidRep Designer (2/5)

  • At runtime, when reports are created based on a Reportdefinition, the Parameters have an impact on the result. A Reportdefinition may have an arbitrary number of typed Parameters and may have a default value.
  • This example uses only one Parameter of type Date. The user or the batch programme will have to override this value.
  • Calculation of test cases can only be influenced by assigning values to Parameters. Apart from this, the Reportdefinition encapsulates the whole logic of the test case.

 

Implementation using RapidRep Designer (3/5)

  • In our example data involved in this test is stored in two distinct places. The actual data is stored in an Oracle table, while the target list is maintained in a CSV file.
  • RapidRep allows the use of an arbitrary number of connections which are setup in the connections-tab.
  • RapidRep supports all relevant data sources. The definition of data sources is done elsewhere and is not shown here.

Implementation using RapidRep Designer (4/5)

  • At the heart of a Reportdefinition there is a list of so called Data Elements. These may be organized in a tree structure
  • Using a simple text replacement language the calculation part of a data element allows creation of SQL – Instructions whcih are easy to maintain. The evaluation of such a data element happens at runtime and returns a block of data.
  • The optional the text replacement language is simple and powerful. Four commands are sufficient to structure even complex SQL statements: if/else-branching, looping, value assignment and function calls.

Implementation using RapidRep Designer (5/5)

  • Calculation results as defined in the Calulation-tab are written to specific areas in the Excel workbook template. 
  • Many settings are available to control the output in various ways.
The development of a Reportdefinition is now finished and may be used by the RapidRep Runner application or via the RapidRep Batch Interface.

Test case execution using RapidRep Runner (1/2)

  • Once a version of a Reportdefinition has been finalized it can be used by the RapidRep Runner application.
  • The main task of the RapidRep Runner is the creation and execution of test cases.
  • The user selects a Reportdefiniton and creates a new, empty test case
  • All test cases which already exist are listed in the table

Test case execution using RapidRep Runner (2/2)

  • The newly created Report may then be parameterized accordingly. In the Reportdefinition we use there is only one parameter of type date.
  • After parameterization the Report (i.e. test case report) is calculated and rendered into an Excel workbook.
  • During calculation the Data Elements of the Reportdefinition are executed and results are written to specified positions within the new Excel workbook.
  • This result is stored in the RapidRep repository in an audit proof way.

The final result of a test case report

  • The output of a test case execution using RapidRep is a stand alone Excel workbook based on the workbook template as defined in the underlying Reportdefinition.
  • It contains all Data required to verify the result and to quickly identify data errors.
  • The advantage of this approach is he abiltity to pass this self-contained Excel workbook to an optional test management system. The result can be used without requiring RapidRep.

Integration into IBM Rational Quality Manager

The RapidRep Data Testing Adapter makes functions of RapidRep available to users of Rational Quality Manager.

The connection has been evaluated by IBM staff and has been certified as Ready for Rational Software Logo.

Creating a RapidRep Test-Skripts within RQM

  • Inside Rational Quality Manager a Test Script needs to be created choosing „Data Testing with RapidRep“ in the type field.
  • An import function retrieves all available Reportdefinitions. In the example the Currency Test Reportdefinition is selected and is stored together with its parameters after values are changed.

Creating a Test Case within RQM

  • A Test Case is created in Rational Quality Manager and the test script created in the step before is assigned to it.
  • One the test case is started, RapidRep is instructed to execute the test based on information provided in the test script.

Result of a calculated Test Case

  • During test case execution a visual bar indicates the progress. When finished there will be a so called  „Execution Result“
  • In our currency example there is two missing currencies. Thererfore the test case result is reported as “failed”.
  • At the bottom of the page, in the „Result Details“ section the Excel workbook created and populated by RapidRep is attached.
  • Test cases can be started as often as required. The history of these test executions  is maintained by Rational Quality Manager.
  • Reporting on test progress and other types of reports may be created by means of Rational Quality Manager using results provided by RapidRep.

Summary

The example demonstrates how RapidRep can be used to check the quality of data. Is does not matter where data is located as RapidRep supports all relevant data stores. In real life dozens of checks like the currency test are needed to ensure high data quality. RapidRep does not have on limits on the number of checks.

The implementation of a test case is normally a one time effort during the test case design phase, while test cases may be paramterized and executed as often as needed (also in batch mode). The execution of a test case is very simple and can also be triggered from outside like demonstrated using Rational Quality Manager.

Advantages:

  • results  are transparent and traceable
  • the outcome of a test case exectuion is a self contained Excel workbook containing all details.
  • if a test managment system is used, in this example IBM Rational Quality Manager, the planning
    phase can be done in Rational while test case execution and analysis is done by RapidRep.
    The integration is complete and transparent to the user
  • the approach is designed for a high coverage of test cases at low cost