The MultiRun tool – localized regression testing made simple

In previous blog posts, we have discussed the need for regression testing when building model libraries. With no off-the-shelf tools available, Claytex have developed our own tools for this task. One of them, the RegressionTest tool, is designed to be used as an integral element of a library development cycle. The other is the MultiRun tool, which provides a way for the model developer to quickly regression test a smaller batch of experiments in an automated way. This blog post serves as a quick overview of the MultiRun tool.

Once a library is loaded into the MultiRun tool, multiple instances of Dymola are spun up behind the scenes. The selected experiments are then distributed and run on these Dymola instances. Since many experiments can be run in parallel, the tool completes the process of testing a set of experiments in a highly efficient manner. When each experiment has completed, the tool then compares the results against a user specified set of reference data. Differences in results and model statistics are then written to a report and displayed in the tool. As a result, localized testing of model changes during development work is straightforward, enabling the developer to quickly interrogate the effect of a specific change across a range of relevant experiments.

Setting up the tool

After installing the MultiRun tool, the first job is to configure it to work with your existing Modelica library locations and Dymola installations. Key options can be found in the Tool Settings.

Figure 1: The Settings of the MultiRun tool - App Settings group.

Figure 1: The Settings of the MultiRun tool – App Settings group.

To start, there are 2 directories to create, or identify: the Output Directory and the Working Directory. These directories serve as the primary write locations for the tool. The Output Directory is used for storage of data generated by the MultiRun tool. This includes both raw results – which can then be used as reference data (benchmark results to be compared against in the future) as well as the generated reports. As one might expect, the Working Directory is the collection of working directory folders used by each instance of Dymola.

Users can also specify the Number of Dymola instances to use which sets the number of simultaneous simulations they want to execute. The default number of instances is equal to the number of physical cores on the PC. Depending on PC age and performance level, it may be better to limit this number to prevent over taxing of the hardware. For maximum throughput on new PCs we’ve found the best option to be the same number of instances as the number of physical cores on the machine. However, if the user wants to use the computer for other tasks while experiments are running it is a good idea to specify one or two less cores than are physically available. An automatic simulation timeout can also be enforced with a time limit, to prevent stubborn experiments unable to progress from grinding on indefinitely.

The next task is to configure the Dymola settings for the MultiRun tool. In order to function smoothly, the application requires the paths to your library directories and Dymola installations. While the paths to the Dymola installations seldom need to be updated, the location of the user’s Modelica libraries are often varied. The user then needs to select the Dymola version they would like to use. Also note that each instance of Dymola will continue to utilize your ModelicaPath environment variable to find dependent libraries.

Figure 2: The Settings of the MultiRun tool - Dymola Settings group.

Figure 2: The Settings of the MultiRun tool – Dymola Settings group.

After all this is completed, remember to save your settings by clicking Save Settings!

Generating Reference results

Now that the tool is configured, the first thing to do is generate some reference data. This should be done prior to making any new changes to models to establish a baseline set of results.

To begin, hit the Load button in the top left-hand corner. This gives you the option to either load a Modelica library for testing, or load an existing test setup. Simply change the file type on the explorer window to switch between these options.

Figure 3: Generation of reference results is a simple task!

Figure 3: Generation of reference results is a simple task!

Now that the desired library has been loaded into the MultiRun tool, it is time to configure the run which will generate the reference data.  The most obvious and intuitive step is to expand the library tree and select the experiments that you would like to run. Once that is complete, there are a few more items which need to be specified.

In the Output directory, there is a subfolder, called Reports. Within this subfolder, the report for the run as well as the condensed results will be written. Within the Reports directory there will be a folder corresponding to each Report Identifier which have been run on the machine. The Report Identifier field denotes the name given to the directory into which your reports will be written.  The Name for this version field provides the user with the ability to specify an identifier for each specific report; that way you can keep track of each iteration of the reference results or regression tests which have been run. After these fields have been set, the MultiRun tool’s reference data run can be initiated by clicking the Start button.

Running a Regression Test

Once a set of reference results have been generated, they can be used as the reference data to which one can compare new results. This allows the user to test new results against previous results, whether the user made model changes or simply changed to a different version of Dymola.

To get started, simply select the Run regression test button in the toolbar, fill in the Report identifier, give the test version a name, and select the Reference version to which you wish to compare against from the drop-down menu.

Variables to Compare and Tolerances

Setting up the variables to be compared between the new test results and the reference data, along with the tolerances is the next task.

The user has the ability to adjust two tolerances which are used to determine whether a particular variable in a test passes or fails.

The first is called the Tolerance setting and is applied to the reference results when comparing; if the new results differ by more than this tolerance, they will be flagged as having failed the test.

Maximum time offset is the second, and operates on a similar principle, although it is applied to time shift in results (we find this necessary for things like gearshifts).

Figure 4: The MultiRun tool enables individual experiments, packages or libraries to be regression tested quickly and easily.

Figure 4: The MultiRun tool enables individual experiments, packages or libraries to be regression tested quickly and easily.

In addition to the tolerances, the user must specify which variables should be compared during the test. There are two Boolean fields and one string filter which are used to control this.

The first Boolean (Ignore annotations) enables or disables the variables defined within Claytex variables annotation to be included in the comparison. These annotations can exist in the *.mo file and take the form __Claytex(variables={“myvariablename”}). Utilizing this type of setup allows the users to embed the variables of interest into the annotation of each experiment.

The second Boolean (Ignore state variables) tells the MultiRun tool to include or exclude model state variables in the comparison. At times, excluding these variables can be necessary.

The last setting, Names to compare is a filter which is applied to the entire results dataset which filters the results down to only those that the user may want to compare. (The default setting for this field “*summary*”, compares only the variables in summary records of the sub components within the experiments.)   

Collectively, these options allow the user fairly granular control over the results included in the test report, as well as what is considered an acceptable difference.

Reviewing a report

Once the test has been run and a report generated, the report is loaded and the user is automatically switched over to the report view. From here the user can review the report and efficiently evaluate the effects of the changes they made to their models. In the report view tree, experiments are primarily identified as either green or red.

Green circles with a white check indicate a ‘pass’. This means the simulation ran successfully and changes (if any) fall within the bounds established for the report and are thus considered acceptable.

Red on the other hand indicates a failure. The tool will indicate failure for a number of reasons. Some examples are as follows:

  • The simulation failed to run
  • A signal was missing from the reference data
  • Changes in the results fall outside the previously discussed thresholds

All of these “failures” are things that come up regularly during development.

Figure 5: Test results are laid out clearly.

Figure 5: Test results are laid out clearly.

If the failure was due to thresholds, the user can use the embedded plotting capability to easily overlay the new signal versus that of the reference data. This enables the user to quickly identify the differences between the versions.

Other messages are commonly displayed in the Summary tab of the report view. For example, the tool will indicate if the model structure has changed (more or fewer linear or nonlinear systems for example).

Icons other than the green check or the red x are sometimes displayed to indicate other ways in which a model doesn’t ‘pass’. One example of this would be if the selected model was not a test (The user may have accidentally selected a template for example, or an interactive function). Another example would be if a simulation timed out.

The MultiRun tool also captures the translation and simulation logs of each experiment. This is done to primarily help identify structural changes to the models (model statistics). For instance, the automated comparison of the translation log statistics is conducted to flag changes in quantity and structure of linear or non-linear systems.

In addition to the automated comparison of statistics. the logs can be interrogated manually by viewing either the translation or simulation logs in a helpful side-by-side view.

Closing remarks

The MultiRun tool provides the user with the ability to test a user-defined batch of experiments. This allows the developer preview the downstream effects due to changes they are making in a sub-model without having to run a full blown multi-hour regression test.

If you would like a demo or to learn more, contact sales@claytex.com

Nate Horn – Vice President

Please get in touch if you have any questions or have got a topic in mind that you would like us to write about. You can submit your questions / topics via: Tech Blog Questions / Topic Suggestion.

CONTACT US

Got a question? Just fill in this form and send it to us and we'll get back to you shortly.

Sending

© Copyright 2010-2022 Claytex Services Ltd All Rights Reserved

Log in with your credentials

Forgot your details?