Skip to content

Dac-Man Quick Start Guide

This guide serves as a starting point for using Dac-Man. In this guide, we will demonstrate how to use the Dac-Man data change tool to compare two versions of a sample dataset.

Important

This guide assumes you have installed Dac-Man as described in the installation instructions. Alternatively, you can use Binder to try Dac-Man without needing to install it locally. See Running Dac-Man in Binder for instructions.

A step-by-step example running diff

Comparing directories

Dac-Man is able to compare directories of files as well as individual files for changes.

First, activate Dac-Man's environment, then navigate to the examples directory of Dac-Man's source code repository. Next, compare two directories for changes by running the dacman diff command with the directory paths as arguments:

cd dac-man/examples
dacman diff data/simple/v0 data/simple/v1

This comparison will produce output that lists the number of changes between the two folders in five categories. For more information about Dac-Man's output, refer to the Outputs subsection.

Added: 1, Deleted: 1, Modified: 1, Metadata-only: 0, Unchanged: 1

Comparing specific files

To compare two specific files for changes, run the dacman diff command with the --datachange option. The --script option allows you to specify a particular change analysis script, in this case the built-in Unix diff tool.

dacman diff data/simple/v0 data/simple/v1 --datachange --script /usr/bin/diff

This comparison will produce output that lists the number of changes to data values between these folders. The output will also list specific changes that the analysis script (in this case diff) found.

Added: 1, Deleted: 1, Modified: 1, Metadata-only: 0, Unchanged: 1
1c1
< foo
\ No newline at end of file
> hello
\ No newline at end of file

Using Dac-Man plug-ins to compare files

Dac-Man plug-ins allow to analyze changes between file contents in a more specialized way, depending on the file type.

Enabling included plug-ins

Dac-Man comes with included plug-ins for CSV, HDF5, and JSON files.

To enable a particular plug-in, its required additional dependencies must be installed. Follow these steps to install dependencies for all of Dac-Man's included plug-ins.

Once a plug-in has been enabled, it will automatically be used by Dac-Man when comparing files of the supported type.

Using the CSV plug-in

When dependencies are installed, Dac-Man's CSV plug-in will be used automatically when comparing CSV files.

The examples/data/csv directory contains the two example files A.csv and B.csv.

To test the Dac-Man CSV plug-in with these two files, run this command from the examples directory:

dacman diff data/csv/A.csv data/csv/B.csv

Using the HDF5 plug-in

When dependencies are installed, Dac-Man's HDF5 plug-in will be used automatically when comparing HDF5 files.

The examples/data/hdf5 directory contains the two example files A.h5 and B.h5.

To test the Dac-Man HDF5 plug-in with these two files, run this command from the examples directory:

dacman diff data/hdf5/A.h5 data/hdf5/B.h5

Using plug-ins when comparing entire directories

Plug-ins are also supported when using Dac-Man to compare entire directories with the --datachange option. When Dac-Man detects a modification in a file of a supported type, it will automatically choose the corresponding plug-in to perform the comparison.

The examples/data/plugin_test directory contains the two sub-directories v0 and v1, containing multiple files of the types supported by the included plug-ins.

To test the included plug-ins, after installing the dependencies, run this command from the examples directory:

dacman diff data/plugin_test/v0 data/plugin_test/v1 --datachange

Additional information

More detailed information about Dac-Man's features and functionality, can be found in the Reference section of the documentation.

For more information on Dac-Man's plug-in framework, refer to these sections of the documentation:

Running Dac-Man in Binder

With Binder you can try Dac-Man in a temporary cloud environment without needing to install on your local machine.

Clicking on this button will create a Dac-Man environment on Binder and open a web terminal: badge

Note

It might take a few minutes for Binder to setup the environment.

Note

Sample data to try Dac-Man features out will be available in the examples directory once the Binder environment is running.

Warning

After a period of inactivity, Binder environment (along with all data that was added by users during the session) are automatically and irreversibly deactivated and destroyed.

Binder Limitations

Although a valuable tool for distributing test environments, Binder's mode of operation might make it unsuitable for certain workflows and/or datasets. Make sure to read the documentation carefully, in particular the usage guidelines and the user privacy policy.

Outside of a quick evaluation, the recommended way of using Dac-Man is to install it in the users' computing environment.