Dac-Man: A Framework to Track, Compare and Analyze Large Scientific Data Changes¶
The Dac-Man (Data Change Management) framework allows users to efficiently and effectively identify, track and manage data change and associated provenance in scientific datasets.
Features¶
Dac-Man's key features include:
- HPC support. Dac-Man provides MPI support for enabling parallel change capture in HPC environments allowing users to find changes without necessarily moving the datasets to a common location
- Extendable. Users can plug-in their own scripts to calculate changes
- Support for remote comparison. Changes in datasets can be identified and compared across systems without moving data to a single location
- Flexible command-line options. Dac-Man provides different options to configure change detection
- Detailed output. Dac-Man outputs contain details on the different types and amount of change
- Customizable logging. Users can customize where and what to log, including detailed steps in the change capture process
Getting Started¶
Dac-Man is developed using Python 3, and distributed under the "new" or "revised" BSD license. To get up and running with Dac-Man, install the tool following the installation instructions, then use the Quickstart to try out features.