idaes.dmf package

IDAES Data Management Framework (DMF)

The DMF lets you save, search, and retrieve provenance related to your models.

This package is documented with Sphinx. To build the documentation, change to the ‘docs’ directory and run, e.g., ‘make html’.

Resource representaitons.

Submodules

idaes.dmf.commands module

Perform all logic, input, output of commands that is particular to the CLI.

Call functions defined in ‘api’ module to handle logic that is common to the API and CLI.

idaes.dmf.commands.cat_resources(path, objects=(), color=True)[source]
idaes.dmf.commands.init_conf(workspace)[source]

Initialize the workspace.

idaes.dmf.commands.list_resources(path, long_format=None, relations=False)[source]

List resources in a given DMF workspace.

Parameters:
  • path (str) – Path to the workspace
  • long_format (bool) – List in long format flag
  • relations (bool) – Show relationships, in long format
Returns:

None

idaes.dmf.commands.list_workspaces(root, stream=None)[source]

List workspaces found from a given root path.

Parameters:
  • root – root path
  • stream – Output stream (must have .write() method)
idaes.dmf.commands.workspace_import(path, patterns, exit_on_error)[source]

Import files into workspace.

Parameters:
  • path (str) – Target workspace directory
  • patterns (list) – List of Unix-style glob for files to import. Files are expected to be resource JSON or a Jupyter Notebook.
  • exit_on_error (bool) – If False, continue trying to import resources even if one or more fail.
Returns:

Number of things imported

Return type:

int

Raises:

BadResourceError, if there is a problem

idaes.dmf.commands.workspace_info(dirname)[source]
idaes.dmf.commands.workspace_init(dirname, metadata)[source]

Initialize from root at dirname, set environment variable for other commands, and parse config file.

idaes.dmf.dmf module

Data Management Framework

class idaes.dmf.dmf.DMF(path='', name=None, desc=None, **ws_kwargs)[source]

Bases: idaes.dmf.workspace.Workspace, traitlets.traitlets.HasTraits

Data Management Framework (DMF).

Expected usage is to instantiate this class, once, and then use it for storing, searching, and retrieve resource s that are required for the given analysis.

For details on the configuration files used by the DMF, see documentation for DMFConfig (global configuration) and idaes.dmf.workspace.Workspace.

CONF_DATA_DIR = 'datafile_dir'
CONF_DB_FILE = 'db_file'
CONF_HELP_PATH = 'htmldocs'
add(rsrc)[source]

Add a resource and associated files.

Parameters:rsrc (resource.Resource) – The resource
Returns:(str) Resource ID
Raises:DMFError, DuplicateResourceError
count()[source]
datafile_dir

A trait for unicode strings.

db_file

A trait for unicode strings.

fetch_many(rid_list)[source]

Fetch multiple resources, by their identifiers.

Parameters:rid_list (list) – List of integer resource identifers
Returns:(list of resource.Resource) List of found resources (may be empty)
fetch_one(rid)[source]

Fetch one resource, from its identifier.

Parameters:rid (int) – Resource identifier
Returns:(resource.Resource) The found resource, or None if no match
find(filter_dict=None, id_only=False)[source]

Find and return resources matching the filter.

The filter syntax is a subset of the MongoDB filter syntax. This means that it is represented as a dictionary, where each key is an attribute or nested attribute name, and each value is the value against which to match. There are four possible types of values:

  1. scalar string or number (int, float): Match resources that have this exact value for the given attribute.

  2. date, as datetime.datetime or pendulum.Pendulum instance: Match resources that have this exact date for the given attribute.

  3. list: Match resources that have a list value for this attribute, and for which any of the values in the provided list are in the resource’s corresponding value. If a ‘!’ is appended to the key name, then this will be interpreted as a directive to only match resources for which all values in the provided list are present.

  4. dict: This is an inequality, with one or more key/value pairs. The key is the type of inequality and the value is the numeric value for that range. All keys begin with ‘$’. The possible inequalities are:

    • “$lt”: Less than (<)
    • “$le”: Less than or equal (<=)
    • “$gt”: Greater than (>)
    • “$ge”: Greater than or equal (>=)
    • “$ne”: Not equal to (!=)
Parameters:
  • filter_dict (dict) – Search filter.
  • id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned.
Returns:

(list of int|Resource) Depending on the value of id_only.

Find related resources.

Parameters:
  • rsrc (resource.Resource) – Resource starting point
  • filter_dict (dict) – See parameter of same name in find().
  • maxdepth (int) – Maximum depth of search (starts at 1)
  • meta (List[str]) – Metadata fields to extract for meta part
  • outgoing (bool) – If True, look at outgoing relations. Otherwise look at incoming relations. e.g. if A ‘uses’ B and if True, would find B starting from A. If False, would find A starting from B.
Returns:

Generates triples (depth, Triple, meta), where the depth is an integer (starting at 1), the Triple is a simple namedtuple wrapping (subject, object, predicate), and meta is a dict of metadata for the endpoint of the relation (the object if outgoing=True, the subject if outgoing=False) for the fields provided in the meta parameter.

Raises:

NoSuchResourceError – if the starting resource is not found

remove(identifier=None, filter_dict=None, update_relations=True)[source]

Remove one or more resources, from its identifier or a filter. Unless told otherwise, this method will scan the DB and remove all relations that involve this resource.

Parameters:
  • identifier (int) – Identifier in id_ attribute of a resource.
  • filter_dict (dict) – Filter to use instead of identifier
  • update_relations (bool) – If True (the default), scan the DB and remove all relations that involve this identifier.
update(rsrc, sync_relations=False, upsert=False)[source]

Update/insert stored resource.

Parameters:
  • rsrc (resource.Resource) – Resource instance
  • sync_relations (bool) – If True, and if resource exists in the DB, then the “relations” attribute of the provided resource will be changed to the stored value.
  • upsert (bool) – If true, and the resource is not in the DMF, then insert it. If false, and the resource is not in the DMF, then do nothing.
Returns:

True if the resource was updated or added, False if nothing

was done.

Return type:

bool

Raises:

errors.DMFError – If the input resource was invalid.

class idaes.dmf.dmf.DMFConfig(defaults=None)[source]

Bases: object

Global DMF configuration.

Every time you create an instance of the DMF, or run a dmf command on the command-line, the library opens the global DMF configuration file to figure out wthe default workspace (and, eventually, other values).

The default location for this configuration file is “~/.dmf”, i.e. the file named “.dmf” in the user’s home directory. This can be modified programmatically by changing the “filename” attribute of this class.

The contents of .dmf are formatted as YAML, with the following keys defined:

workspace
Path to the default workspace directory.

An example file is shown below:

{workspace: /tmp/newdir}
DEFAULTS = {'workspace': '/home/ksb/Projects/IDAES/github/IDAES/idaes/doc'}
WORKSPACE = 'workspace'
filename = '/home/ksb/.dmf'
save()[source]
workspace
idaes.dmf.dmf.get_propertydb_table(rsrc)[source]

idaes.dmf.errors module

Exception classes.

exception idaes.dmf.errors.AlamoDisabledError[source]

Bases: idaes.dmf.errors.AlamoError

exception idaes.dmf.errors.AlamoError(msg)[source]

Bases: idaes.dmf.errors.DmfError

exception idaes.dmf.errors.BadResourceError[source]

Bases: idaes.dmf.errors.ResourceError

exception idaes.dmf.errors.CommandError(command, operation, details)[source]

Bases: Exception

exception idaes.dmf.errors.DMFBadWorkspaceError(path, why)[source]

Bases: idaes.dmf.errors.DMFError

exception idaes.dmf.errors.DMFError(detailed_error)[source]

Bases: Exception

exception idaes.dmf.errors.DMFWorkspaceNotFoundError(path)[source]

Bases: idaes.dmf.errors.DMFError

exception idaes.dmf.errors.DataFormatError(dtype, err)[source]

Bases: idaes.dmf.errors.DmfError

exception idaes.dmf.errors.DmfError[source]

Bases: Exception

exception idaes.dmf.errors.DuplicateResourceError(op, id_)[source]

Bases: idaes.dmf.errors.ResourceError

exception idaes.dmf.errors.FileError[source]

Bases: Exception

exception idaes.dmf.errors.InvalidRelationError(subj, pred, obj)[source]

Bases: idaes.dmf.errors.DmfError

exception idaes.dmf.errors.ModuleFormatError(module_name, type_, what)[source]

Bases: Exception

exception idaes.dmf.errors.NoSuchResourceError(name=None, id_=None)[source]

Bases: idaes.dmf.errors.ResourceError

exception idaes.dmf.errors.ParseError[source]

Bases: Exception

exception idaes.dmf.errors.ResourceError[source]

Bases: Exception

exception idaes.dmf.errors.SearchError(spec, problem)[source]

Bases: Exception

exception idaes.dmf.errors.WorkspaceConfMissingField(path, name, desc)[source]

Bases: idaes.dmf.errors.WorkspaceError

exception idaes.dmf.errors.WorkspaceConfNotFoundError(path)[source]

Bases: idaes.dmf.errors.WorkspaceError

exception idaes.dmf.errors.WorkspaceError[source]

Bases: Exception

exception idaes.dmf.errors.WorkspaceNotFoundError(from_dir)[source]

Bases: idaes.dmf.errors.WorkspaceError

idaes.dmf.experiment module

The ‘experiment’ is a root container for a coherent set of ‘resources’.

class idaes.dmf.experiment.Experiment(dmf, **kwargs)[source]

Bases: idaes.dmf.resource.Resource

An experiment is a way of grouping resources in a way that makes sense to the user.

It is also a useful unit for passing as an argument to functions, since it has a standard ‘slot’ for the DMF instance that created it.

add(rsrc)[source]

Add a resource to an experiment.

This does two things:

  1. Establishes an “experiment” type of relationship between the new resource and the experiment.
  2. Adds the resource to the DMF
Parameters:rsrc (resource.Resource) – The resource to add.
Returns:Added (input) resource, for chaining calls.
Return type:resource.Resource
copy(**kwargs)[source]

Get a copy of this experiment. The returned object will have been added to the DMF.

Parameters:kwargs – Values to set in new instance after copying.
Returns:A (mostly deep) copy.

Note that the DMF instance is just a reference to the same object as in the original, and they will share state.

Return type:Experiment
dmf

Add and update relation triple in DMF.

Parameters:
Returns:

None

remove()[source]

Remove this experiment from the associated DMF instance.

update()[source]

Update experiment to current values.

idaes.dmf.help module

Find documentation for modules and classes in the generated Sphinx documentation and return its location.

idaes.dmf.help.find_html_docs(dmf, obj, **kw)[source]

Get one or more files with HTML documentation for the given object, in paths referred to by the dmf instance.

idaes.dmf.help.get_html_docs(dmf, module_, name, sphinx_version=(1, 5, 5))[source]

idaes.dmf.magics module

Jupyter magics for the DMF.

class idaes.dmf.magics.DmfMagics(shell)[source]

Bases: IPython.core.magic.Magics

NEED_INIT_CMD = {'help': '+', 'info': '*'}
dmf(line)[source]

DMF outer command

dmf_help(*names)[source]

Provide help on IDAES objects and classes.

Invoking with no arguments gives general help. Invoking with one or more arguments looks for help in the docs on the given objects or classes.

dmf_info(*topics)[source]

Provide information about DMF current state for whatever ‘topics’ are provided. With no topic, provide general information about the configuration.

Parameters:topics ((List[str])) – List of topics
Returns:None
dmf_init(path, *extra)[source]

Initialize DMF (do this before most other commands).

Parameters:path (str) – Full path to DMF home
dmf_list()[source]

List resources in the current workspace.

dmf_workspaces(*paths)[source]

List DMF workspaces.

Parameters:paths (List[str]) – Paths to search, use “.” by default
idaes(line)[source]

%idaes magic

idaes_help(*names)[source]

Provide help on IDAES objects and classes.

Invoking with no arguments gives general help. Invoking with one or more arguments looks for help in the docs on the given objects or classes.

magics = {'cell': {}, 'line': {'dmf': 'dmf', 'idaes': 'idaes'}}
registered = True
idaes.dmf.magics.register()[source]

idaes.dmf.propdata module

Property data types.

Ability to import, etc. from text files is part of the methods in the type.

Import property database from textfile(s): * See PropertyData.from_csv(), for the expected format for data. * See PropertyMetadata() for the expected format for metadata.

exception idaes.dmf.propdata.AddedCSVColumnError(names, how_bad, column_type='')[source]

Bases: KeyError

Error for :meth:PropertyData.add_csv()

class idaes.dmf.propdata.Fields[source]

Bases: idaes.dmf.tabular.Fields

Constants for fields.

C_PROP = 'property'
C_STATE = 'state'
class idaes.dmf.propdata.PropertyColumn(name, data)[source]

Bases: idaes.dmf.tabular.Column

Data column for a property.

data()[source]
type_name = 'Property'
class idaes.dmf.propdata.PropertyData(data)[source]

Bases: idaes.dmf.tabular.TabularData

Class representing property data that knows how to construct itself from a CSV file.

You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in add_csv() and the tests in idaes_dmf.propdb.tests.test_mergecsv.

add_csv(file_or_path, strict=False)[source]

Add to existing object from a new CSV file.

Depending on the value of the strict argument (see below), the new file may or may not have the same properties as the object – but it always needs to have the same number of state columns, and in the same order.

Note

Data that is “missing” because of property columns in one CSV and not the other will be filled with float(nan) values.

Parameters:
  • file_or_path (file or str) – Input file. This should be in exactly the same format as expected by :meth:from_csv().
  • strict (bool) – If true, require that the columns in the input CSV match columns in this object. Otherwise, only require that state columns in input CSV match columns in this object. New property columns are added, and matches to existing property columns will append the data.
Raises:

AddedCSVColumnError – If the new CSV column headers are not the same as the ones in this object.

Returns:

(int) Number of added rows

as_arr(states=True)[source]

Export property data as arrays.

Parameters:states (bool) – If False, exclude “state” data, e.g. the ambient temperature, and only include measured property values.
Returns:(values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values.
Raises:ValueError if the columns are not all the same length
embedded_units = '(.*)\\((.*)\\)'
errors_dataframe(states=False)[source]

Get errors as a dataframe.

Parameters:states (bool) – If False, exclude state data. This is the default, because states do not normally have associated error information.
Returns:Pandas dataframe for values.
Return type:pd.DataFrame
Raises:ImportError – If pandas or numpy were never successfully imported.
static from_csv(file_or_path, nstates=0)[source]

Import the CSV data.

Expected format of the files is a header plus data rows.

Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..

Column-name is in the format “Name (units)”

Error-column is in the format “<type> Error”, where “<type>” is the error type.

Parameters:
  • file_or_path (file-like or str) – Input file
  • nstates (int) – Number of state columns, appearing first before property columns.
Returns:

New properties instance

Return type:

PropertyData

is_property_column(index)[source]

Whether given column is a property. See is_state_column().

is_state_column(index)[source]

Whether given column is state.

Parameters:index (int) – Index of column
Returns:(bool) State or property and the column number.
Raises:IndexError – No column at that index.
names(states=True, properties=True)[source]

Get column names.

Parameters:
  • states (bool) – If False, exclude “state” data, e.g. the ambient temperature, and only include measured property values.
  • properties (bool) – If False, excluse property data
Returns:

List of column names.

Return type:

list[str]

properties
states
values_dataframe(states=True)[source]

Get values as a dataframe.

Parameters:states (bool) – see names().
Returns:(pd.DataFrame) Pandas dataframe for values.
Raises:ImportError – If pandas or numpy were never successfully imported.
class idaes.dmf.propdata.PropertyMetadata(values=None)[source]

Bases: idaes.dmf.tabular.Metadata

Class to import property metadata.

class idaes.dmf.propdata.PropertyTable(data=None, metadata=None)[source]

Bases: idaes.dmf.tabular.Table

Property data and metadata together (at last!)

static load(file_or_path, validate=True)[source]

Create PropertyTable from JSON input.

Parameters:
  • file_or_path (file or str) – Filename or file object from which to read the JSON-formatted data.
  • validate (bool) – If true, apply validation to input JSON data.

Example input:

{
    "meta": [
        {"datatype": "MEA",
         "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 306-310",
         "notes": "r is MEA weight fraction in aqueous soln.",
         "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.",
         "title": "Density and Viscosity of ..."}
    ],
    "data": [
        {"name": "Viscosity Value",
         "units": "mPa-s",
         "values": [2.6, 6.2],
         "error_type": "absolute",
         "errors": [0.06, 0.004],
         "type": "property"},
        {"name": "r",
         "units": "",
         "values": [0.2, 1000],
         "type": "state"}
    ]
}
class idaes.dmf.propdata.StateColumn(name, data)[source]

Bases: idaes.dmf.tabular.Column

Data column for a state.

data()[source]
type_name = 'State'
idaes.dmf.propdata.convert_csv(meta_csv, datatype, data_csv, nstates, output)[source]

idaes.dmf.resource module

Resource representaitons.

class idaes.dmf.resource.Code(*args, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

Some source code, such as a Python module or C file.

This can also refer to packages or entire Git repositories.

desc

Description of the code

idhash

Git or other unique hash

language

Programming language, e.g. “Python” (the default).

location

Flie path or URL location for the code

name

Name of the code object, e.g. Python module name

release

Version of the release, default is ‘0.0.0’

type

Type of code resource, must be one of – ‘method’, ‘function’, ‘module’, ‘class’, ‘file’, ‘package’, ‘repository’, or ‘notebook’.

class idaes.dmf.resource.Contact(*args, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

Person who can be contacted.

email

Email of the contact

name

Name of the contact

class idaes.dmf.resource.DateTime(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]

Bases: traitlets.traitlets.TraitType

A trait type for a datetime.

Input can be a string, float, or tuple. Specifically:
  • string, ISO8601: YYYY[-MM-DD[Thh:mm:ss[.uuuuuu]]]
  • float: seconds since Unix epoch (1/1/1970)
  • tuple: format accepted by datetime.datetime()

No matter the input, validation will transform it into a floating point number, since this is the easiest form to store and search.

default_value = 0
info_text = 'a datetime'
classmethod isoformat(ts)[source]
validate(obj, value)[source]
class idaes.dmf.resource.FilePath(tempfile=False, copy=True, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

Path to a file, plus optional description and metadata.

So that the DMF does not break when data files are moved or copied, the default is to copy the datafile into the DMF workspace. This behavior can be controlled by the copy and tempfile keywords to the constructor.

For example, if you have a big file you do NOT want to copy when you create the resource:

FilePath(path='/my/big.file', desc='100GB file', copy=False)

On the other hand, if you have a file that you want the DMF to manage entirely:

FilePath(path='/some/file.txt', desc='a file', tempfile=True)
CSV_MIMETYPE = 'text/csv'
desc

Description of the file’s contents

do_copy
fullpath
is_tmp
metadata

Metadata to associate with the file

mimetype

MIME type

open(mode='r')[source]
path

Path to file

read(*args)[source]
root
subdir

Unique subdir

class idaes.dmf.resource.FlowsheetResource(*args, **kwargs)[source]

Bases: idaes.dmf.resource.Resource

Flowsheet resource & factory.

classmethod from_flowsheet(obj, **kw)[source]
class idaes.dmf.resource.Identifier(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]

Bases: traitlets.traitlets.TraitType

Unique identifier.

Will set it itself automatically to a 32-byte unique hex string. Can only be set to strings

default_value = None
expr = re.compile('[0-9a-f]{32}')
info_text = 'Unique identifier'
validate(obj, value)[source]
class idaes.dmf.resource.PropertyDataResource(property_table=None, **kwargs)[source]

Bases: idaes.dmf.resource.TabularDataResource

Property data resource & factory.

idaes.dmf.resource.R_DERIVED = 'derived'

Constants for RelationType predicates

class idaes.dmf.resource.RelationType(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]

Bases: traitlets.traitlets.TraitType

Traitlets type for RDF-style triples relating resources to each other.

Predicates = {'version', 'contains', 'derived', 'uses'}
info_text = 'triple of (subject-id, predicate, object-id), all strings, with a predicate in {version, contains, derived, uses}'
validate(obj, value)[source]
class idaes.dmf.resource.Resource(*args, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

A dynamically typed resource.

Resources have metadata and (same for all resoures) a type-specific “data” section (unique to that type of resource).

ID_FIELD = 'id_'
TYPE_FIELD = 'type'
aliases

List of aliases for the resource

codes

List of code objects (including repositories and packages) associated with the resource. Each value is a Code.

collaborators

List of other people involved. Each value is a Contact.

copy(**kwargs)[source]

Get a copy of this Resource.

As a convenience, optionally set some attributes in the copy.
Parameters:kwargs

Attributes to set in new instance after copying.

Returns:
Resource: A deep copy.

The copy will have an empty (zero) identifier and a new unique value for uuid. The relations are not copied.

static create_relation(subj, pred, obj)[source]

Create a relationship between two Resource instances.

Parameters:
Returns:

None

Raises:

TypeError – if subject & object are not Resource instances.

created

Date and time when the resource was created. This defaults to the time when the object was created. Value is a DateTime.

creator

Creator of the resource. Value is a Contact.

data

An instance of a Python dict.

datafiles

List of data files associated with the resource. Each value is a FilePath.

datafiles_dir

Datafiles subdirectory (single directory name)

desc

Description of the resource

help(name)[source]

Return descriptive ‘help’ for the given attribute.

Parameters:name (str) – Name of attribute
Returns:Help string, or error starting with “Error: “
Return type:str
id_

Integer identifier for this Resource. You should not set this yourself. The value will be automatically overwritten with the database’s value when the resource is added to the DMF (with the .add() method).

modified

Date and time the resource was last modified. This defaults to the time when the object was created. Value is a DateTime.

name

Human-readable name for the resource (optional)

property_table

For property data resources, this property builds and returns a PropertyTable object.

Returns:
A representation of metadata and data
in this resource.
Return type:propdata.PropertyTable
Raises:TypeError – if this resource is not of the correct type.
relations

Validate values in a list as belonging to a given TraitType.

This can be used in place of the Traitlets.List class.

sources

Sources from which resource is derived, i.e. its provenance. Each value is a Source.

table
For tabular data resources, this property builds and returns
a Table object.
Returns:
A representation of metadata and data
in this resource.
Return type:tabular.Table
Raises:TypeError – if this resource is not of the correct type.
tags

List of tags for the resource

type

Type of this Resource. See ResourceTypes for standard values for this attribute.

uuid

Universal identifier for this resource

version

Version of the resource. Value is a SemanticVersion.

class idaes.dmf.resource.ResourceTypes[source]

Bases: object

Standard resource type names.

Use these as opaque constants to indicate standard resource types. For example, when creating a Resource:

rsrc = Resource(type=ResourceTypes.property_data, ...)
data = 'data'

Data (e.g. result data)

experiment = 'experiment'

Experiment

fs = 'flowsheet'

Flowsheet resource.

jupyter = 'notebook'
jupyter_nb = 'notebook'
nb = 'notebook'

Jupyter Notebook

property_data = 'propertydb'

Property data resource, e.g. the contents are created via classes in the idaes.dmf.propdata module.

python = 'python'

Python code

surrmod = 'surrogate_model'

Surrogate model

tabular_data = 'tabular_data'

Tabular data

xp = 'experiment'
class idaes.dmf.resource.SemanticVersion(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]

Bases: traitlets.traitlets.TraitType

Semantic version.

Three numeric identifiers, separated by a dot. Trailing non-numeric characters allowed.

Inputs, string or tuple, may have less than three numeric identifiers, but internally the value will be padded with zeros to always be of length four.

A leading dash or underscore in the trailing non-numeric characters is removed.

Some examples:

  • 1 => valid => (1, 0, 0, ‘’)
  • rc3 => invalid: no number
  • 1.1 => valid => (1, 1, 0, ‘’)
  • 1a => valid => (1, 0, 0, ‘a’)
  • 1.a.1 => invalid: non-numeric can only go at end
  • 1.12.1 => valid => (1, 12, 1, ‘’)
  • 1.12.13-1 => valid => (1, 12, 13, ‘1’)
  • 1.12.13.x => invalid: too many parts
default_value = (0, 0, 0, '')
info_text = 'semantic version major, minor, patch, & modifier'
classmethod pretty(values)[source]
validate(obj, value)[source]
class idaes.dmf.resource.Source(*args, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

A work from which the resource is derived.

date

Date associated with resource

doi

Digital object identifier

isbn

ISBN

language

The primary language of the intellectual content of the resource

source

The work, either print or electronic, from which the resource was derived

class idaes.dmf.resource.TabularDataResource(table=None, **kwargs)[source]

Bases: idaes.dmf.resource.Resource

Tabular data resource & factory.

class idaes.dmf.resource.TraitContainer(*args, **kwargs)[source]

Bases: traitlets.traitlets.HasTraits

Base class for Resource, that knows how to serialize and parse its traits.

as_dict()[source]
classmethod from_dict(d)[source]
class idaes.dmf.resource.Triple(subject, predicate, object)

Bases: tuple

Provide attribute access to an RDF subject, predicate, object triple

object

Alias for field number 2

predicate

Alias for field number 1

subject

Alias for field number 0

class idaes.dmf.resource.ValidatingList(*args, **kwargs)[source]

Bases: traitlets.traitlets.List

Validate values in a list as belonging to a given TraitType.

This can be used in place of the Traitlets.List class.

validate_elements(obj, value=None)[source]

This is called when the initial value is set.

class idaes.dmf.resource.Version(*args, **kwargs)[source]

Bases: idaes.dmf.resource.TraitContainer

Version of something (code, usually).

created

When this version was created. Default “empty”, which is encoded as the start of Unix epoch (1970/01/01).

name

Name given to version

revision

Revision, e.g. 1.0.0rc3

idaes.dmf.resource.get_resource_structure()[source]

idaes.dmf.resourcedb module

Resource database.

class idaes.dmf.resourcedb.ResourceDB(dbfile=None, connection=None)[source]

Bases: object

A database interface to all the resources within a given DMF workspace.

delete(id_=None, idlist=None, filter_dict=None)[source]

Delete one or more resources with given identifiers.

Parameters:
  • id (int) – If given, delete this id.
  • idlist (list) – If given, delete ids in this list
  • filter_dict (dict) – If given, perform a search and delete ids it finds.
Returns:

(list[str]) Identifiers

find(filter_dict, id_only=False)[source]

Find and return records based on the provided filter.

Parameters:
  • filter_dict (dict) – Search filter. For syntax, see docs in dmf.DMF.find().
  • id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned.
Returns:

(list of int|Resource) Depending on the value of id_only

Find all resources connected to the identified one.

Parameters:
  • rsrc_id
  • filter_dict
  • outgoing
  • maxdepth
  • meta (List[str]) –
Returns:

Generator of (depth, relation, metadata)

Raises:

KeyError if the resource is not found.

get(identifier)[source]
put(resource)[source]
update(id_, new_dict)[source]

Update the identified resource with new values.

Parameters:
  • id (int) – Identifier of resource to update
  • new_dict (dict) – New dictionary of resource values, e.g. result of Resource.as_dict().
Returns:

The id_ of the resource, or None if it was not updated.

Return type:

int

Raises:

ValueError – If new resource is of wrong type

idaes.dmf.surrmod module

Surrogate modeling helper classes and functions. This is used to run ALAMO on property data.

class idaes.dmf.surrmod.SurrogateModel(experiment, **kwargs)[source]

Bases: object

Run ALAMO to generate surrogate models.

Automatically track the objects in the DMF.

Example:

model = SurrogateModel(dmf, simulator='linsim.py')
rsrc = dmf.fetch_one(1) # get resource ID 1
data = rsrc.property_table.data
model.set_input_data(data, ['temp'], 'density')
results = model.run()
PARAM_DATA_KEY = 'parameters'

Key in resource ‘data’ for params

run(**kwargs)[source]

Run ALAMO.

Parameters:**kwargs – Additional arguments merged with those passed to the class constructor. Any duplicate values will override the earlier ones.
Returns:The dictionary returned from alamopy.doalamo()
Return type:dict
set_input_data(data, x_colnames, z_colname)[source]

Set input from provided dataframe or property data.

Parameters:
  • data (PropertyData|pandas.DataFrame) – Input data
  • x_colnames (List[str]|str) – One or more column names for parameters
  • z_colname (str) – Column for response variable
Returns:

None

Raises:

KeyError – if columns are not found in data

set_input_data_np(x, z, xlabels=None, zlabel='z')[source]

Set input data from numpy arrays.

Parameters:
  • x (arr) – Numpy array with parameters
  • xlabels (List[str]) – List of labels for x
  • zlabel (str) – Label for z
  • z (arr) – Numpy array with response variables
Returns:

None

set_validation_data(data, x_colnames, z_colname)[source]

Set validation data from provided data.

Parameters:
  • data (PropertyData|pandas.DataFrame) – Input data
  • x_colnames (List[str]|str) – One or more column names for parameters
  • z_colname (str) – Column for response variable
Returns:

None

Raises:

KeyError – if columns are not found in data

set_validation_data_np(x, z, xlabels=None, zlabel='z')[source]

Set input data from numpy arrays.

Parameters:
  • x (arr) – Numpy array with parameters
  • xlabels (List[str]) – List of labels for x
  • zlabel (str) – Label for z
  • z (arr) – Numpy array with response variables
Returns:

None

idaes.dmf.tabular module

Tabular data handling

class idaes.dmf.tabular.Column(name, data)[source]

Bases: object

Generic, abstract column

data()[source]
type_name = 'generic'
class idaes.dmf.tabular.Fields[source]

Bases: object

Constants for field names.

AUTH = 'authors'
COLTYPE = 'type'
DATA = 'data'
DATA_ERRORS = 'errors'
DATA_ERRTYPE = 'error_type'
DATA_NAME = 'name'

Keys for data mapping

DATA_UNITS = 'units'
DATA_VALUES = 'values'
DATE = 'date'
DTYPE = 'datatype'
INFO = 'info'
META = 'meta'
ROWS = 'rows'
TITLE = 'title'
VALS = 'values'
class idaes.dmf.tabular.Metadata(values=None)[source]

Bases: object

Class to import metadata.

as_dict()[source]
author

Publication author(s).

datatype
date

Publication date

static from_csv(file_or_path)[source]

Import metadata from simple text format.

Example input:

Source,Han, J., Jin, J., Eimer, D.A., Melaaen, M.C.,"Density of             Water(1) + Monoethanolamine(2) + CO2(3) from (298.15 to 413.15) K            and Surface Tension of Water(1) + Monethanolamine(2) from (             303.15 to 333.15)K", J. Chem. Eng. Data, 2012, Vol. 57,             pg. 1095-1103"
Retrieval,"J. Morgan, date unknown"
Notes,r is MEA weight fraction in aqueous soln. (CO2-free basis)
Parameters:file_or_path (str or file) – Input file
Returns:(PropertyMetadata) New instance
info

Publication venue, etc.

line_expr = re.compile('\\s*(\\w+)\\s*,\\s*(.*)\\s*')
source

Full publication info.

source_expr = re.compile('\\s*(.*)\\s*,\\s*"(.*)"\\s*,\\s*(.*)\\s*')
title

Publication title.

class idaes.dmf.tabular.Table(data=None, metadata=None)[source]

Bases: idaes.dmf.tabular.TabularObject

Tabular data and metadata together (at last!)

add_metadata(m)[source]
as_dict()[source]

Represent as a Python dictionary.

Returns:(dict) Dictionary representation
data
dump(fp, **kwargs)[source]

Dump to file as JSON. Convenience method, equivalent to converting to a dict and calling json.dump().

Parameters:
  • fp (file) – Write output to this file
  • **kwargs – Keywords passed to json.dump()
Returns:

see json.dump()

dumps(**kwargs)[source]

Dump to string as JSON. Convenience method, equivalent to converting to a dict and calling json.dumps().

Parameters:**kwargs – Keywords passed to json.dumps()
Returns:(str) JSON-formatted data
classmethod load(file_or_path, validate=True)[source]

Create from JSON input.

Parameters:
  • file_or_path (file or str) – Filename or file object from which to read the JSON-formatted data.
  • validate (bool) – If true, apply validation to input JSON data.

Example input:

{
    "meta": [{
        "datatype": "MEA",
        "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 3096-30100",
        "notes": "r is MEA weight fraction in aqueous soln.",
        "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.",
        "title": "Density and Viscosity of Monoethanolamine + etc."
    }],
    "data": [
        {
            "name": "Viscosity Value",
            "units": "mPa-s",
            "values": [2.6, 6.2],
            "error_type": "absolute",
            "errors": [0.06, 0.004],
            "type": "property"
        }
    ]
}
metadata
class idaes.dmf.tabular.TabularData(data, error_column=False)[source]

Bases: object

Class representing tabular data that knows how to construct itself from a CSV file.

You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in add_csv() and the tests in idaes_dmf.propdb.tests.test_mergecsv.

as_arr()[source]

Export property data as arrays.

Returns:(values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values.
Raises:ValueError if the columns are not all the same length
as_list()[source]

Export the data as a list.

Output will be in same form as data passed to constructor.

Returns:(list) List of dicts
columns
embedded_units = '(.*)\\((.*)\\)'
errors_dataframe()[source]

Get errors as a dataframe.

Returns:Pandas dataframe for values.
Return type:pd.DataFrame
Raises:ImportError – If pandas or numpy were never successfully imported.
static from_csv(file_or_path, error_column=False)[source]

Import the CSV data.

Expected format of the files is a header plus data rows.

Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..

Column-name is in the format “Name (units)”

Error-column is in the format “<type> Error”, where “<type>” is the error type.

Parameters:
  • file_or_path (file-like or str) – Input file
  • error_column (bool) – If True, look for an error column after each value column. Otherwise, all columns are assumed to be values.
Returns:

New table of data

Return type:

TabularData

get_column(key)[source]

Get an object for the given named column.

Parameters:key (str) – Name of column
Returns:(TabularColumn) Column object.
Raises:KeyError – No column by that name.
get_column_index(key)[source]

Get an index for the given named column.

Parameters:key (str) – Name of column
Returns:(int) Column number.
Raises:KeyError – No column by that name.
names()[source]

Get column names.

Returns:List of column names.
Return type:list[str]
num_columns

Number of columns in this table.

A “column” is defined as data + error. So if there are two columns of data, each with an associated error column, then num_columns is 2 (not 4).

Returns:Number of columns.
Return type:int
num_rows

Number of rows in this table.

obj.num_rows is a synonym for len(obj)

Returns:Number of rows.
Return type:int
values_dataframe()[source]

Get values as a dataframe.

Returns:(pd.DataFrame) Pandas dataframe for values.
Raises:ImportError – If pandas or numpy were never successfully imported.
class idaes.dmf.tabular.TabularObject[source]

Bases: object

Abstract Property data class.

as_dict()[source]

Return Python dict representation.

idaes.dmf.util module

Utility functions.

class idaes.dmf.util.CPrint(color=True)[source]

Bases: object

Colorized terminal printing.

Codes are below. To use:

cprint = CPrint() cprint(‘This has no colors’) # just like print() cprint(‘This is @b[blue] and @_r[red underlined]’)

You can use the same class as a no-op by just passing color=False to the constructor.

COLORS = {'*': '\x1b[1m', '-': '\x1b[2m', '.': '\x1b[0m', '_': '\x1b[4m', 'b': '\x1b[94m', 'c': '\x1b[96m', 'g': '\x1b[92m', 'h': '\x1b[1m\x1b[95m', 'm': '\x1b[95m', 'r': '\x1b[91m', 'w': '\x1b[97m', 'y': '\x1b[93m'}
colorize(s)[source]
println(s)[source]
write(s)[source]
class idaes.dmf.util.TempDir(*args)[source]

Bases: object

Simple context manager for mkdtemp().

idaes.dmf.util.datetime_timestamp(v)[source]

Get numeric timestamp. This will work under both Python 2 and 3.

idaes.dmf.util.find_process_byname(name, uid=None)[source]

Generate zero or more PIDs where ‘name’ is part of either the first or second token in the command line. Optionally also filter the returned PIDs to only those with a ‘real’ user UID (UID) equal to the provided uid. If None, the default, is given, then use the current process UID. Providing a value of < 0 will skip the filter.

idaes.dmf.util.get_file(file_or_path, mode='r')[source]

Open a file for reading, or simply return the file object.

idaes.dmf.util.get_logger(name='')[source]

Create and return a DMF logger instance.

The name should be lowercase letters like ‘dmf’ or ‘propdb’.

Leaving the name blank will get the root logger. Also, any non-string name will get the root logger.

idaes.dmf.util.get_module_author(mod)[source]

Find and return the module author.

Parameters:mod (module) – Python module
Returns:(str) Author string or None if not found
Raises:nothing
idaes.dmf.util.get_module_version(mod)[source]

Find and return the module version.

Version must look like a semantic version with <a>.<b>.<c> parts; there can be arbitrary extra stuff after the <c>. For example:

1.0.12
0.3.6
1.2.3-alpha-rel0
Parameters:mod (module) – Python module
Returns:(str) Version string or None if not found
Raises:ValueError if version is found but not valid
idaes.dmf.util.import_module(name)[source]
idaes.dmf.util.is_jupyter_notebook(filename)[source]

See if this is a Jupyter notebook.

idaes.dmf.util.is_python(filename)[source]

See if this is a Python file. Do not import the source code.

idaes.dmf.util.is_resource_json(filename)[source]
idaes.dmf.util.strlist(x, sep=', ')[source]
idaes.dmf.util.terminate_pid(pid, waitfor=1)[source]

idaes.dmf.validate module

class idaes.dmf.validate.InstanceGenerator(schema, params=None)[source]

Bases: object

bplate_div = 'DO NOT MODIFY BEYOND THIS POINT'
create_script(output_file, preserve_old=True, **kwargs)[source]
default_arr_len = 1
get_script(n=1, output_files='/tmp/file{i}.json')[source]

Code to load & generate n schemas as a Python string template with the spot for the variables as ‘{variables}’.

Returns:
Pair of strings, first is user-modifiable part and
second is boilerplate with the template data. This allows separate modification of these 2 sections.
Return type:(str, str)
get_template()[source]

Generate a new template for the instance.

Returns:JSON of the instance
Return type:str
get_variables(commented=True)[source]
indent = 2
keywords = ('$schema', 'id', 'definitions')
root_var = 'root'
class idaes.dmf.validate.JsonSchemaValidator(modpath='idaes.dmf', directory='schemas', do_not_cache=False)[source]

Bases: object

Validate JSON documents against schemas defined in this package.

The schemas are in the “schemas/” directory of the package. They are first processed as Jinja2 templates, to allow for flexible re-use of common schema elements. The actual resulting schema is stored in a temporary directory that is removed when this class is deleted.

Example usage:

vdr = JsonSchemaValidator()
# Validate document against the "foobar" schema.
ok, msg = vdr.validate({'foo': '1', 'bar': 2}, 'foobar')
if ok:
    print("Success!")
else:
    print("Failed: {}".format(msg))
# Validate input YAML file against the "config" schema
ok, msg = vdr.validate('/path/to/my_config.yaml', 'config', yaml=True)
if ok:
    print("Success!")
else:
    print("Failed: {}".format(msg))
get_schema(schema)[source]
Load the schema and return it as a Python (dict) object.
See validate() for details.
Parameters:

schema (str) – Schema name. Same as schema arg to validate()

Returns:

Parsed schema

Return type:

dict

Raises:
  • IOError if file cannot be opened.
  • ValueError if file cannot be parsed.
instances(schema, param_file)[source]
reset()[source]

Clear cached schemas, so that changes in the base templates are picked up by the validation code.

validate(doc, schema, yaml=False)[source]

Validate a JSON file against a schema.

Parameters:
  • doc (str|file|list|dict) – Input filename or object. May be JSON or YAML. Also may be a list/dict, which is assumed to represent parsed JSON.
  • schema (str) – Name of schema in this package. This will be the name, without the .template suffix, of a file in the ‘schemas/’ directory.
  • yaml (bool) – If true, use the YAML parser instead of the JSON parser on the input file.
Returns:

(bool, str) Pair whose first value is whether it validated and

second is set to the error message if it did not.

Raises:
  • IOError if either file cannot be opened.
  • ValueError if either file cannot be parsed.

idaes.dmf.workspace module

Workspace classes and functions.

class idaes.dmf.workspace.Fields[source]

Bases: object

Workspace configuration fields.

DOC_HTML_PATH = 'htmldocs'
LOG_CONF = 'logging'
class idaes.dmf.workspace.Workspace(path, create=False, add_defaults=False)[source]

Bases: object

DMF Workspace.

In essence, a workspace is some information at the root of a directory tree, a database (currently file-based, so also in the directory tree) of Resources, and a set of files associated with these resources.

Workspace Configuration

When the DMF is initialized, the workspace is given as a path to a directory. In that directory is a special file named config.yaml, that contains metadata about the workspace. The very existence of a file by that name is taken by the DMF code as an indication that the containing directory is a DMF workspace:

/path/to/dmf: Root DMF directory
 |
 +- config.yaml: Configuration file
 +- resourcedb.json: Resource metadata "database" (uses TinyDB)
 +- files: Data files for all resources

The configuration file is a YAML formatted file

The DMF configuration file defines the following key/value pairs:

_id
Unique identifier for the workspace. This is auto-generated by the library, of course.
name
Short name for the workspace.
description
Possibly longer text describing the workspace.
created
Date at which the workspace was created, as string in the ISO8601 format.
modified
Date at which the workspace was last modified, as string in the ISO8601 format.
htmldocs
Full path to the location of the built (not source) Sphinx HTML documentation for the idaes_dmf package. See DMF Help Configuration for more details.

There are many different possible “styles” of formatting a list of values in YAML, but we prefer the simple block-indented style, where the key is on its own line and the values are each indented with a dash:

_id: fe5372a7e51d498fb377da49704874eb
created: '2018-07-16 11:10:44'
description: A bottomless trashcan
modified: '2018-07-16 11:10:44'
name: Oscar the Grouch's Home
htmldocs:
- '{dmf_root}/doc/build/html/dmf'
- '{dmf_root}/doc/build/html/models'

Any paths in the workspace configuration, e.g., for the “htmldocs”, can use two special variables that will take on values relative to the workspace location. This avoids hardcoded paths and makes the workspace more portable across environments. {ws_root} will be replaces with the path to the workspace directory, and {dmf_root} will be replaced with the path to the (installed) DMF package.

The config.yaml file will allow keys and values it does not know about. These will be accessible, loaded into a Python dictionary, via the meta attribute on the Workspace instance. This may be useful for passing additional user-defined information into the DMF at startup.

CONF_CREATED = 'created'

Configuration field for created date

CONF_DESC = 'description'

Configuration field for description

CONF_MODIFIED = 'modified'

Configuration field for modified date

CONF_NAME = 'name'

Configuration field for name

ID_FIELD = '_id'

Name of ID field

WORKSPACE_CONFIG = 'config.yaml'

Name of configuration file placed in WORKSPACE_DIR

description
get_doc_paths()[source]

Get paths to generated HTML Sphinx docs.

Returns:(list) Paths or empty list if not found.
meta

Get metadata.

This reads and parses the configuration. Therefore, one way to force a config refresh is to simply refer to this property, e.g.:

dmf = DMF(path='my-workspace')
#  ... do stuff that alters the config ...
dmf.meta  # re-read/parse the config
Returns:(dict) Metadata for this workspace.
name
root

Root path for this workspace. This is the path containing the configuration file.

set_meta(values, remove=None)[source]

Update metadata with new values.

Parameters:
  • values (dict) – Values to add or change
  • remove (list) – Keys of values to remove.
wsid

Get workspace identifier (from config file).

Returns:Unique identifier.
Return type:str
class idaes.dmf.workspace.WorkspaceConfig[source]

Bases: object

DEFAULTS = {'array': [], 'boolean': False, 'number': 0, 'string': ''}
get_fields(only_defaults=False)[source]

Get all possible metadata fields for workspace config.

These values come out of the configuration schema. Keys starting with a leading underscore, like ‘_id’, are skipped.

Parameters:only_defaults – Only include fields that have a default value in the schema.
Returns:
Keys are field name, values are (field description, value).
The ‘value’ gives a default value. Its type is either a list, a number, bool, or a string; the list may be empty.
Return type:dict
idaes.dmf.workspace.find_workspaces(root)[source]

Find workspaces at or below ‘root’.

Parameters:root – Path to start at
Returns:List of paths, which are all workspace roots.