idaes.dmf package¶
IDAES Data Management Framework (DMF)
The DMF lets you save, search, and retrieve provenance related to your models.
This package is documented with Sphinx. To build the documentation, change to the ‘docs’ directory and run, e.g., ‘make html’.
Resource representaitons.
Subpackages¶
Submodules¶
idaes.dmf.codesearch module¶
Search through the code and index static information in the DMF.
-
class
idaes.dmf.codesearch.
ModuleClassWalker
(from_path=None, from_pkg=None, class_expr=None, parent_class=None, suppress_warnings=False, exclude_testdirs=True, exclude_tests=True, exclude_init=True, exclude_setup=True, exclude_dirs=None)[source]¶ Bases:
idaes.dmf.codesearch.Walker
Walk modules from a given root (e.g. ‘idaes’), and visit all classes in those modules whose name matches a given pattern.
Example usage:
walker = ModuleClassWalker(from_pkg=idaes, class_expr='_PropertyParameter.*') walker.walk(PrintMetadataVisitor()) # see below
-
class
idaes.dmf.codesearch.
PropertyMetadataVisitor
[source]¶ Bases:
idaes.dmf.codesearch.Visitor
Visit something implementing
HasPropertyClassMetadata
and pass that metadata, as a dict, to the visit_metadata() method, which should be implemented by the subclass.-
visit
(obj)[source]¶ Visit one object.
Parameters: obj (idaes.core.property_base.HasPropertyClassMetadata) – The object Returns: None
-
visit_metadata
(obj, meta)[source]¶ Do something with the metadata.
Parameters: - obj (object) – Object from which metadata was pulled, for context.
- meta (idaes.core.property_base.PropertyClassMetadata) – The metadata
Returns: None
-
idaes.dmf.commands module¶
Perform all logic, input, output of commands that is particular to the CLI.
Call functions defined in ‘api’ module to handle logic that is common to the API and CLI.
-
idaes.dmf.commands.
list_resources
(path, long_format=None, relations=False)[source]¶ List resources in a given DMF workspace.
Parameters: Returns: None
-
idaes.dmf.commands.
list_workspaces
(root, stream=None)[source]¶ List workspaces found from a given root path.
Parameters: - root – root path
- stream – Output stream (must have .write() method)
idaes.dmf.dmf module¶
Data Management Framework
-
class
idaes.dmf.dmf.
DMF
(path='', name=None, desc=None, **ws_kwargs)[source]¶ Bases:
idaes.dmf.workspace.Workspace
,traitlets.traitlets.HasTraits
Data Management Framework (DMF).
Expected usage is to instantiate this class, once, and then use it for storing, searching, and retrieve resource s that are required for the given analysis.
For details on the configuration files used by the DMF, see documentation for
DMFConfig
(global configuration) andidaes.dmf.workspace.Workspace
.-
CONF_DATA_DIR
= 'datafile_dir'¶
-
CONF_DB_FILE
= 'db_file'¶
-
CONF_HELP_PATH
= 'htmldocs'¶
-
add
(rsrc)[source]¶ Add a resource and associated files.
If the resource has ‘datafiles’, there are some special values that cause those files to be copied and possibly the original removed at this point. There are attributes do_copy and is_tmp on the resource, and also potentially keys of the same name in the datafiles themselves. If present, the datafile key/value pairs will override the attributes in the resource. For do_copy, the original file will be copied into the DMF workspace. If do_copy is True, then if is_tmp is also True the original file will be removed (after the copy is made, of course).
Parameters: rsrc (resource.Resource) – The resource Returns: (str) Resource ID Raises: DMFError, DuplicateResourceError
-
datafile_dir
¶ A trait for unicode strings.
-
db_file
¶ A trait for unicode strings.
-
fetch_many
(rid_list)[source]¶ Fetch multiple resources, by their identifiers.
Parameters: rid_list (list) – List of integer resource identifers Returns: (list of resource.Resource) List of found resources (may be empty)
-
fetch_one
(rid)[source]¶ Fetch one resource, from its identifier.
Parameters: rid (str) – Resource identifier Returns: (resource.Resource) The found resource, or None if no match
-
find
(filter_dict=None, id_only=False)[source]¶ Find and return resources matching the filter.
The filter syntax is a subset of the MongoDB filter syntax. This means that it is represented as a dictionary, where each key is an attribute or nested attribute name, and each value is the value against which to match. There are four possible types of values:
scalar string or number (int, float): Match resources that have this exact value for the given attribute.
date, as datetime.datetime or pendulum.Pendulum instance: Match resources that have this exact date for the given attribute.
list: Match resources that have a list value for this attribute, and for which any of the values in the provided list are in the resource’s corresponding value. If a ‘!’ is appended to the key name, then this will be interpreted as a directive to only match resources for which all values in the provided list are present.
dict: This is an inequality, with one or more key/value pairs. The key is the type of inequality and the value is the numeric value for that range. All keys begin with ‘$’. The possible inequalities are:
- “$lt”: Less than (<)
- “$le”: Less than or equal (<=)
- “$gt”: Greater than (>)
- “$ge”: Greater than or equal (>=)
- “$ne”: Not equal to (!=)
Parameters: Returns: (list of int|Resource) Depending on the value of id_only.
Find related resources.
Parameters: - rsrc (resource.Resource) – Resource starting point
- filter_dict (dict) – See parameter of same name in
find()
. - maxdepth (int) – Maximum depth of search (starts at 1)
- meta (List[str]) – Metadata fields to extract for meta part
- outgoing (bool) – If True, look at outgoing relations. Otherwise look at incoming relations. e.g. if A ‘uses’ B and if True, would find B starting from A. If False, would find A starting from B.
Returns: Generates triples (depth, Triple, meta), where the depth is an integer (starting at 1), the Triple is a simple namedtuple wrapping (subject, object, predicate), and meta is a dict of metadata for the endpoint of the relation (the object if outgoing=True, the subject if outgoing=False) for the fields provided in the meta parameter.
Raises: NoSuchResourceError
– if the starting resource is not found
-
remove
(identifier=None, filter_dict=None, update_relations=True)[source]¶ Remove one or more resources, from its identifier or a filter. Unless told otherwise, this method will scan the DB and remove all relations that involve this resource.
Parameters:
-
update
(rsrc, sync_relations=False, upsert=False)[source]¶ Update/insert stored resource.
Parameters: - rsrc (resource.Resource) – Resource instance
- sync_relations (bool) – If True, and if resource exists in the DB, then the “relations” attribute of the provided resource will be changed to the stored value.
- upsert (bool) – If true, and the resource is not in the DMF, then insert it. If false, and the resource is not in the DMF, then do nothing.
Returns: - True if the resource was updated or added, False if nothing
was done.
Return type: Raises: errors.DMFError
– If the input resource was invalid.
-
-
class
idaes.dmf.dmf.
DMFConfig
(defaults=None)[source]¶ Bases:
object
Global DMF configuration.
Every time you create an instance of the
DMF
, or run admf
command on the command-line, the library opens the global DMF configuration file to figure out wthe default workspace (and, eventually, other values).The default location for this configuration file is “~/.dmf”, i.e. the file named “.dmf” in the user’s home directory. This can be modified programmatically by changing the “filename” attribute of this class.
The contents of .dmf are formatted as YAML, with the following keys defined:
- workspace
- Path to the default workspace directory.
An example file is shown below:
{workspace: /tmp/newdir}
-
DEFAULTS
= {'workspace': '/home/ksb/Projects/IDAES/github/IDAES/idaes_0.6_rel/doc'}¶
-
WORKSPACE
= 'workspace'¶
-
filename
= '/home/ksb/.dmf'¶
-
workspace
¶
idaes.dmf.errors module¶
Exception classes.
-
exception
idaes.dmf.errors.
AlamoDisabledError
[source]¶ Bases:
idaes.dmf.errors.AlamoError
-
exception
idaes.dmf.errors.
AlamoError
(msg)[source]¶ Bases:
idaes.dmf.errors.DmfError
-
exception
idaes.dmf.errors.
DMFBadWorkspaceError
(path, why)[source]¶ Bases:
idaes.dmf.errors.DMFError
-
exception
idaes.dmf.errors.
DMFWorkspaceNotFoundError
(path)[source]¶ Bases:
idaes.dmf.errors.DMFError
-
exception
idaes.dmf.errors.
DataFormatError
(dtype, err)[source]¶ Bases:
idaes.dmf.errors.DmfError
-
exception
idaes.dmf.errors.
InvalidRelationError
(subj, pred, obj)[source]¶ Bases:
idaes.dmf.errors.DmfError
idaes.dmf.experiment module¶
The ‘experiment’ is a root container for a coherent set of ‘resources’.
-
class
idaes.dmf.experiment.
Experiment
(dmf, **kwargs)[source]¶ Bases:
idaes.dmf.resource.Resource
An experiment is a way of grouping resources in a way that makes sense to the user.
It is also a useful unit for passing as an argument to functions, since it has a standard ‘slot’ for the DMF instance that created it.
-
add
(rsrc)[source]¶ Add a resource to an experiment.
This does two things:
- Establishes an “experiment” type of relationship between the new resource and the experiment.
- Adds the resource to the DMF
Parameters: rsrc (resource.Resource) – The resource to add. Returns: Added (input) resource, for chaining calls. Return type: resource.Resource
-
copy
(new_id=True, **kwargs)[source]¶ Get a copy of this experiment. The returned object will have been added to the DMF.
Parameters: - new_id (bool) – If True, generate a new unique ID for the copy.
- kwargs – Values to set in new instance after copying.
Returns: A (mostly deep) copy.
Note that the DMF instance is just a reference to the same object as in the original, and they will share state.
Return type:
-
dmf
¶
-
link
(subj, predicate='contains', obj=None)[source]¶ Add and update relation triple in DMF.
Parameters: - subj (resource.Resource) – Subject
- predicate (str) – Predicate
- obj (resource.Resource) – Object
Returns: None
-
idaes.dmf.help module¶
Find documentation for modules and classes in the generated Sphinx documentation and return its location.
idaes.dmf.magics module¶
Jupyter magics for the DMF.
-
class
idaes.dmf.magics.
DmfMagics
(shell)[source]¶ Bases:
IPython.core.magic.Magics
-
NEED_INIT_CMD
= {'help': '+', 'info': '*'}¶
-
dmf_help
(*names)[source]¶ Provide help on IDAES objects and classes.
Invoking with no arguments gives general help. Invoking with one or more arguments looks for help in the docs on the given objects or classes.
-
dmf_info
(*topics)[source]¶ Provide information about DMF current state for whatever ‘topics’ are provided. With no topic, provide general information about the configuration.
Parameters: topics ((List[str])) – List of topics Returns: None
-
dmf_init
(path, *extra)[source]¶ Initialize DMF (do this before most other commands).
Parameters: path (str) – Full path to DMF home
-
dmf_workspaces
(*paths)[source]¶ List DMF workspaces.
Parameters: paths (List[str]) – Paths to search, use “.” by default
-
idaes_help
(*names)[source]¶ Provide help on IDAES objects and classes.
Invoking with no arguments gives general help. Invoking with one or more arguments looks for help in the docs on the given objects or classes.
-
magics
= {'cell': {}, 'line': {'dmf': 'dmf', 'idaes': 'idaes'}}¶
-
registered
= True¶
-
idaes.dmf.propdata module¶
Property data types.
Ability to import, etc. from text files is part of the methods in the type.
Import property database from textfile(s):
* See PropertyData.from_csv()
, for the expected format for data.
* See PropertyMetadata()
for the expected format for metadata.
-
exception
idaes.dmf.propdata.
AddedCSVColumnError
(names, how_bad, column_type='')[source]¶ Bases:
KeyError
Error for :meth:PropertyData.add_csv()
-
class
idaes.dmf.propdata.
Fields
[source]¶ Bases:
idaes.dmf.tabular.Fields
Constants for fields.
-
C_PROP
= 'property'¶
-
C_STATE
= 'state'¶
-
-
class
idaes.dmf.propdata.
PropertyColumn
(name, data)[source]¶ Bases:
idaes.dmf.tabular.Column
Data column for a property.
-
type_name
= 'Property'¶
-
-
class
idaes.dmf.propdata.
PropertyData
(data)[source]¶ Bases:
idaes.dmf.tabular.TabularData
Class representing property data that knows how to construct itself from a CSV file.
You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in
add_csv()
and the tests inidaes_dmf.propdb.tests.test_mergecsv
.-
add_csv
(file_or_path, strict=False)[source]¶ Add to existing object from a new CSV file.
Depending on the value of the strict argument (see below), the new file may or may not have the same properties as the object – but it always needs to have the same number of state columns, and in the same order.
Note
Data that is “missing” because of property columns in one CSV and not the other will be filled with float(nan) values.
Parameters: - file_or_path (file or str) – Input file. This should be in exactly the same format as expected by :meth:from_csv().
- strict (bool) – If true, require that the columns in the input CSV match columns in this object. Otherwise, only require that state columns in input CSV match columns in this object. New property columns are added, and matches to existing property columns will append the data.
Raises: AddedCSVColumnError
– If the new CSV column headers are not the same as the ones in this object.Returns: (int) Number of added rows
-
as_arr
(states=True)[source]¶ Export property data as arrays.
Parameters: states (bool) – If False, exclude “state” data, e.g. the ambient temperature, and only include measured property values. Returns: (values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values. Raises: ValueError if the columns are not all the same length
-
embedded_units
= '(.*)\\((.*)\\)'¶
-
errors_dataframe
(states=False)[source]¶ Get errors as a dataframe.
Parameters: states (bool) – If False, exclude state data. This is the default, because states do not normally have associated error information. Returns: Pandas dataframe for values. Return type: pd.DataFrame Raises: ImportError
– If pandas or numpy were never successfully imported.
-
static
from_csv
(file_or_path, nstates=0)[source]¶ Import the CSV data.
Expected format of the files is a header plus data rows.
Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..
Column-name is in the format “Name (units)”
Error-column is in the format “<type> Error”, where “<type>” is the error type.
Parameters: Returns: New properties instance
Return type:
-
is_property_column
(index)[source]¶ Whether given column is a property. See
is_state_column()
.
-
is_state_column
(index)[source]¶ Whether given column is state.
Parameters: index (int) – Index of column Returns: (bool) State or property and the column number. Raises: IndexError
– No column at that index.
-
names
(states=True, properties=True)[source]¶ Get column names.
Parameters: Returns: List of column names.
Return type:
-
properties
¶
-
states
¶
-
values_dataframe
(states=True)[source]¶ Get values as a dataframe.
Parameters: states (bool) – see names()
.Returns: (pd.DataFrame) Pandas dataframe for values. Raises: ImportError
– If pandas or numpy were never successfully imported.
-
-
class
idaes.dmf.propdata.
PropertyMetadata
(values=None)[source]¶ Bases:
idaes.dmf.tabular.Metadata
Class to import property metadata.
-
class
idaes.dmf.propdata.
PropertyTable
(data=None, **kwargs)[source]¶ Bases:
idaes.dmf.tabular.Table
Property data and metadata together (at last!)
-
classmethod
load
(file_or_path, validate=True)[source]¶ Create PropertyTable from JSON input.
Parameters: Example input:
{ "meta": [ {"datatype": "MEA", "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 306-310", "notes": "r is MEA weight fraction in aqueous soln.", "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.", "title": "Density and Viscosity of ..."} ], "data": [ {"name": "Viscosity Value", "units": "mPa-s", "values": [2.6, 6.2], "error_type": "absolute", "errors": [0.06, 0.004], "type": "property"}, {"name": "r", "units": "", "values": [0.2, 1000], "type": "state"} ] }
-
classmethod
-
class
idaes.dmf.propdata.
StateColumn
(name, data)[source]¶ Bases:
idaes.dmf.tabular.Column
Data column for a state.
-
type_name
= 'State'¶
-
idaes.dmf.propindex module¶
Index Property metadata
-
class
idaes.dmf.propindex.
DMFVisitor
(dmf, default_version=None)[source]¶ Bases:
idaes.dmf.codesearch.PropertyMetadataVisitor
-
visit_metadata
(obj, meta)[source]¶ - Called for each property class encountered during the “walk”
- initiated by index_property_metadata().
Parameters: - obj (property_base.PropertyParameterBase) – Property class instance
- meta (property_base.PropertyClassMetadata) – Associated metadata
Returns: None
Raises: AttributeError
– if
-
-
idaes.dmf.propindex.
index_property_metadata
(dmf, pkg=<module 'idaes' from '/home/ksb/anaconda3/envs/idaes/lib/python3.7/site-packages/idaes-0.6.0-py3.7.egg/idaes/__init__.py'>, expr='_PropertyMetadata.*', default_version='0.0.1', **kwargs)[source]¶ Index all the PropertyMetadata classes in this package.
Usually the defaults will be correct, but you can modify the package explored and set of classes indexed.
When you re-index the same class (in the same module), whether or not that is a “duplicate” will depend on the version found in the containing module. If there is no version in the containing module, the default version is used (so it is always the same). If it is a duplicate, nothing is done, this is not considered an error. If a new version is added, it will be explicitly connected to the highest version of the same module/code. So, for example,
Starting with (a.module.ClassName version=0.1.2)
If you then find a new version (a.module.ClassName version=1.2.3) There will be 2 resources, and you will have the relation:
a.module.ClassName/1.2.3 --version---> a.module.ClassName/0.1.2
If you add another version (a.module.ClassName version=1.2.4), you will have two relations:
a.module.ClassName/1.2.3 --version---> a.module.ClassName/0.1.2 a.module.ClassName/1.2.4 --version---> a.module.ClassName/1.2.3
Parameters: - dmf (idaes.dmf.DMF) – Data Management Framework instance in which to record the found metadata.
- pkg (module) – Root module (i.e. package root) from which to find the classes containing metadata.
- expr (str) – Regular expression pattern for the names of the classes in which to look for metadata.
- default_version (str) – Default version to use for modules with no explicit version.
- kwargs – Other keyword arguments passed to
codesearch.ModuleClassWalker
.
Returns: None
Raises: - This instantiated a DMFVisitor and calls its walk() method to
- walk/visit each found class, so any exception raised by the constructor
- or DMFVisitor.visit_metadata().
idaes.dmf.resource module¶
Resource representaitons.
-
class
idaes.dmf.resource.
Dict
(*args, **kwargs)[source]¶ Bases:
dict
Subclass of dict that has a ‘dirty’ bit.
-
idaes.dmf.resource.
PR_DERIVED
= 'derived'¶ Constants for relation predicates
-
class
idaes.dmf.resource.
Resource
(value=None, type_=None)[source]¶ Bases:
object
Core object for the Data Management Framework.
-
ID_FIELD
= 'id_'¶ Identifier field name constant
-
TYPE_FIELD
= 'type'¶ Resource type field name constant
-
data
¶ Get JSON data for this resource.
-
get_datafiles
(mode='r')[source]¶ Generate readable file objects for ‘datafiles’ in resource.
Parameters: mode (str) – Mode for open() Returns: Generates `file`s. Return type: generator
-
id
¶ Get resource identifier.
-
type
¶ Get resource type.
-
-
idaes.dmf.resource.
TY_EXPERIMENT
= 'experiment'¶ Constants for resource ‘types’
-
class
idaes.dmf.resource.
Triple
(subject, predicate, object)¶ Bases:
tuple
Provide attribute access to an RDF subject, predicate, object triple
-
object
¶ Alias for field number 2
-
predicate
¶ Alias for field number 1
-
subject
¶ Alias for field number 0
-
-
idaes.dmf.resource.
create_relation
(rel)[source]¶ Create a relationship between two Resource instances.
Relations are stored in both the subject and object resources, in the following way:
If R = (subject)S, (predicate)P, and (object)O then store the following: In S.relations: {predicate: P, identifier:O.id, role:subject} In O.relations: {predicate: P, identifier:S.id, role:object}
Parameters: rel (Triple) – Relation triple. The ‘subject’ and ‘object’ parts should be Resource
, and the ‘predicate’ should be a simple string.Returns: None Raises: ValueError
– if this relation already exists in the subject or object resource, or the predicate is not in the list of valid ones in RELATION_PREDICATES
-
idaes.dmf.resource.
create_relation_args
(*args)[source]¶ Syntactic sugar to take 3 args instead of a Triple.
-
idaes.dmf.resource.
identifier_str
(value=None)[source]¶ Unique identifier.
Parameters: value (str) – If given, validate that it is a 32-byte str If not given or None, set new random value.
-
idaes.dmf.resource.
triple_from_resource_relations
(id_, rrel)[source]¶ Create a Triple from one entry in resource[‘relations’].
Parameters: Returns: A triple
Return type:
-
idaes.dmf.resource.
version_list
(value)[source]¶ Semantic version.
Three numeric identifiers, separated by a dot. Trailing non-numeric characters allowed.
Inputs, string or tuple, may have less than three numeric identifiers, but internally the value will be padded with zeros to always be of length four.
A leading dash or underscore in the trailing non-numeric characters is removed.
Some examples:
- 1 => valid => (1, 0, 0, ‘’)
- rc3 => invalid: no number
- 1.1 => valid => (1, 1, 0, ‘’)
- 1a => valid => (1, 0, 0, ‘a’)
- 1.a.1 => invalid: non-numeric can only go at end
- 1.12.1 => valid => (1, 12, 1, ‘’)
- 1.12.13-1 => valid => (1, 12, 13, ‘1’)
- 1.12.13.x => invalid: too many parts
Returns: [major:int, minor:int, debug:int, release-type:str] Return type: list
idaes.dmf.resource_old module¶
Resource representaitons.
-
class
idaes.dmf.resource_old.
Code
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
Some source code, such as a Python module or C file.
This can also refer to packages or entire Git repositories.
-
desc
¶ Description of the code
-
idhash
¶ Git or other unique hash
-
language
¶ Programming language, e.g. “Python” (the default).
-
location
¶ Flie path or URL location for the code
-
name
¶ Name of the code object, e.g. Python module name
-
release
¶ Version of the release, default is ‘0.0.0’
-
type
¶ ‘method’, ‘function’, ‘module’, ‘class’, ‘file’, ‘package’, ‘repository’, or ‘notebook’.
Type: Type of code resource, must be one of
-
-
class
idaes.dmf.resource_old.
Contact
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
Person who can be contacted.
-
email
¶ Email of the contact
-
name
¶ Name of the contact
-
-
class
idaes.dmf.resource_old.
DateTime
(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]¶ Bases:
traitlets.traitlets.TraitType
A trait type for a datetime.
- Input can be a string, float, or tuple. Specifically:
- string, ISO8601: YYYY[-MM-DD[Thh:mm:ss[.uuuuuu]]]
- float: seconds since Unix epoch (1/1/1970)
- tuple: format accepted by datetime.datetime()
No matter the input, validation will transform it into a floating point number, since this is the easiest form to store and search.
-
default_value
= 0¶
-
info_text
= 'a datetime'¶
-
class
idaes.dmf.resource_old.
FilePath
(tempfile=False, copy=True, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
Path to a file, plus optional description and metadata.
So that the DMF does not break when data files are moved or copied, the default is to copy the datafile into the DMF workspace. This behavior can be controlled by the copy and tempfile keywords to the constructor.
For example, if you have a big file you do NOT want to copy when you create the resource:
FilePath(path='/my/big.file', desc='100GB file', copy=False)
On the other hand, if you have a file that you want the DMF to manage entirely:
FilePath(path='/some/file.txt', desc='a file', tempfile=True)
-
CSV_MIMETYPE
= 'text/csv'¶
-
desc
¶ Description of the file’s contents
-
do_copy
¶
-
fullpath
¶
-
is_tmp
¶
-
metadata
¶ Metadata to associate with the file
-
mimetype
¶ MIME type
-
path
¶ Path to file
-
root
¶
-
subdir
¶ Unique subdir
-
-
class
idaes.dmf.resource_old.
FlowsheetResource
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.Resource
Flowsheet resource & factory.
-
class
idaes.dmf.resource_old.
Identifier
(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]¶ Bases:
traitlets.traitlets.TraitType
Unique identifier.
Will set it itself automatically to a 32-byte unique hex string. Can only be set to strings
-
default_value
= None¶
-
expr
= re.compile('[0-9a-f]{32}')¶
-
info_text
= 'Unique identifier'¶
-
-
class
idaes.dmf.resource_old.
PropertyDataResource
(property_table=None, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TabularDataResource
Property data resource & factory.
-
idaes.dmf.resource_old.
R_DERIVED
= 'derived'¶ Constants for RelationType predicates
-
class
idaes.dmf.resource_old.
RelationType
(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]¶ Bases:
traitlets.traitlets.TraitType
Traitlets type for RDF-style triples relating resources to each other.
-
Predicates
= {'contains', 'derived', 'uses', 'version'}¶
-
info_text
= 'triple of (subject-id, predicate, object-id), all strings, with a predicate in {version, uses, contains, derived}'¶
-
-
class
idaes.dmf.resource_old.
Resource
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
A dynamically typed resource.
Resources have metadata and (same for all resoures) a type-specific “data” section (unique to that type of resource).
-
ID_FIELD
= 'id_'¶
-
TYPE_FIELD
= 'type'¶
-
aliases
¶ List of aliases for the resource
-
codes
¶ List of code objects (including repositories and packages) associated with the resource. Each value is a
Code
.
-
copy
(**kwargs)[source]¶ Get a copy of this Resource.
As a convenience, optionally set some attributes in the copy.Parameters: kwargs – Attributes to set in new instance after copying.
- Returns:
- Resource: A deep copy.
The copy will have an empty (zero) identifier and a new unique value for uuid. The relations are not copied.
-
static
create_relation
(subj, pred, obj)[source]¶ Create a relationship between two Resource instances.
Parameters: Returns: None
Raises: TypeError
– if subject & object are not Resource instances.
-
created
¶ Date and time when the resource was created. This defaults to the time when the object was created. Value is a
DateTime
.
-
data
¶ An instance of a Python dict.
-
datafiles_dir
¶ Datafiles subdirectory (single directory name)
-
desc
¶ Description of the resource
-
help
(name)[source]¶ Return descriptive ‘help’ for the given attribute.
Parameters: name (str) – Name of attribute Returns: Help string, or error starting with “Error: “ Return type: str
-
id_
¶ Integer identifier for this Resource. You should not set this yourself. The value will be automatically overwritten with the database’s value when the resource is added to the DMF (with the .add() method).
-
modified
¶ Date and time the resource was last modified. This defaults to the time when the object was created. Value is a
DateTime
.
-
name
¶ Human-readable name for the resource (optional)
-
property_table
¶ For property data resources, this property builds and returns a PropertyTable object.
Returns: - A representation of metadata and data
- in this resource.
Return type: propdata.PropertyTable Raises: TypeError
– if this resource is not of the correct type.
-
relations
¶ Validate values in a list as belonging to a given TraitType.
This can be used in place of the Traitlets.List class.
-
table
¶ - For tabular data resources, this property builds and returns
- a Table object.
Returns: - A representation of metadata and data
- in this resource.
Return type: tabular.Table Raises: TypeError
– if this resource is not of the correct type.
List of tags for the resource
-
type
¶ Type of this Resource. See
ResourceTypes
for standard values for this attribute.
-
uuid
¶ Universal identifier for this resource
-
version
¶ Version of the resource. Value is a
SemanticVersion
.
-
-
class
idaes.dmf.resource_old.
ResourceTypes
[source]¶ Bases:
object
Standard resource type names.
Use these as opaque constants to indicate standard resource types. For example, when creating a Resource:
rsrc = Resource(type=ResourceTypes.property_data, ...)
-
data
= 'data'¶ Data (e.g. result data)
-
experiment
= 'experiment'¶ Experiment
-
fs
= 'flowsheet'¶ Flowsheet resource.
-
jupyter
= 'notebook'¶
-
jupyter_nb
= 'notebook'¶
-
nb
= 'notebook'¶ Jupyter Notebook
-
property_data
= 'propertydb'¶ Property data resource, e.g. the contents are created via classes in the
idaes.dmf.propdata
module.
-
python
= 'python'¶ Python code
-
surrmod
= 'surrogate_model'¶ Surrogate model
-
tabular_data
= 'tabular_data'¶ Tabular data
-
xp
= 'experiment'¶
-
-
class
idaes.dmf.resource_old.
SemanticVersion
(default_value=traitlets.Undefined, allow_none=False, read_only=None, help=None, config=None, **kwargs)[source]¶ Bases:
traitlets.traitlets.TraitType
Semantic version.
Three numeric identifiers, separated by a dot. Trailing non-numeric characters allowed.
Inputs, string or tuple, may have less than three numeric identifiers, but internally the value will be padded with zeros to always be of length four.
A leading dash or underscore in the trailing non-numeric characters is removed.
Some examples:
- 1 => valid => (1, 0, 0, ‘’)
- rc3 => invalid: no number
- 1.1 => valid => (1, 1, 0, ‘’)
- 1a => valid => (1, 0, 0, ‘a’)
- 1.a.1 => invalid: non-numeric can only go at end
- 1.12.1 => valid => (1, 12, 1, ‘’)
- 1.12.13-1 => valid => (1, 12, 13, ‘1’)
- 1.12.13.x => invalid: too many parts
-
default_value
= (0, 0, 0, '')¶
-
info_text
= 'semantic version major, minor, patch, & modifier'¶
-
class
idaes.dmf.resource_old.
Source
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
A work from which the resource is derived.
-
date
¶ Date associated with resource
-
doi
¶ Digital object identifier
-
isbn
¶ ISBN
-
language
¶ The primary language of the intellectual content of the resource
-
source
¶ The work, either print or electronic, from which the resource was derived
-
-
class
idaes.dmf.resource_old.
TabularDataResource
(table=None, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.Resource
Tabular data resource & factory.
-
class
idaes.dmf.resource_old.
TraitContainer
(*args, **kwargs)[source]¶ Bases:
traitlets.traitlets.HasTraits
Base class for Resource, that knows how to serialize and parse its traits.
-
class
idaes.dmf.resource_old.
Triple
(subject, predicate, object)¶ Bases:
tuple
Provide attribute access to an RDF subject, predicate, object triple
-
object
¶ Alias for field number 2
-
predicate
¶ Alias for field number 1
-
subject
¶ Alias for field number 0
-
-
class
idaes.dmf.resource_old.
ValidatingList
(*args, **kwargs)[source]¶ Bases:
traitlets.traitlets.List
Validate values in a list as belonging to a given TraitType.
This can be used in place of the Traitlets.List class.
-
class
idaes.dmf.resource_old.
Version
(*args, **kwargs)[source]¶ Bases:
idaes.dmf.resource_old.TraitContainer
Version of something (code, usually).
-
created
¶ When this version was created. Default “empty”, which is encoded as the start of Unix epoch (1970/01/01).
-
name
¶ Name given to version
-
revision
¶ Revision, e.g. 1.0.0rc3
-
idaes.dmf.resourcedb module¶
Resource database.
-
class
idaes.dmf.resourcedb.
ResourceDB
(dbfile=None, connection=None)[source]¶ Bases:
object
A database interface to all the resources within a given DMF workspace.
-
delete
(id_=None, idlist=None, filter_dict=None)[source]¶ Delete one or more resources with given identifiers.
Parameters: Returns: (list[str]) Identifiers
-
find
(filter_dict, id_only=False)[source]¶ Find and return records based on the provided filter.
Parameters: - filter_dict (dict) – Search filter. For syntax, see docs in
dmf.DMF.find()
. - id_only (bool) – If true, return only the identifier of each resource; otherwise a Resource object is returned.
Returns: (list of int|Resource) Depending on the value of id_only
- filter_dict (dict) – Search filter. For syntax, see docs in
Find all resources connected to the identified one.
Parameters: Returns: Generator of (depth, relation, metadata)
Raises: KeyError if the resource is not found.
-
idaes.dmf.surrmod module¶
Surrogate modeling helper classes and functions. This is used to run ALAMO on property data.
-
class
idaes.dmf.surrmod.
SurrogateModel
(experiment, **kwargs)[source]¶ Bases:
object
Run ALAMO to generate surrogate models.
Automatically track the objects in the DMF.
Example:
model = SurrogateModel(dmf, simulator='linsim.py') rsrc = dmf.fetch_one(1) # get resource ID 1 data = rsrc.property_table.data model.set_input_data(data, ['temp'], 'density') results = model.run()
-
PARAM_DATA_KEY
= 'parameters'¶ Key in resource ‘data’ for params
-
run
(**kwargs)[source]¶ Run ALAMO.
Parameters: **kwargs – Additional arguments merged with those passed to the class constructor. Any duplicate values will override the earlier ones. Returns: The dictionary returned from alamopy.doalamo()
Return type: dict
-
set_input_data
(data, x_colnames, z_colname)[source]¶ Set input from provided dataframe or property data.
Parameters: Returns: None
Raises: KeyError
– if columns are not found in data
-
set_input_data_np
(x, z, xlabels=None, zlabel='z')[source]¶ Set input data from numpy arrays.
Parameters: Returns: None
-
idaes.dmf.tabular module¶
Tabular data handling
-
class
idaes.dmf.tabular.
Column
(name, data)[source]¶ Bases:
object
Generic, abstract column
-
type_name
= 'generic'¶
-
-
class
idaes.dmf.tabular.
Fields
[source]¶ Bases:
object
Constants for field names.
-
AUTH
= 'authors'¶
-
COLTYPE
= 'type'¶
-
DATA
= 'data'¶
-
DATA_ERRORS
= 'errors'¶
-
DATA_ERRTYPE
= 'error_type'¶
-
DATA_NAME
= 'name'¶ Keys for data mapping
-
DATA_UNITS
= 'units'¶
-
DATA_VALUES
= 'values'¶
-
DATE
= 'date'¶
-
DTYPE
= 'datatype'¶
-
INFO
= 'info'¶
-
META
= 'meta'¶
-
ROWS
= 'rows'¶
-
TITLE
= 'title'¶
-
VALS
= 'values'¶
-
-
class
idaes.dmf.tabular.
Metadata
(values=None)[source]¶ Bases:
object
Class to import metadata.
Publication author(s).
-
datatype
¶
-
date
¶ Publication date
-
static
from_csv
(file_or_path)[source]¶ Import metadata from simple text format.
Example input:
Source,Han, J., Jin, J., Eimer, D.A., Melaaen, M.C.,"Density of Water(1) + Monoethanolamine(2) + CO2(3) from (298.15 to 413.15) K and Surface Tension of Water(1) + Monethanolamine(2) from ( 303.15 to 333.15)K", J. Chem. Eng. Data, 2012, Vol. 57, pg. 1095-1103" Retrieval,"J. Morgan, date unknown" Notes,r is MEA weight fraction in aqueous soln. (CO2-free basis)
Parameters: file_or_path (str or file) – Input file Returns: (PropertyMetadata) New instance
-
info
¶ Publication venue, etc.
-
line_expr
= re.compile('\\s*(\\w+)\\s*,\\s*(.*)\\s*')¶
-
source
¶ Full publication info.
-
source_expr
= re.compile('\\s*(.*)\\s*,\\s*"(.*)"\\s*,\\s*(.*)\\s*')¶
-
title
¶ Publication title.
-
class
idaes.dmf.tabular.
Table
(data=None, metadata=None)[source]¶ Bases:
idaes.dmf.tabular.TabularObject
Tabular data and metadata together (at last!)
-
data
¶
-
dump
(fp, **kwargs)[source]¶ Dump to file as JSON. Convenience method, equivalent to converting to a dict and calling
json.dump()
.Parameters: - fp (file) – Write output to this file
- **kwargs – Keywords passed to json.dump()
Returns: see json.dump()
-
dumps
(**kwargs)[source]¶ Dump to string as JSON. Convenience method, equivalent to converting to a dict and calling
json.dumps()
.Parameters: **kwargs – Keywords passed to json.dumps() Returns: (str) JSON-formatted data
-
classmethod
load
(file_or_path, validate=True)[source]¶ Create from JSON input.
Parameters: Example input:
{ "meta": [{ "datatype": "MEA", "info": "J. Chem. Eng. Data, 2009, Vol 54, pg. 3096-30100", "notes": "r is MEA weight fraction in aqueous soln.", "authors": "Amundsen, T.G., Lars, E.O., Eimer, D.A.", "title": "Density and Viscosity of Monoethanolamine + etc." }], "data": [ { "name": "Viscosity Value", "units": "mPa-s", "values": [2.6, 6.2], "error_type": "absolute", "errors": [0.06, 0.004], "type": "property" } ] }
-
metadata
¶
-
-
class
idaes.dmf.tabular.
TabularData
(data, error_column=False)[source]¶ Bases:
object
Class representing tabular data that knows how to construct itself from a CSV file.
You can build objects from multiple CSV files as well. See the property database section of the API docs for details, or read the code in
add_csv()
and the tests inidaes_dmf.propdb.tests.test_mergecsv
.-
as_arr
()[source]¶ Export property data as arrays.
Returns: (values[M,N], errors[M,N]) Two arrays of floats, each with M columns having N values. Raises: ValueError if the columns are not all the same length
-
as_list
()[source]¶ Export the data as a list.
Output will be in same form as data passed to constructor.
Returns: (list) List of dicts
-
columns
¶
-
embedded_units
= '(.*)\\((.*)\\)'¶
-
errors_dataframe
()[source]¶ Get errors as a dataframe.
Returns: Pandas dataframe for values. Return type: pd.DataFrame Raises: ImportError
– If pandas or numpy were never successfully imported.
-
static
from_csv
(file_or_path, error_column=False)[source]¶ Import the CSV data.
Expected format of the files is a header plus data rows.
Header: Index-column, Column-name(1), Error-column(1), Column-name(2), Error-column(2), .. Data: <index>, <val>, <errval>, <val>, <errval>, ..
Column-name is in the format “Name (units)”
Error-column is in the format “<type> Error”, where “<type>” is the error type.
Parameters: Returns: New table of data
Return type:
-
get_column
(key)[source]¶ Get an object for the given named column.
Parameters: key (str) – Name of column Returns: (TabularColumn) Column object. Raises: KeyError
– No column by that name.
-
get_column_index
(key)[source]¶ Get an index for the given named column.
Parameters: key (str) – Name of column Returns: (int) Column number. Raises: KeyError
– No column by that name.
-
num_columns
¶ Number of columns in this table.
A “column” is defined as data + error. So if there are two columns of data, each with an associated error column, then num_columns is 2 (not 4).
Returns: Number of columns. Return type: int
-
num_rows
¶ Number of rows in this table.
obj.num_rows is a synonym for len(obj)
Returns: Number of rows. Return type: int
-
values_dataframe
()[source]¶ Get values as a dataframe.
Returns: (pd.DataFrame) Pandas dataframe for values. Raises: ImportError
– If pandas or numpy were never successfully imported.
-
idaes.dmf.util module¶
Utility functions.
-
class
idaes.dmf.util.
CPrint
(color=True)[source]¶ Bases:
object
Colorized terminal printing.
Codes are below. To use:
cprint = CPrint() cprint(‘This has no colors’) # just like print() cprint(‘This is @b[blue] and @_r[red underlined]’)You can use the same class as a no-op by just passing color=False to the constructor.
-
COLORS
= {'*': '\x1b[1m', '-': '\x1b[2m', '.': '\x1b[0m', '_': '\x1b[4m', 'b': '\x1b[94m', 'c': '\x1b[96m', 'g': '\x1b[92m', 'h': '\x1b[1m\x1b[95m', 'm': '\x1b[95m', 'r': '\x1b[91m', 'w': '\x1b[97m', 'y': '\x1b[93m'}¶
-
-
idaes.dmf.util.
datetime_timestamp
(v)[source]¶ Get numeric timestamp. This will work under both Python 2 and 3.
-
idaes.dmf.util.
find_process_byname
(name, uid=None)[source]¶ Generate zero or more PIDs where ‘name’ is part of either the first or second token in the command line. Optionally also filter the returned PIDs to only those with a ‘real’ user UID (UID) equal to the provided uid. If None, the default, is given, then use the current process UID. Providing a value of < 0 will skip the filter.
-
idaes.dmf.util.
get_file
(file_or_path, mode='r')[source]¶ Open a file for reading, or simply return the file object.
-
idaes.dmf.util.
get_logger
(name='')[source]¶ Create and return a DMF logger instance.
The name should be lowercase letters like ‘dmf’ or ‘propdb’.
Leaving the name blank will get the root logger. Also, any non-string name will get the root logger.
Find and return the module author.
Parameters: mod (module) – Python module Returns: (str) Author string or None if not found Raises: nothing
-
idaes.dmf.util.
get_module_version
(mod)[source]¶ Find and return the module version.
Version must look like a semantic version with <a>.<b>.<c> parts; there can be arbitrary extra stuff after the <c>. For example:
1.0.12 0.3.6 1.2.3-alpha-rel0
Parameters: mod (module) – Python module Returns: (str) Version string or None if not found Raises: ValueError if version is found but not valid
-
idaes.dmf.util.
is_python
(filename)[source]¶ See if this is a Python file. Do not import the source code.
idaes.dmf.validate module¶
XXX: This module is going way soon -dang 10/26/18
-
class
idaes.dmf.validate.
InstanceGenerator
(schema, params=None)[source]¶ Bases:
object
-
bplate_div
= 'DO NOT MODIFY BEYOND THIS POINT'¶
-
default_arr_len
= 1¶
-
get_script
(n=1, output_files='/tmp/file{i}.json')[source]¶ Code to load & generate n schemas as a Python string template with the spot for the variables as ‘{variables}’.
Returns: - Pair of strings, first is user-modifiable part and
- second is boilerplate with the template data. This allows separate modification of these 2 sections.
Return type: (str, str)
-
get_template
()[source]¶ Generate a new template for the instance.
Returns: JSON of the instance Return type: str
-
indent
= 2¶
-
keywords
= ('$schema', 'id', 'definitions')¶
-
root_var
= 'root'¶
-
-
class
idaes.dmf.validate.
JsonSchemaValidator
(modpath='idaes.dmf', directory='schemas', do_not_cache=False)[source]¶ Bases:
object
Validate JSON documents against schemas defined in this package.
The schemas are in the “schemas/” directory of the package. They are first processed as Jinja2 templates, to allow for flexible re-use of common schema elements. The actual resulting schema is stored in a temporary directory that is removed when this class is deleted.
Example usage:
vdr = JsonSchemaValidator() # Validate document against the "foobar" schema. ok, msg = vdr.validate({'foo': '1', 'bar': 2}, 'foobar') if ok: print("Success!") else: print("Failed: {}".format(msg)) # Validate input YAML file against the "config" schema ok, msg = vdr.validate('/path/to/my_config.yaml', 'config', yaml=True) if ok: print("Success!") else: print("Failed: {}".format(msg))
-
get_schema
(schema)[source]¶ - Load the schema and return it as a Python (dict) object.
- See
validate()
for details.
Parameters: schema (str) – Schema name. Same as schema arg to
validate()
Returns: Parsed schema
Return type: Raises: - IOError if file cannot be opened.
- ValueError if file cannot be parsed.
-
reset
()[source]¶ Clear cached schemas, so that changes in the base templates are picked up by the validation code.
-
validate
(doc, schema, yaml=False)[source]¶ Validate a JSON file against a schema.
Parameters: - doc (str|file|list|dict) – Input filename or object. May be JSON or YAML. Also may be a list/dict, which is assumed to represent parsed JSON.
- schema (str) – Name of schema in this package. This will be the name, without the .template suffix, of a file in the ‘schemas/’ directory.
- yaml (bool) – If true, use the YAML parser instead of the JSON parser on the input file.
Returns: - (bool, str) Pair whose first value is whether it validated and
second is set to the error message if it did not.
Raises: - IOError if either file cannot be opened.
- ValueError if either file cannot be parsed.
-
idaes.dmf.workspace module¶
Workspace classes and functions.
-
class
idaes.dmf.workspace.
Fields
[source]¶ Bases:
object
Workspace configuration fields.
-
DOC_HTML_PATH
= 'htmldocs'¶
-
LOG_CONF
= 'logging'¶
-
-
class
idaes.dmf.workspace.
Workspace
(path, create=False, add_defaults=False)[source]¶ Bases:
object
DMF Workspace.
In essence, a workspace is some information at the root of a directory tree, a database (currently file-based, so also in the directory tree) of Resources, and a set of files associated with these resources.
Workspace Configuration
When the DMF is initialized, the workspace is given as a path to a directory. In that directory is a special file named
config.yaml
, that contains metadata about the workspace. The very existence of a file by that name is taken by the DMF code as an indication that the containing directory is a DMF workspace:/path/to/dmf: Root DMF directory | +- config.yaml: Configuration file +- resourcedb.json: Resource metadata "database" (uses TinyDB) +- files: Data files for all resources
The configuration file is a YAML formatted file
The DMF configuration file defines the following key/value pairs:
- _id
- Unique identifier for the workspace. This is auto-generated by the library, of course.
- name
- Short name for the workspace.
- description
- Possibly longer text describing the workspace.
- created
- Date at which the workspace was created, as string in the ISO8601 format.
- modified
- Date at which the workspace was last modified, as string in the ISO8601 format.
- htmldocs
- Full path to the location of the built (not source) Sphinx HTML documentation for the idaes_dmf package. See DMF Help Configuration for more details.
There are many different possible “styles” of formatting a list of values in YAML, but we prefer the simple block-indented style, where the key is on its own line and the values are each indented with a dash:
_id: fe5372a7e51d498fb377da49704874eb created: '2018-07-16 11:10:44' description: A bottomless trashcan modified: '2018-07-16 11:10:44' name: Oscar the Grouch's Home htmldocs: - '{dmf_root}/doc/build/html/dmf' - '{dmf_root}/doc/build/html/models'
Any paths in the workspace configuration, e.g., for the “htmldocs”, can use two special variables that will take on values relative to the workspace location. This avoids hardcoded paths and makes the workspace more portable across environments.
{ws_root}
will be replaces with the path to the workspace directory, and{dmf_root}
will be replaced with the path to the (installed) DMF package.The config.yaml file will allow keys and values it does not know about. These will be accessible, loaded into a Python dictionary, via the
meta
attribute on theWorkspace
instance. This may be useful for passing additional user-defined information into the DMF at startup.-
CONF_CREATED
= 'created'¶ Configuration field for created date
-
CONF_DESC
= 'description'¶ Configuration field for description
-
CONF_MODIFIED
= 'modified'¶ Configuration field for modified date
-
CONF_NAME
= 'name'¶ Configuration field for name
-
ID_FIELD
= '_id'¶ Name of ID field
-
WORKSPACE_CONFIG
= 'config.yaml'¶ Name of configuration file placed in WORKSPACE_DIR
-
description
¶
-
get_doc_paths
()[source]¶ Get paths to generated HTML Sphinx docs.
Returns: (list) Paths or empty list if not found.
-
meta
¶ Get metadata.
This reads and parses the configuration. Therefore, one way to force a config refresh is to simply refer to this property, e.g.:
dmf = DMF(path='my-workspace') # ... do stuff that alters the config ... dmf.meta # re-read/parse the config
Returns: (dict) Metadata for this workspace.
-
name
¶
-
root
¶ Root path for this workspace. This is the path containing the configuration file.
-
class
idaes.dmf.workspace.
WorkspaceConfig
[source]¶ Bases:
object
-
DEFAULTS
= {'array': [], 'boolean': False, 'number': 0, 'string': ''}¶
-
get_fields
(only_defaults=False)[source]¶ Get all possible metadata fields for workspace config.
These values come out of the configuration schema. Keys starting with a leading underscore, like ‘_id’, are skipped.
Parameters: only_defaults – Only include fields that have a default value in the schema. Returns: - Keys are field name, values are (field description, value).
- The ‘value’ gives a default value. Its type is either a list, a number, bool, or a string; the list may be empty.
Return type: dict
-