Brightway2-io

BW2Package

Brightway2 has its own data format for archiving data which is both efficient and compatible across operating systems and programming languages. This is the default backup format for Brightway2 DataStore objects.

Note

imports and exports are supported.

class bw2io.package.BW2Package

This is a format for saving objects which implement the DataStore API. Data is stored as a BZip2-compressed file of JSON data. This archive format is compatible across Python versions, and is, at least in theory, programming-language agnostic.

Validation is done with bw2data.validate.bw2package_validator.

The data format is:

{
    'metadata': {},  # Dictionary of metadata to be written to metadata-store.
    'name': basestring,  # Name of object
    'class': {  # Data on the underlying class. A new class is instantiated
                # based on these strings. See _create_class.
        'module': basestring,  # e.g. "bw2data.database"
        'name': basestring  # e.g. "Database"
    },
    'unrolled_dict': bool,  # Flag indicating if dictionary keys needed to
                            # be modified for JSON (as JSON keys can't be tuples)
    'data': object  # Object data, e.g. LCIA method or LCI database
}
Perfect roundtrips between machines are not guaranteed:
  • All lists are converted to tuples (because JSON does not distinguish between lists and tuples).
  • Absolute filepaths in metadata would be specific to a certain computer and user.

Note

This class does not need to be instantiated, as all its methods are classmethods, i.e. do BW2Package.import_obj("foo") instead of BW2Package().import_obj("foo")

classmethod export_obj(obj, filename=None, folder='export', backwards_compatible=False)

Export an object.

Args:
  • obj (object): Object to export.
  • filename (str, optional): Name of file to create. Default is obj.name.
  • folder (str, optional): Folder to create file in. Default is export.
  • backwards_compatible (bool, optional): Create package compatible with bw2data version 1.
Returns:
Filepath of created file.
classmethod export_objs(objs, filename, folder='export', backwards_compatible=False)

Export a list of objects. Can have heterogeneous types.

Args:
  • objs (list): List of objects to export.
  • filename (str): Name of file to create.
  • folder (str, optional): Folder to create file in. Default is export.
  • backwards_compatible (bool, optional): Create package compatible with bw2data version 1.
Returns:
Filepath of created file.
classmethod import_file(filepath, whitelist=True)

Import bw2package file, and create the loaded objects, including registering, writing, and processing the created objects.

Args:
  • filepath (str): Path of file to import
  • whitelist (bool): Apply whitelist to allowed types. Default is True.
Returns:
Created object or list of created objects.
classmethod load_file(filepath, whitelist=True)

Load a bw2package file with one or more objects. Does not create new objects.

Args:
  • filepath (str): Path of file to import
  • whitelist (bool): Apply whitelist of approved classes to allowed types. Default is True.
Returns the loaded data in the bw2package dict data format, with the following changes:
  • "class" is an actual Python class object (but not instantiated).

Migrations

class bw2io.migrations.Migration(*args, **kwargs)
write(data, description)

Write migration data. Requires a description.

bw2io.migrations.create_core_migrations()

Add pre-defined core migrations data files

Extractors

Ecospold 1

class bw2io.extractors.ecospold1.Ecospold1DataExtractor
classmethod process_exchange(exc, dataset)

Process exchange.

Input groups are:

  1. Materials/fuels
  2. Electricity/Heat
  3. Services
  4. FromNature
  5. FromTechnosphere

Output groups are:

  1. Reference product
  2. Include avoided product system
  3. Allocated byproduct
  4. Waste to treatment
  5. ToNature

A single-output process will have one output group 0; A MO process will have multiple output group 2s. Output groups 1 and 3 are not used in ecoinvent.

class bw2io.extractors.ecospold1_lcia.Ecospold1LCIAExtractor

Extract impact assessment methods and weightings data from ecospold XML format.

Ecospold 2

class bw2io.extractors.ecospold2.Ecospold2DataExtractor
classmethod extract_exchange(exc)

Process exchange.

Input groups are:

  1. Materials/fuels
  2. Electricity/Heat
  3. Services
  4. From environment (elementary exchange only)
  5. FromTechnosphere

Output groups are:

0. ReferenceProduct 2. By-product 3. MaterialForTreatment 4. To environment (elementary exchange only) 5. Stock addition

Simapro CSV

class bw2io.extractors.simapro_csv.SimaProCSVExtractor
classmethod parse_biosphere_flow(line, category, pm)

Parse biosphere flow line.

  1. name
  2. subcategory
  3. unit
  4. value or formula
  5. uncertainty type
  6. uncert. param.
  7. uncert. param.
  8. uncert. param.
  9. comment

However, sometimes the value is in index 2, and the unit in index 3. Because why not! We assume default ordering unless we find a number in index 2.

classmethod parse_calculated_parameter(line, pm)

Parse line in Calculated parameters section.

  1. name
  2. formula
  3. comment

Can include multiline comment in TSV.

classmethod parse_final_waste_flow(line, pm)

Parse final wate flow line.

0: name 1: subcategory? 2: unit 3. value or formula 4. uncertainty type 5. uncert. param. 6. uncert. param. 7. uncert. param.

However, sometimes the value is in index 2, and the unit in index 3. Because why not! We assume default ordering unless we find a number in index 2.

classmethod parse_input_line(line, category, pm)

Parse technosphere input line.

  1. name
  2. unit
  3. value or formula
  4. uncertainty type
  5. uncert. param.
  6. uncert. param.
  7. uncert. param.
  8. comment

However, sometimes the value is in index 1, and the unit in index 2. Because why not! We assume default ordering unless we find a number in index 1.

classmethod parse_input_parameter(line)

Parse line in Input parameters section.

  1. name
  2. value (not formula)
  3. uncertainty type
  4. uncert. param.
  5. uncert. param.
  6. uncert. param.
  7. hidden (“Yes” or “No” - we ignore)
  8. comment
classmethod parse_reference_product(line, pm)

Parse reference product line.

  1. name
  2. unit
  3. value or formula
  4. allocation
  5. waste type
  6. category (separated by )
  7. comment

However, sometimes the value is in index 1, and the unit in index 2. Because why not! We assume default ordering unless we find a number in index 1.

classmethod parse_waste_treatment(line, pm)

Parse reference product line.

  1. name
  2. unit
  3. value or formula
  4. waste type
  5. category (separated by )
  6. comment
class bw2io.extractors.simapro_lcia_csv.SimaProLCIACSVExtractor
classmethod parse_cf(line)

Parse line in Substances section.

  1. category
  2. subcategory
  3. flow
  4. CAS number
  5. CF
  6. unit

Importers

Base

class bw2io.importers.base.ImportBase(*args, **kwargs)

Base class for format-specific importers.

Defines workflow for applying strategies.

apply_strategies(strategies=None, verbose=True)

Apply a list of strategies.

Uses the default list self.strategies if strategies is None.

Args:
strategies (list, optional): List of strategies to apply. Defaults to self.strategies.
Returns:
Nothings, but modifies self.data, and adds each strategy to self.applied_strategies.
apply_strategy(strategy, verbose=True)

Apply strategy transform to self.data.

Adds strategy name to self.applied_strategies. If StrategyError is raised, print error message, but don’t raise error.

Note

Strategies should not partially modify data before raising StrategyError.

Args:
strategy (callable)
Returns:
Nothing, but modifies self.data, and strategy to self.applied_strategies.
unlinked

Iterate through unique unlinked exchanges.

Uniqueness is determined by activity_hash.

write_unlinked(name)

Write all data to an UnlikedData data store (not a Database!)

class bw2io.importers.base_lci.LCIImporter(db_name)

Base class for format-specific importers.

Defines workflow for applying strategies.

Takes a database name (string) as initialization parameter.

add_unlinked_activities()

Add technosphere flows to self.data.

create_new_biosphere(biosphere_name, relink=True)

Create new biosphere database from biosphere flows in self.data.

Links all biosphere flows to new bio database if relink.

match_database(db_name=None, fields=None, ignore_categories=False, relink=False, kind=None)

Match current database against itself or another database.

If db_name is None, match against current data. Otherwise, db_name should be the name of an existing Database.

fields is a list of fields to use for matching. Field values are case-insensitive, but otherwise must match exactly for a link to be valid. If fields is None, use the default fields of ‘name’, ‘categories’, ‘unit’, ‘reference product’, and ‘location’.

If ignore_categories, link based only on name, unit and location. ignore_categories conflicts with fields.

If relink, relink exchanges even if a link is already present.

kind can be a string or a list of strings. Common values are “technosphere”, “biosphere”, “production”, and “substitution”.

Nothing is returned, but self.data is changed.

write_database(data=None, name=None, overwrite=True, backend=None, **kwargs)

Write data to a Database.

All arguments are optional, and are normally not specified.

Args:
  • data (dict, optional): The data to write to the Database. Default is self.data.
  • name (str, optional): The name of the Database to create. Default is self.db_name.
  • overwrite (bool, optional): Overwrite the Database if it currently exists. Default is True.
  • backend (string, optional): Storage backend to use when creating Database. Default is the default backend.
Returns:
Database instance.
write_excel(only_unlinked=False, only_names=False)

Write database information to a spreadsheet.

If only_unlinked, then only write unlinked exchanges.

If only_names, then write only activity names, no exchange data.

Returns the filepath to the spreadsheet file.

class bw2io.importers.base_lcia.LCIAImporter(filepath, biosphere=None)

Ecospold 1

Ecospold version 1 is the data format of ecoinvent versions 1 and 2, and the US LCI. It is an XML data format with reasonable defaults.

Note

only imports are supported.

class bw2io.importers.ecospold1.SingleOutputEcospold1Importer(filepath, db_name)

Import and process single-output datasets in the ecospold 1 format.

Applies the following strategies: #. If only one exchange is a production exchange, that is the reference product #. Delete (unreliable) integer codes from extracted data #. Drop unspecified subcategories from biosphere flows #. Normalize biosphere flow categories to ecoinvent 3.1 standard #. Normalize biosphere flow names to ecoinvent 3.1 standard #. Remove locations from biosphere exchanges #. Create a code from the activity hash of the dataset #. Link biosphere exchanges to the default biosphere database #. Link internal technosphere exchanges

Args:
  • filepath: Either a file or directory.
  • db_name: Name of database to create.
class bw2io.importers.ecospold1.MultiOutputEcospold1Importer(*args, **kwargs)

Import and process mutli-output datasets in the ecospold 1 format.

Works the same as the single-output importer, but first allocates multioutput datasets.

class bw2io.importers.ecospold1_lcia.Ecospold1LCIAImporter(filepath, biosphere=None)

Ecospold 2

Ecospold version 2 is the data format of ecoinvent version 3.

Note

only imports are supported.

class bw2io.importers.ecospold2.SingleOutputEcospold2Importer(dirpath, db_name)
class bw2io.importers.ecospold2_biosphere.Ecospold2BiosphereImporter(name='biosphere3')

Ecoinvent

class bw2io.importers.ecoinvent_lcia.EcoinventLCIAImporter
separate_methods()

Separate the list of CFs into distinct methods

Simapro

Import a SimaPro text file.

Note

only imports are supported.

class bw2io.importers.simapro_csv.SimaProCSVImporter(filepath, name=None, delimiter=';', encoding='latin-1', normalize_biosphere=True, biosphere_db=None)
class bw2io.importers.simapro_lcia_csv.SimaProLCIACSVImporter(filepath, biosphere=None, delimiter=';', encoding='latin-1', normalize_biosphere=True)

Excel

Import an inventory in an Excel spreadsheet which follows the generic Excel example.

Note

both imports and exports are supported.

class bw2io.importers.excel.ExcelImporter(filepath)

Generic Excel importer.

See the generic Excel example spreadsheet.

Excel spreadsheet should follow the following format:

Database, <name of database>
<database field name>, <database field value>
<blank line>
Activity, <name of activity>
<database field name>, <database field value>
Exchanges
<field name>, <field name>, <field name>
<value>, <value>, <value>
<value>, <value>, <value>
<blank line>

Exchanges for each activity are not required.

An activity is marked as finished with a blank line.

In general, data is imported without modification. However, the following transformations are applied:

  • Numbers are translated from text into actual numbers.
  • Tuples, separated in the cell by the :: string, are reconstructed.
  • True and False are transformed to boolean values.
  • Fields with the value (Unknown) are dropped.

CSV

Import an inventory in a CSV file which follows the generic CSV example.

Note

both imports and exports are supported.

class bw2io.importers.csv.CSVImporter(filepath)

Generic CSV importer.

CSV should follow the following format:

Database, <name of database>
<database field name>, <database field value>
<blank line>
Activity, <name of activity>
<database field name>, <database field value>
Exchanges
<field name>, <field name>, <field name>
<value>, <value>, <value>
<value>, <value>, <value>
<blank line>

Exchanges for each activity are not required.

An activity is marked as finished with a blank line.

In general, data is imported without modification. However, the following transformations are applied: * Numbers are translated from text * Tuples, separated in the CSV by the :: string, are reconstructed. * True and False are transformed to boolean values.

Strategies

Migrations

bw2io.strategies.migrations.migrate_datasets(db, migration)
bw2io.strategies.migrations.migrate_exchanges(db, migration)

Generic

Generic function to link objects in unlinked to objects in other using fields fields.

The database to be linked must have uniqueness for each object for the given fields.

If kind, limit objects in unlinked of type kind.

If relink, link to objects which already have an input. Otherwise, skip already linked objects.

If internal, linked unlinked to other objects in unlinked. Each object must have the attributes database and code.

bw2io.strategies.generic.assign_only_product_as_production(db)

Assign only product as reference product.

Skips datasets that already have a reference product.

This requires something to extract production exchanges to a new list called products. Usually this happens in the extractors, but it could also be a strategy.

Link technosphere exchanges using activity_hash function.

If external_db_name, link against a different database; otherwise link internally.

If fields, link using only certain fields.

bw2io.strategies.generic.set_code_by_activity_hash(db, overwrite=False)

Use activity_hash to set dataset code.

By default, won’t overwrite existing codes, but will if overwrite is True.

bw2io.strategies.generic.set_code_by_activity_hash(db, overwrite=False)

Use activity_hash to set dataset code.

By default, won’t overwrite existing codes, but will if overwrite is True.

bw2io.strategies.generic.tupleize_categories(db)
bw2io.strategies.generic.drop_unlinked(db)

This is the nuclear option - use at your own risk!

bw2io.strategies.generic.normalize_units(db)

Normalize units in datasets and their exchanges

Biosphere

bw2io.strategies.biosphere.drop_unspecified_subcategories(db)

Drop subcategories if they are in the following: * unspecified * (unspecified) * '' (empty string) * None

bw2io.strategies.biosphere.normalize_biosphere_names(db, lcia=False)

Normalize biosphere flow names to ecoinvent 3.1 standard.

Assumes that each dataset and each exchange have a name. Will change names even if exchange is already linked.

bw2io.strategies.biosphere.normalize_biosphere_categories(db, lcia=False)

Normalize biosphere categories to ecoinvent 3.1 standard

bw2io.strategies.biosphere.strip_biosphere_exc_locations(db)

Biosphere flows don’t have locations - if any are included they can confuse linking

LCIA

bw2io.strategies.lcia.add_activity_hash_code(data)

Add code field to characterization factors using activity_hash, if code not already present.

bw2io.strategies.lcia.drop_unlinked_cfs(data)

Drop CFs which don’t have input attribute

bw2io.strategies.lcia.set_biosphere_type(data)

Set CF types to ‘biosphere’, to keep compatibility with LCI strategies.

This will overwrite existing type values.

bw2io.strategies.lcia.match_subcategories(data, biosphere_db_name, remove=True)

Given a characterization with a top-level category, e.g. ('air',), find all biosphere flows with the same top-level categories, and add CFs for these flows as well. Doesn’t replace CFs for existing flows with multi-level categories. If remove, also delete the top-level CF, but only if it is unlinked.

Ecospold 1

bw2io.strategies.ecospold1_allocation.clean_integer_codes(data)

Convert integer activity codes to strings and delete integer codes from exchanges (they can’t be believed).

bw2io.strategies.ecospold1_allocation.es1_allocate_multioutput(data)

This strategy allocates multioutput datasets to new datasets.

This deletes the multioutput dataset, breaking any existing linking. This shouldn’t be a concern, as you shouldn’t link to a multioutput dataset in any case.

Note that multiple allocations for the same product and input will result in undefined behavior.

bw2io.strategies.ecospold1_allocation.allocate_exchanges(ds)

Take a dataset, which has multiple outputs, and return a list of allocated datasets.

The allocation data structure looks like:

{
    'exchanges': [integer codes for biosphere flows, ...],
    'fraction': out of 100,
    'reference': integer codes
}

We assume that the allocation factor for each coproduct is always 100 percent.

Ecospold 2

bw2io.strategies.ecospold2.remove_zero_amount_coproducts(db)

Remove coproducts with zero production amounts from exchanges

bw2io.strategies.ecospold2.remove_zero_amount_inputs_with_no_activity(db)

Remove technosphere exchanges with amount of zero and no uncertainty.

Input exchanges with zero amounts are the result of the ecoinvent linking algorithm, and can be safely discarded.

bw2io.strategies.ecospold2.es2_assign_only_product_with_amount_as_reference_product(db)

If a multioutput process has one product with a non-zero amount, assign that product as reference product.

This is by default called after remove_zero_amount_coproducts, which will delete the zero-amount coproducts in any case. However, we still keep the zero-amount logic in case people want to keep all coproducts.

bw2io.strategies.ecospold2.assign_single_product_as_activity(db)
bw2io.strategies.ecospold2.create_composite_code(db)

Create composite code from activity and flow names

Link internal technosphere inputs by code.

Only links to process datasets actually in the database document.

bw2io.strategies.ecospold2.delete_exchanges_missing_activity(db)

Delete exchanges that weren’t linked correctly by ecoinvent.

These exchanges are missing the “activityLinkId” attribute, and the flow they want to consume is not produced as the reference product of any activity. See the known data issues report.

bw2io.strategies.ecospold2.delete_ghost_exchanges(db)

Delete technosphere which can’t be linked due to ecoinvent errors.

A ghost exchange is one which links to a combination of activity and flow which aren’t provided in the database.

Simapro

bw2io.strategies.simapro.sp_allocate_products(db)

Create a dataset from each product in a raw SimaPro dataset

Link technosphere exchanges based on name, unit, and location. Can’t use categories because we can’t reliably extract categories from SimaPro exports, only exchanges.

If external_db_name, link against a different database; otherwise link internally.

bw2io.strategies.simapro.split_simapro_name_geo(db)

Split a name like ‘foo/CH U’ into name and geo components.

Sets original name to simapro name.

bw2io.strategies.simapro.normalize_simapro_biosphere_categories(db)

Normalize biosphere categories to ecoinvent standard.

bw2io.strategies.simapro.normalize_simapro_biosphere_names(db)

Normalize biosphere flow names to ecoinvent standard

bw2io.strategies.simapro.normalize_simapro_formulae(formula, settings)

Convert SimaPro formulae to Python

Special

bw2io.strategies.special.add_dummy_processes_and_rename_exchanges(db)

Add new processes to link to so-called “dummy” processes in the US LCI database.

Export

bw2io.export.excel.write_lci_excel(database_name, objs=None)

Export database database_name to an Excel spreadsheet.

Not all data can be exported. The following constraints apply:

  • Nested data, e.g. {‘foo’: {‘bar’: ‘baz’}} are excluded. Spreadsheets are not a great format for nested data. However, tuples are exported, and the characters :: are used to join elements of the tuple.

  • Only the following fields in exchanges are exported:
    • name
    • amount
    • unit
    • database
    • categories
    • location
    • type
    • uncertainty type
    • loc
    • scale
    • shape
    • minimum
    • maximum
  • The only well-supported data types are strings, numbers, and booleans.

Returns the filepath of the exported file.

bw2io.export.csv.write_lci_csv(database_name)

Export database database_name to a CSV file.

Not all data can be exported. The following constraints apply:

  • Nested data, e.g. {‘foo’: {‘bar’: ‘baz’}} are excluded. CSV is not a great format for nested data. However, tuples are exported, and the characters :: are used to join elements of the tuple.

  • Only the following fields in exchanges are exported:
    • name
    • amount
    • unit
    • database
    • categories
    • location
    • type
    • uncertainty type
    • loc
    • scale
    • shape
    • minimum
    • maximum
  • The only well-supported data types are strings, numbers, and booleans.

Returns the filepath of the exported file.

bw2io.export.excel.lci_matrices_to_excel(database_name, include_descendants=True)

Fake docstring

bw2io.export.excel.write_lci_activities(database_name)

Write activity names and metadata to Excel file

bw2io.export.excel.write_lci_matching(db, database_name, only_unlinked=False, only_activity_names=False)

Write matched and unmatched exchanges to Excel file

bw2io.export.excel.write_lcia_matching(db, name)

Write matched and unmatched CFs to Excel file

Gephi is an open-source graph visualization and analysis program.

Note

only exports are supported.

class bw2io.export.gexf.DatabaseToGEXF(database, include_descendants=False)

Export a Gephi graph for a database.

Call .export() to export the file after class instantiation.

Args:
  • database (str): Database name.
  • include_descendants (bool): Include databases which are linked from database.

Warning

include_descendants is not yet implemented.

export()

Export the Gephi XML file. Returns the filepath of the created file.

get_data(E)

Get Gephi nodes and edges.

bw2io.export.matlab.lci_matrices_to_matlab(database_name)

Backups

bw2io.backup.backup_data_directory()

Backup data directory to a .tar.gz (compressed tar archive).

Backup archive is saved to the user’s home directory.

Restoration is done manually. Returns the filepath of the backup archive.

bw2io.backup.backup_project_directory(project)

Backup project data directory to a .tar.gz (compressed tar archive).

project is the name of a project.

Backup archive is saved to the user’s home directory.

Restoration is done using restore_project_directory.

Returns the filepath of the backup archive.

Data

bw2io.data.write_json_file(data, name)
bw2io.data.get_ecoinvent_301_31_migration_data()
bw2io.data.get_ecoinvent_2_301_migration_data()
bw2io.data.get_biosphere_2_3_category_migration_data()

Get data for 2 -> 3 migration for biosphere flow categories

bw2io.data.get_biosphere_2_3_name_migration_data()

Get migration data for 2 -> 3 biosphere flow names.

This migration must be applied only after categories have been updated.

Note that the input data excel sheet is modified from the raw data provided by ecoinvent - some biosphere flows which had no equivalent in ecospold2 were mapped using my best judgment. Name changes from 3.1 were also included. Modified cells are marked in dark orange.

Note that not all rows have names in ecoinvent 3. There are a few energy resources that we don’t update. For water flows, the categories are updated by a different strategy, and the names don’t change, so we just ignore them for now.

bw2io.data.get_us_lci_migration_data()

Fix US LCI database name inconsistencies

bw2io.data.convert_simapro_ecoinvent_elementary_flows()

Write a correspondence list from SimaPro elementary flow names to ecoinvent 3 flow names to a JSON file.

Uses custom SimaPro specific data. Ecoinvent 2 -> 3 conversion is in a separate JSON file.

bw2io.data.get_simapro_ecoinvent_3_migration_data(version)

Write a migrations data file from SimaPro activity names to ecoinvent 3 processes.

Correspondence file is processed from Pré, and has the following fields:

  1. SimaPro name
  2. Ecoinvent flow name
  3. Location
  4. Ecoinvent activity name
  5. System model
  6. SimaPro type

Note that even the official matching data from Pré is incorrect, but works if we cast all strings to lower case.

SimaPro type is either System terminated or Unit process. We always match to unit processes regardless of SimaPro type.

bw2io.data.convert_ecoinvent_2_301()

Write a migrations data file from ecoinvent 2 to 3.1.

This is not simple, unfortunately. We have to deal with at least the following:
  • Unit changes (e.g. cubic meters to MJ)
  • Some datasets are deleted, and replaced by others
bw2io.data.convert_lcia_methods_data()