from brightway2 import *
projects.set_current("US LCI")
bw2setup()
sp = SingleOutputEcospold1Importer(
"/Users/cmutel/Documents/LCA Documents/US LCI database/2014",
"US LCI"
)
sp.apply_strategies()
OK, our first error. There are two process datasets that have the same process name - in this case, it looks like one was a first draft, and the other is the final dataset. One file is called Spring wheat straw, production, average, US, 2022.xml
, and the other is called Spring wheat straw, ground and stored, 2022.xml
. We will ignore the average production dataset file.
bad_file = ('/Users/cmutel/Documents/LCA Documents/US LCI database/2014/'
'Spring wheat straw, production, average, US, 2022.xml')
sp.data = [obj for obj in sp.data if obj.get('filename') != bad_file]
Apply the last two strategies; the error stopped the program from going through the list.
sp.apply_strategies(sp.strategies[-2:])
The US LCI has "dummy" processes - links to activities which are real inputs, but which aren't modeled in the database. We need to add these dummy processes as real activities (even if they don't have any inputs themselves).
from bw2io.strategies import *
sp.apply_strategy(special.add_dummy_processes_and_rename_exchanges)
Let's see how things look. In an ideal dataset, everything would already be linked, but we know that this is not yet true for the US LCI.
sp.statistics()
We are now ready to start internally linking the database.
First, we migrate some names for biosphere flows.
sp.migrate("biosphere-2-3-names")
sp.migrate("biosphere-2-3-categories")
sp.migrate('default-units')
Then, we try to internally link the database. We call the match_database
method with two arguments. The first is None
, i.e. we are not linking against another database, but only doing internal linking. Because the US LCI doesn't use categories in exchange definitions consistently, we also ignore_categories
.
sp.match_database(None, ignore_categories=True)
We find another error liek before - the same process dataset is repeated using two different filenames.
[x['filename'] for x in sp.data if x['name'] == 'Harvesting, fresh fruit bunch, at farm']
The Harvesting...
dataset is older; presumably, the Fresh fruit...
dataset is the updated version. We can delete the older dataset and continue.
bad_file = '/Users/cmutel/Documents/LCA Documents/US LCI database/2014/Harvesting, fresh fruit bunch, at farm.xml'
sp.data = [obj for obj in sp.data if obj.get('filename') != bad_file]
sp.match_database(None, ignore_categories=True)
We have done the internal linking that we can - now we need to link the biosphere flows. This looks complicated, but is just a fancy way of linking the biosphere flows by their names, units, and categories.
import functools
f = functools.partial(link_iterable_by_fields,
other=Database(config.biosphere),
kind='biosphere'
)
sp.apply_strategy(f)
Let's see how far we have got:
sp.statistics()
Not great.
Some of these unlinked exchanges are links to ecoinvent 2.2, so they shouldn't work.
Let's export lists of what we have so far.
sp.write_excel(only_unlinked=True)
sp.write_excel(only_names=True)
sp.write_excel()
The Excel output files are available for download at https://bitbucket.org/cmutel/brightway2/src/tip/notebooks/files/?at=2.0. Click on "view raw" for each file to download it.
We can search the biosphere database to find out why some biosphere flows weren't linked. For example, Carbon dioxide
- that seems strange. Why didn't that work?
db = Database("biosphere3")
db.search("Carbon dioxide")
Oh, we would need to specify if it was fossil or non-fossil, as they are handled differently in GWP calculations.
For every unmatched exchange, there is a reason the computer couldn't match it exactly. The next step is to figure out the problem for each exchange, and then write a migration to fix the input data to match what is expected.