-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Labels
Description
Describe the bug
stetl fails when using a configuration option with a dictionary value and an arguments dictionary.
Example config from: https://github.com/geopython/stetl/blob/master/examples/basics/11_formatconvert/etl.cfg#L46
# The GML must be a simple features collection
[convert_to_geojson]
class = stetl.filters.formatconverter.FormatConverter
input_format = etree_doc
output_format = geojson_collection
converter_args = {
'root_tag': 'FeatureCollection',
'feature_tag': 'featureMember',
'feature_id_attr': 'fid'
}
To Reproduce
$ PYTHONPATH=. python3 bin/stetl -c examples/basics/11_formatconvert/etl.cfg -a foo=bar
2021-11-29 14:49:25,134 util INFO Found lxml.etree, native XML parsing, fabulous!
2021-11-29 14:49:25,188 util INFO Found GDAL/OGR Python bindings, super!!
2021-11-29 14:49:25,190 main INFO Stetl version = 2.1.dev0
2021-11-29 14:49:25,191 ETL INFO INIT - Stetl version is 2.1.dev0
2021-11-29 14:49:25,191 ETL INFO Config/working dir = /home/bas/git/nlextract/nlextract/externals/stetl/examples/basics/11_formatconvert
2021-11-29 14:49:25,191 ETL INFO Reading config_file = examples/basics/11_formatconvert/etl.cfg
2021-11-29 14:49:25,191 ETL INFO Substituting 0 args in config file from args_dict: []
2021-11-29 14:49:25,191 ETL ERROR Error substituting config arguments: err="\n 'root_tag'"
Traceback (most recent call last):
File "/home/bas/git/nlextract/nlextract/externals/stetl/bin/stetl", line 43, in <module>
main()
File "/home/bas/git/nlextract/nlextract/externals/stetl/bin/stetl", line 35, in main
etl = ETL(vars(args), args.config_args)
File "/home/bas/git/nlextract/nlextract/externals/stetl/stetl/etl.py", line 97, in __init__
raise e
File "/home/bas/git/nlextract/nlextract/externals/stetl/stetl/etl.py", line 91, in __init__
config_str = config_str.format(**args_dict)
KeyError: "\n 'root_tag'"
Expected Behavior
The configuration is loaded successfully, including argument substitution.
Context (please complete one or more from the following information):
- OS: Debian unstable
- Python Version: 3.9.9
- Stetl Version: 2.1.dev0
- Stetl Input/Output/Filter Component: stetl/etl.py
- Stetl Config file: examples/basics/11_formatconvert/etl.cfg
Additional context
A string2record
converter was implemented:
--- a/stetl/filters/formatconverter.py
+++ b/stetl/filters/formatconverter.py
@@ -338,6 +338,29 @@ class FormatConverter(Filter):
packet.data = etree.fromstring(packet.data)
return packet
+ @staticmethod
+ def string2record(packet, converter_args=None):
+ if(
+ converter_args is not None and
+ 'value_column' in converter_args
+ ):
+ key = converter_args['value_column']
+ else:
+ key = 'value'
+
+ record = dict({key: packet.data})
+
+ if(
+ converter_args is not None and
+ 'column_data' in converter_args
+ ):
+ for key in converter_args['column_data']:
+ record[key] = converter_args['column_data'][key]
+
+ packet.data = record
+
+ return packet
+
@staticmethod
def struct2string(packet):
packet.data = packet.to_string()
@@ -406,6 +429,7 @@ FORMAT_CONVERTERS = {
},
FORMAT.string: {
FORMAT.etree_doc: FormatConverter.string2etree_doc,
+ FORMAT.record: FormatConverter.string2record,
FORMAT.xml_doc_as_string: FormatConverter.no_op
},
FORMAT.struct: {
Which requires configuration like this:
# convert string to record
[convert_string_to_record]
class = stetl.filters.formatconverter.FormatConverter
input_format = string
output_format = record
converter_args = {
'value_column': 'waarde',
'column_data': {
'sleutel': 'levering_xml',
},
}
Due to this issue the converters which require converter_args
cannot be used in the NLExtract BAGv2 configuration because that sets arguments via options/<hostname>.args
.