Skip to content

IndexError list index out of range exception when creating StringSubstitutionFilter #104

@fsteggink

Description

@fsteggink

I'm getting an IndexError: list index out of range exception when creating a StringSubstitutionFilter. Stack trace:

  File "/usr/src/app/etl/tasks.py", line 66, in stetl_task
    etl.run()
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/etl.py", line 154, in run
    chain.assemble()
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/chain.py", line 87, in assemble
    etl_comp = factory.create_obj(self.config_dict, etl_section_name.strip())
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/factory.py", line 28, in create_obj
    raise e
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/factory.py", line 25, in create_obj
    class_obj_inst = self.new_instance(class_obj, configdict, section)
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/factory.py", line 62, in new_instance
    return class_obj(configdict, section)
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/filters/stringfilter.py", line 62, in __init__
    self.format_args_dict = Util.string_to_dict(self.format_args, self.separator)
  File "/usr/local/lib/python3.7/dist-packages/Stetl-2.1.dev0-py3.7.egg/stetl/util.py", line 112, in string_to_dict
    x[1] = x[1].replace(space, ' ')
IndexError: list index out of range

This happens when the config file contains placeholders which are passed through the command line and when the value contains spaces which are represented with tildes. Example: stetl -c blah.cfg -a myvalue=contains~space

Previously, as a workaround, I passed those values as environment variables, like export STETL_myvalue=contains~space. Then this error doesn't occur.

As you can see, this occurs in Stetl version 2.1-dev, but this also happened before the 2.0 versions.

During debugging, I found out that after I create the ETL object (etl = ETL(vars(args_parsed), args_parsed.config_args)) and show the config_dict, the relevant section is shown like this:

[my_filter]
class = stetl.filters.stringfilter.StringSubstitutionFilter
format_args = myvalue:contains space

It is clear that the tilde is replaced by a space earlier in the process, at the creation of the ETL object.

So, when this is passed to string_to_dict, the dict_arr will look like this:[['myvalue','contains'],['space']], which obviously causes the IndexError, since the second array only contains one element.

I haven't looked yet where this error exactly occurs. This must happen after extra arguments are passed through -a, but not when arguments are passed as environment variables with the 'STETL_'-prefix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions