Skip to content

AssertionError: unknown status keyword 'dsgvo_service_control' in marked section #468

@snarfed

Description

@snarfed

Hi! First off, huge thanks for maintaining feedparser. It's legendary! We're all lucky to have it.

I hit a new (to me) AssertionError today when parsing the RSS at https://snrk.de/feed/ . Here's the relevant RSS snippet:

<content:encoded><![CDATA[
  ...
  <p><strong>If you don&#8217;t like that, don&#8217;t use snrk.de!</strong><![dsgvo_service_control]></p>
  ...
]]></content:encoded>

...and here's the assert:

>>> feedparser.parse(rss)
Traceback (most recent call last):
  File ".../site-packages/feedparser/api.py", line 263, in parse
    saxparser.parse(source)
  File ".../python3.11/xml/sax/expatreader.py", line 111, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File ".../python3.11/xml/sax/xmlreader.py", line 125, in parse
    self.feed(buffer)
  File ".../python3.11/xml/sax/expatreader.py", line 217, in feed
    self._parser.Parse(data, isFinal)
  File "/private/tmp/pythonA3.11-20240402-4978-3ygh5v/Python-3.11.9/Modules/pyexpat.c", line 477, in EndElement
  File ".../python3.11/xml/sax/expatreader.py", line 395, in end_element_ns
    self._cont_handler.endElementNS(pair, None)
  File ".../site-packages/feedparser/parsers/strict.py", line 124, in endElementNS
    self.unknown_endtag(localname)
  File ".../site-packages/feedparser/mixin.py", line 321, in unknown_endtag
    method()
  File ".../site-packages/feedparser/namespaces/_base.py", line 488, in _end_content
    value = self.pop_content('content')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/feedparser/mixin.py", line 629, in pop_content
    value = self.pop(tag)
            ^^^^^^^^^^^^^
  File ".../site-packages/feedparser/mixin.py", line 548, in pop
    output = _sanitize_html(output, self.encoding, self.contentparams.get('type', 'text/html'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/feedparser/sanitizer.py", line 883, in _sanitize_html
    p.feed(html_source)
  File ".../site-packages/feedparser/html.py", line 156, in feed
    super(_BaseHTMLProcessor, self).feed(data)
  File ".../site-packages/sgmllib.py", line 98, in feed
    self.goahead(0)
  File ".../site-packages/sgmllib.py", line 168, in goahead
    k = self.parse_declaration(i)
        ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../site-packages/feedparser/html.py", line 351, in parse_declaration
    return sgmllib.SGMLParser.parse_declaration(self, i)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../python3.11/_markupbase.py", line 91, in parse_declaration
    return self.parse_marked_section(i)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../python3.11/_markupbase.py", line 154, in parse_marked_section
    raise AssertionError(
AssertionError: unknown status keyword 'dsgvo_service_control' in marked section

Is this expected? Should I catch AssertionError everywhere I use feedparser? Any other thoughts?

feedparser 6.0.11, Python 3.11.9. Maybe related to #378...but not exactly the same. Thanks in advance!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions