Skip to content

XMPP → Telegram quote processing #153

@ForNeVeR

Description

@ForNeVeR

XMPP users are used to quote the messages they answer to.

Unfortunately, nobody I know seem to use XEP-0201 for that purpose, so the quotes they make aren't easy to properly recognize. But still, we have to do something.

XMPP users are more flexible in the ways they quote the messages. There are two types of quotations they make:

  1. Full quotation: when an XMPP user quotes a Telegram message completely (preceding any quotation line with a special mark, such as >> or »), and adds their response after the message. For example, this:
    <xmpp_user> >> @telegram_user
    >> full text
    >> of a (possibly multiline)
    >> message
    
    I second that!
    
  2. A partial quotation: when an XMPP user quotes parts of another user's message, optionally inserting their reply after or before each part.

Due to the nature of Telegram replies, I propose we only recognize full quotes and not partial ones. But, we should properly recognize the quotes of XMPP users as well as Telegram ones (e.g. when an XMPP user quotes another XMPP user).

Special care (but not too much, because the case is a rare one) should be put around the case when an XMPP user double-quotes an already existing quote: we should avoid (if possible) to misattribute it as an answer to a second level reply. For example:

<xmpp user 1> my text
<xmpp user 2> >> my text
I second that!
<xmpp user 3> >> <xmpp user 1>
>> >> my text
>> I second that!
I triplicate

Here, a message from xmpp user 3 shouldn't be misattributed as a reply to a message from xmpp user 1.

(I don't think we should support a sequence of "reply → partial reply" here, let the replying folks sort it out themselves)

So, I propose the following.

  1. This will partially obsolete our work on Parse and clean up quotes from XMPP #116, but let's first do Parse and clean up quotes from XMPP #116 since it's easier, and them improve on that work.
  2. Let's keep all the Telegram messages (both the incoming one and the outgoing ones, i.e. the bot messages) from the last 48 hours somewhere (either in a database or in memory).
  3. When a new XMPP message arrives, parse it and check if it starts with a quotation. For quotation, strip the user name from it, if available (but keep as a hint if it was possibly a Telegram user), and then search in our database for the same message. The search should mostly ignore the whitespace and be fluent enough for practical cases (may be tuned up later if required, so don't put too much brain into that "fluency").
  4. If a message contains a full quotation of any cached message, then mark it as that on the Telegram side, and completely strip the quotation from its actual body.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions