Skip to content

Updating request queue metada performs full table scan in SQL storage #1533

@ericvg97

Description

@ericvg97

Updating the metadata performs this subquery (in a bigger update query) which performs a full table scan, making all my crawlers much slower (and making the tasks timeout)

SELECT count(*) FROM request_queue_records
WHERE request_queue_id = 'AO2Uinca6RHO1Mo6X'
AND is_handled IS true

With the other two subqueries it also happens:

SELECT count(*) FROM request_queue_records
        WHERE request_queue_id = 'AO2Uinca6RHO1Mo6X'
          AND is_handled IS false
SELECT count(*) FROM request_queue_records
        WHERE request_queue_id = 'AO2Uinca6RHO1Mo6X'

this is a bottleneck as all crawler instances want to update this row at the same time + the query is slow.
Besides adding an index which would make the query much faster, is there the option to not update this metadata? Is it used for something?
Thi

Metadata

Metadata

Assignees

Labels

t-toolingIssues with this label are in the ownership of the tooling team.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions