Skip to content

Some improvements for Production deployments #71

@fschoell

Description

@fschoell

Don't log 4xx responses as error. Either log them as info or not at all, they aren't an error from our perspective but only client errors.

Don't expose internal errors to the client (make sure they are properly logged though). They are not helpful for users and might expose internal information. If you want traceability, you could generate a random id and return that to the user instead (and also log it so we can grep the logs for a specific failed request).

Count failed Clickhouse queries as Prometheus metric, that way we can easily add alerts for database issues (for now this is probably equivalent with all 500 errors, but that might diverge in the future).

Use a Promtheus histogram to bucket query times instead of a counter. That way we can monitor query time percentiles, which is more useful than a global average of query times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions