Skip to content

rfc: how variable is the length of headers stored in objects / make OBJ_VARATTR more variable in length? #4359

@nigoroll

Description

@nigoroll

This question re-surfaced in the context of #4344 after having written #4358, when I noticed that it is also very much relevant for the trailer support which I hope we will eventually get at some point (see #2477, #4130).

Both the vcl_backend_error {} and trailers case have one thing in common: We would ideally want to be able to first create some headers, then a body, and then potentially some more headers. For example:

# A
sub vcl_backend_error {
    set beresp.http.hdr1 = "headervalue1";
    set beresp.body = "body";
    set beresp.http.hdr2 = "headervalue2";
    set beresp.body += "morebody";
}

# B
sub vcl_backend_fetch {
    set beresp.http.hdr1 = "headervalue1";

    # might add trailers, streaming can start IFF the client supports trailers
    std.fetch(beresp.body);
    set beresp.http.hdr2 = "headervalue2";
}

A) works today, because be basically buffer beresp.body in a VSB, which #4344 is about removing. So for A), we would like to have an object at hand when write the first bytes to the body, but today we only create the object when the sub returns.

B) does not work today.

For both cases, the limiting factor at this point is that we need to "pre-announce" the total OBJ_VARATTR length we are going to need as the last argument of STV_NewObject() named wsl.

With this existing interface in place, I only see the following options to support A) and B):

  1. Add some configurable headroom (e.g. set beresp.headroom = 1K) of "how much additional space do we expect to require", and, if that is taken up, the transaction might fail.

  2. Copy/replace the object to be created once we overrun the allocated space: Once vcl_backend_error or vcl_backend_fetch finish, we are going to have the complete headers on the workspace and potentially an object with a body. If we find out that we can not fit all headers, we create a new object in place of the old one and copy the body.

But both options have problems:

With 1), we will basically almost always over-allocate and waste space, because the headroom needs to be larger than any potential use. Also, this adds a knob to fiddle with, and when it's tuned wrongly, transactions will fail, which always comes as a surprise and is annoying/frustrating.

With 2), I do not see how streaming of objects with trailers is supposed to work cleanly, because we would need to take the trailers from a different object than we started streaming on.

So at this point, the only sensible option I see is to redefine the wsl argument of STV_NewObject() to be interpreted as a hint only and require storage engines to support OBJ_VARATTR of sizes greater than announced, up to an upper limit defined by the storage engine, which should in the order of typical workspace sizes, so at least some value in the order of 128KB.

In practice, storage engines could fall back to the existing OBJ_AUXATTR logic if however much space they reserved for OBJ_VARATTR tuns out to be insufficient.

This sounds very much doable at some performance impact, but storage engines can still implement specific methods to reduce that impact, for example by opportunistic over-allocation where it does not cause harm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions