From 972a289e5ddaa33a45f2a3a47ece7ea623e30814 Mon Sep 17 00:00:00 2001 From: James Rosewell Date: Sat, 15 Oct 2022 13:29:34 +0100 Subject: [PATCH] Optimises the none RTB data model to reduce the number of bytes needed when performing crypto graphic operations thus improving overall performance and efficiency of the solution. Additional changes address future proofing issues or multi-implementation complexity with the current design. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The changes here fully address issues #264, #194, #193, #184, #179. These changes relate to #139. The changes to reduce the bytes needed for signing are: - Version is a single byte not a string or double. One byte rather than minimum of four bytes. - Identifier’s ID type is an enum of “rid” or “sid”. These now require four bytes. - Require base 64 representations of byte arrays to be converted to byte arrays before being used in data for signing. The length of the byte array is written as a 4 byte unsigned integer, followed by the array of bytes. Therefore a 64 byte UUID will consume 68 bytes. - Removed the Unicode separator used to concatenate data for signing saving two bytes. - The source domain as a null terminated string saving one byte. - The source timestamp is a four byte unsigned integer representing the number of minutes since the epoch of 1/1/2020 are used for the source data when signing. See comments add to the timestamp entity for the reasons for storing the timestamp in minutes and not seconds. This saves four bytes. - Remove the relationship between the identifier and the preference data entities. They are signed and changed independently of one another. Other changes are: - Source includes a version field to support changes to the cryptographic algorithm. - Identifier and preferences both contain a persisted field. - Added the RID, SID (Signed in Identifier not currently used in the MVP), and preferences to the Seed. This ensures the seed contains all the necessary information to support the transmission requests and responses in a single location reducing complexity. - Removed the type field from the identity response as we do not need to know the role of the signer in the eco-system, and in any case some signers will have multiple roles. - Removed the “end” timestamp from the identity response public key as this is not relevant. - Changed the name of the timestamp in the identity response public key to “created” as this is the information that is needed. - Changed the “type” field name to “id_type” to avoid name space conflicts in languages where “type” is a reserved word. - Add the type terms-url together with the rules associated with the HTML response provided including the addition of a element to make the response machine readable and to learn the version of the Model Terms being used by the signer. Considerations: - This PR does not resulting in an operational project as the changes are breaking changes. Once reviewed the dependent code will need to be refactored by the mantainers. This includes the simple task of renaming or removing fields (created not start), using new value types (byte instead of string), and then modifying the signing definitions logic to use byte arrays rather than strings. - This commit/pull request does not yet apply the same direction of efficiency improvements to the transmission OpenRTB data entities. The reviewer’s opinion on these changes is sought before a further modification is submitted. A concrete implementation of these changes can be found on the SWAN go repository - https://github.com/SWAN-community/swan-go/tree/feature/ok-refact which is currently being refactored to support a hybrid addressability framework that would include these efficiency and future proofing improvements ahead of live deployments. --- mvp-spec/json-schemas/audit-log.json | 4 --- .../json-schemas/get-identity-response.json | 36 +++++++------------ mvp-spec/json-schemas/identifier.json | 21 ++++++----- mvp-spec/json-schemas/persisted.json | 9 +++++ mvp-spec/json-schemas/preferences.json | 5 ++- mvp-spec/json-schemas/seed.json | 12 ++++++- mvp-spec/json-schemas/signature.json | 2 +- mvp-spec/json-schemas/source.json | 6 +++- mvp-spec/json-schemas/terms-url.json | 11 ++++++ mvp-spec/json-schemas/timestamp.json | 2 +- mvp-spec/json-schemas/version.json | 8 ++--- mvp-spec/security-signatures.md | 36 +++++++++++-------- 12 files changed, 90 insertions(+), 62 deletions(-) create mode 100644 mvp-spec/json-schemas/persisted.json create mode 100644 mvp-spec/json-schemas/terms-url.json diff --git a/mvp-spec/json-schemas/audit-log.json b/mvp-spec/json-schemas/audit-log.json index 99aa471..d434856 100644 --- a/mvp-spec/json-schemas/audit-log.json +++ b/mvp-spec/json-schemas/audit-log.json @@ -9,9 +9,6 @@ "version": { "$ref": "version.json" }, - "data": { - "$ref": "ids-and-preferences.json" - }, "seed": { "$ref": "seed.json" }, @@ -27,7 +24,6 @@ } }, "required": [ - "data", "seed", "transaction_id", "transmissions" diff --git a/mvp-spec/json-schemas/get-identity-response.json b/mvp-spec/json-schemas/get-identity-response.json index 849d3a2..335f266 100644 --- a/mvp-spec/json-schemas/get-identity-response.json +++ b/mvp-spec/json-schemas/get-identity-response.json @@ -7,33 +7,25 @@ "properties": { "name": { "type": "string", - "description": "The name of the contracting party, since the domain may not reflect the company name.\n", - "examples": ["Criteo"] - }, - "type": { - "type": "string", - "enum": ["vendor", "operator"], - "description": "The type of contracting party in the PAF ecosystem" + "description": "The name of the signing party, since the domain may not reflect the company name.", + "examples": ["Criteo", "Preference Express"] }, "version": { "$ref": "version.json", - "description": "The type of contracting party in the PAF ecosystem" + "description": "The version of the source currently in use by the signer" }, "dpo_email": { "type": "string", "format": "idn-email", - "description": "Email address to contact the contracting party", - "examples": ["dpo@criteo.com"] + "description": "Email address to contact the signing party", + "examples": ["dpo@criteo.com", "dpo@preference.express"] }, - "privacy_policy_url": { - "type": "string", - "format": "uri-template", - "description": "URL of the privacy policy of the contracting party", - "examples": ["https://www.criteo.com/privacy/"] + "terms_url": { + "$ref": "terms-url.json" }, "keys": { "type": "array", - "description": "List of public keys the contracting party used or is using for signing data and messages", + "description": "List of public keys the signing party used or is using for signing data and messages", "items": { "type": "object", "additionalProperties": false, @@ -45,18 +37,14 @@ "-----BEGIN PUBLIC KEY-----\nMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEUnarwp0gUZgjb9fsYNLcNrddNKV5\nh4/WfMRMVh3HIqojt3LIsvUQig1rm9ZkcNx+IHZVhDM+hso2sXlGjF9xOQ==\n-----END PUBLIC KEY-----" ] }, - "start": { - "$ref": "timestamp.json", - "description": "Timestamp when the contracting party started using this key for signing" - }, - "end": { + "created": { "$ref": "timestamp.json", - "description": "Timestamp when the contracting party stopped using this key for signing" + "description": "Timestamp when the signing party created the key and started using this key for signing" } }, "required": [ "key", - "start" + "created" ] } } @@ -66,7 +54,7 @@ "type", "version", "dpo_email", - "privacy_policy_url", + "terms_url", "keys" ] } diff --git a/mvp-spec/json-schemas/identifier.json b/mvp-spec/json-schemas/identifier.json index 844e4c4..e1446de 100644 --- a/mvp-spec/json-schemas/identifier.json +++ b/mvp-spec/json-schemas/identifier.json @@ -1,7 +1,7 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Identifier", - "description": "A pseudonymous identifier generated for a web user", + "description": "A pseudonymous identifier for the Model Terms Random Id, or Signed in ID.
When signed the following bytes are concatenated.
Version (1 byte), IdType (null/zero terminated string - variable bytes), value length (unsigned 32 bit integer - 4 bytes), value (byte array), source domain (null/zero terminated string - variable bytes), source timestamp (4 bytes).
When compared to signing the JSON data structures the byte packed data is significantly more efficient.", "$id": "identifier", "type": "object", "additionalProperties": false, @@ -9,22 +9,21 @@ "version": { "$ref": "version.json" }, - "type": { + "persisted": { + "$ref": "persisted.json" + }, + "id_type": { "type": "string", "enum": [ - "paf_browser_id" + "rid", + "sid" ], - "description": "The identifier type, identifier of type `paf_browser_id` is mandatory and is \"pivot\"" - }, - "persisted": { - "type": "boolean", - "defaultValue": true, - "description": "If set to `false`, means the identifier has not yet been persisted as a cookie.
Otherwise, means this identifier is persisted as a PAF cookie
(default value = `true` meaning if the property is omitted the identifier *is* persisted)" + "description": "The identifier type, either Random ID (RID), or Signed in ID (SID)" }, "value": { "type": "string", - "description": "The identifier value", - "examples": ["7435313e-caee-4889-8ad7-0acd0114ae3c"] + "description": "The identifier value as a base 64 representation of a byte array", + "examples": ["7435313ecaee48898ad70acd0114ae3c"] }, "source": { "$ref": "source.json", diff --git a/mvp-spec/json-schemas/persisted.json b/mvp-spec/json-schemas/persisted.json new file mode 100644 index 0000000..ad52354 --- /dev/null +++ b/mvp-spec/json-schemas/persisted.json @@ -0,0 +1,9 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "persisted", + "description": "If set to `false`, means the data has not yet been persisted as a cookie.
Otherwise, means the data is persisted
(default value = `false`)", + "$id": "persisted", + "type": "boolean", + "pattern": "true|false|0|1", + "examples": ["1", "0", "true", "false"] +} diff --git a/mvp-spec/json-schemas/preferences.json b/mvp-spec/json-schemas/preferences.json index 2c746d4..a5dd6e9 100644 --- a/mvp-spec/json-schemas/preferences.json +++ b/mvp-spec/json-schemas/preferences.json @@ -2,13 +2,16 @@ "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "preferences", "title": "User preferences", - "description": "The current preferences of the user", + "description": "The current preferences of the user.
When signed the following bytes are concatenated.
Version (1 byte), Personalized flag (boolean, 1 byte), source domain (null/zero terminated string - variable bytes), source timestamp (4 bytes).
When compared to signing the JSON data structures the byte packed data is significantly more efficient.", "type": "object", "additionalProperties": false, "properties": { "version": { "$ref": "version.json" }, + "persisted": { + "$ref": "persisted.json" + }, "data": { "$ref": "preferences-data.json" }, diff --git a/mvp-spec/json-schemas/seed.json b/mvp-spec/json-schemas/seed.json index 59bf8e1..98d97ba 100644 --- a/mvp-spec/json-schemas/seed.json +++ b/mvp-spec/json-schemas/seed.json @@ -15,6 +15,15 @@ "$ref": "transaction_id.json" } }, + "preferences": { + "$ref": "preferences.json" + }, + "rid": { + "$ref": "identifier.json" + }, + "sid": { + "$ref": "identifier.json" + }, "publisher": { "$ref": "domain.json", "description": "The domain name of the Publisher that displays the Addressable Content", @@ -31,6 +40,7 @@ "version", "transaction_ids", "publisher", - "source" + "source", + "rid" ] } diff --git a/mvp-spec/json-schemas/signature.json b/mvp-spec/json-schemas/signature.json index d4f9213..ee7fc2d 100644 --- a/mvp-spec/json-schemas/signature.json +++ b/mvp-spec/json-schemas/signature.json @@ -1,7 +1,7 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Signature", - "description": "The base64 representation of a data signature", + "description": "The base64 representation of the signature byte array", "$id": "signature", "type": "string", "examples": ["RYGHYsBUEwMgFgOJ9aUQl7ywl4xnqdmwWIgPbaIowbXbmZAFKLa7mcBJQuWh1wEskpu57SHn2mmCF6V5+cESgw=="] diff --git a/mvp-spec/json-schemas/source.json b/mvp-spec/json-schemas/source.json index ea412c3..d603441 100644 --- a/mvp-spec/json-schemas/source.json +++ b/mvp-spec/json-schemas/source.json @@ -1,11 +1,14 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Source", - "description": "Source of data representing what contracting party created and signed the data", + "description": "Data associated with the entity that created and signed the data", "$id": "source", "type": "object", "additionalProperties": false, "properties": { + "version": { + "$ref": "version.json" + }, "timestamp": { "$ref": "timestamp.json", "description": "Time when data was signed" @@ -19,6 +22,7 @@ } }, "required": [ + "version", "timestamp", "domain", "signature" diff --git a/mvp-spec/json-schemas/terms-url.json b/mvp-spec/json-schemas/terms-url.json new file mode 100644 index 0000000..d5b9321 --- /dev/null +++ b/mvp-spec/json-schemas/terms-url.json @@ -0,0 +1,11 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "Terms URL", + "description": "URL to the data processing policy of the contracting party. The response may link to the wider privacy policy used by the controller or processor.
The response must be in HTML format and include a meta element with the name `af:model-terms-url` to indicate the URL of the Model Terms used. For example;

Where the URL provided in the `af:model-terms-url` element is not the main official version of the Model Terms then the canonical information must be provided in the response to indicate the main official Model Terms being referenced.
This information is critical to determining if the signer supports the same version of Model Terms as other participants.
Without this requirement it will not be possible to update the Model Terms without the entire eco-system performing the upgrade at the same time. The requirement has a further advantage in that the processing terms become machine readable aiding audit.", + "$id": "terms-url", + "type": "string", + "format": "uri-template", + "examples": ["https://the-web-site.com/somePage.html", "https://another.co.uk/news/2022/02/01/?param=value#anchorA"] +} + + \ No newline at end of file diff --git a/mvp-spec/json-schemas/timestamp.json b/mvp-spec/json-schemas/timestamp.json index 5603926..30dac8c 100644 --- a/mvp-spec/json-schemas/timestamp.json +++ b/mvp-spec/json-schemas/timestamp.json @@ -1,7 +1,7 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Timestamp", - "description": "Number of seconds since UNIX Epoch time (1970/01/01 00:00:00)", + "description": "Number of minutes since the Epoch time (2020/01/01 00:00:00). Minutes are used instead of seconds because a) clock differences between the diverse set of computers involved makes second level comparisons impractical; and b) a four byte positive integer can be used to represent timestamps until the year 10,185 thus saving four bytes compared to an eight byte positive integer.", "$id": "timestamp", "type": "integer", "minimum": 1, diff --git a/mvp-spec/json-schemas/version.json b/mvp-spec/json-schemas/version.json index acded54..a2b5f93 100644 --- a/mvp-spec/json-schemas/version.json +++ b/mvp-spec/json-schemas/version.json @@ -1,9 +1,9 @@ { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Version", - "description": "A version number made of a \"major\" and a \"minor\" version numbers.\n\nTo be detailed.", + "description": "A version positive byte between 1 and 255. When more than 255 versions are needed 255 will indicate the value is a two byte positive integer.", "$id": "version", - "type": "string", - "pattern": "^[0-9]+\\.[0-9]+$", - "examples": ["0.1", "0.407", "10.0"] + "type": "integer", + "pattern": "^[0-9]+$", + "examples": ["1"] } diff --git a/mvp-spec/security-signatures.md b/mvp-spec/security-signatures.md index 0861a9a..6d2aa2c 100644 --- a/mvp-spec/security-signatures.md +++ b/mvp-spec/security-signatures.md @@ -101,20 +101,29 @@ All "signers" have a pair of **private** and a **public** Elliptic Curve Cryptog A "signer" needs to calculate the signature to associate with an object (cookie or message). 1. the signer computes the _signature input_ for the object to sign - 1. usually, different properties from the object are "joined together" with the special separator character `\u2063` - 2. but **each type of object has its own rule to calculate the signature input**. Refer to the [model documentation](./model) for details on these rules. + 1. usually, different properties from the object are "joined together" to form a single byte array + 2. usually, the first component of the data structure is the version of the structure to support serialization + 3. but **each type of object has its own rule to calculate the signature input**. Refer to the [model documentation](./model) for details on these rules. Example: ``` -transmission_result.source.domain + '\u2063' + -transmission_result.source.timestamp + '\u2063' + -seed.source.signature + '\u2063' + -source.domain + '\u2063' + -source.timestamp + '\u2063' + -transmission_response.receiver + '\u2063' + -transmission_response.status + '\u2063' + -transmission_response.details +identifier.version (byte) + +identifier.id_type (null terminated string) + +identifier.value (four byte unsigned integer for length, then the bytes) + +identifier.source.version (one byte) + +identifier.source.domain (null teriminated string) + +identifier.source.timestamp (4 byte unsigned integer) +``` + +or + +``` +preferences.version (byte) + +preferences.data.use_browsing_for_personalization (boolean, one byte) + +preferences.source.version (one byte) + +preferences.source.domain (null teriminated string) + +preferences.source.timestamp (4 byte unsigned integer) ``` 3. the signer "hashes" this signature input with `RSA-SHA256` @@ -184,17 +193,16 @@ Host: operator.paf-operation-domain.io ```json { "dpo_email": "contact@crto-poc-1.onekey.network", - "privacy_policy_url": "https://crto-poc-1.onekey.network/privacy", + "terms_url": "https://crto-poc-1.onekey.network/privacy", "name": "Some OneKey operator", "keys": [ { "key": "-----BEGIN PUBLIC KEY-----\nMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEEiZIRhGxNdfG4l6LuY2Qfjyf60R0\njmcW7W3x9wvlX4YXqJUQKR2c0lveqVDj4hwO0kTZDuNRUhgxk4irwV3fzw==\n-----END PUBLIC KEY-----\n", - "start": 1641034200, - "end": 1672488000 + "created": 1641034200, } ], "type": "operator", - "version": "0.1" + "version": 1 } ```