Skip to content

Filtering outside CPython #45

@TkTech

Description

@TkTech

We used to have a little toy state machine that would let us do basic queries against documents before simdjson got JSONPointer support in 0.3.0. Anything we can do to reduce the number of objects crossing the C++ -> CPython barrier will greatly improve real-world performance, so it's time to bring this back.

This is a draft for experimentation and planning on syntax.

Selection:

  • . -> Current element.
  • .[] -> "For each item in an array". (returns array)
  • .[1] -> For single item in array (returns scalar)
  • .[1:2] -> For slice of array (returns array)
  • .<prop> -> Property of current object. (returns scalar)
  • <string> -> String literal, escapable with quotes and backslash.

Construction:

  • {name: .name, description: .description} -> Construct an object from current object, ignoring unspecified fields.
  • [.name] -> Construct an array from current object, ignoring unspecified fields.
  • When an object key is not specified, the field it accesses is used. {.price < 2} is the same as {price: .price < 2}

Filtering:

  • Given document [0, 1, 2, 3, 4], query .[] <= 2 returns [0, 1, 2]
  • Given document [{"price": 1.00}, {"price": 2.00}, {"price": 3.00}], query .[].price < 2 returns [{"price": 1.00}]
  • Given document {"price": 1.00}, query .price returns 1.00
  • When given multiple key filters, they're implicitly AND. Given document [{"price": 1.00, "available": true, "unwantedfield": false}, {"price": 2.00, "available": false}, {"price": 3.00}], query .[] | {.price < 2, .available == true} returns [{"price": 1.00, "available": true}]

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions