Skip to content
This repository was archived by the owner on Jun 6, 2025. It is now read-only.
This repository was archived by the owner on Jun 6, 2025. It is now read-only.

Add a high level Python query library ontop of gafferpy #5

@t92549

Description

@t92549

gafferpy is great for directly sending json to the Gaffer rest-api, and to do this creates Python objects that map 1 to 1 to that json. However, these queries can become very long and require knowledge of Gaffer's verbose query language/json.
For a better user experience, an additional library should be made to sit on top of gafferpy, allowing users to specify more Pythonic, user friendly queries that would get translated to gafferpy queries and sent.

For example, here is a json query for a GetElements operation on the road traffic api. It gets Edges connected to the Entity with vertex M32:1. These are then filtered on the count property being more than 1, and the group by removed. It is also using the gaffer.federatedstore.operation.graphIds option to assert that this only gets executes on sub-graph graph1:

{
    "class": "uk.gov.gchq.gaffer.operation.impl.get.GetElements",
    "input": [
        {
            "class": "uk.gov.gchq.gaffer.operation.data.EntitySeed",
            "vertex": "M32:1"
        }
    ],
    "view": {
        "edges": {
            "RoadUse": {
                "preAggregationFilterFunctions": [
                    {
                        "selection": [
                            "count"
                        ],
                        "predicate": {
                            "class": "uk.gov.gchq.koryphe.impl.predicate.IsMoreThan",
                            "value": {
                                "java.lang.Long": 1
                            }
                        }
                    }
                ],
                "groupBy": []
            }
        }
    },
    "directedType": "EITHER",
    "options": {
        "gaffer.federatedstore.operation.graphIds": "graph1"
    }
}

This is a very long and verbose mapping to the Java api. The gafferpy code to perform this query is a verbose map to this json:

from gafferpy import gaffer as g
from gafferpy import gaffer_connector

gc = gaffer_connector.GafferConnector("http://localhost:8080/rest/latest")
op = g.GetElements(
    input=['M32:1'],
    view=g.View(
        edges=[
            g.ElementDefinition(
                group='RoadUse',
                group_by=[],
                pre_aggregation_filter_functions=[
                    g.PredicateContext(
                        selection=['count'],
                        predicate=g.IsMoreThan(
                            value=g.long(1)
                        )
                    )
                ]
            )
        ]
    ),
    directed_type=g.DirectedType.EITHER,
    options=["graph1"]
)
results = gc.execute_operation(op)

A more usable query library based in Python could look something like this:

from gafferpy import gaffer_query as gq
from gafferpy import gaffer_connector

gc = gaffer_connector.GafferConnector("http://localhost:8080/rest/latest")
results = gq.GetElements(using=gc, graphs="graph1") \
            .input("M32:1") \
            .view(edge="RoadUse", group_by=[], pre_agg_filter="count > 1") \
            .directed("either")

Most of this simplification could be achieved by restructuring operations so that objects like ElementDefinitions don't have to be created in such a verbose way.
For the simplification of the predicate however, a parser would have to be written to map the string to the relevant Predicate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions