Skip to content

Commit 25c25c3

Browse files
SchmidtCole-Greer
authored andcommitted
proposal: eager vs. lazy execution in TP4
1 parent 12aa078 commit 25c25c3

File tree

2 files changed

+194
-3
lines changed

2 files changed

+194
-3
lines changed

docs/src/dev/future/index.asciidoc

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -160,9 +160,11 @@ story.
160160
[width="100%",cols="3,10,2,^1",options="header"]
161161
|=========================================================
162162
|Proposal |Description |Targets |Resolved
163-
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-equality-1.asciidoc[Proposal 1] |Equality, Equivalence, Comparability and Orderability Semantics - Documents existing Gremlin semantics along with clarifications for ambiguous behaviors and recommendations for consistency. |3.6.0 |N
164-
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-arrow-flight-2[Proposal 2] |Gremlin Arrow Flight. |4.0.0 |N
165-
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-3-remove-closures[Proposal 3] |Removing the Need for Closures/Lambda in Gremlin |3.7.0 |N
163+
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-equality-1.asciidoc[Proposal 1] |Equality, Equivalence, Comparability and Orderability Semantics - Documents existing Gremlin semantics along with clarifications for ambiguous behaviors and recommendations for consistency. |3.6.0 |Y
164+
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-arrow-flight-2[Proposal 2] |Gremlin Arrow Flight. |Future |N
165+
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-3-remove-closures[Proposal 3] |Removing the Need for Closures/Lambda in Gremlin |3.7.0 |Y
166+
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-transaction-4[Proposal 4] |TinkerGraph Transaction Support |3.7.0 |Y
167+
|link:https://github.com/apache/tinkerpop/blob/master/docs/src/dev/future/proposal-scoping-5[Proposal 5] |Lazy vs. Eager Evaluation|3.8.0 |N
166168
|=========================================================
167169
168170
= Appendix
Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
////
2+
Licensed to the Apache Software Foundation (ASF) under one or more
3+
contributor license agreements. See the NOTICE file distributed with
4+
this work for additional information regarding copyright ownership.
5+
The ASF licenses this file to You under the Apache License, Version 2.0
6+
(the "License"); you may not use this file except in compliance with
7+
the License. You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
////
17+
image::apache-tinkerpop-logo.png[width=500,link="https://tinkerpop.apache.org"]
18+
19+
*x.y.z - Proposal 5*
20+
21+
== Lazy vs. Eager Evaluation in TP4 ==
22+
23+
=== Introduction ===
24+
25+
Gremlin comes with conventions and mechanisms to control the flow strategy for traversal processing: _lazy evaluation_ is conceptually a depth-first evaluation paradigm that follows as a natural result from the pull-based stacked iterator model (as implemented in the Apache TinkerPop OLTP engine), whereas _eager evaluation_ enforces a Gremlin step to process all its incoming traversers before passing any results to the subsequent step.
26+
27+
In many cases, switching between a lazy vs. eager flow strategy merely affects the internal order in which the engine processes traversers, yet there is no observable difference for end users in the final query result. However, there exist quite a few common use cases where lazy vs. eager evaluation may cause observable differences in the query results. These scenarios include (1) queries with side effects — where side effect variables are written and read, and the order in which these variables are updated and accessed changes observed values in these variables, (2) cases where queries aim to visit and return results in a given order — particularly queries with `limit()` steps to achieve top-k behavior, and (3) certain classes of update queries where the order in which updates are being applied affects the final state of the database.
28+
29+
To illustrate the difference between lazy and eager evaluation, consider the following simple query over the modern graph:
30+
31+
[code]
32+
----
33+
gremlin> g.V().hasLabel('person').groupCount('x').select('x')
34+
----
35+
36+
If a lazy flow strategy is used, the observed `x` values are reported incrementally in the output:
37+
38+
[code]
39+
----
40+
==>[v[1]:1]
41+
==>[v[1]:1,v[2]:1]
42+
==>[v[1]:1,v[2]:1,v[4]:1]
43+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
44+
----
45+
46+
In contrast, an eager evaluation strategy would store the complete set of solutions in the side effect variable `x` before proceeding, in which case the output would change to the following:
47+
48+
[code]
49+
----
50+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
51+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
52+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
53+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
54+
----
55+
56+
While there are select Gremlin steps that provide explicit control over lazy vs. eager flow — for instance, switching the `Scope.local` default to `Scope.global` in the https://tinkerpop.apache.org/docs/current/reference/#aggregate-step[the side effect version of the aggregate step] allows users to enforce eager evaluation — the https://tinkerpop.apache.org/gremlin.html[Apache TinkerPop documentation] is rather vague when it comes to providing guarantees regarding the flow strategy that is used in the general case (the https://tinkerpop.apache.org/docs/current/dev/provider/#gremlin-semantics[Gremlin Semantics] section currently does not talk about this distinction). On the other hand, the Apache TinkerPop Gremlin OLTP processor as the de facto reference implementation, leverages a pull-based execution engine that typically (though not always) results in a lazy evaluation semantics -- yet it is not clear whether the Gremlin language as such aims to impose a _strong guarantee_ that queries have to be evaluated lazily or whether the observed lazy evaluation in the TinkerPop OLTP processor is just an implementation artifact. From our perspective, it is important for Gremlin users — who often seek to run queries and workloads across different engines and appreciate the freedom to switch implementations — to have a concise answer on the design intent and be explicit about guarantees that Gremlin implementations do vs. do not have to provide in order to be considered compliant with the language spec.
57+
58+
In fact, when looking at the specific question of lazy vs. eager flow guarantees, different Gremlin processors today come with different “degrees of compatibility” with the lazy execution behavior observed in the TinkerPop OLTP processor. The key reason for deviating from a rigorous lazy execution paradigm usually is performance: the problem with lazy evaluation is that it prescribes serial execution order, which in many cases complicates (or even prevents) common optimization techniques such as bulking, vectored execution, and parallelization. As a matter of fact, even the TinkerPop Gremlin OLAP graph processor breaks with the lazy evaluation paradigm that is implemented in the traditional OLTP processor in order to achieve efficient parallel execution.
59+
60+
=== A unified control mechanism for lazy vs. eager evaluation ===
61+
62+
In this proposal we argue that guarantees for and control over lazy vs. eager evaluation order should be a well-defined aspect of the Gremlin language that has to strike the right balance between (a) imposing a minimal set of constraints by default, as to leave implementers the freedom to apply optimizations for the general cases (and account for the variety of approaches that Gremlin engines implement today) while (b) providing Gremlin users the freedom to specify and constrain flow control whenever they depend on it. With these goals in mind, our proposal is as follows.
63+
64+
===== Proposal 1: By default, the Gremlin semantics shall NOT prescribe lazy vs. eager evaluation order =====
65+
Of course, this does not prevent implementations from opting into a specific evaluation order (the Apache TinkerPop OLTP processor, for instance, would likely continue to implement a lazy evaluation paradigm and hence may provide more specific guarantees than what is prescribed by Gremlin as a query language). Concretely, the required changes for TP4 in this regard would be to update the documentation, to be explicit about the fact that Gremlin as a language does neither prescribe eager nor lazy evaluation order in the general case, and review (and, where necessary, relax) some existing test cases to be less constraining when it comes to enforcing lazy evaluation.
66+
67+
68+
[code]
69+
----
70+
Example:
71+
========
72+
73+
# In the absence of lazy vs. eager evaluation guarantees as proposed above, the
74+
# sample query from the Introduction may return different results, depending on
75+
# the control flow strategy chosen by a specific Gremlin processor:
76+
gremlin> g.V().hasLabel('person').groupCount('x').select('x')
77+
78+
## Sample result 1:
79+
# An implementation that internally implements a lazy execution approach may
80+
# choose to execute traversers sequentially and return the following result
81+
# (this is the result returned by the TinkerPop OLTP processor today):
82+
==>[v[1]:1]
83+
==>[v[1]:1,v[2]:1]
84+
==>[v[1]:1,v[2]:1,v[4]:1]
85+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
86+
87+
## Sample result 2:
88+
# An implementation that internally implements an eager execution approach may
89+
# choose to batch process results and would return the following result:
90+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
91+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
92+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
93+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
94+
95+
## Sample result 3:
96+
# Implementations are also free to do vectored processing, e.g. implement "partial
97+
# batching" of the results, in which case the following result might be observed:
98+
==>[v[1]:1,v[2]:1]
99+
==>[v[1]:1,v[2]:1]
100+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
101+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
102+
----
103+
104+
105+
===== Proposal 2: The recipe to achieve lazy evaluation is to wrap the relevant part of the query into a local() step =====
106+
With the exception of cases where bulk optimization affects the semantics of `local()` evaluation (which we discuss further below in a separate section), this already works today and could be documented as a __general pattern__ to enforce lazy evaluation for certain parts of the query.
107+
108+
[code]
109+
----
110+
Example:
111+
========
112+
113+
# By wrapping the groupCount() and select() into a local() step, users can enforce lazy
114+
# execution behavior:
115+
gremlin> g.V().hasLabel('person').local(groupCount('x').select('x'))
116+
117+
# The observed result will be guaranteed "incremental", i.e. the local() wrapping
118+
# of the subquery groupCount('x').select('x') now provides a guarantee that the subquery
119+
# is evaluated lazily, one solution at a time:
120+
==>[v[1]:1]
121+
==>[v[1]:1,v[2]:1]
122+
==>[v[1]:1,v[2]:1,v[4]:1]
123+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
124+
----
125+
126+
===== Proposal 3: Vice versa, as a generic mechanism to enforce eager evaluation, it is possible to use an explicit barrier() step =====
127+
Again, this already works in Gremlin today and could just be documented as a _general pattern_ to achieve lazy evaluation for subqueries.
128+
129+
[code]
130+
----
131+
Example:
132+
========
133+
134+
# When using an explicit barrier step, our sample query will be guaranteed to switch to
135+
# eager evaluation and group-count all the results before proceeding on to result selection:
136+
gremlin> g.V().hasLabel('person').groupCount('x').barrier().select('x')
137+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
138+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
139+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
140+
==>[v[1]:1,v[2]:1,v[4]:1,v[6]:1]
141+
----
142+
143+
144+
145+
=== On the interaction between local() and bulking ===
146+
As called out in initial PR feedback, there are some situations today in the Apache TinkerPop implementation where bulked traversers can affect the behavior of `local()` execution. More precisely, in the presence of bulked traversers `local()` execution today would cause _per-bulk_ execution (rather than strict _per-traverser_ execution). Hence, `local()` queries may return different results depending on whether bulk traverser optimizations are enabled or disabled, as illustrated by the following example (credit to Cole Greer, copied over from the PR):
147+
148+
[code]
149+
----
150+
With LazyBarrierStrategy disabled (to avoid hidden barrier() steps), the following example works as expected with a lazy evaluation:
151+
152+
gremlin> g.withoutStrategies(LazyBarrierStrategy).V().both().hasLabel('person').local(groupCount('x').select('x'))
153+
==>[v[2]:1]
154+
==>[v[2]:1,v[4]:1]
155+
==>[v[1]:1,v[2]:1,v[4]:1]
156+
==>[v[1]:2,v[2]:1,v[4]:1]
157+
==>[v[1]:2,v[2]:1,v[4]:2]
158+
==>[v[1]:2,v[2]:1,v[4]:2,v[6]:1]
159+
==>[v[1]:3,v[2]:1,v[4]:2,v[6]:1]
160+
==>[v[1]:3,v[2]:1,v[4]:3,v[6]:1]
161+
162+
However, if a barrier is injected prior to the local() step, the result is a mix of lazy and eager evaluation:
163+
164+
gremlin> g.withoutStrategies(LazyBarrierStrategy).V().both().hasLabel('person').barrier().local(groupCount('x').select('x'))
165+
==>[v[2]:1]
166+
==>[v[2]:1,v[4]:3]
167+
==>[v[2]:1,v[4]:3]
168+
==>[v[2]:1,v[4]:3]
169+
==>[v[1]:3,v[2]:1,v[4]:3]
170+
==>[v[1]:3,v[2]:1,v[4]:3]
171+
==>[v[1]:3,v[2]:1,v[4]:3]
172+
==>[v[1]:3,v[2]:1,v[4]:3,v[6]:1]
173+
----
174+
175+
Strategies and optimizations in general, and the bulk traversal optimization specifically, must never affect query execution semantics. In the context of this proposal, which seeks to establish concrete guarantees for the behavior of `local()`, this implies that (from a logical perspective) the `local()` step evaluation must be performed _as if there were no bulk traversers_. From a technical point of view, one possible way to achieve this behavior would be to convert bulk traversers into regular, non-bulked traversers prior to starting the `local()` execution; a more advanced implementation (seeking to maintain bulk traversers as an important optimization mechanism where possible) might look into the steps used inside `local()` and reason about whether execution over bulked vs. non-bulked traversals could possibly differ, as to apply such a conversion conditionally (i.e., only in cases where the result might differ). A discussion of details of the technical approach that will be taken to assert consistency of bulked vs. regular traversers w.r.t. the `local()` semantics defined in this proposal is beyond the scope of this document.
176+
177+
== Proposed further simplifications ==
178+
179+
The previous section does not suggest any semantical changes compared to the way the Gremlin language is implemented in TinkerPop today — it only proposes improving documentation to clarify guarantees that Gremlin as a language does vs. does not provide (which helps to set boundaries around the “degree of freedom” that implementers have when it comes to flow strategy) and highlights already-existing mechanisms that are available in the language to _explicitly_ control lazy vs. eager control flow. Complementary, in this section we propose small simplifications to Gremlin as language, with the goal to eliminate redundant mechanisms to control lazy vs. eager evaluation behavior and streamline / align the behavior of existing TinkerPop steps.
180+
181+
===== Proposal 4: Alignment of side effect steps w.r.t. lazy vs. eager evaluation =====
182+
Today, Gremlin uses the `Scope` keyword with two different “meanings”:
183+
184+
1. For `aggregate('x')`, the `Scope` argument defines https://tinkerpop.apache.org/docs/current/reference/#aggregate-step[lazy vs. eager evaluation semantics], where a global scope enforces eager semantics (no object continues until all previous objects have been fully seen), providing a guarantee that each subsequent inspection of the side effect variable `x` contains the complete list of all values stored, whereas the `Scope.local` variant does not provide such a guarantee.
185+
2. Various steps like `dedup()`, `order()`, `sample()`, and predicates (e.g., `count()`, `toLower()`, `toUpper()`, etc.) accept the same `Scope` enum as an argument to control whether the step is applied across traversers or relative to each value in the traverser. As an example, `count(Scope.global)` counts the traversers, whereas `count(Scope.local)` expects a collection type input and counts, for each traverser, the number of elements in the collection.
186+
187+
From a conceptual perspective, these are two different use cases: case 1. is affecting the flow strategy, whereas case 2. is about specifying that a step applies "per element" rather than "across traversers". Given that `aggregate('x')` currently is the only side effect step that takes an explicit `Scope` as argument and that we proposed alternative, already existing mechanisms in the language for flow control in the previous section, we propose to fix this inconsistency and remove the `Scope` parameter from `aggregate('x')`. This would (a) align the structure and behavior of all side effect steps (none of them would carry an argument to enforce the scope) and (b) would leave the `Scope` enum reserved for the “traverser-local” application usage pattern discussed in case 2, as to eliminate confusion around the different contexts in which the `Scope` parameter is used today.
188+
189+
The key idea with that change is that side effect steps in TP4 would *neither* prescribe lazy evaluation (local scope) *nor* prescribe eager evaluation (global scope) — which is inline with the main theme postulated earlier in this proposal: by default, Gremlin semantics shall not prescribe the evaluation order. Whenever flow control is required, Gremlin queries would need to be explicit about this, via `local()` or `barrier()` steps, as exemplified in the previous section.

0 commit comments

Comments
 (0)