Lexing optimizations #15132

bonzini · 2025-10-17T09:30:55Z

Cut about 2/3 of the execution time of the lexer, and a little more elsewhere. Even on a project like QEMU that spends a lot of time in external scripts, this amounts to a 3-5% saving.

mesonbuild/mparser.py

mesonbuild/interpreterbase/operator.py

Match single-character tokens a separate dictionary lookup. As pointed out by dcbaker, this is even faster than str.index and gives the syntax error check for free (via KeyError). It also enables splitting the special-case "if" in two parts, one for long tokens and one for short tokens, thus providing further speedup. This shaves about 2/3rds of the time spent in lex(). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Identifiers are more common than strings, check against 'id' first. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Tuples are inefficient, require the ability to use hash table lookup via either a frozenset or a dictionary. This also allows using accept_any with COMPARISON_MAP.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

…tor names This avoids creating a dictionary every time an arithmetic operator is evaluated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bonzini force-pushed the parser-opt branch from e2f0a49 to aec15d7 Compare October 17, 2025 09:32

bonzini added perf parser/interpreter labels Oct 17, 2025

bonzini force-pushed the parser-opt branch 4 times, most recently from bb9ca51 to 17e9e95 Compare October 17, 2025 10:38

dnicolodi reviewed Oct 17, 2025

View reviewed changes

mesonbuild/mparser.py Show resolved Hide resolved

bonzini force-pushed the parser-opt branch from 17e9e95 to b15a74d Compare October 17, 2025 14:01

dcbaker reviewed Oct 17, 2025

View reviewed changes

mesonbuild/mparser.py Outdated Show resolved Hide resolved

mesonbuild/mparser.py Outdated Show resolved Hide resolved

mesonbuild/mparser.py Outdated Show resolved Hide resolved

bonzini force-pushed the parser-opt branch from b15a74d to a70695c Compare October 17, 2025 15:05

bonzini marked this pull request as ready for review October 17, 2025 16:53

bonzini requested review from jpakkane and mensinda as code owners October 17, 2025 16:53

dcbaker requested changes Oct 17, 2025

View reviewed changes

mesonbuild/interpreterbase/operator.py Outdated Show resolved Hide resolved

bonzini force-pushed the parser-opt branch from a70695c to f8ff473 Compare October 20, 2025 07:03

bonzini added 5 commits October 20, 2025 09:10

mparser: lexer: check early against common tokens

0723587

Identifiers are more common than strings, check against 'id' first. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

make ctype the same as the printed AST

e7574a5

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mparser: use a literal for arithmetic operators

a60de3d

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mparser: make comparison_map global uppercsae

44d8ac3

bonzini force-pushed the parser-opt branch from f8ff473 to 7547527 Compare October 20, 2025 07:10

bonzini added 3 commits October 20, 2025 09:11

mparser: tweak typing of accept_any, use it for comparisons.

140c300

Tuples are inefficient, require the ability to use hash table lookup via either a frozenset or a dictionary. This also allows using accept_any with COMPARISON_MAP.

mparser: move dictionaries to toplevel

036fa05

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

interpreterbase: make ArithmeticNode and MesonOperator both use opera…

ab26644

…tor names This avoids creating a dictionary every time an arithmetic operator is evaluated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bonzini force-pushed the parser-opt branch from 7547527 to ab26644 Compare October 20, 2025 07:11

dcbaker approved these changes Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Lexing optimizations #15132

Lexing optimizations #15132

Uh oh!

bonzini commented Oct 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Lexing optimizations #15132

Are you sure you want to change the base?

Lexing optimizations #15132

Uh oh!

Conversation

bonzini commented Oct 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants