Skip to content

Conversation

@happy-barney
Copy link

By extracting various declarations of cursor variable(s) into dedicated non-terminal
for statement now contains only 3 branches:

  • for (;;)
  • for ()
  • for cursor ()

with LALR(1) parser cursor cannot be empty -> it will produce conflict with for (;;) form.

  • This set of changes does not require a perldelta entry.

@happy-barney happy-barney force-pushed the hpb/cleanup branch 2 times, most recently from 30c084c to 4d18cbb Compare October 23, 2025 07:16
@happy-barney happy-barney changed the title perly - simplity non-terminal bare_statement_for perly - simplify non-terminal bare_statement_for Oct 23, 2025
@happy-barney happy-barney force-pushed the hpb/cleanup branch 3 times, most recently from 43552bd to 7f8a73f Compare October 31, 2025 16:56
Comment on lines -349 to +359
PERLY_PAREN_OPEN
remember
PERLY_PAREN_OPEN

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] bare_statement_for - move remember right after keyword forming sequence KW_FOR remember.

This commit message is still a bit too cryptic for me.

Non-terminal remember is empty non-terminal with action starting
new full lexical scope. Such empty non-terminals, when used in
multiple branches, will generate shift/reduce conflict when not
appearing after distinguishable sequence of terminals.

I simply can't understand most of that..
Reading the Such empty non-terminals, when used in multiple branches, will generate shift/reduce conflict makes me wonder if there is/was an issue with how it was written before and/or if there is any change to how the construct is now treated.

Change formalizes KW_FOR as a keyword starting new lexical scope.

The commit message should also what the consequence of this is. (Does it change anything? If yes then what? If no then explicitly mention it in the commit message).

Also missing a bit of the "why"; What was the real problem (if any) with the old code?

%type <opval> bare_statement_when
%type <opval> bare_statement_while
%type <opval> bare_statement_yadayada
%type <opval> clause_mexpr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] clause_mexpr - deduplicate shared pattern into non-terminal PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE

For this commit I would use something like:

Add `clause_mexpr` to remove duplicated code

The pattern `PERLY_PAREN_OPEN mexpr `PERLY_PAREN_CLOSE` is repeated in several places
add `clause_mexpr` to avoid this repetition.

This change should not affect any of the behaviour. 

In my opinion having: [parser] clause_mexpr and non-terminal and PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE in the commit title does not add any real value. (The title is too long and reading it in git log --oneline I would really need to think hard what it really means and if it might alter behavior or not)

perly.y Outdated
remember
KW_MY
my_scalar
my_scalar[cursor]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] bare_statement_for - rename variable(s) product to cursor to visualize possible deduplication

To me (but I'm unfamiliar with the parser) variable(s) product is meaningless/confusing.

I also find the word cursor confusing (but again could be because I'm unfamiliar with the code and/or non-native English). I don't know what to expect when I see the word cursor..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made me think of a cursor in a database sense, which lets you iterate over the results of a database query, just as foreach style loops iterate over a list.

perly.y Outdated
%type <opval> bare_statement_when
%type <opval> bare_statement_while
%type <opval> bare_statement_yadayada
%type <opval> clause_for_cursor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] clause_for_cursor - encapsulate FOR loop cursor variables

This is already an improvement;
What I would still add in the commit message is what branches are now merged.

I.e. something like:

Merge the for-branches of:
- `for $scalar (...)`
- `for my $scalar (...)`
- `for my ($foo, $bar) (...)`
- `for \ $scalar (...)`
- `for my \ $scalar (..)`

into a single branch with the parsing of the variables handled  by `clause_for_cusror`

@happy-barney
Copy link
Author

@bram-perl thanks for your comments on commit messages. I will use it to improve my commit messages.

Based on those I also realized that with end status after this PR it will is reasonable to write dedicated tests focusing of modified part, using commit structure: "first introduce new test(s); second changes".

Branislav Zahradník added 7 commits November 9, 2025 17:49
* Motivation

Introduce single assert function which can:
- use emulated named arguments
- evaluate code to build "got" value
- detect eval success/failure
- detect expected eval success/failure

`assume` function enforces most important part of test case,
assumed behaviour description (test message), first.
be it control variable of `for (;;)` variant or cursor variable(s)
of `for $var () { } continue { }`

Tests test:
- visibility of variable in for loop
- visibility of variable in continue loop
- behaviour after loop
Tests covering this changes:
- t/op/for-control-scope.t
- t/op/for-over-scope.t

* Motivation

Normalize where new lexical context starts so every variant
of for loop will start it at same position.

Such normalization will later allow reduction of code currently
duplicated.

* Change description

For statement already defined its own lexical scope in each its branch
though in different places for each branch (using <*> as a marker):

```
for (<*> $control = 0; ...)
for my <*> $cursor (...)
for my <*> ($cursor_a, $cursor_b) (...)
for $cursor (<*> ...)
for my \ <*> $cursor (...)
for \ my <*> $cursor (...)
for \ $cursor (<*> ... )
for (<*> ...)
```

This change formalizes every branch to start with:
```
for <*> ...
```

* Technical details

Lexical scope is started by non-terminal `remember` - empty with an action.

Bison's LALR(1) grammar evaluates action before looking which terminal token
follows, which means, that when `remember` occurs it must be either:
- preceded by unique sequence of terminal tokens (approach used before)
- be on same position for every possible branch (approach used by this change)

As far as `KW_FOR` (as statement) always starts new lexical scope,
it's possible to formalize it by starting it right after keyword,
when `for` statement is identified.
for_control - C-style with control expression
for_over - iterating over list using cursor variable(s)
Sequence `PERLY_PAREN_OPEN` `mexpr` `PERLY_PAREN_CLOSE` is repeated
in several places, add new non-terminal `clause_mexpr` to remove
this repetition.
Extract multiple ways how to declare cursor variable(s) of for-over
loop into dedicated non-terminal allowing to merge `for-over` branches.

Merged for-over branches of:
- `for $scalar (...)`
- `for my $scalar (...)`
- `for my ($foo, $bar) (...)`
- `for \ $scalar (...)`
- `for my \ $scalar (..)`

Branch `for ()` must still remain as far as LALR(1) grammar will run
into shift/reduce conflict with `for (;;)` - there will be two possible
paths between `KW_FOR` and `(`:
- `KW_FOR remember '('` - from `for (;;)` branch
- `KW_FOR remember clause_for_over_cursor '(` - from `for ()` (using default cursor)
@happy-barney
Copy link
Author

@bram-perl I've updated PR and commit messages, can you please look whether it is better described ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants