You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replaced a few mistranslations.
Fixed some basic grammatical errors (mood, plurality, etc.).
Made a bunch of minor changes where the sentence wasn't necessarily ungrammatical, but would be awkward to a native speaker.
Copy file name to clipboardExpand all lines: README.md
+49-50Lines changed: 49 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,28 +2,28 @@ LuaJIT Language Toolkit
2
2
===
3
3
4
4
The LuaJIT Language Toolkit is an implementation of the Lua programming language written in Lua itself.
5
-
It works by generating a LuaJIT's bytecode including the debug informations and use the LuaJIT's virtual machine to run the generated bytecode.
5
+
It works by generating LuaJIT bytecode, including debug information, and uses LuaJIT's virtual machine to run the generated bytecode.
6
6
7
-
On itself the language toolkit does not do anything useful since LuaJIT itself does the same things natively.
8
-
The purpose of the language toolkit is to provide a starting point to implement a programming language that target the LuaJIT virtual machine.
7
+
On its own, the language toolkit does not do anything useful, since LuaJIT itself does the same things natively.
8
+
The purpose of the language toolkit is to provide a starting point to implement a programming language that targets the LuaJIT virtual machine.
9
9
10
-
With the LuaJIT Language Toolkit is easy to create a new language or modify the Lua language because the parser is cleanly separated from the bytecode generator and the virtual machine.
10
+
With the LuaJIT Language Toolkit, it is easy to create a new language or modify the Lua language because the parser is cleanly separated from the bytecode generator and the virtual machine.
11
11
12
-
The toolkit implement actually a complete pipeline to parse a Lua program, generate an AST tree and generate the bytecode.
12
+
The toolkit implements a complete pipeline to parse a Lua program, generate an AST, and generate the corresponding bytecode.
13
13
14
14
Lexer
15
15
---
16
16
17
17
Its role is to recognize lexical elements from the program text.
18
-
It does take the text of the program as input and does produce a flow of "tokens".
18
+
It takes the text of the program as input and produces a stream of "tokens" as its output.
19
19
20
-
Using the language toolkit you can run the lexer only to examinate the flow of tokens:
20
+
Using the language toolkit you can run the lexer only, to examinate the stream of tokens:
21
21
22
22
```
23
23
luajit run-lexer.lua tests/test-1.lua
24
24
```
25
25
26
-
The command above generate for the following code fragment:
26
+
The command above will lex the following code fragment:
27
27
28
28
```lua
29
29
localx= {}
@@ -32,7 +32,7 @@ for k = 1, 10 do
32
32
end
33
33
```
34
34
35
-
to obtain a list of the tokens:
35
+
...to generate the list of tokens:
36
36
37
37
TK_local
38
38
TK_name x
@@ -58,36 +58,36 @@ to obtain a list of the tokens:
58
58
TK_number 1
59
59
TK_end
60
60
61
-
Each line represent a token where the first element is the kind of token and the second element is its value, if any.
61
+
Each line represents a token where the first element is the kind of token and the second element is its value, if any.
62
62
63
63
The Lexer's code is an almost literal translation of the LuaJIT's lexer.
64
64
65
65
Parser
66
66
---
67
67
68
-
The parser takes the flow of tokens as given by the lexer and forms the statements and expressions according to the language's grammar.
69
-
The parser is based on a list of parsing rules that are invoked each time a the input match a given rule.
70
-
When the input match a rule a corresponding function in the AST module is called to build an AST node.
71
-
The generated nodes in turns are passed as arguments to the other parsing rules until the whole program is parsed and a complete AST tree is built for the program text.
68
+
The parser takes the token stream from the lexer and builds statements and expressions according to the language's grammar.
69
+
The parser is based on a list of parsing rules that are invoked each time the input matches a given rule.
70
+
When the input matches a rule, a corresponding function in the AST (abstract syntax tree) module is called to build an AST node.
71
+
The generated nodes in turns are passed as arguments to the other parsing rules until the whole program is parsed and a complete AST is built for the program text.
72
72
73
-
The AST tree is very useful since it does abstract the structure of the program and is more easy to manipulate.
73
+
The AST is very useful as an abstraction of the structure of the program, and is easier to manipulate.
74
74
75
-
What distinguish the language toolkit from LuaJIT is that the parser phase does generate an AST tree and the bytecode generation is done in a separate phase only when the AST tree is completely generated.
75
+
What distinguishes the language toolkit from LuaJIT is that the parser phase generates an AST, and the bytecode generation is done in a separate phase only when the AST is complete.
76
76
77
77
LuaJIT itself operates differently.
78
78
During the parsing phase it does not generate any AST but instead the bytecode is directly generated and loaded into the memory to be executed by the VM.
79
-
This means that LuaJIT's C implementation perform the three operations:
79
+
This means that LuaJIT's C implementation performs the three operations:
80
80
81
81
- parse the program text
82
82
- generate the bytecode
83
83
- load the bytecode into memory
84
84
85
85
in one single pass.
86
-
This approach is remarkable on itself and very efficient but it makes difficult to modify or extend the programming language.
86
+
This approach is remarkable and very efficient, but makes it difficult to modify or extend the programming language.
87
87
88
88
### Parsing Rule example ###
89
89
90
-
To illustrate how the parsing work in the language toolkit let us make an example.
90
+
To illustrate how parsing works in the language toolkit, let us make an example.
91
91
The grammar rule for the "return" statement is:
92
92
93
93
```
@@ -113,55 +113,55 @@ local function parse_return(ast, ls, line)
113
113
end
114
114
```
115
115
116
-
As you can see the AST function are invoked using the `ast` object.
116
+
As you can see, the AST functions are invoked using the `ast` object.
117
117
118
-
In addition the parser provides additional informations about:
118
+
In addition, the parser provides information about:
119
119
120
120
* the function prototype
121
121
* the syntactic scope
122
122
123
-
The first is used to keep trace of some informations about the current function parsed.
123
+
The first is used to keep track of some information about the current function being parsed.
124
124
125
-
The syntactic scope rules tell to the user's rule when a new syntactic block begins or end.
125
+
The syntactic scope rules tell the user's rule when a new syntactic block begins or end.
126
126
Currently this is not really used by the AST builder but it can be useful for other implementations.
127
127
128
128
The Abstract Syntax Tree (AST)
129
129
---
130
130
131
-
The abstract syntax tree represent the whole Lua program with all the informations.
131
+
The abstract syntax tree represent the whole Lua program, with all the information the parser has gathered about it.
132
132
133
-
One possible approach to implement a new programming language is to generate an AST tree that correspond to the target programming language and to transform the tree in a Lua's AST tree in a separate phase.
133
+
One possible approach to implement a new programming language is to generate an AST that more closely corresponds to the target programming language, and then transform the tree into a Lua AST in a separate phase.
134
134
135
-
Another possible approach is to act from the parser itself and directly generate the appropriate Lua AST nodes.
135
+
Another possible approach is to directly generate the appropriate Lua AST nodes from the parser itself.
136
136
137
-
Currently the language toolkit does not perform any transformation and just pass the AST tree to the bytecode generator module.
137
+
Currently the language toolkit does not perform any additional transformations, and just passes the AST to the bytecode generator module.
138
138
139
139
Bytecode Generator
140
140
---
141
141
142
-
Once the AST tree is generated it can be feeded to the bytecode generator module that will generate the corresponding LuaJIT bytecode.
142
+
Once the AST is generated, it can be fed to the bytecode generator module, which will generate the corresponding LuaJIT bytecode.
143
143
144
144
The bytecode generator is based on the original work of Richard Hundt for the Nyanga programming language.
145
-
It was largely modified by myself to produce optimized code similar to what LuaJIT generate itself.
146
-
A lot of work was also done to ensure the correctness of the bytecode and of the debug informations.
145
+
It was largely modified by myself to produce optimized code similar to what LuaJIT would generate, itself.
146
+
A lot of work was also done to ensure the correctness of the bytecode and of the debug information.
147
147
148
148
Alternative Lua Code generator
149
149
------------------------------
150
150
151
-
Instead of passing the AST tree to the bytecode generator an alternative module can be used to generate Lua code.
151
+
Instead of passing the AST to the bytecode generator, an alternative module can be used to generate Lua code.
152
152
The module is called "luacode-generator" and can be used exactly like the bytecode generator.
153
153
154
-
The Lua code generator has the advantage of being more simple and more safe as the code is parsed directly by LuaJIT ensuring from the beginning complete compatibility of the bytecode.
154
+
The Lua code generator has the advantage of being more simple and more safe as the code is parsed directly by LuaJIT, ensuring from the beginning complete compatibility of the bytecode.
155
155
156
156
Currently the Lua Code Generator backend does not preserve the line numbers of the original source code. This is meant to be fixed in the future.
157
157
158
158
Use this backend instead of the bytecode generator if you prefer to have a more safe backend to convert the Lua AST to code.
159
-
The module can be used also to pretty-printing a Lua AST tree since the code itself is probably the most human readable representation of the AST tree.
159
+
The module can also be used for pretty-printing a Lua AST, since the code itself is probably the most human readable representation of the AST.
160
160
161
161
C API
162
162
---
163
163
164
-
The language toolkit does provide a very simple set of C API to implement a custom language.
164
+
The language toolkit provides a very simple set of C APIs to implement a custom language.
165
165
The functions provided by the C API are:
166
166
167
167
```c
@@ -187,7 +187,7 @@ When the function `language_*` is used, an independent `lua_State` is created be
187
187
Once the bytecode is generated it is loaded into the user's `lua_State` ready to be executed.
188
188
The approach of using a separate Lua's state ensure that the process of compiling does not interfere with the user's application.
189
189
190
-
It should be noted that even when an executable is created with the C API the lang/* Lua files need to be available at run time because they are used by the language toolkit's Lua state.
190
+
It should be noted that even when an executable is created with the C API, the lang/* Lua files need to be available at run time because they are used by the language toolkit's Lua state.
The "run.lua" script will just invoke the complete pipeline of the lexer, parser and bytecode generator and it will pass the bytecode to luajit with "loadstring".
202
202
203
-
The language toolkit provide also a customized executable named `luajit-x` that use the language toolkit's toolchain instead of the native one.
204
-
Otherwise the program `luajit-x` works exactly as luajit itself and accept the same options.
205
-
206
-
This means that you can experiment with the language by modifying the Lua implementation of the language and test the changes immediately without recompiling anything by using `luajit-x` as a REPL.
203
+
The language toolkit also provides a customized executable named `luajit-x` that uses the language toolkit's pipeline instead of the native one.
204
+
Otherwise, the program `luajit-x` works exactly the same as `luajit` itself, and accepts the same options.
205
+
This means that you can experiment with the language by modifying the toolkit's implementation, and test the changes immediately without recompiling anything by using `luajit-x` as a REPL.
207
206
208
207
### Generated Bytecode ###
209
208
@@ -216,16 +215,16 @@ For example you can inspect the bytecode using the following command:
216
215
luajit run.lua -bl tests/test-1.lua
217
216
```
218
217
219
-
or in alternative:
218
+
or alternatively:
220
219
221
220
```
222
221
./src/luajit-x -bl tests/test-1.lua
223
222
```
224
223
225
224
where we suppose that you are running `luajit-x` from the language toolkit's root directory.
226
-
This is somewhat *required* since the `luajit-x` programe needs to found the lang/* Lua modules when is executed.
225
+
This is somewhat *required* since the `luajit-x` program needs to be able to find the lang/* Lua modules when is executed.
227
226
228
-
Either way, when you use one of the two commands above to generate the bytecode you will obtain on the screen:
227
+
Either way, when you use one of the two commands above to generate the bytecode you will the see following on the screen:
229
228
230
229
```
231
230
-- BYTECODE -- "test-1.lua":0-7
@@ -255,13 +254,13 @@ You can compare it with the bytecode generated natively by LuaJIT using the comm
255
254
luajit -bl tests/test-1.lua
256
255
```
257
256
258
-
In the example above the generated bytecode will be *identical* to those generated by LuaJIT.
259
-
This is not an hazard since the Language Toolkit's bytecode generator is designed to produce the same bytecode that LuaJIT itself would generate.
260
-
Yet in some cases the generated code will differ but this is not considered a problem as long as the generated code is still correct.
257
+
In the example above the generated bytecode will be *identical* to that generated by LuaJIT.
258
+
This is not an accident, since the Language Toolkit's bytecode generator is designed to produce the same bytecode that LuaJIT itself would generate.
259
+
In some cases, the generated code will differ. But, this is not considered a big problem as long as the generated code is still semantically correct.
261
260
262
261
### Bytecode Annotated Dump ###
263
262
264
-
In addition to the standard LuaJIT bytecode functions the language toolkit support also a special debug mode where the bytecode in printed byte-by-byte in hex format with some annotations on the right side of the screen.
263
+
In addition to the standard LuaJIT bytecode functions, the language toolkit also supports a special debug mode where the bytecode is printed byte-by-byte in hex format with some annotations on the right side of the screen.
265
264
The annotations will explain the meaning of each chunk of bytes and decode them as appropriate.
266
265
267
266
For example:
@@ -270,7 +269,7 @@ For example:
270
269
luajit run.lua -bx tests/test-1.lua
271
270
```
272
271
273
-
will print on the screen something like:
272
+
will display something like:
274
273
275
274
```
276
275
1b 4c 4a 01 | Header LuaJIT 2.0 BC
@@ -320,15 +319,15 @@ will print on the screen something like:
320
319
```
321
320
322
321
This kind of output is especially useful for debugging the language toolkit itself because it does account for every byte of the bytecode and include all the sections of the bytecode.
323
-
For examples you will be able to inspect the `kgc` or `knum` sections where the prototype's constants are stored.
324
-
The output will include also the debug section in decoded form so that it can be easily inspected.
322
+
For example, you will be able to inspect the `kgc` or `knum` sections where the prototype's constants are stored.
323
+
The output will also include the debug section in decoded form so that it can be easily inspected.
325
324
326
325
Current Status
327
326
---
328
327
329
328
Currently LuaJIT Language Toolkit should be considered as beta software.
330
329
331
-
The implementation is now complete in term of features and well tested, even for the most complex cases and a complete test suite is used to verify the correctness of the generated bytecode.
330
+
The implementation is now complete in term of features and well tested, even for the most complex cases, and a complete test suite is used to verify the correctness of the generated bytecode.
332
331
333
332
The language toolkit is currently capable of executing itself.
334
333
This means that the language toolkit is able to correctly compile and load all of its module and execute them correctly.
0 commit comments