Skip to content

Commit 337c581

Browse files
committed
Merge pull request #23 from IonoclastBrigham/patch-1
Fix grammar and flow.
2 parents 92cf17f + 942f8eb commit 337c581

File tree

1 file changed

+49
-50
lines changed

1 file changed

+49
-50
lines changed

README.md

Lines changed: 49 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,28 @@ LuaJIT Language Toolkit
22
===
33

44
The LuaJIT Language Toolkit is an implementation of the Lua programming language written in Lua itself.
5-
It works by generating a LuaJIT's bytecode including the debug informations and use the LuaJIT's virtual machine to run the generated bytecode.
5+
It works by generating LuaJIT bytecode, including debug information, and uses LuaJIT's virtual machine to run the generated bytecode.
66

7-
On itself the language toolkit does not do anything useful since LuaJIT itself does the same things natively.
8-
The purpose of the language toolkit is to provide a starting point to implement a programming language that target the LuaJIT virtual machine.
7+
On its own, the language toolkit does not do anything useful, since LuaJIT itself does the same things natively.
8+
The purpose of the language toolkit is to provide a starting point to implement a programming language that targets the LuaJIT virtual machine.
99

10-
With the LuaJIT Language Toolkit is easy to create a new language or modify the Lua language because the parser is cleanly separated from the bytecode generator and the virtual machine.
10+
With the LuaJIT Language Toolkit, it is easy to create a new language or modify the Lua language because the parser is cleanly separated from the bytecode generator and the virtual machine.
1111

12-
The toolkit implement actually a complete pipeline to parse a Lua program, generate an AST tree and generate the bytecode.
12+
The toolkit implements a complete pipeline to parse a Lua program, generate an AST, and generate the corresponding bytecode.
1313

1414
Lexer
1515
---
1616

1717
Its role is to recognize lexical elements from the program text.
18-
It does take the text of the program as input and does produce a flow of "tokens".
18+
It takes the text of the program as input and produces a stream of "tokens" as its output.
1919

20-
Using the language toolkit you can run the lexer only to examinate the flow of tokens:
20+
Using the language toolkit you can run the lexer only, to examinate the stream of tokens:
2121

2222
```
2323
luajit run-lexer.lua tests/test-1.lua
2424
```
2525

26-
The command above generate for the following code fragment:
26+
The command above will lex the following code fragment:
2727

2828
```lua
2929
local x = {}
@@ -32,7 +32,7 @@ for k = 1, 10 do
3232
end
3333
```
3434

35-
to obtain a list of the tokens:
35+
...to generate the list of tokens:
3636

3737
TK_local
3838
TK_name x
@@ -58,36 +58,36 @@ to obtain a list of the tokens:
5858
TK_number 1
5959
TK_end
6060

61-
Each line represent a token where the first element is the kind of token and the second element is its value, if any.
61+
Each line represents a token where the first element is the kind of token and the second element is its value, if any.
6262

6363
The Lexer's code is an almost literal translation of the LuaJIT's lexer.
6464

6565
Parser
6666
---
6767

68-
The parser takes the flow of tokens as given by the lexer and forms the statements and expressions according to the language's grammar.
69-
The parser is based on a list of parsing rules that are invoked each time a the input match a given rule.
70-
When the input match a rule a corresponding function in the AST module is called to build an AST node.
71-
The generated nodes in turns are passed as arguments to the other parsing rules until the whole program is parsed and a complete AST tree is built for the program text.
68+
The parser takes the token stream from the lexer and builds statements and expressions according to the language's grammar.
69+
The parser is based on a list of parsing rules that are invoked each time the input matches a given rule.
70+
When the input matches a rule, a corresponding function in the AST (abstract syntax tree) module is called to build an AST node.
71+
The generated nodes in turns are passed as arguments to the other parsing rules until the whole program is parsed and a complete AST is built for the program text.
7272

73-
The AST tree is very useful since it does abstract the structure of the program and is more easy to manipulate.
73+
The AST is very useful as an abstraction of the structure of the program, and is easier to manipulate.
7474

75-
What distinguish the language toolkit from LuaJIT is that the parser phase does generate an AST tree and the bytecode generation is done in a separate phase only when the AST tree is completely generated.
75+
What distinguishes the language toolkit from LuaJIT is that the parser phase generates an AST, and the bytecode generation is done in a separate phase only when the AST is complete.
7676

7777
LuaJIT itself operates differently.
7878
During the parsing phase it does not generate any AST but instead the bytecode is directly generated and loaded into the memory to be executed by the VM.
79-
This means that LuaJIT's C implementation perform the three operations:
79+
This means that LuaJIT's C implementation performs the three operations:
8080

8181
- parse the program text
8282
- generate the bytecode
8383
- load the bytecode into memory
8484

8585
in one single pass.
86-
This approach is remarkable on itself and very efficient but it makes difficult to modify or extend the programming language.
86+
This approach is remarkable and very efficient, but makes it difficult to modify or extend the programming language.
8787

8888
### Parsing Rule example ###
8989

90-
To illustrate how the parsing work in the language toolkit let us make an example.
90+
To illustrate how parsing works in the language toolkit, let us make an example.
9191
The grammar rule for the "return" statement is:
9292

9393
```
@@ -113,55 +113,55 @@ local function parse_return(ast, ls, line)
113113
end
114114
```
115115

116-
As you can see the AST function are invoked using the `ast` object.
116+
As you can see, the AST functions are invoked using the `ast` object.
117117

118-
In addition the parser provides additional informations about:
118+
In addition, the parser provides information about:
119119

120120
* the function prototype
121121
* the syntactic scope
122122

123-
The first is used to keep trace of some informations about the current function parsed.
123+
The first is used to keep track of some information about the current function being parsed.
124124

125-
The syntactic scope rules tell to the user's rule when a new syntactic block begins or end.
125+
The syntactic scope rules tell the user's rule when a new syntactic block begins or end.
126126
Currently this is not really used by the AST builder but it can be useful for other implementations.
127127

128128
The Abstract Syntax Tree (AST)
129129
---
130130

131-
The abstract syntax tree represent the whole Lua program with all the informations.
131+
The abstract syntax tree represent the whole Lua program, with all the information the parser has gathered about it.
132132

133-
One possible approach to implement a new programming language is to generate an AST tree that correspond to the target programming language and to transform the tree in a Lua's AST tree in a separate phase.
133+
One possible approach to implement a new programming language is to generate an AST that more closely corresponds to the target programming language, and then transform the tree into a Lua AST in a separate phase.
134134

135-
Another possible approach is to act from the parser itself and directly generate the appropriate Lua AST nodes.
135+
Another possible approach is to directly generate the appropriate Lua AST nodes from the parser itself.
136136

137-
Currently the language toolkit does not perform any transformation and just pass the AST tree to the bytecode generator module.
137+
Currently the language toolkit does not perform any additional transformations, and just passes the AST to the bytecode generator module.
138138

139139
Bytecode Generator
140140
---
141141

142-
Once the AST tree is generated it can be feeded to the bytecode generator module that will generate the corresponding LuaJIT bytecode.
142+
Once the AST is generated, it can be fed to the bytecode generator module, which will generate the corresponding LuaJIT bytecode.
143143

144144
The bytecode generator is based on the original work of Richard Hundt for the Nyanga programming language.
145-
It was largely modified by myself to produce optimized code similar to what LuaJIT generate itself.
146-
A lot of work was also done to ensure the correctness of the bytecode and of the debug informations.
145+
It was largely modified by myself to produce optimized code similar to what LuaJIT would generate, itself.
146+
A lot of work was also done to ensure the correctness of the bytecode and of the debug information.
147147

148148
Alternative Lua Code generator
149149
------------------------------
150150

151-
Instead of passing the AST tree to the bytecode generator an alternative module can be used to generate Lua code.
151+
Instead of passing the AST to the bytecode generator, an alternative module can be used to generate Lua code.
152152
The module is called "luacode-generator" and can be used exactly like the bytecode generator.
153153

154-
The Lua code generator has the advantage of being more simple and more safe as the code is parsed directly by LuaJIT ensuring from the beginning complete compatibility of the bytecode.
154+
The Lua code generator has the advantage of being more simple and more safe as the code is parsed directly by LuaJIT, ensuring from the beginning complete compatibility of the bytecode.
155155

156156
Currently the Lua Code Generator backend does not preserve the line numbers of the original source code. This is meant to be fixed in the future.
157157

158158
Use this backend instead of the bytecode generator if you prefer to have a more safe backend to convert the Lua AST to code.
159-
The module can be used also to pretty-printing a Lua AST tree since the code itself is probably the most human readable representation of the AST tree.
159+
The module can also be used for pretty-printing a Lua AST, since the code itself is probably the most human readable representation of the AST.
160160

161161
C API
162162
---
163163

164-
The language toolkit does provide a very simple set of C API to implement a custom language.
164+
The language toolkit provides a very simple set of C APIs to implement a custom language.
165165
The functions provided by the C API are:
166166

167167
```c
@@ -187,7 +187,7 @@ When the function `language_*` is used, an independent `lua_State` is created be
187187
Once the bytecode is generated it is loaded into the user's `lua_State` ready to be executed.
188188
The approach of using a separate Lua's state ensure that the process of compiling does not interfere with the user's application.
189189
190-
It should be noted that even when an executable is created with the C API the lang/* Lua files need to be available at run time because they are used by the language toolkit's Lua state.
190+
It should be noted that even when an executable is created with the C API, the lang/* Lua files need to be available at run time because they are used by the language toolkit's Lua state.
191191
192192
Running the Application
193193
---
@@ -200,10 +200,9 @@ luajit run.lua [lua-options] <filename>
200200
201201
The "run.lua" script will just invoke the complete pipeline of the lexer, parser and bytecode generator and it will pass the bytecode to luajit with "loadstring".
202202
203-
The language toolkit provide also a customized executable named `luajit-x` that use the language toolkit's toolchain instead of the native one.
204-
Otherwise the program `luajit-x` works exactly as luajit itself and accept the same options.
205-
206-
This means that you can experiment with the language by modifying the Lua implementation of the language and test the changes immediately without recompiling anything by using `luajit-x` as a REPL.
203+
The language toolkit also provides a customized executable named `luajit-x` that uses the language toolkit's pipeline instead of the native one.
204+
Otherwise, the program `luajit-x` works exactly the same as `luajit` itself, and accepts the same options.
205+
This means that you can experiment with the language by modifying the toolkit's implementation, and test the changes immediately without recompiling anything by using `luajit-x` as a REPL.
207206
208207
### Generated Bytecode ###
209208
@@ -216,16 +215,16 @@ For example you can inspect the bytecode using the following command:
216215
luajit run.lua -bl tests/test-1.lua
217216
```
218217
219-
or in alternative:
218+
or alternatively:
220219
221220
```
222221
./src/luajit-x -bl tests/test-1.lua
223222
```
224223
225224
where we suppose that you are running `luajit-x` from the language toolkit's root directory.
226-
This is somewhat *required* since the `luajit-x` programe needs to found the lang/* Lua modules when is executed.
225+
This is somewhat *required* since the `luajit-x` program needs to be able to find the lang/* Lua modules when is executed.
227226
228-
Either way, when you use one of the two commands above to generate the bytecode you will obtain on the screen:
227+
Either way, when you use one of the two commands above to generate the bytecode you will the see following on the screen:
229228
230229
```
231230
-- BYTECODE -- "test-1.lua":0-7
@@ -255,13 +254,13 @@ You can compare it with the bytecode generated natively by LuaJIT using the comm
255254
luajit -bl tests/test-1.lua
256255
```
257256
258-
In the example above the generated bytecode will be *identical* to those generated by LuaJIT.
259-
This is not an hazard since the Language Toolkit's bytecode generator is designed to produce the same bytecode that LuaJIT itself would generate.
260-
Yet in some cases the generated code will differ but this is not considered a problem as long as the generated code is still correct.
257+
In the example above the generated bytecode will be *identical* to that generated by LuaJIT.
258+
This is not an accident, since the Language Toolkit's bytecode generator is designed to produce the same bytecode that LuaJIT itself would generate.
259+
In some cases, the generated code will differ. But, this is not considered a big problem as long as the generated code is still semantically correct.
261260
262261
### Bytecode Annotated Dump ###
263262
264-
In addition to the standard LuaJIT bytecode functions the language toolkit support also a special debug mode where the bytecode in printed byte-by-byte in hex format with some annotations on the right side of the screen.
263+
In addition to the standard LuaJIT bytecode functions, the language toolkit also supports a special debug mode where the bytecode is printed byte-by-byte in hex format with some annotations on the right side of the screen.
265264
The annotations will explain the meaning of each chunk of bytes and decode them as appropriate.
266265
267266
For example:
@@ -270,7 +269,7 @@ For example:
270269
luajit run.lua -bx tests/test-1.lua
271270
```
272271
273-
will print on the screen something like:
272+
will display something like:
274273
275274
```
276275
1b 4c 4a 01 | Header LuaJIT 2.0 BC
@@ -320,15 +319,15 @@ will print on the screen something like:
320319
```
321320
322321
This kind of output is especially useful for debugging the language toolkit itself because it does account for every byte of the bytecode and include all the sections of the bytecode.
323-
For examples you will be able to inspect the `kgc` or `knum` sections where the prototype's constants are stored.
324-
The output will include also the debug section in decoded form so that it can be easily inspected.
322+
For example, you will be able to inspect the `kgc` or `knum` sections where the prototype's constants are stored.
323+
The output will also include the debug section in decoded form so that it can be easily inspected.
325324
326325
Current Status
327326
---
328327
329328
Currently LuaJIT Language Toolkit should be considered as beta software.
330329
331-
The implementation is now complete in term of features and well tested, even for the most complex cases and a complete test suite is used to verify the correctness of the generated bytecode.
330+
The implementation is now complete in term of features and well tested, even for the most complex cases, and a complete test suite is used to verify the correctness of the generated bytecode.
332331
333332
The language toolkit is currently capable of executing itself.
334333
This means that the language toolkit is able to correctly compile and load all of its module and execute them correctly.

0 commit comments

Comments
 (0)