|
1 | | -# code_graph |
| 1 | +# Code Graph |
| 2 | +------------------------------------------------ |
| 3 | +> Fast program graph generation in Python |
| 4 | +
|
| 5 | +Many Programming Language Processing (PLP) exploit the fact that programming languages are highly structured. Therefore, it is easy to parse a program |
| 6 | +into an abstract syntax tree, analyse its control flow and track data flow |
| 7 | +relations between variables (the last one is a bit more harder :D). |
| 8 | + |
| 9 | +**code.graph** provides easy access to graph representations of program codes for code written in Java and Python. The library is mainly designed to replicate the graph representation introduced in [Self-Supervised Bug Detection and Repair](https://arxiv.org/abs/2105.12787) for Python and published in NeurIPS21 by Allamanis et al. |
| 10 | +Therefore, the implementation is close to the implementation used by the original |
| 11 | +[authors](https://github.com/microsoft/neurips21-self-supervised-bug-detection-and-repair). |
| 12 | + |
| 13 | +**Note:** This implementation does not compute interprocedural relations currently and its main purpose is parsing the implementation of single functions. |
| 14 | + |
| 15 | + |
| 16 | +## Installation |
| 17 | +The package is tested under Python 3. It can be installed by cloning this repository and installing the package via: |
| 18 | +``` |
| 19 | +pip install -e . |
| 20 | +``` |
| 21 | + |
| 22 | +## Usage |
| 23 | +code.graph can be used to transform Java and Python program into a graph representation with a few lines of code: |
| 24 | +```python |
| 25 | +import code_graph as cg |
| 26 | + |
| 27 | +# Python |
| 28 | +cg.codegraph( |
| 29 | + ''' |
| 30 | + def my_func(): |
| 31 | + print("Hello World") |
| 32 | + ''', |
| 33 | +lang = "python") |
| 34 | + |
| 35 | +# Output: PythonCodeGraph(19), a graph with 19 nodes |
| 36 | + |
| 37 | +# Java |
| 38 | +cg.codegraph( |
| 39 | + ''' |
| 40 | + public static void main(String[] args){ |
| 41 | + System.out.println("Hello World"); |
| 42 | + } |
| 43 | + ''', |
| 44 | +lang = "java", |
| 45 | +syntax_error = "ignore") |
| 46 | + |
| 47 | +# Output: JavaCodeGraph(32) |
| 48 | + |
| 49 | +``` |
| 50 | +Further, you can easily traverse the code graph, e.g. via depth-first search: |
| 51 | +```python |
| 52 | +graph = cg.codegraph(...) |
| 53 | + |
| 54 | +dfs_stack = [graph.root_node] # Root of the parsed AST |
| 55 | +while len(dfs_tack) > 0: |
| 56 | + node = dfs_stack.pop(-1) |
| 57 | + |
| 58 | + for current, edge_type, next_node in node.successors(): |
| 59 | + dfs_stack.append(next_node) |
| 60 | + |
| 61 | +``` |
| 62 | +Alternatively, you can also export the graph int Dot Format by: |
| 63 | +```python |
| 64 | +graph.todot("file_name.dot") |
| 65 | +``` |
| 66 | + |
| 67 | +## Project Info |
| 68 | +This is currently developed as a helper library for internal research projects. Therefore, it will only be updated as needed. |
| 69 | + |
| 70 | +Feel free to open an issue if anything unexpected |
| 71 | +happens. |
| 72 | + |
| 73 | +Distributed under the MIT license. See ``LICENSE`` for more information. |
| 74 | + |
| 75 | +We thank the developer of [tree-sitter](https://tree-sitter.github.io/tree-sitter/) library. Without tree-sitter this project would not be possible. |
0 commit comments