Skip to content

Commit 2865737

Browse files
committed
Documentation of the API function
1 parent 739868e commit 2865737

File tree

2 files changed

+52
-2
lines changed

2 files changed

+52
-2
lines changed

code_graph/__init__.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,58 @@
1010

1111

1212
def codegraph(source_code, lang = "guess", analyses = None, **kwargs):
13+
"""
14+
Transforms source code into an annotated AST.
15+
16+
Given source code as string, this function quickly transforms
17+
the given code into an annotated AST. The AST is annotated with multiple
18+
(configurable) relations like control flow and data flow.
19+
The function uses tree-sitter as a backend. Therefore, this
20+
function can in theory support most programming languages (see README).
21+
However, since control flow and data flow have to be tailored to a specific
22+
language only Java and Python are supported at the moment.
23+
24+
All transformations are based on the transformations used in
25+
'Self-Supervised Bug Detection and Repair' (Allamanis et al., 2021).
26+
The original implementation for Python can be found here:
27+
https://github.com/microsoft/neurips21-self-supervised-bug-detection-and-repair
28+
Note that interprocedural analysis (and relations) are currently not supported.
29+
30+
31+
Parameters
32+
----------
33+
source_code : str
34+
Source code to parsed as a string. Also
35+
supports parsing of incomplete source code
36+
snippets (by deactivating the syntax checker; see syntax_error)
37+
38+
lang : [python, java]
39+
String identifier of the programming language
40+
to be parsed. Supported are most programming languages
41+
including python, java and javascript (see README)
42+
Default: guess (Guesses language / Not supported currently throws error currently)
43+
44+
analyses: list of [ast, cfg, dataflow, subcfg]
45+
The analyses that should be applied during parsing the source code and
46+
the relations included the output.
47+
ast: Include relations based on the abstract syntax tree (the AST is always computed)
48+
cfg: Relations related to the control flow in the program (on a statement level)
49+
dataflow: Relations related to the data flow between variables
50+
subcfg: Relations related to the control flow (on a subexpression level)
51+
52+
syntax_error : [raise, warn, ignore]
53+
Reaction to syntax error in code snippet.
54+
raise: raises a Syntax Error
55+
warn: prints a warning to console
56+
ignore: Ignores syntax errors. Helpful for parsing code snippets.
57+
Default: raise
58+
59+
Returns
60+
-------
61+
SourceCodeGraph
62+
A labelled multi graph representing the given source code
63+
"""
64+
1365
root_node, tokens = preprocess_code(source_code, lang, **kwargs)
1466

1567
graph_analyses = load_lang_analyses(tokens[0].config.lang)

code_graph/graph.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
from io import StringIO
22
from collections import defaultdict
33

4-
from itertools import chain
5-
64
from code_tokenize.tokens import Token
75

86

0 commit comments

Comments
 (0)