🚀 AllianceCoder: What to Retrieve for Effective Retrieval-Augmented Code Generation? 🔍 An Empirical Study and Beyond
🖥️ AllianceCoder consists of three key phases:
✅ We utilize large language models (LLMs) to generate natural language descriptions for each API present in the repository.
✅ These descriptions are then encoded into vector representations using pre-trained embedding models.
🔍 We guide LLMs with carefully designed examples to generate descriptions of potentially invoked API functionalities.
🔍 These descriptions are similarly encoded into vector representations.
🤖 Relevant APIs are retrieved based on the cosine similarity between their vector representations.
🤖 The retrieved APIs provide valuable context for enhanced code generation.
|-- 📁 repo_funcs_summary # 🛠️ First step: Extract and summarize repository functions
|
|-- 📁 ask_dependencies # 🔗 Second step: Identify function dependencies
|
|-- 📁 similarity_retrieval # 🔍 Third step: Retrieve the most relevant APIs
|-- 📁 function_list_buildup # 📜 Build a function list for further reference
|-- 📄 final_completion.py # 🏗️ Generate complete code using retrieved APIs
|
|-- 🚀 run_pipeline.py # ▶️ Execute the full pipeline
📁 empirical_study # Code for empierical study mentioned in our paper
|-- 📁API # code for API retrieval and context+API retrieval
|-- 📁LLM # code for context retrieval and only LLM generation(without retrieval)
|-- 📁SimCode # code for similarity retrieval and simCode+context retrieval
|-- 📁SimCode+API # code for SimCode+API retrieval and context+SimCOde+API retrieval
To get started, make sure your input.jsonl
file and the corresponding input repository are correctly set up within your input
folder.
If you're looking to use RepoExec, CoderEval, or ExecRepoBench, follow the specific instructions below to prepare your data.
To configure for RepoExec, you'll need to modify input/handle_input+RepoExec+Context.py
.
Change Line 6 from test-apps/httpie
to the name of your project, then execute the Python file:
filtered_data = [item for item in ds if item['project'] == 'your-project-name']
For CoderEval, adjust input/handle_input+CoderEval+Context.py
.
Change Line 8 from SoftwareHeritage/swh-lister
to your project's name, and then run the Python file:
if record.get("project") != "your-project-name":
When using ExecRepoBench, you'll edit input/handle_input+ExecRepoBench+Context.py
.
Change Line 17 from algorithms
to your project's name, and then execute the Python file:
if repo_name != "your-project-name":
Run the following command to install all required packages:
pip install contexttimer flask transformers_stream_generator colorama accelerate python-Levenshtein tqdm sentence_transformers flash_attn
In line 55 of repo_funcs_summary/repo_funcs_extraction.py, change the repo_path string into your repository path.
repo_path = 'input/string_utils' # change your path here
python run_pipeline.py
In our paper, we evaluate AllianceCoder using three function-level Python code generation benchmarks: RepoExec, CoderEval, and ExecRepoBench. Each dataset presents unique challenges relevant to real-world repository-level code completion.
Focus: Repository-level code completion with complex contextual dependencies.
-
✅ Evaluates the ability to generate functionally correct and executable code while utilizing cross-file contexts.
-
🧩 Each task includes developer-specified code dependencies and comprehensive test cases.
-
📚 Ideal for assessing how well models understand and integrate repository-wide knowledge.
-
📦 Dataset: Hugging Face – RepoExec
-
📄 Paper: arXiv:2406.11927v2
Focus: Pragmatic function-level code generation across real-world tasks.
-
🛠️ Contains 230 Python and 230 Java tasks sampled from open-source repositories.
-
🧾 Each task provides:
- A function signature
- A natural language description
- A reference solution
- Unit tests for functional verification
-
📦 Dataset: GitHub – CoderEval
-
📄 Paper: ACM DL
Focus: Code completion benchmark with AST-guided multi-level masking.
-
🧠 Based on 1,200 samples from real Python repositories.
-
🧩 Simulates statement, expression, and function-level masking guided by abstract syntax trees (ASTs).
-
⚙️ Originally designed for block completion; we adapt it for function-level generation:
- Reviewed the 167 test samples
- Selected and transformed those suitable into full function-level generation tasks
-
📦 Codebase: ExecRepoBench
-
📄 Paper: arXiv:2412.11990
-
📁 Modified Data File:
input/execrepobench_data.jsonl
(included in this repo)