chore: ai code review #399

basmasking · 2025-03-21T17:10:44Z

Fixes #398

Changes proposed in this pull request:

configured github action for ai review

@MaskingTechnology/comify

basmasking · 2025-03-21T18:47:57Z

/gemini-review

basmasking · 2025-03-21T18:56:58Z

/gemini-review

basmasking · 2025-03-21T18:58:23Z

/gemini-review

github-actions

Arquivo: .github/workflows/code-review.yml

Resumo da Revisão:

A ação do GitHub para revisão de código com Gemini AI é uma boa ideia, mas precisa de alguns ajustes para melhor funcionamento e segurança. O uso de git diff diretamente e a passagem do diff completo para a API Gemini podem causar problemas de performance e segurança, especialmente em PRs grandes.

Comentários:

Melhora na obtenção do diff: Ao invés de usar git diff e salvar em um arquivo, utilize a API do GitHub para obter o diff. Isso é mais eficiente e evita problemas com arquivos muito grandes. Você pode usar a actions/github-script@v6 para fazer isso em Javascript, consumindo a API do GitHub diretamente. Exemplo:

- uses: actions/github-script@v6
  id: get_diff
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}
    result-encoding: string
    script: |
      const { data: diff } = await github.rest.pulls.get({
        owner: context.repo.owner,
        repo: context.repo.repo,
        pull_number: context.issue.number,
        mediaType: { format: 'diff' }
      });
      return `pull_request_diff<<EOF\n${diff}\nEOF`;

Segurança - secrets.GITHUB_TOKEN com permissões mínimas: A permissão pull-requests: write é muito ampla. Recomendo restringir para pull-requests: write-comments. Isso garante que a action só possa postar comentários na PR, minimizando o impacto de potenciais vulnerabilidades.
Performance - Chunking do Diff (essencial): PRs grandes podem gerar diffs gigantescos, ultrapassando os limites da API do Gemini e causando falhas. Ao invés de passar o diff inteiro, divida-o em chunks menores e chame a API do Gemini para cada chunk. A lógica de divisão deve ser implementada no script Javascript. Assim, você processa a PR em partes, evitando problemas de tamanho. O pull_request_chunk_size já existe, mas não está sendo usado efetivamente porque o diff inteiro é passado de uma vez.
Tratamento de erros: Adicione tratamento de erros para lidar com situações como falha na obtenção do diff ou na comunicação com a API do Gemini. Isso tornará a action mais robusta.
Nome da action: O nome "Code Review by Gemini AI" pode ser simplificado para algo como "Gemini Code Review".
Versão da action rubensflinco/gemini-code-review-action: Considere usar sempre @main ou uma tag específica mais recente, ao invés de fixar em @1.0.5.
Remoção do \ No newline at end of file: Esse trecho indica que o arquivo não termina com uma nova linha. Adicione uma nova linha vazia ao final do arquivo.

Sugestão de implementação do chunking (Javascript dentro do actions/github-script):

// ... (código anterior para obter o diff)

const chunkSize = parseInt('${{ inputs.pull_request_chunk_size }}') || 3500;
const chunks = diff.match(new RegExp(`(.|[\r\n]){1,${chunkSize}}`, 'g'));

let combinedReview = '';

for (const chunk of chunks) {
  // Chamada para a API Gemini com o 'chunk' ao invés do 'diff' completo
  // ... (código para chamar a API com o chunk)
  // Append o resultado da API em 'combinedReview'
  combinedReview += resultadoDaAPI;
}

// Postar 'combinedReview' como comentário na PR

Implementando essas mudanças, a sua action ficará mais robusta, segura e eficiente, principalmente para PRs de grande porte.

github-actions

Okay, I've reviewed the file .github/workflows/code-review.yml. Here's a summary of my observations and suggestions:

Review Summary:

This workflow sets up automated code review using the Gemini AI model on every pull request. It fetches the diff of the pull request, then uses a third-party action (rubensflinco/gemini-code-review-action@1.0.5) to analyze the diff and create a review.

Issue Comments and Suggestions:

File: .github/workflows/code-review.yml
- Suggestion: Consider adding error handling or retries for the git fetch commands. Network issues can cause these commands to fail, halting the workflow. You could use the continue-on-error: true option or implement a retry mechanism.
- Suggestion: Although pull_request_chunk_size is set to 3500, it would be beneficial to add a comment explaining why this value was chosen. Is it based on experimentation with the Gemini API's limitations or other factors? This improves maintainability.
- Suggestion: The current implementation fetches the entire Git history. Depending on the repository's size and history, this could be time-consuming. You might consider using actions/checkout@v3 with fetch-depth: 0 to fetch the entire history only if needed. Otherwise, consider limiting the fetch depth to only the relevant commits. However, this could limit the AI's ability to understand the full context of the changes.
- Suggestion: It's good practice to pin the version of the actions used. While you are using @1.0.5 for the Gemini code review action, it's generally safer to pin to a specific commit hash rather than a tag. This ensures that the action's behavior doesn't change unexpectedly due to updates to the tag. You should check that the action is properly maintained by the owner.
- Suggestion: Consider adding a step to check the health and validity of the Gemini API key. This can prevent workflow failures due to invalid or expired keys.
- Suggestion: It might be useful to add a condition to the workflow to prevent it from running on certain branches (e.g., main or develop). This could avoid unnecessary reviews on merge commits.
- Suggestion: Instead of writing the diff to a file (diff.txt) and then reading it, you can capture the output of git diff directly into the pull_request_diff variable. This eliminates the need for an intermediate file. Example:
```
run: |-
  git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
  git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
  git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
  DIFF=$(git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}")
  echo "pull_request_diff<<EOF" >> $GITHUB_OUTPUT
  echo "$DIFF" >> $GITHUB_OUTPUT
  echo "EOF" >> $GITHUB_OUTPUT
```
- Suggestion: Think about the cost implications of using an AI code review tool. High usage might lead to significant API costs. Implement monitoring and cost control measures, such as limiting the number of reviews per day or per repository.
- Security Consideration: Ensure the GEMINI_API_KEY secret is properly protected and not exposed in the workflow logs. While secrets are masked in logs, it's still crucial to avoid accidental exposure.
- Security Consideration: This workflow relies on a third-party action. It's important to carefully vet the action's code and author to ensure it doesn't introduce any security vulnerabilities. Consider using actions from trusted sources or auditing the action's code.
- Suggestion: Consider adding metrics or logging to track the effectiveness of the AI code review. This data can be used to improve the workflow and the AI's recommendations over time.
- Suggestion: Add a comment explaining why log_level: "DEBUG" is used. Is this for initial setup and debugging, or is it intended for ongoing use? If the latter, consider the volume of logs generated and whether it's necessary for routine operation.

I'm ready for the next file.

basmasking · 2025-03-21T20:53:20Z

/gemini-review

basmasking · 2025-03-21T20:57:24Z

/gemini-review

github-actions · 2025-03-21T21:05:20Z

✅ LGTM! No issues found

Code meets quality standards

sonarqubecloud · 2025-03-21T21:18:21Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions

Okay, I've reviewed the provided diff for the file .github/workflows/code-review.yml. Here's a summary of my findings and suggestions:

Review Summary:

This file introduces a new GitHub Actions workflow named "Gemini Code Review". This workflow automates code reviews on pull requests targeting the main branch using the rubensflinco/gemini-code-review-action. It fetches the diff of the pull request, feeds it to the Gemini AI model (presumably via the action), and then posts the review as a comment on the pull request. The workflow includes error handling, uses secrets for API keys and tokens, and allows for customization of the Gemini model and prompt.

Overall, this looks like a good starting point for automating code reviews. Here are some things to consider improving:

Security: While the API key and GitHub token are stored as secrets, ensure that the rubensflinco/gemini-code-review-action handles these secrets securely and does not inadvertently expose them in logs or elsewhere. Consider researching the reputation and security practices of the action provider.
Maintainability: The workflow is reasonably well-structured. However, consider adding comments to explain the purpose of each step, especially the more complex ones like fetching the diff.
Error Handling: The current workflow doesn't explicitly handle errors. Consider adding error handling to steps like fetching the diff or calling the Gemini API to gracefully handle failures and provide informative error messages.
Customization and Flexibility: Expose more parameters as workflow inputs to allow for greater customization. For example, allow users to specify the target branch, the Gemini model, or even the action to take on review findings (e.g., create a task, add a label).

File: .github/workflows/code-review.yml

Issue Comments:

Clarity and Security: Action Security.
```
       - uses: rubensflinco/gemini-code-review-action@1.0.5
         name: "Code Review by Gemini AI"
```
Suggestion: While using pre-built actions is convenient, it's crucial to vet their security. Before relying heavily on rubensflinco/gemini-code-review-action, consider:
* Examining the action's source code: If possible, review the action's source code to understand how it handles your GEMINI_API_KEY and GITHUB_TOKEN. Look for any potential vulnerabilities such as logging secrets, insecure storage, or transmission of sensitive data.
* Checking the action's reputation: Research the action's author and the community's feedback. Are there any known security concerns or vulnerabilities?
* Pinning to a specific commit: Instead of @1.0.5, pin the action to a specific commit SHA. This prevents unexpected behavior changes if the action is updated with malicious code. For example: rubensflinco/gemini-code-review-action@<commit_sha>. You'll need to monitor the action for updates and manually update the commit SHA when you've reviewed and approved the changes.

Improvement: Add Error Handling

         run: |-
           git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
           git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
           git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
           git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt"
           {
             echo "pull_request_diff<<EOF";
             cat "diff.txt";
             echo 'EOF';
           } >> $GITHUB_OUTPUT

Suggestion: The shell commands for fetching the diff lack error handling. If any of these commands fail (e.g., due to network issues or invalid refs), the workflow will continue, potentially with an empty diff. Add error handling using set -e and checking the exit codes of the commands. For example:

run: |-
  set -e
  git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}" || exit 1
  git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}" || exit 1
  git checkout "${{ env.PULL_REQUEST_HEAD_REF }}" || exit 1
  git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt" || exit 1
  {
    echo "pull_request_diff<<EOF";
    cat "diff.txt";
    echo 'EOF';
  } >> $GITHUB_OUTPUT

set -e will cause the script to exit immediately if any command fails. The || exit 1 after each Git command will also explicitly exit with a non-zero status code if the command fails.

Clarity: Consider Adding Comments

         run: |-
           git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
           git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
           git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
           git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt"
           {
             echo "pull_request_diff<<EOF";
             cat "diff.txt";
             echo 'EOF';
           } >> $GITHUB_OUTPUT

Suggestion: While the commands are relatively straightforward, add a brief comment explaining the purpose of this section of the workflow (e.g., "Fetch the diff between the pull request head and base branches"). This will make the workflow easier to understand and maintain in the future. Also, explain the purpose of redirecting the diff into a variable.

Security: Review Gemini AI output for Sensitive Information
The Gemini AI might hallucinate or inadvertently include sensitive information in its review comments (e.g., internal URLs, API keys, or code snippets that should not be public). You should have monitoring in place to detect and redact such information. This could involve a post-processing step that scans the Gemini AI's output for potential secrets or sensitive data before posting the comment to the pull request.

I'm ready for the next file diff.

basmasking · 2025-03-21T21:35:46Z

The number of gemini ai review actions are limited and the review seems not too good for now. We will build something ourselves if we see the added benefit (which is not at this point in time)

#398: first attempt

af69302

basmasking requested a review from petermasking March 21, 2025 17:10

basmasking linked an issue Mar 21, 2025 that may be closed by this pull request

AI code review in CI #398

Closed

basmasking added 2 commits March 21, 2025 19:42

test change

a84eb9f

#399: second attempt

2503d1e

#398: restore incorrect code

8891d4f

#398: different action

0d38ea7

github-actions bot reviewed Mar 21, 2025

View reviewed changes

basmasking added 2 commits March 21, 2025 20:12

#398: try again

f4a27b4

#398: attempt 4

fd89686

github-actions bot reviewed Mar 21, 2025

View reviewed changes

#398: manual trigger review

a87712a

#398: rename workflow

3080d85

#398: try again

25610d9

#398: try again

17efaf3

github-actions bot reviewed Mar 21, 2025

View reviewed changes

basmasking closed this Mar 21, 2025

basmasking deleted the 398-ai-code-review-in-ci branch June 13, 2025 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: ai code review #399

chore: ai code review #399

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

github-actions bot commented Mar 21, 2025

Uh oh!

sonarqubecloud bot commented Mar 21, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chore: ai code review #399

chore: ai code review #399

Uh oh!

Conversation

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

github-actions bot commented Mar 21, 2025

Uh oh!

sonarqubecloud bot commented Mar 21, 2025

Quality Gate passed

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

basmasking commented Mar 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant