Skip to content

Conversation

basmasking
Copy link
Member

Fixes #398

Changes proposed in this pull request:

  • configured github action for ai review

@MaskingTechnology/comify

@basmasking basmasking requested a review from petermasking March 21, 2025 17:10
@basmasking basmasking linked an issue Mar 21, 2025 that may be closed by this pull request
@basmasking
Copy link
Member Author

/gemini-review

1 similar comment
@basmasking
Copy link
Member Author

/gemini-review

@basmasking
Copy link
Member Author

/gemini-review

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arquivo: .github/workflows/code-review.yml

Resumo da Revisão:

A ação do GitHub para revisão de código com Gemini AI é uma boa ideia, mas precisa de alguns ajustes para melhor funcionamento e segurança. O uso de git diff diretamente e a passagem do diff completo para a API Gemini podem causar problemas de performance e segurança, especialmente em PRs grandes.

Comentários:

  • Melhora na obtenção do diff: Ao invés de usar git diff e salvar em um arquivo, utilize a API do GitHub para obter o diff. Isso é mais eficiente e evita problemas com arquivos muito grandes. Você pode usar a actions/github-script@v6 para fazer isso em Javascript, consumindo a API do GitHub diretamente. Exemplo:
- uses: actions/github-script@v6
  id: get_diff
  with:
    github-token: ${{ secrets.GITHUB_TOKEN }}
    result-encoding: string
    script: |
      const { data: diff } = await github.rest.pulls.get({
        owner: context.repo.owner,
        repo: context.repo.repo,
        pull_number: context.issue.number,
        mediaType: { format: 'diff' }
      });
      return `pull_request_diff<<EOF\n${diff}\nEOF`;
  • Segurança - secrets.GITHUB_TOKEN com permissões mínimas: A permissão pull-requests: write é muito ampla. Recomendo restringir para pull-requests: write-comments. Isso garante que a action só possa postar comentários na PR, minimizando o impacto de potenciais vulnerabilidades.
  • Performance - Chunking do Diff (essencial): PRs grandes podem gerar diffs gigantescos, ultrapassando os limites da API do Gemini e causando falhas. Ao invés de passar o diff inteiro, divida-o em chunks menores e chame a API do Gemini para cada chunk. A lógica de divisão deve ser implementada no script Javascript. Assim, você processa a PR em partes, evitando problemas de tamanho. O pull_request_chunk_size já existe, mas não está sendo usado efetivamente porque o diff inteiro é passado de uma vez.
  • Tratamento de erros: Adicione tratamento de erros para lidar com situações como falha na obtenção do diff ou na comunicação com a API do Gemini. Isso tornará a action mais robusta.
  • Nome da action: O nome "Code Review by Gemini AI" pode ser simplificado para algo como "Gemini Code Review".
  • Versão da action rubensflinco/gemini-code-review-action: Considere usar sempre @main ou uma tag específica mais recente, ao invés de fixar em @1.0.5.
  • Remoção do \ No newline at end of file: Esse trecho indica que o arquivo não termina com uma nova linha. Adicione uma nova linha vazia ao final do arquivo.

Sugestão de implementação do chunking (Javascript dentro do actions/github-script):

// ... (código anterior para obter o diff)

const chunkSize = parseInt('${{ inputs.pull_request_chunk_size }}') || 3500;
const chunks = diff.match(new RegExp(`(.|[\r\n]){1,${chunkSize}}`, 'g'));

let combinedReview = '';

for (const chunk of chunks) {
  // Chamada para a API Gemini com o 'chunk' ao invés do 'diff' completo
  // ... (código para chamar a API com o chunk)
  // Append o resultado da API em 'combinedReview'
  combinedReview += resultadoDaAPI;
}

// Postar 'combinedReview' como comentário na PR

Implementando essas mudanças, a sua action ficará mais robusta, segura e eficiente, principalmente para PRs de grande porte.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've reviewed the file .github/workflows/code-review.yml. Here's a summary of my observations and suggestions:

Review Summary:

This workflow sets up automated code review using the Gemini AI model on every pull request. It fetches the diff of the pull request, then uses a third-party action (rubensflinco/gemini-code-review-action@1.0.5) to analyze the diff and create a review.

Issue Comments and Suggestions:

  • File: .github/workflows/code-review.yml
    • Suggestion: Consider adding error handling or retries for the git fetch commands. Network issues can cause these commands to fail, halting the workflow. You could use the continue-on-error: true option or implement a retry mechanism.

    • Suggestion: Although pull_request_chunk_size is set to 3500, it would be beneficial to add a comment explaining why this value was chosen. Is it based on experimentation with the Gemini API's limitations or other factors? This improves maintainability.

    • Suggestion: The current implementation fetches the entire Git history. Depending on the repository's size and history, this could be time-consuming. You might consider using actions/checkout@v3 with fetch-depth: 0 to fetch the entire history only if needed. Otherwise, consider limiting the fetch depth to only the relevant commits. However, this could limit the AI's ability to understand the full context of the changes.

    • Suggestion: It's good practice to pin the version of the actions used. While you are using @1.0.5 for the Gemini code review action, it's generally safer to pin to a specific commit hash rather than a tag. This ensures that the action's behavior doesn't change unexpectedly due to updates to the tag. You should check that the action is properly maintained by the owner.

    • Suggestion: Consider adding a step to check the health and validity of the Gemini API key. This can prevent workflow failures due to invalid or expired keys.

    • Suggestion: It might be useful to add a condition to the workflow to prevent it from running on certain branches (e.g., main or develop). This could avoid unnecessary reviews on merge commits.

    • Suggestion: Instead of writing the diff to a file (diff.txt) and then reading it, you can capture the output of git diff directly into the pull_request_diff variable. This eliminates the need for an intermediate file. Example:

      run: |-
        git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
        git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
        git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
        DIFF=$(git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}")
        echo "pull_request_diff<<EOF" >> $GITHUB_OUTPUT
        echo "$DIFF" >> $GITHUB_OUTPUT
        echo "EOF" >> $GITHUB_OUTPUT
    • Suggestion: Think about the cost implications of using an AI code review tool. High usage might lead to significant API costs. Implement monitoring and cost control measures, such as limiting the number of reviews per day or per repository.

    • Security Consideration: Ensure the GEMINI_API_KEY secret is properly protected and not exposed in the workflow logs. While secrets are masked in logs, it's still crucial to avoid accidental exposure.

    • Security Consideration: This workflow relies on a third-party action. It's important to carefully vet the action's code and author to ensure it doesn't introduce any security vulnerabilities. Consider using actions from trusted sources or auditing the action's code.

    • Suggestion: Consider adding metrics or logging to track the effectiveness of the AI code review. This data can be used to improve the workflow and the AI's recommendations over time.

    • Suggestion: Add a comment explaining why log_level: "DEBUG" is used. Is this for initial setup and debugging, or is it intended for ongoing use? If the latter, consider the volume of logs generated and whether it's necessary for routine operation.

I'm ready for the next file.

@basmasking
Copy link
Member Author

/gemini-review

@basmasking
Copy link
Member Author

/gemini-review

Copy link

LGTM! No issues found

Code meets quality standards

Copy link

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've reviewed the provided diff for the file .github/workflows/code-review.yml. Here's a summary of my findings and suggestions:

Review Summary:

This file introduces a new GitHub Actions workflow named "Gemini Code Review". This workflow automates code reviews on pull requests targeting the main branch using the rubensflinco/gemini-code-review-action. It fetches the diff of the pull request, feeds it to the Gemini AI model (presumably via the action), and then posts the review as a comment on the pull request. The workflow includes error handling, uses secrets for API keys and tokens, and allows for customization of the Gemini model and prompt.

Overall, this looks like a good starting point for automating code reviews. Here are some things to consider improving:

  • Security: While the API key and GitHub token are stored as secrets, ensure that the rubensflinco/gemini-code-review-action handles these secrets securely and does not inadvertently expose them in logs or elsewhere. Consider researching the reputation and security practices of the action provider.
  • Maintainability: The workflow is reasonably well-structured. However, consider adding comments to explain the purpose of each step, especially the more complex ones like fetching the diff.
  • Error Handling: The current workflow doesn't explicitly handle errors. Consider adding error handling to steps like fetching the diff or calling the Gemini API to gracefully handle failures and provide informative error messages.
  • Customization and Flexibility: Expose more parameters as workflow inputs to allow for greater customization. For example, allow users to specify the target branch, the Gemini model, or even the action to take on review findings (e.g., create a task, add a label).

File: .github/workflows/code-review.yml

Issue Comments:

  1. Clarity and Security: Action Security.

           - uses: rubensflinco/gemini-code-review-action@1.0.5
             name: "Code Review by Gemini AI"

    Suggestion: While using pre-built actions is convenient, it's crucial to vet their security. Before relying heavily on rubensflinco/gemini-code-review-action, consider:
    * Examining the action's source code: If possible, review the action's source code to understand how it handles your GEMINI_API_KEY and GITHUB_TOKEN. Look for any potential vulnerabilities such as logging secrets, insecure storage, or transmission of sensitive data.
    * Checking the action's reputation: Research the action's author and the community's feedback. Are there any known security concerns or vulnerabilities?
    * Pinning to a specific commit: Instead of @1.0.5, pin the action to a specific commit SHA. This prevents unexpected behavior changes if the action is updated with malicious code. For example: rubensflinco/gemini-code-review-action@<commit_sha>. You'll need to monitor the action for updates and manually update the commit SHA when you've reviewed and approved the changes.

  2. Improvement: Add Error Handling

             run: |-
               git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
               git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
               git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
               git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt"
               {
                 echo "pull_request_diff<<EOF";
                 cat "diff.txt";
                 echo 'EOF';
               } >> $GITHUB_OUTPUT

    Suggestion: The shell commands for fetching the diff lack error handling. If any of these commands fail (e.g., due to network issues or invalid refs), the workflow will continue, potentially with an empty diff. Add error handling using set -e and checking the exit codes of the commands. For example:

    run: |-
      set -e
      git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}" || exit 1
      git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}" || exit 1
      git checkout "${{ env.PULL_REQUEST_HEAD_REF }}" || exit 1
      git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt" || exit 1
      {
        echo "pull_request_diff<<EOF";
        cat "diff.txt";
        echo 'EOF';
      } >> $GITHUB_OUTPUT

    set -e will cause the script to exit immediately if any command fails. The || exit 1 after each Git command will also explicitly exit with a non-zero status code if the command fails.

  3. Clarity: Consider Adding Comments

             run: |-
               git fetch origin "${{ env.PULL_REQUEST_HEAD_REF }}"
               git fetch origin "${{ env.PULL_REQUEST_BASE_REF }}"
               git checkout "${{ env.PULL_REQUEST_HEAD_REF }}"
               git diff "origin/${{ env.PULL_REQUEST_BASE_REF }}" > "diff.txt"
               {
                 echo "pull_request_diff<<EOF";
                 cat "diff.txt";
                 echo 'EOF';
               } >> $GITHUB_OUTPUT

    Suggestion: While the commands are relatively straightforward, add a brief comment explaining the purpose of this section of the workflow (e.g., "Fetch the diff between the pull request head and base branches"). This will make the workflow easier to understand and maintain in the future. Also, explain the purpose of redirecting the diff into a variable.

  4. Security: Review Gemini AI output for Sensitive Information
    The Gemini AI might hallucinate or inadvertently include sensitive information in its review comments (e.g., internal URLs, API keys, or code snippets that should not be public). You should have monitoring in place to detect and redact such information. This could involve a post-processing step that scans the Gemini AI's output for potential secrets or sensitive data before posting the comment to the pull request.

I'm ready for the next file diff.

@basmasking
Copy link
Member Author

The number of gemini ai review actions are limited and the review seems not too good for now. We will build something ourselves if we see the added benefit (which is not at this point in time)

@basmasking basmasking closed this Mar 21, 2025
@basmasking basmasking deleted the 398-ai-code-review-in-ci branch June 13, 2025 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AI code review in CI

1 participant