Skip to content

docs(site): add AI red teaming for first-timers blog post #5017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 26, 2025

Conversation

mldangelo
Copy link
Member

This PR adds a new blog post about AI red teaming for beginners.

Changes

  • Add new blog post: 'AI Red Teaming for complete first-timers'
  • Add hero image generated with the generate-blog-image.js script
  • Convert HTML article to proper markdown format
  • Include tables from the original markdown source

The blog post covers:

  • Introduction to AI red teaming
  • Comparison with traditional red teaming
  • Evolution stages of red team practice (Levels 0-5)
  • Building a red teaming culture
  • Operational feedback loops
  • Best practices for beginners

Copy link
Contributor

use-tusk bot commented Jul 22, 2025

⏩ No test execution environment matched (f8180d8) View output ↗

View output in GitHub ↗

View check history

Commit Status Output Created (UTC)
b331325 ⏩ No test execution environment matched Output Jul 22, 2025 2:10PM
5472eda ⏩ No test execution environment matched Output Jul 22, 2025 2:12PM
3f8bab9 ⏩ No test execution environment matched Output Jul 22, 2025 2:13PM
5809543 ⏩ No test execution environment matched Output Jul 22, 2025 2:24PM
df66578 ⏩ No test execution environment matched Output Jul 24, 2025 5:20AM
f8180d8 ⏩ No test execution environment matched Output Jul 24, 2025 12:54PM

Copy link
Contributor

gru-agent bot commented Jul 22, 2025

TestGru Assignment

Summary

Link CommitId Status Reason
Detail b331325 🚫 Skipped No files need to be tested {"site/blog/ai-red-teaming-for-first-timers.md":"File path does not match include patterns.","site/static/img/blog/ai-red-teaming-hero.png":"File path does not match include patterns."}

History Assignment

Tip

You can @gru-agent and leave your feedback. TestGru will make adjustments based on your input

Copy link
Contributor

coderabbitai bot commented Jul 22, 2025

📝 Walkthrough

Walkthrough

A new blog post titled "AI Red Teaming for complete first-timers" has been added as a Markdown file. The article introduces the concept of AI red teaming, differentiates it from traditional red teaming, and discusses regulatory considerations in the EU, China, and the US. It presents a maturity model for AI red teaming practices, describes the operational feedback loop, and emphasizes building a red teaming culture. The post also references the Promptfoo tool and includes metadata such as title, description, authors, tags, keywords, date, and image.

Estimated code review effort

1 (~2 minutes)


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
site/blog/ai-red-teaming-for-first-timers.md (3)

39-39: Drop “in order to” for stronger, shorter copy

-... may be required for the same attack in order to determine the probability spread of results
+... may be required for the same attack to determine the probability distribution of results

87-87: Convert blockquote to Docusaurus admonition

Guidelines prefer admonitions (:::tip, :::info, …) over blockquotes for call-outs.

-> Promptfoo works with enterprise clients to fit tooling into their workflows due to the nature of custom requirements for each project.
+
+:::tip
+
+Promptfoo works with enterprise clients to fit tooling into their workflows due to each project's custom requirements.
+
+:::

93-93: Trim “in an effort to” for concision

-... considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes, go you 🥳!
+... considering using a tool like Promptfoo to make your systems more robust—heck yes, go you! 🥳
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 820330a and b331325.

⛔ Files ignored due to path filters (1)
  • site/static/img/blog/ai-red-teaming-hero.png is excluded by !**/*.png
📒 Files selected for processing (1)
  • site/blog/ai-red-teaming-for-first-timers.md (1 hunks)
📓 Path-based instructions (3)
site/blog/**/*.md

📄 CodeRabbit Inference Engine (.cursor/rules/docusaurus.mdc)

site/blog/**/*.md: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
{site/**,examples/**}

📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

Any pull request that only touches files in 'site/' or 'examples/' directories must use the 'docs:' prefix in the PR title, not 'feat:' or 'fix:'

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
site/**

📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

If the change is a feature, update the relevant documentation under 'site/'

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
site/blog/ai-red-teaming-for-first-timers.md (2)

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.mdx : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.

🪛 LanguageTool
site/blog/ai-red-teaming-for-first-timers.md

[style] ~39-~39: Consider a more concise word here.
Context: ...pts may be required for the same attack in order to determine the probability spread of res...

(IN_ORDER_TO_PREMIUM)


[style] ~93-~93: Consider a more concise word here.
Context: ...considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes...

(IN_ORDER_TO_PREMIUM)

🧰 Additional context used
📓 Path-based instructions (3)
site/blog/**/*.md

📄 CodeRabbit Inference Engine (.cursor/rules/docusaurus.mdc)

site/blog/**/*.md: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
{site/**,examples/**}

📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

Any pull request that only touches files in 'site/' or 'examples/' directories must use the 'docs:' prefix in the PR title, not 'feat:' or 'fix:'

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
site/**

📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

If the change is a feature, update the relevant documentation under 'site/'

Files:

  • site/blog/ai-red-teaming-for-first-timers.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
site/blog/ai-red-teaming-for-first-timers.md (2)

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.mdx : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.

🪛 LanguageTool
site/blog/ai-red-teaming-for-first-timers.md

[style] ~39-~39: Consider a more concise word here.
Context: ...pts may be required for the same attack in order to determine the probability spread of res...

(IN_ORDER_TO_PREMIUM)


[style] ~93-~93: Consider a more concise word here.
Context: ...considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes...

(IN_ORDER_TO_PREMIUM)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: webui tests
  • GitHub Check: Test on Node 20.x and macOS-latest
  • GitHub Check: Test on Node 20.x and windows-latest
  • GitHub Check: Test on Node 18.x and windows-latest
  • GitHub Check: Build Docs
  • GitHub Check: Test on Node 24.x and ubuntu-latest
  • GitHub Check: Redteam
  • GitHub Check: Redteam Custom Enterprise Server
  • GitHub Check: Test on Node 24.x and macOS-latest
  • GitHub Check: Test on Node 22.x and ubuntu-latest
  • GitHub Check: Test on Node 18.x and macOS-latest
  • GitHub Check: Test on Node 22.x and macOS-latest
  • GitHub Check: Build on Node 24.x
  • GitHub Check: Test on Node 18.x and ubuntu-latest
  • GitHub Check: Test on Node 20.x and ubuntu-latest
  • GitHub Check: Share Test
  • GitHub Check: Build on Node 20.x
  • GitHub Check: Build on Node 22.x
  • GitHub Check: Build on Node 18.x
  • GitHub Check: calibreapp/image-actions
  • GitHub Check: Analyze (javascript-typescript)

---
title: 'AI Red Teaming for complete first-timers'
description: 'A comprehensive guide to AI red teaming for beginners, covering the basics, culture building, and operational feedback loops'
authors: [tabs]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace placeholder authors value with a real author slug

authors: [tabs] will break Docusaurus build unless tabs is a defined author in site/src/data/authors.yml. Swap the placeholder for a valid slug or remove the field until one exists.

-authors: [tabs]
+# authors: [mldangelo]   # ← example; update to a real author slug
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
authors: [tabs]
-authors: [tabs]
+# authors: [mldangelo] # ← example; update to a real author slug
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 4, the authors field is
set to a placeholder value 'tabs' which is not defined in
site/src/data/authors.yml and will cause the Docusaurus build to fail. Replace
'tabs' with a valid author slug that exists in the authors.yml file or remove
the authors field entirely until a valid author is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oi :')


Red teaming is the process of simulating real-world attacks to identify vulnerabilities.

AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems. There are two scopes people often use to refer to AI red teaming:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing verb ― add “is” and tighten wording

The sentence drops the verb “is” and reads awkwardly.

-AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems.
+AI red teaming is the process of simulating real-world attacks to identify vulnerabilities in artificial-intelligence systems.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems. There are two scopes people often use to refer to AI red teaming:
AI red teaming is the process of simulating real-world attacks to identify vulnerabilities in artificial-intelligence systems.
There are two scopes people often use to refer to AI red teaming:
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 20, the sentence is
missing the verb "is," making it awkward. Add the verb "is" after "AI red
teaming" to complete the sentence and improve clarity.

As the name would suggest, the focus of AI red teaming is going to revolve around AI (duh). The implications of this are:
- A shift from deterministic to non-deterministic results; multiple attempts may be required for the same attack in order to determine the probability spread of results
- There's a significant portion of efforts focused around testing models and their data
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques around eliciting these rise in important similarly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo: “important” → “importance”

-... techniques around eliciting these rise in important similarly
+... techniques for eliciting these rise in importance similarly
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques around eliciting these rise in important similarly
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques for eliciting these rise in importance similarly
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 41, fix the typo by
replacing the word "important" with "importance" to correctly read "techniques
around eliciting these rise in importance similarly."


| Level | Description | Characteristics | Promptfoo Fit |
| -------------------------------------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| **0: No Testing** | No structured evaluation of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use “eval” instead of “evaluation” per style guide

-| **0: No Testing**                      | No structured evaluation of prompts or outputs                        | - Risks mostly unobserved<br>- Manual spot-checks                                                                                                               | Not in use                                                                |
+| **0: No Testing**                      | No structured eval of prompts or outputs                              | - Risks mostly unobserved<br>- Manual spot-checks                                                                                                              | Not in use                                                                |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| **0: No Testing** | No structured evaluation of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use |
| **0: No Testing** | No structured eval of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use |
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 50, replace the word
"evaluation" with "eval" in the table content to comply with the style guide.
Ensure the change is applied only to the relevant text without altering the rest
of the line or formatting.

- Comment out placeholder authors field pending valid slug
- Add missing verb 'is' in AI red teaming definition
- Fix typo: 'important' → 'importance'
- Replace 'evaluation' with 'eval' per style guide
- Remove 'in order to' for conciseness
- Convert blockquote to Docusaurus admonition (:::tip)
- Trim 'in an effort to' for brevity
Copy link
Contributor

@ladyofcode ladyofcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, thank you

Would be good to have the date for this one not the same date as the one you wrote - 22nd is fine or some date early next week

@mldangelo mldangelo merged commit edb5f7f into main Jul 26, 2025
37 checks passed
@mldangelo mldangelo deleted the docs/ai-red-teaming-blog-post branch July 26, 2025 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants