-
-
Notifications
You must be signed in to change notification settings - Fork 660
docs(site): add AI red teaming for first-timers blog post #5017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
⏩ No test execution environment matched (f8180d8) View output ↗ View check history
|
TestGru AssignmentSummary
Tip You can |
📝 WalkthroughWalkthroughA new blog post titled "AI Red Teaming for complete first-timers" has been added as a Markdown file. The article introduces the concept of AI red teaming, differentiates it from traditional red teaming, and discusses regulatory considerations in the EU, China, and the US. It presents a maturity model for AI red teaming practices, describes the operational feedback loop, and emphasizes building a red teaming culture. The post also references the Promptfoo tool and includes metadata such as title, description, authors, tags, keywords, date, and image. Estimated code review effort1 (~2 minutes) 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (3)
site/blog/ai-red-teaming-for-first-timers.md (3)
39-39
: Drop “in order to” for stronger, shorter copy-... may be required for the same attack in order to determine the probability spread of results +... may be required for the same attack to determine the probability distribution of results
87-87
: Convert blockquote to Docusaurus admonitionGuidelines prefer admonitions (
:::tip
,:::info
, …) over blockquotes for call-outs.-> Promptfoo works with enterprise clients to fit tooling into their workflows due to the nature of custom requirements for each project. + +:::tip + +Promptfoo works with enterprise clients to fit tooling into their workflows due to each project's custom requirements. + +:::
93-93
: Trim “in an effort to” for concision-... considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes, go you 🥳! +... considering using a tool like Promptfoo to make your systems more robust—heck yes, go you! 🥳
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
site/static/img/blog/ai-red-teaming-hero.png
is excluded by!**/*.png
📒 Files selected for processing (1)
site/blog/ai-red-teaming-for-first-timers.md
(1 hunks)
📓 Path-based instructions (3)
site/blog/**/*.md
📄 CodeRabbit Inference Engine (.cursor/rules/docusaurus.mdc)
site/blog/**/*.md
: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...
Files:
site/blog/ai-red-teaming-for-first-timers.md
{site/**,examples/**}
📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)
Any pull request that only touches files in 'site/' or 'examples/' directories must use the 'docs:' prefix in the PR title, not 'feat:' or 'fix:'
Files:
site/blog/ai-red-teaming-for-first-timers.md
site/**
📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)
If the change is a feature, update the relevant documentation under 'site/'
Files:
site/blog/ai-red-teaming-for-first-timers.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
site/blog/ai-red-teaming-for-first-timers.md (2)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.mdx : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
🪛 LanguageTool
site/blog/ai-red-teaming-for-first-timers.md
[style] ~39-~39: Consider a more concise word here.
Context: ...pts may be required for the same attack in order to determine the probability spread of res...
(IN_ORDER_TO_PREMIUM)
[style] ~93-~93: Consider a more concise word here.
Context: ...considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes...
(IN_ORDER_TO_PREMIUM)
🧰 Additional context used
📓 Path-based instructions (3)
site/blog/**/*.md
📄 CodeRabbit Inference Engine (.cursor/rules/docusaurus.mdc)
site/blog/**/*.md
: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...
Files:
site/blog/ai-red-teaming-for-first-timers.md
{site/**,examples/**}
📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)
Any pull request that only touches files in 'site/' or 'examples/' directories must use the 'docs:' prefix in the PR title, not 'feat:' or 'fix:'
Files:
site/blog/ai-red-teaming-for-first-timers.md
site/**
📄 CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)
If the change is a feature, update the relevant documentation under 'site/'
Files:
site/blog/ai-red-teaming-for-first-timers.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
site/blog/ai-red-teaming-for-first-timers.md (2)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.md : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.591Z
Learning: Applies to site/blog/**/*.mdx : Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
🪛 LanguageTool
site/blog/ai-red-teaming-for-first-timers.md
[style] ~39-~39: Consider a more concise word here.
Context: ...pts may be required for the same attack in order to determine the probability spread of res...
(IN_ORDER_TO_PREMIUM)
[style] ~93-~93: Consider a more concise word here.
Context: ...considering using a tool like Promptfoo in an effort to make your systems more robust, heck yes...
(IN_ORDER_TO_PREMIUM)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: webui tests
- GitHub Check: Test on Node 20.x and macOS-latest
- GitHub Check: Test on Node 20.x and windows-latest
- GitHub Check: Test on Node 18.x and windows-latest
- GitHub Check: Build Docs
- GitHub Check: Test on Node 24.x and ubuntu-latest
- GitHub Check: Redteam
- GitHub Check: Redteam Custom Enterprise Server
- GitHub Check: Test on Node 24.x and macOS-latest
- GitHub Check: Test on Node 22.x and ubuntu-latest
- GitHub Check: Test on Node 18.x and macOS-latest
- GitHub Check: Test on Node 22.x and macOS-latest
- GitHub Check: Build on Node 24.x
- GitHub Check: Test on Node 18.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and ubuntu-latest
- GitHub Check: Share Test
- GitHub Check: Build on Node 20.x
- GitHub Check: Build on Node 22.x
- GitHub Check: Build on Node 18.x
- GitHub Check: calibreapp/image-actions
- GitHub Check: Analyze (javascript-typescript)
--- | ||
title: 'AI Red Teaming for complete first-timers' | ||
description: 'A comprehensive guide to AI red teaming for beginners, covering the basics, culture building, and operational feedback loops' | ||
authors: [tabs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace placeholder authors
value with a real author slug
authors: [tabs]
will break Docusaurus build unless tabs
is a defined author in site/src/data/authors.yml
. Swap the placeholder for a valid slug or remove the field until one exists.
-authors: [tabs]
+# authors: [mldangelo] # ← example; update to a real author slug
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
authors: [tabs] | |
-authors: [tabs] | |
+# authors: [mldangelo] # ← example; update to a real author slug |
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 4, the authors field is
set to a placeholder value 'tabs' which is not defined in
site/src/data/authors.yml and will cause the Docusaurus build to fail. Replace
'tabs' with a valid author slug that exists in the authors.yml file or remove
the authors field entirely until a valid author is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oi :')
|
||
Red teaming is the process of simulating real-world attacks to identify vulnerabilities. | ||
|
||
AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems. There are two scopes people often use to refer to AI red teaming: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing verb ― add “is” and tighten wording
The sentence drops the verb “is” and reads awkwardly.
-AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems.
+AI red teaming is the process of simulating real-world attacks to identify vulnerabilities in artificial-intelligence systems.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
AI red teaming the process of simulating real-world attacks to identify vulnerabilities of artificial intelligence systems. There are two scopes people often use to refer to AI red teaming: | |
AI red teaming is the process of simulating real-world attacks to identify vulnerabilities in artificial-intelligence systems. | |
There are two scopes people often use to refer to AI red teaming: |
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 20, the sentence is
missing the verb "is," making it awkward. Add the verb "is" after "AI red
teaming" to complete the sentence and improve clarity.
As the name would suggest, the focus of AI red teaming is going to revolve around AI (duh). The implications of this are: | ||
- A shift from deterministic to non-deterministic results; multiple attempts may be required for the same attack in order to determine the probability spread of results | ||
- There's a significant portion of efforts focused around testing models and their data | ||
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques around eliciting these rise in important similarly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo: “important” → “importance”
-... techniques around eliciting these rise in important similarly
+... techniques for eliciting these rise in importance similarly
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques around eliciting these rise in important similarly | |
- Metrics around toxicity, hallucination, and leaks rise in importance; techniques for eliciting these rise in importance similarly |
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 41, fix the typo by
replacing the word "important" with "importance" to correctly read "techniques
around eliciting these rise in importance similarly."
|
||
| Level | Description | Characteristics | Promptfoo Fit | | ||
| -------------------------------------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | | ||
| **0: No Testing** | No structured evaluation of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use “eval” instead of “evaluation” per style guide
-| **0: No Testing** | No structured evaluation of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use |
+| **0: No Testing** | No structured eval of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use |
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **0: No Testing** | No structured evaluation of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use | | |
| **0: No Testing** | No structured eval of prompts or outputs | - Risks mostly unobserved<br>- Manual spot-checks | Not in use | |
🤖 Prompt for AI Agents
In site/blog/ai-red-teaming-for-first-timers.md at line 50, replace the word
"evaluation" with "eval" in the table content to comply with the style guide.
Ensure the change is applied only to the relevant text without altering the rest
of the line or formatting.
- Comment out placeholder authors field pending valid slug - Add missing verb 'is' in AI red teaming definition - Fix typo: 'important' → 'importance' - Replace 'evaluation' with 'eval' per style guide - Remove 'in order to' for conciseness - Convert blockquote to Docusaurus admonition (:::tip) - Trim 'in an effort to' for brevity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, thank you
Would be good to have the date for this one not the same date as the one you wrote - 22nd is fine or some date early next week
This PR adds a new blog post about AI red teaming for beginners.
Changes
The blog post covers: