robusta-dev
diff --git a/‎Dockerfile‎
Lines changed: 4 additions & 7 deletions b/‎Dockerfile‎
Lines changed: 4 additions & 7 deletions
diff --git a/‎docs/ai-providers/anthropic.md‎
Lines changed: 21 additions & 0 deletions b/‎docs/ai-providers/anthropic.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎docs/assets/Holmes-azure-mcp.gif‎
15.4 MB b/‎docs/assets/Holmes-azure-mcp.gif‎
15.4 MB
diff --git a/‎docs/community.md‎
Lines changed: 10 additions & 18 deletions b/‎docs/community.md‎
Lines changed: 10 additions & 18 deletions
diff --git a/‎docs/data-sources/builtin-toolsets/bash.md‎
Lines changed: 91 additions & 0 deletions b/‎docs/data-sources/builtin-toolsets/bash.md‎
Lines changed: 91 additions & 0 deletions
diff --git a/‎docs/data-sources/builtin-toolsets/coralogix-logs.md‎
Lines changed: 78 additions & 0 deletions b/‎docs/data-sources/builtin-toolsets/coralogix-logs.md‎
Lines changed: 78 additions & 0 deletions
diff --git a/‎docs/installation/python-installation.md‎
Lines changed: 0 additions & 3 deletions b/‎docs/installation/python-installation.md‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎docs/overrides/main.html‎
Lines changed: 2 additions & 2 deletions b/‎docs/overrides/main.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/walkthrough/investigating-using-aks-mcp-server.md‎
Lines changed: 71 additions & 0 deletions b/‎docs/walkthrough/investigating-using-aks-mcp-server.md‎
Lines changed: 71 additions & 0 deletions
diff --git a/‎helm/holmes/templates/holmes.yaml‎
Lines changed: 3 additions & 1 deletion b/‎helm/holmes/templates/holmes.yaml‎
Lines changed: 3 additions & 1 deletion
@@ -23,6 +23,8 @@ ENV VIRTUAL_ENV=/app/venv
 ENV PATH="$VIRTUAL_ENV/bin:$PATH"
 
 # Needed for kubectl
+ENV VERIFY_CHECKSUM=true \
+    VERIFY_SIGNATURES=true
 RUN curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key -o Release.key
 
 # Set the architecture-specific kube lineage URLs
@@ -58,12 +60,7 @@ RUN chmod 777 argocd
 RUN ./argocd --help
 
 # Install Helm
-RUN curl https://baltocdn.com/helm/signing.asc | gpg --dearmor -o /usr/share/keyrings/helm.gpg \
-    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" \
-    | tee /etc/apt/sources.list.d/helm-stable-debian.list \
-    && apt-get update \
-    && apt-get install -y helm \
-    && rm -rf /var/lib/apt/lists/*
+RUN curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
 
 # Set up poetry
 ARG PRIVATE_PACKAGE_REGISTRY="none"
@@ -135,7 +132,7 @@ COPY --from=builder /app/argocd /usr/local/bin/argocd
 RUN argocd --help
 
 # Set up Helm
-COPY --from=builder /usr/bin/helm /usr/local/bin/helm
+COPY --from=builder /usr/local/bin/helm /usr/local/bin/helm
 RUN chmod 555 /usr/local/bin/helm
 RUN helm version
 
 
@@ -21,6 +21,27 @@ You can also pass the API key directly as a command-line parameter:
 holmes ask "what pods are failing?" --model="anthropic/<your-claude-model>" --api-key="your-api-key"
 ```
 
+## Prompt Caching
+
+HolmesGPT adds Anthropic's prompt caching feature, which can significantly reduce costs and latency for repeated API calls with similar prompts.
+
+HolmesGPT automatically adds cache control to the last message in each API call. This caches everything from the beginning of the conversation up to that point, making subsequent calls with the same prefix much faster and cheaper.
+
+### How It Works
+
+- Anthropic uses prefix-based caching - it caches the exact sequence of messages up to the cache control point
+- The cache has a 5-minute lifetime by default
+- Cached content must be at least 1024 tokens to be effective
+- You're charged for cache writes on the first call, but subsequent cache hits are much cheaper
+
+### Benefits in HolmesGPT
+
+Prompt caching is particularly effective for HolmesGPT because:
+
+- System prompts with tool definitions are large and static - perfect for caching
+- Tool investigation loops reuse the same context multiple times
+- Multi-step investigations benefit from cached conversation history
+
 ## Additional Resources
 
 HolmesGPT uses the LiteLLM API to support Anthropic provider. Refer to [LiteLLM Anthropic docs](https://litellm.vercel.app/docs/providers/anthropic){:target="_blank"} for more details.
@@ -1,31 +1,23 @@
 # Community
 
-Join us for our regular community meetings to discuss the HolmesGPT roadmap and collaborate on the future of AI-powered troubleshooting.
+Join our community to collaborate on the future of AI-powered troubleshooting.
 
-## Community Meeting
+## Community Meetup Recording
 
-📅 **HolmesGPT Community Meetup**
+📹 **Watch our first HolmesGPT Community Meetup**
 
-**🗓️ Date:** Thursday, August 21, 2025
+We held our inaugural community meetup on August 21, 2025. Watch the recording to learn about:
 
-**📍 Where:** [Google Meet](https://meet.google.com/jxc-ujyf-xwy)
+- HolmesGPT roadmap and upcoming features
+- Community Q&A and feedback
+- Ways to get involved with the project
 
-| Local Date & Time | Time Zone |
-|------------------|-----------|
-| Thursday, Aug 21 · 8:00 - 9:00 AM | PT (Pacific Time) |
-| Thursday, Aug 21 · 11:00 AM - 12:00 PM | ET (Eastern Time) |
-| Thursday, Aug 21 · 8:30 - 9:30 PM | IST (India Standard Time) |
+**[▶️ Watch Recording on YouTube](https://youtu.be/slQRc6nlFQU)**
 
-### Agenda
-- [📋 HolmesGPT Roadmap](https://github.com/orgs/robusta-dev/projects/2) - Review and discuss upcoming features
-- Community feedback and Q&A
-- Ways to get involved
+### Resources
 
-**Links:**
-
-- [🔗 Google Meet](https://meet.google.com/jxc-ujyf-xwy)
 - [📝 Meeting Notes](https://docs.google.com/document/d/1sIHCcTivyzrF5XNvos7ZT_UcxEOqgwfawsTbb9wMJe4/edit?tab=t.0)
-- [📋 Roadmap](https://github.com/orgs/robusta-dev/projects/2)
+- [📋 HolmesGPT Roadmap](https://github.com/orgs/robusta-dev/projects/2)
 
 ## Get Involved
 
 
@@ -0,0 +1,91 @@
+# Bash Toolset
+
+The bash toolset provides secure execution of common command-line tools used for troubleshooting and system analysis. It replaces multiple YAML-based toolsets with a single, comprehensive toolset that includes safety validation and command parsing.
+
+**⚠️ Security Note**: This toolset executes commands on the system where Holmes is running. Only validated, safe commands are allowed, and the toolset is disabled by default for security reasons.
+
+## Supported Commands
+
+The bash toolset supports the following categories of commands:
+
+### Cloud Providers
+
+**AWS CLI (`aws`)**
+
+- Supports various AWS services and operations
+- Commands are validated for safety before execution
+
+**Azure CLI (`az`)**
+
+- Supports Azure operations including AKS management
+- Network and account operations
+
+### Kubernetes Tools
+
+**kubectl**
+
+- Standard Kubernetes operations: get, describe, logs, events
+- Resource management and cluster inspection
+- Live metrics via `kubectl top`
+
+**Helm**
+
+- Helm chart operations
+- Repository management
+- Release inspection
+
+**ArgoCD**
+
+- Application management
+- Deployment status checking
+
+### Container Tools
+
+**Docker**
+
+- Container inspection and management
+- Image operations
+- Basic Docker commands
+
+### Text Processing Utilities
+
+**Data Processing**
+
+- `grep` - Text searching and pattern matching
+- `jq` - JSON processing and querying
+- `sed` - Stream editing and text transformation
+- `awk` - Pattern scanning and text processing
+
+**File Utilities**
+
+- `cut` - Column extraction
+- `sort` - Data sorting
+- `uniq` - Duplicate removal
+- `head` - Show first lines
+- `tail` - Show last lines
+- `wc` - Word, line, and character counting
+
+**Text Transformation**
+
+- `tr` - Character translation and deletion
+- `base64` - Base64 encoding/decoding
+
+### Special Tools
+
+**kubectl_run_image**
+
+Creates temporary debug pods in Kubernetes clusters for diagnostic commands:
+
+- Runs commands in specified container images
+- Automatically cleans up temporary pods
+- Supports custom namespaces and timeouts
+- Useful for network debugging, DNS resolution, and environment inspection
+
+## Command Validation
+
+All commands undergo security validation before execution:
+
+- Only whitelisted commands and options are allowed
+- Dangerous operations are blocked (file writes, system calls, etc.)
+- Commands are parsed and validated for safety
+- Pipe operations between supported commands are allowed
@@ -29,6 +29,84 @@ toolsets:
     enabled: false  # Disable default Kubernetes logging
 ```
 
+## Custom Labels Configuration (Optional)
+
+By default, the Coralogix toolset expects logs to use standard Kubernetes field names. If your Coralogix deployment uses different field names for Kubernetes metadata, you can customize the label mappings.
+
+This is useful when:
+
+- Your log ingestion pipeline uses custom field names
+- You have a non-standard Coralogix setup with different metadata fields
+- Your Kubernetes logs are structured differently in Coralogix
+
+To find the correct field names, examine your logs in the Coralogix UI and identify how pod names, namespaces, log messages, and timestamps are labeled.
+
+### Example with Custom Labels
+
+```yaml-toolset-config
+toolsets:
+  coralogix/logs:
+    enabled: true
+    config:
+      api_key: "<your Coralogix API key>"
+      domain: "eu2.coralogix.com"
+      team_hostname: "your-company-name"
+      labels:
+        namespace: "resource.attributes.k8s.pod.name" # Default
+        pod: "resource.attributes.k8s.namespace.name" # Default
+        log_message: "logRecord.body"                 # Default
+        timestamp: "logRecord.attributes.time"        # Default
+
+  kubernetes/logs:
+    enabled: false  # Disable default Kubernetes logging
+```
+
+**Label Configuration Fields:**
+
+- `namespace`: Field path for Kubernetes namespace name
+- `pod`: Field path for Kubernetes pod name
+- `log_message`: Field path for the actual log message content
+- `timestamp`: Field path for log timestamp
+
+All label fields are optional and will use the defaults shown above if not specified.
+
+## Logs Retrieval Strategy (Optional)
+
+Coralogix stores logs in two tiers with different performance characteristics:
+
+- **Frequent Search**: Fast queries with limited retention
+- **Archive**: Slower queries but longer retention period
+
+You can configure how HolmesGPT retrieves logs using the `logs_retrieval_methodology` setting:
+
+### Available Strategies
+
+- `ARCHIVE_FALLBACK` (default): Try Frequent Search first, fallback to Archive if no results
+- `FREQUENT_SEARCH_ONLY`: Only search Frequent Search tier
+- `ARCHIVE_ONLY`: Only search Archive tier
+- `BOTH_FREQUENT_SEARCH_AND_ARCHIVE`: Search both tiers and merge results
+- `FREQUENT_SEARCH_FALLBACK`: Try Archive first, fallback to Frequent Search if no results
+
+### Example Configuration
+
+```yaml-toolset-config
+toolsets:
+  coralogix/logs:
+    enabled: true
+    config:
+      api_key: "<your Coralogix API key>"
+      domain: "eu2.coralogix.com"
+      team_hostname: "your-company-name"
+      logs_retrieval_methodology: "ARCHIVE_FALLBACK"  # Default
+```
+
+**Recommendations:**
+
+- Use `ARCHIVE_FALLBACK` for most cases (balances speed and coverage)
+- Use `FREQUENT_SEARCH_ONLY` when you know Holmes does not need to access the log archive
+- Use `ARCHIVE_ONLY` if the frequent search logs are always empty
+- Use `BOTH_FREQUENT_SEARCH_AND_ARCHIVE` for comprehensive log coverage (slower)
+
 ## Capabilities
 
 | Tool Name | Description |
 
@@ -48,7 +48,6 @@ messages = build_initial_ask_messages(
     initial_user_prompt=question,
     file_paths=None,
     tool_executor=ai.tool_executor,
-    investigation_id=ai.investigation_id,
     runbooks=config.get_runbook_catalog(),
     system_prompt_additions=None
 )
@@ -130,7 +129,6 @@ def main():
                 initial_user_prompt=question,
                 file_paths=None,
                 tool_executor=ai.tool_executor,
-                investigation_id=ai.investigation_id,
                 runbooks=config.get_runbook_catalog(),
                 system_prompt_additions=None
             )
@@ -224,7 +222,6 @@ def main():
         initial_user_prompt=first_question,
         file_paths=None,
         tool_executor=ai.tool_executor,
-        investigation_id=ai.investigation_id,
         runbooks=config.get_runbook_catalog(),
         system_prompt_additions=None
     )
 
@@ -3,8 +3,8 @@
 {% block announce %}
   <div class="md-banner">
     <div class="md-banner__inner">
-      🎉 Join us on our first HolmesGPT community meeting - August 21, 8AM PT
-      <a href="/community/">Learn more</a>
+      📹 Watch the recording of our first HolmesGPT community meetup
+      <a href="https://youtu.be/slQRc6nlFQU" target="_blank">Watch on YouTube</a>
     </div>
   </div>
 {% endblock %}
@@ -0,0 +1,71 @@
+# Investigating using AKS MCP Server
+
+You can investigate Azure Kubernetes Service issues using HolmesGPT with the AKS MCP (Model Context Protocol) server.
+
+![AKS MCP Integration](../assets/Holmes-azure-mcp.gif)
+
+## Prerequisites
+
+- HolmesGPT CLI installed ([installation guide](../installation/cli-installation.md))
+- An AI provider API key configured ([setup guide](../ai-providers/index.md))
+- Azure CLI installed and authenticated
+- Access to Azure Kubernetes Service clusters
+- [Azure Kubernetes Service](https://marketplace.visualstudio.com/items?itemName=ms-kubernetes-tools.vscode-aks-tools) VS Code extension installed
+
+## Setting Up AKS MCP Server
+
+### Step 1: Setup the MCP Server
+
+- Open VS Code Command Palette (`Ctrl+Shift+P` or `Cmd+Shift+P`)
+- Run: **"AKS: Setup AKS MCP Server"**
+- Follow the setup wizard to configure your Azure credentials and cluster access
+
+### Step 2: Update Configuration for SSE
+   After installation, update your VS Code MCP configuration (`.vscode/mcp.json`) to use SSE transport and start the server
+   ```json
+   {
+     "servers": {
+       "AKS MCP": {
+         "command": "/Users/yourname/.vs-kubernetes/tools/aks-mcp/v0.0.3/aks-mcp",
+         "args": [
+           "--transport",
+           "sse"
+         ]
+       }
+     }
+   }
+   ```
+   **Note:** Change `"stdio"` to `"sse"` in the transport argument.
+
+### Step 3: Configure HolmesGPT
+
+Add this configuration to your HolmesGPT config file (`~/.holmes/config.yaml`):
+
+```yaml
+mcp_servers:
+  aks-mcp:
+    description: "Azure Kubernetes Service(AKS) Model Context Protocol(MCP) server"
+    url: "http://localhost:8000/sse"
+    llm_instructions: "MCP server to get AKS cluster information, retrieve cluster resources and workloads, analyze network policies and VNet configurations, query control plane logs, fetch cluster metrics and health status. Investigate networking issues with NSGs and load balancers, perform kubectl operations, real-time monitoring of DNS, services across Azure Kubernetes environments"
+
+```
+
+## Investigation Examples
+
+Once configured, you can investigate AKS issues using natural language queries:
+
+### Cluster Health Issues
+```bash
+holmes ask "What issues do I have in my AKS cluster?"
+```
+
+### Network Connectivity Problems
+```bash
+holmes ask "My payment deployment can't reach external services investigate why"
+```
+
+## What's Next?
+
+- **[Add more data sources](../data-sources/index.md)** - Combine AKS MCP with other observability tools
+- **[Set up additional MCP servers](../data-sources/remote-mcp-servers.md)** - Integrate multiple specialized MCP servers
+- **[Configure custom toolsets](../data-sources/custom-toolsets.md)** - Create specialized investigation workflows
@@ -6,7 +6,9 @@ metadata:
   labels:
     app: holmes
 spec:
-  replicas: 1
+  {{- if (not .Values.autoscaling.enabled) }}
+  replicas: {{ .Values.replicas }}
+  {{- end }}
   revisionHistoryLimit: {{ .Values.revisionHistoryLimit }}
   selector:
     matchLabels: