Skip to content

Commit 58e1c44

Browse files
author
Bob Strahan
committed
Merge branch 'develop' v0.3.10
2 parents 245fa3f + 95bfd06 commit 58e1c44

File tree

95 files changed

+9654
-36
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

95 files changed

+9654
-36
lines changed

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,27 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
## [0.3.10]
9+
10+
### Added
11+
12+
- **Agent Analysis Feature for Natural Language Document Analytics**
13+
- Added integrated AI-powered analytics agent that enables natural language querying of processed document data
14+
- **Key Capabilities**: Convert natural language questions to SQL queries, generate interactive visualizations and tables, explore database schema automatically
15+
- **Secure Architecture**: All Python code execution happens in isolated AWS Bedrock AgentCore sandboxes, not in Lambda functions
16+
- **Multi-Tool Agent System**: Database discovery tool for schema exploration, Athena query tool for SQL execution, secure code sandbox for data transfer, Python visualization tool for charts and tables
17+
- **Example Use Cases**: Query document processing volumes and trends, analyze confidence scores and extraction accuracy, explore document classifications and content patterns, generate custom charts and data tables
18+
- **Sample W2 Test Data**: Includes 20 synthetic W2 tax documents for testing analytics capabilities
19+
- **Configurable Models**: Supports multiple AI models including Claude 3.7 Sonnet (default), Claude 3.5 Sonnet, Nova Pro/Lite, and Haiku
20+
- **Web UI Integration**: Accessible through "Document Analytics" section with real-time progress display and query history
21+
22+
- **Automatic Glue Table Creation for Document Sections**
23+
- Added automatic creation of AWS Glue tables for each document section type (classification) during processing
24+
- Tables are created dynamically when new section types are encountered, eliminating manual table creation
25+
- Consistent lowercase naming convention for tables ensures compatibility with case-sensitive S3 paths
26+
- Tables are configured with partition projection for efficient date-based queries without manual partition management
27+
- Automatic schema evolution - tables update when new fields are detected in extraction results
28+
829
## [0.3.9]
930

1031
### Added

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,7 @@ For detailed deployment and testing instructions, see the [Deployment Guide](./d
122122
- [Architecture](./docs/architecture.md) - Detailed component architecture and data flow
123123
- [Deployment](./docs/deployment.md) - Build, publish, deploy, and test instructions
124124
- [Web UI](./docs/web-ui.md) - Web interface features and usage
125+
- [Agent Analysis](./docs/agent-analysis.md) - Natural language analytics and data visualization feature
125126
- [Configuration](./docs/configuration.md) - Configuration and customization options
126127
- [Classification](./docs/classification.md) - Customizing document classification
127128
- [Extraction](./docs/extraction.md) - Customizing information extraction

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.9
1+
0.3.10

docs/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ This folder contains detailed documentation on various aspects of the GenAI Inte
1010
- [Architecture](./architecture.md) - Detailed component architecture and data flow
1111
- [Deployment](./deployment.md) - Build, publish, deploy, and test instructions
1212
- [Web UI](./web-ui.md) - Web interface features and usage
13+
- [Agent Analysis](./agent-analysis.md) - Natural language analytics and data visualization feature
1314
- [Knowledge Base](./knowledge-base.md) - Document knowledge base query feature
1415
- [Evaluation Framework](./evaluation.md) - Accuracy assessment system
1516
- [Assessment Feature](./assessment.md) - Extraction confidence evaluation using LLMs

docs/agent-analysis.md

Lines changed: 301 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,301 @@
1+
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
SPDX-License-Identifier: MIT-0
3+
4+
# Agent Analysis Feature
5+
6+
The GenAIIDP solution includes an integrated Agent Analysis feature that enables you to interactively query and analyze your processed document data using natural language. This feature leverages AI agents to convert natural language questions into SQL queries, execute them against your document analytics database, and generate visualizations or tables to answer your questions.
7+
8+
## Overview
9+
10+
The Agent Analysis feature provides intelligent data exploration capabilities that allow users to:
11+
12+
- **Natural Language Querying**: Ask questions about your document data in plain English
13+
- **Automated SQL Generation**: AI agents convert your questions into optimized SQL queries
14+
- **Interactive Visualizations**: Generate charts, graphs, and tables from query results
15+
- **Real-time Analysis**: Get insights from your processed documents without manual data analysis
16+
- **Secure Code Execution**: Python visualization code runs in isolated AWS Bedrock AgentCore sandboxes
17+
18+
19+
https://github.com/user-attachments/assets/e2dea2c5-5eb1-42f6-9af5-469afd2135a7
20+
21+
22+
## Key Features
23+
24+
- **Multi-Modal AI Agent**: Uses advanced language models (Claude 3.7 Sonnet by default) for intelligent query understanding
25+
- **Secure Architecture**: All code execution happens in AWS Bedrock AgentCore sandboxes, not in Lambda functions
26+
- **Database Schema Discovery**: Agents automatically explore and understand your database structure
27+
- **Flexible Visualization**: Supports multiple chart types including bar charts, line charts, pie charts, and data tables
28+
- **Query History**: Track and manage previous analytics queries through the web interface
29+
- **Real-time Progress**: Live display of agent thought processes and SQL query execution
30+
- **Error Handling**: Intelligent retry logic for failed queries with automatic corrections
31+
32+
## Architecture
33+
34+
### Agent Workflow
35+
36+
1. **Question Processing**: User submits a natural language question through the web UI
37+
2. **Database Discovery**: Agent explores database schema using `get_database_info` tool
38+
3. **SQL Generation**: Agent converts the question into optimized SQL queries with proper column quoting
39+
4. **Query Execution**: SQL queries are executed against Amazon Athena with results stored in S3
40+
5. **Data Processing**: Query results are securely transferred to AWS Bedrock AgentCore sandbox
41+
6. **Visualization Generation**: Python code generates charts or tables from the data
42+
7. **Result Display**: Final visualizations are displayed in the web interface
43+
44+
### Security Architecture
45+
46+
The Agent Analysis feature implements a security-first design:
47+
48+
- **Sandboxed Execution**: All Python code runs in AWS Bedrock AgentCore, completely isolated from the rest of the AWS environment and the internet
49+
- **Secure Data Transfer**: Query results are transferred via S3 and AgentCore APIs, never through direct file system access
50+
- **Session Management**: Code interpreter sessions are properly managed and cleaned up after use
51+
- **Minimal Permissions**: Each component requests only the necessary AWS permissions
52+
- **Audit Trail**: Comprehensive logging and monitoring for security reviews
53+
54+
### Data Flow
55+
56+
```
57+
User Question → Analytics Request Handler → Analytics Processor → Agent Tools:
58+
├── Database Info Tool
59+
├── Athena Query Tool
60+
├── Code Sandbox Tool
61+
└── Python Execution Tool
62+
63+
Results ← Web UI ← AppSync Subscription ← DynamoDB ← Agent Response
64+
```
65+
66+
## Available Tools
67+
68+
The analytics agent has access to four specialized tools:
69+
70+
### 1. Database Information Tool
71+
- **Purpose**: Discovers database schema and table structures
72+
- **Usage**: Automatically called to understand available tables and columns
73+
- **Output**: Table names, column definitions, and data types
74+
75+
### 2. Athena Query Tool
76+
- **Purpose**: Executes SQL queries against the analytics database
77+
- **Features**:
78+
- Automatic column name quoting for Athena compatibility
79+
- Query result storage in S3
80+
- Error handling and retry logic
81+
- Support for both exploratory and final queries
82+
83+
### 3. Code Sandbox Tool
84+
- **Purpose**: Securely transfers query results to AgentCore sandbox
85+
- **Security**: Isolated environment with no Lambda file system access
86+
- **Data Format**: CSV files containing query results
87+
88+
### 4. Python Execution Tool
89+
- **Purpose**: Generates visualizations and tables from query data
90+
- **Libraries**: Pandas, Matplotlib, and other standard Python libraries
91+
- **Output**: JSON-formatted charts and tables for web display
92+
93+
## Using Agent Analysis
94+
95+
### Accessing the Feature
96+
97+
1. Log in to the GenAIIDP Web UI
98+
2. Navigate to the "Document Analytics" section in the main navigation
99+
3. You'll see a chat-like interface for querying your document data
100+
101+
### Asking Questions
102+
103+
The agent can answer various types of questions about your processed documents:
104+
105+
**Document Volume Questions:**
106+
- "How many documents were processed last month?"
107+
- "What's the trend in document processing over time?"
108+
- "Which document types are most common?"
109+
110+
**Processing Performance Questions:**
111+
- "What's the average processing time by document type?"
112+
- "Which documents failed processing and why?"
113+
- "Show me processing success rates by day"
114+
115+
**Content Analysis Questions:**
116+
- "What are the most common vendor names in invoices?"
117+
- "Show me the distribution of invoice amounts"
118+
- "Which documents have the highest confidence scores?"
119+
120+
**Comparative Analysis Questions:**
121+
- "How do confidence scores vary by document type?"
122+
- "What's the relationship between document size and processing time?"
123+
124+
### Sample Queries
125+
126+
Here are some example questions you can ask:
127+
128+
```
129+
"Show me a chart of document processing volume by day for the last 30 days"
130+
131+
"What are the top 10 most common document classifications?"
132+
133+
"Create a table showing average confidence scores by document type"
134+
135+
"Plot the relationship between document page count and processing time"
136+
137+
"Which extraction fields have the lowest average confidence scores?"
138+
```
139+
140+
### Understanding Results
141+
142+
The agent can return three types of results:
143+
144+
1. **Charts/Plots**: Visual representations of data trends and patterns
145+
2. **Tables**: Structured data displays for detailed information
146+
3. **Text Responses**: Direct answers to simple questions
147+
148+
Each result includes:
149+
- The original question
150+
- SQL queries that were executed
151+
- The final visualization or answer
152+
- Agent reasoning and thought process
153+
154+
## Testing with Sample Data
155+
156+
The solution includes sample W2 tax documents for testing the analytics feature:
157+
158+
### Sample Documents Location
159+
- **Path**: `/samples/w2/`
160+
- **Files**: 20 sample W2 documents (W2_XL_input_clean_1000.pdf through W2_XL_input_clean_1019.pdf)
161+
- **Purpose**: Realistic test data for exploring analytics capabilities
162+
- **Source**: Sample W2 documents are from [this kaggle dataset](https://www.kaggle.com/datasets/mcvishnu1/fake-w2-us-tax-form-dataset) and are 100% synthetic with a [CC0 1.0 public domain license](https://creativecommons.org/publicdomain/zero/1.0/).
163+
164+
### Testing Steps
165+
166+
1. **Upload Sample Documents**:
167+
- Use the Web UI to upload documents from the `/samples/w2/` folder
168+
- Or copy them directly to the S3 input bucket
169+
170+
2. **Wait for Processing**:
171+
- Monitor document processing through the Web UI dashboard
172+
- Ensure all documents complete successfully
173+
174+
3. **Try Sample Queries**:
175+
```
176+
"How many W2 documents have been processed?"
177+
178+
"Make a bar chart histogram of total earnings in all W2s with bins $25000 wide"
179+
180+
"What employee from the state of California paid the most tax?"
181+
182+
"What is the ratio of state tax paid to federal tax paid for the following states: Vermont, Nevada, Indiana, and Oregon?"
183+
```
184+
185+
## Configuration
186+
187+
The Agent Analysis feature is configured through CloudFormation parameters:
188+
189+
### Model Selection
190+
```yaml
191+
DocumentAnalysisAgentModelId:
192+
Type: String
193+
Default: "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
194+
Description: Model to use for Document Analysis Agent (analytics queries)
195+
```
196+
197+
**Supported Models:**
198+
- `us.anthropic.claude-3-7-sonnet-20250219-v1:0` (Default - Recommended)
199+
- `us.anthropic.claude-3-5-sonnet-20241022-v2:0`
200+
- `us.anthropic.claude-3-haiku-20240307-v1:0`
201+
- `us.amazon.nova-pro-v1:0`
202+
- `us.amazon.nova-lite-v1:0`
203+
204+
### Infrastructure Components
205+
206+
The feature automatically creates:
207+
- **DynamoDB Table**: Tracks analytics job status and results
208+
- **Lambda Functions**: Request handler and processor functions
209+
- **AppSync Resolvers**: GraphQL API endpoints for web UI integration
210+
- **IAM Roles**: Minimal permissions for secure operation
211+
212+
### Environment Variables
213+
214+
Key configuration settings:
215+
- `ANALYTICS_TABLE`: DynamoDB table for job tracking
216+
- `ATHENA_DATABASE`: Database containing processed document data
217+
- `ATHENA_OUTPUT_LOCATION`: S3 location for query results
218+
- `DOCUMENT_ANALYSIS_AGENT_MODEL_ID`: AI model for agent processing
219+
220+
## Best Practices
221+
222+
### Query Optimization
223+
224+
1. **Start Broad**: Begin with general questions before diving into specifics
225+
2. **Be Specific**: Clearly state what information you're looking for
226+
3. **Use Follow-ups**: Build on what you learned in previous questions to explore topics in depth (note: each question is independent; there is no actual conversation history)
227+
4. **Check Results**: Verify visualizations make sense for your data
228+
229+
### Security Best Practices
230+
231+
1. **Data Access**: Only authenticated users can access analytics features
232+
2. **Query Isolation**: Each user's queries are isolated and tracked separately
233+
3. **Audit Logging**: All queries and results are logged for security reviews
234+
4. **Sandbox Security**: Python code execution is completely isolated from system resources
235+
236+
## Troubleshooting
237+
238+
### Common Issues
239+
240+
**Agent Not Responding:**
241+
- Check CloudWatch logs for the Analytics Processor Lambda function
242+
- Verify Bedrock model access is enabled for your selected model
243+
- Ensure sufficient Lambda timeout (15 minutes) for complex queries
244+
245+
**SQL Query Errors:**
246+
- Agent automatically retries failed queries up to 5 times
247+
- Check that column names are properly quoted in generated SQL
248+
- Verify database permissions for Athena access
249+
250+
**Visualization Errors:**
251+
- Check that query results contain expected data types
252+
- Verify Python code generation in AgentCore sandbox
253+
- Review agent messages for detailed error information
254+
255+
**Performance Issues:**
256+
- Consider using simpler queries for large datasets
257+
- Try breaking complex questions into smaller parts
258+
- Monitor Athena query performance and optimize if needed
259+
260+
### Monitoring and Logging
261+
262+
- **CloudWatch Logs**: Detailed logs for both Lambda functions
263+
- **DynamoDB Console**: View job status and results directly
264+
- **Athena Console**: Monitor SQL query execution and performance
265+
- **Agent Messages**: Real-time display of agent reasoning in web UI
266+
267+
## Cost Considerations
268+
269+
The Agent Analysis feature uses several AWS services that incur costs:
270+
271+
- **Amazon Bedrock**: Model inference costs for agent processing
272+
- **AWS Bedrock AgentCore**: Code interpreter session costs
273+
- **Amazon Athena**: Query execution costs based on data scanned
274+
- **Amazon S3**: Storage costs for query results
275+
- **AWS Lambda**: Function execution costs
276+
- **Amazon DynamoDB**: Storage and request costs for job tracking
277+
278+
To optimize costs:
279+
- Choose appropriate Bedrock models based on accuracy vs. cost requirements
280+
- Monitor usage through AWS Cost Explorer
281+
282+
## Integration with Other Features
283+
284+
The Agent Analysis feature has access to _all_ tables that the GenAIIDP stores in Athena. Therefore it integrates seamlessly with other GenAIIDP capabilities:
285+
286+
### Evaluation Framework Integration
287+
- Query evaluation metrics and accuracy scores
288+
- Analyze patterns in document processing quality
289+
- Compare performance across different processing patterns
290+
291+
### Assessment Feature Integration
292+
- Explore confidence scores across document types
293+
- Identify low-confidence extractions requiring review
294+
- Analyze relationships between confidence and accuracy
295+
296+
## Future Enhancements
297+
298+
Planned improvements for the Agent Analysis feature include:
299+
300+
- **Dashboard Creation**: Save and share custom analytics dashboards
301+
- **Possible KB Unification**: Have one chat box in the UI which is capable of answering questions based either on the knowledge base (with semantic abilities), or on the Athena tables.

docs/reporting-database.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,11 +107,21 @@ The metering table is particularly valuable for:
107107

108108
## Document Sections Tables
109109

110-
The document sections tables store the actual extracted data from document sections in a structured format suitable for analytics. These tables are automatically discovered by AWS Glue Crawler and are organized by section type (classification).
110+
The document sections tables store the actual extracted data from document sections in a structured format suitable for analytics. These tables are automatically created when new section types are encountered during document processing, eliminating the need for manual table creation.
111+
112+
### Automatic Table Creation
113+
114+
When a document is processed and a new section type (classification) is detected, the system automatically:
115+
1. Creates a new Glue table for that section type (e.g., `document_sections_invoice`, `document_sections_receipt`, `document_sections_w2`)
116+
2. Configures the table with appropriate schema based on the extracted data
117+
3. Sets up partition projection for efficient date-based queries
118+
4. Updates the table schema if new fields are detected in subsequent documents
119+
120+
**Important:** Section type names are normalized to lowercase for consistency with case-sensitive S3 paths. For example, a section classified as "W2" will create a table named `document_sections_w2` with data stored in `document_sections/w2/`.
111121

112122
### Dynamic Section Tables
113123

114-
Document sections are stored in dynamically created tables based on the section classification. Each section type gets its own table (e.g., `document_sections_invoice`, `document_sections_receipt`, `document_sections_bank_statement`, etc.) with the following characteristics:
124+
Document sections are stored in dynamically created tables based on the section classification. Each section type gets its own table with the following characteristics:
115125

116126
**Common Metadata Columns:**
117127
| Column | Type | Description |

0 commit comments

Comments
 (0)