Skip to content

Commit a508f50

Browse files
author
Feng Xu
committed
update readme
1 parent 989b74c commit a508f50

File tree

2 files changed

+260
-5
lines changed

2 files changed

+260
-5
lines changed

README.md

Lines changed: 129 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,135 @@
1-
## My Project
1+
## Intelligent BI Demo
22

3-
TODO: Fill this README out!
3+
[中文文档](README_CN.md)
44

5-
Be sure to:
5+
## Deployment Guide
66

7-
* Change the title in this README
8-
* Edit your repository description on GitHub
7+
### 1. Prepare EC2 Instance
8+
Create an EC2 with following configuration:
9+
10+
- Software Image (AMI): Amazon Linux 2023
11+
- Virtual server type (instance type): t3.large or higher
12+
- Firewall (security group): Allow 22, 80 port
13+
- Storage (volumes): 1 GP3 volume(s) - 30 GiB
14+
15+
### 2. Config Permission
16+
Bind an IAM Role to your EC2 instance.
17+
And attach an inline policy to this IAM Role with following permissions:
18+
```json
19+
{
20+
"Version": "2012-10-17",
21+
"Statement": [
22+
{
23+
"Sid": "VisualEditor0",
24+
"Effect": "Allow",
25+
"Action": [
26+
"bedrock:*",
27+
"secretsmanager:GetSecretValue",
28+
"dynamodb:*"
29+
],
30+
"Resource": "*"
31+
}
32+
]
33+
}
34+
```
35+
36+
Make sure you have enabled model access in AWS Console in us-west-2 (美国西部 (俄勒冈州)) region for Anthropic Claude model and Amazon Titan embedding model.
37+
38+
### 3. Install Docker and Docker Compose
39+
40+
On the EC2 instance, log in to the SSH command line as the ec2-user user or use the AWS EC2 Instance Connect feature in the EC2 console to log in to the command line. In the session, execute the following commands. If you are not this user, you can switch with the following command:
41+
42+
Note: Execute each command one line at a time.
43+
44+
```bash
45+
sudo su - ec2-user
46+
```
47+
48+
```bash
49+
# Install components
50+
sudo dnf install docker python3-pip git -y && pip3 install docker-compose
51+
52+
# Fix docker python wrapper 7.0 SSL version issue
53+
pip3 install docker==6.1.3
54+
55+
# Configure components
56+
sudo systemctl enable docker && sudo systemctl start docker && sudo usermod -aG docker $USER
57+
58+
# Exit the terminal
59+
exit
60+
```
61+
62+
### 4. Install the Demo Application
63+
64+
Reopen a terminal session and continue executing the following commands:
65+
66+
Note: Execute each command one line at a time.
67+
68+
```bash
69+
# Log in as user ec2-user
70+
71+
# Configure OpenSearch server parameters
72+
sudo sh -c "echo 'vm.max_map_count=262144' > /etc/sysctl.conf" && sudo sysctl -p
73+
74+
# Clone the code
75+
git clone https://github.com/aws-samples/generative-bi-using-rag.git
76+
77+
# Build docker images locally
78+
cd generative-bi-using-rag/application && cp .env.template .env && docker-compose build
79+
80+
# Start all services
81+
docker-compose up -d
82+
83+
# Wait 3 minutes for MySQL and OpenSearch to initialize
84+
sleep 180
85+
```
86+
Here is the English translation:
87+
88+
### 5. Initialize MySQL
89+
90+
In the terminal, continue executing the following commands:
91+
92+
```bash
93+
cd initial_data && unzip init_mysql_db.sql.zip && cd ..
94+
docker exec nlq-mysql sh -c "mysql -u root -ppassword -D llm < /opt/data/init_mysql_db.sql"
95+
```
96+
97+
### 6. Initialize OpenSearch
98+
99+
6.1 Initialize the index for the sample data by creating a new index:
100+
101+
```bash
102+
docker exec nlq-webserver python opensearch_deploy.py
103+
```
104+
105+
If the script fails due to any errors, delete the index and rerun the previous command:
106+
107+
```bash
108+
curl -XDELETE -k -u admin:admin "https://localhost:9200/uba"
109+
```
110+
111+
6.2 (Optional) Bulk import custom QA data by appending to an existing index:
112+
113+
```bash
114+
docker exec nlq-webserver python opensearch_deploy.py custom false
115+
```
116+
117+
### 7. Access the Streamlit Web UI
118+
119+
Open in your browser: `http://<your-ec2-public-ip>`
120+
121+
Note: Use HTTP instead of HTTPS.
122+
123+
## How to use custom data sources with the demo app
124+
1. First create the corresponding Data Profile in Data Connection Management and Data Profile Management.
125+
2. After selecting the Data Profile, start asking questions. For simple questions, the LLM can directly generate the correct SQL. If the generated SQL is incorrect, try adding more annotations to the Schema.
126+
3. Use the Schema Management page, select the Data Profile, and add comments to the tables and fields. These comments will be included in the prompt sent to the LLM.
127+
(1) For some fields, add values to the Annotation attribute, e.g. "Values: Y|N", "Values: Shanghai|Jiangsu".
128+
(2) For table comments, add domain knowledge to help answer business questions.
129+
4. Ask the question again. If still unable to generate the correct SQL, add Sample QA pairs to OpenSearch.
130+
(1) Using the Index Management page, select the Data Profile then you can add, view and delete QA pairs.
131+
132+
5. Ask again. In theory, the RAG approach (PE uses Few shots) should now be able to generate the correct SQL.
9133

10134
## Security
11135

README_CN.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
## 智能BI演示
2+
3+
## 部署指南
4+
5+
### 1. 准备EC2实例
6+
创建具有以下配置的EC2实例:
7+
8+
- 软件镜像(AMI): Amazon Linux 2023
9+
- 虚拟服务器类型(实例类型): t3.large或更高配置
10+
- 防火墙(安全组): 允许22, 80端口
11+
- 存储(卷): 1个GP3卷 - 30 GiB
12+
13+
### 2. 配置权限
14+
为您的EC2实例绑定IAM角色, 可以参考[EC2文档-使用IAM角色](https://docs.aws.amazon.com/zh_cn/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#working-with-iam-roles)
15+
并为此IAM角色附加以下权限的内联策略:
16+
```json
17+
{
18+
"Version": "2012-10-17",
19+
"Statement": [
20+
{
21+
"Sid": "VisualEditor0",
22+
"Effect": "Allow",
23+
"Action": [
24+
"bedrock:*",
25+
"secretsmanager:GetSecretValue",
26+
"dynamodb:*"
27+
],
28+
"Resource": "*"
29+
}
30+
]
31+
}
32+
```
33+
34+
确保您已在us-west-2(美国西部(俄勒冈州))区域的AWS控制台中为Anthropic Claude模型和Amazon Titan嵌入模型启用了模型访问。
35+
36+
### 3. 安装Docker和Docker Compose
37+
在EC2中,以ec2-user用户通过SSH命令行登录或者使用AWS EC2控制台的EC2 Instance Connect功能登录命令行,在会话下执行以下命令。 如果不是此用户,您可以使用以下命令切换:
38+
39+
注意:所有命令请一行一行执行。
40+
41+
```bash
42+
sudo su - ec2-user
43+
```
44+
45+
```bash
46+
# 安装组件
47+
sudo dnf install docker python3-pip git -y && pip3 install docker-compose
48+
49+
# 修复docker的python包装器7.0 SSL版本问题
50+
pip3 install docker==6.1.3
51+
52+
# 配置组件
53+
sudo systemctl enable docker && sudo systemctl start docker && sudo usermod -aG docker $USER
54+
55+
# 退出终端
56+
exit
57+
```
58+
59+
### 4. 安装Demo应用
60+
61+
重新开启一个终端会话,继续执行以下命令:
62+
63+
注意:所有命令请一行一行执行。
64+
65+
```bash
66+
# 以用户ec2-user作为登录用户
67+
68+
# 配置OpenSearch的服务器参数
69+
sudo sh -c "echo 'vm.max_map_count=262144' > /etc/sysctl.conf" && sudo sysctl -p
70+
71+
# 克隆代码
72+
git clone https://github.com/aws-samples/generative-bi-using-rag.git
73+
74+
# 在本地构建docker镜像
75+
cd generative-bi-using-rag/application && cp .env.template .env && docker-compose build
76+
77+
# 启动所有服务
78+
docker-compose up -d
79+
80+
# 等待3分钟,等MySQL和OpenSearch初始化完成
81+
sleep 180
82+
```
83+
84+
### 5. 初始化MySQL
85+
在终端里继续执行以下命令:
86+
```bash
87+
cd initial_data && unzip init_mysql_db.sql.zip && cd ..
88+
docker exec nlq-mysql sh -c "mysql -u root -ppassword -D llm < /opt/data/init_mysql_db.sql"
89+
```
90+
91+
### 6. 初始化OpenSearch
92+
93+
6.1 通过创建新索引来初始化示例数据的索引
94+
```bash
95+
docker exec nlq-webserver python opensearch_deploy.py
96+
```
97+
98+
如果脚本执行因任何错误而失败。 请使用以下命令删除索引并重新运行上一个命令。
99+
```bash
100+
curl -XDELETE -k -u admin:admin "https://localhost:9200/uba"
101+
```
102+
103+
6.2 (可选)通过向已有索引追加数据(Append)来批量导入自定义QA数据
104+
```bash
105+
docker exec nlq-webserver python opensearch_deploy.py custom false
106+
```
107+
108+
### 7. 访问Streamlit Web UI
109+
110+
在浏览器中打开网址: `http://<your-ec2-public-ip>`
111+
112+
注意:使用 HTTP 而不是 HTTPS。
113+
114+
## Demo应用使用自定义数据源的方法
115+
1. 先在Data Connection Management和Data Profile Management页面创建对应的Data Profile
116+
2. 选择Data Profile后,开始提问,简单的问题,LLM能直接生成对的SQL,如果生成的SQL不对,可以尝试给Schema增加描述。
117+
3. 使用Schema Management页面,选中Data Profile后,给表和字段都加上注释,这个注释会写进提示词发送给LLM。
118+
(1) 给一些字段的Annotation属性加上这个字段可能出现的值, 比如"Values: Y|N", "Values:上海市|江苏省"
119+
(2) 给表的注释加上能回答业务问题的领域知识
120+
4. 重新提问,如果还是不能生成对的SQL,则添加Sample QA对到OpenSearch
121+
(1) 使用Index Management页面,选中Data Profile后,可以添加、浏览和删除QA问题对。
122+
123+
5. 再重新提问, 理论上通过RAG方式(PE使用Few shots)应该可以生成正确的SQL。
124+
125+
## Security
126+
127+
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
128+
129+
## License
130+
131+
This library is licensed under the MIT-0 License. See the LICENSE file.

0 commit comments

Comments
 (0)