Skip to content

Commit c035342

Browse files
authored
Merge pull request #24 from wangle201210/feat/chunk-edit
api接口地址加前缀 & mysql charset 必须配置 & 数据库唯一索引修正
2 parents 3233bfb + f2a71d9 commit c035342

File tree

11 files changed

+52
-39
lines changed

11 files changed

+52
-39
lines changed

README.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,14 @@
55
2. 选择需要使用的知识库,上传文档
66
![](./server/static/kb-select.png)
77
![](./server/static/indexer.png)
8-
3. 检索
8+
3. 文档列表 & chunk 编辑
9+
![](./server/static/doc-list.png)
10+
![](./server/static/chunk-edit.png)
11+
4. 文档检索
912
![](./server/static/retriever.png)
10-
4. 对话
13+
5. 对话
1114
![](./server/static/chat.png)
12-
5. mcp (以集成到deepchat为例)
15+
6. mcp (以集成到deepchat为例)
1316
![](./server/static/mcp-cfg.png)
1417
![](./server/static/mcp-use.png)
1518

@@ -24,13 +27,10 @@
2427
- [x] 网页解析
2528
- [x] 文档检索
2629
- [x] 长文档自动切割(chunk)
27-
- [x] 提供http接口 [rag-api](./server/README.md)
28-
- [x] 提供 index、retrieve、chat 的前端界面
2930
- [x] 多知识库支持
30-
31-
32-
## 未来计划
33-
- [ ] 使用mysql存储chunk和文档的映射关系,目前放在es的ext字段
31+
- [x] chunk 编辑
32+
- [x] 自动生成 QA 对
33+
- [x] 多路召回
3434

3535
## 使用
3636
### clone项目
@@ -65,6 +65,9 @@ make run
6565
docker run -d --name elasticsearch \
6666
-e "discovery.type=single-node" \
6767
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
68+
-e "cluster.routing.allocation.disk.watermark.low=1gb" \
69+
-e "cluster.routing.allocation.disk.watermark.high=1gb" \
70+
-e "cluster.routing.allocation.disk.watermark.flood_stage=1gb" \
6871
-e "xpack.security.enabled=false" \
6972
-p 9200:9200 \
7073
-p 9300:9300 \

docker-compose.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ services:
2020
- MYSQL_DATABASE=go-rag
2121
volumes:
2222
# 如果需要持久化配置或数据,可以添加相应的卷挂载,目前是在构建镜像时就copy过去的
23-
- ./server/manifest/config/config-docker.yaml:/app/manifest/config/config.yaml
23+
- ./server/manifest/config/config.yaml:/app/manifest/config/config.yaml
2424
depends_on:
2525
mysql:
2626
condition: service_healthy
@@ -53,6 +53,7 @@ services:
5353
- MYSQL_ROOT_PASSWORD=123456
5454
- MYSQL_DATABASE=go-rag
5555
- MYSQL_ROOT_HOST=% # 允许root从任意主机连接
56+
- MYSQL_CHARSET=utf8mb4 # 设置数据库字符集为utf8mb4
5657
ports:
5758
- "3306:3306"
5859
volumes:

fe/src/views/Chat.vue

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,6 @@
174174
import { ref, reactive, onMounted, nextTick } from 'vue'
175175
import { ElMessage, ElNotification } from 'element-plus'
176176
import { User, Service, ChatDotRound, ChatRound, Plus, Position, Setting, Document, CopyDocument } from '@element-plus/icons-vue'
177-
import axios from 'axios'
178177
import { marked } from 'marked'
179178
import hljs from 'highlight.js'
180179
import 'highlight.js/styles/github.css'
@@ -262,7 +261,7 @@ const sendMessage = async () => {
262261
try {
263262
// 使用fetch API进行流式请求
264263
references.value = [];
265-
const response = await fetch('/v1/chat/stream', {
264+
const response = await fetch('/api/v1/chat/stream', {
266265
method: 'POST',
267266
headers: {
268267
'Content-Type': 'application/json',

fe/src/views/Retriever.vue

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ import DOMPurify from 'dompurify'
100100
import hljs from 'highlight.js'
101101
import 'highlight.js/styles/github.css'
102102
import KnowledgeSelector from '../components/KnowledgeSelector.vue'
103+
import request from "../utils/request";
103104
104105
// 配置Marked和代码高亮
105106
marked.setOptions({
@@ -146,13 +147,13 @@ const handleSearch = async () => {
146147
searched.value = true
147148
148149
try {
149-
const response = await axios.post('/v1/retriever', {
150+
const response = await request.post('/v1/retriever', {
150151
question: searchForm.question,
151152
top_k: searchForm.top_k,
152153
score: searchForm.score,
153154
knowledge_name: knowledgeSelectorRef.value?.getSelectedKnowledgeId() || ''
154155
})
155-
searchResults.value = response.data.data.document || []
156+
searchResults.value = response.data.document || []
156157
157158
if (searchResults.value.length === 0) {
158159
ElMessage.info('未找到相关文档')

fe/vite.config.js

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ export default defineConfig(({ command, mode }) => {
2020
'/api': {
2121
target: env.VITE_DEV_PROXY_TARGET || 'http://localhost:8000',
2222
changeOrigin: true,
23-
rewrite: (path) => path.replace(/^\/api/, '')
23+
// rewrite: (path) => path.replace(/^\/api/, '')
2424
}
2525
}
2626
},
@@ -29,30 +29,30 @@ export default defineConfig(({ command, mode }) => {
2929
outDir: 'dist',
3030
assetsDir: 'assets',
3131
// 小于此阈值的资源将被内联为base64编码
32-
assetsInlineLimit: 4096,
32+
// assetsInlineLimit: 4096,
3333
// 启用/禁用CSS代码拆分
34-
cssCodeSplit: true,
34+
// cssCodeSplit: true,
3535
// 构建后是否生成source map文件
36-
sourcemap: false,
37-
// 自定义底层的Rollup打包配置
38-
rollupOptions: {
39-
output: {
40-
// 用于控制chunks的拆分
41-
manualChunks: {
42-
'element-plus': ['element-plus'],
43-
'vue-vendor': ['vue', 'vue-router', 'pinia']
44-
}
45-
}
46-
},
36+
// sourcemap: false,
37+
// // 自定义底层的Rollup打包配置
38+
// rollupOptions: {
39+
// output: {
40+
// // 用于控制chunks的拆分
41+
// manualChunks: {
42+
// 'element-plus': ['element-plus'],
43+
// 'vue-vendor': ['vue', 'vue-router', 'pinia']
44+
// }
45+
// }
46+
// }
4747
// 设置最小化混淆
48-
minify: 'terser',
49-
terserOptions: {
50-
compress: {
51-
// 生产环境时移除console
52-
drop_console: true,
53-
drop_debugger: true
54-
}
55-
}
48+
// minify: 'terser',
49+
// terserOptions: {
50+
// compress: {
51+
// // 生产环境时移除console
52+
// drop_console: true,
53+
// drop_debugger: true
54+
// }
55+
// }
5656
}
5757
}
5858
})

roadmap.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@
88

99
### 0.1.0 - 核心功能完善
1010
#### 异步处理
11-
- [ ] QA & embedding 异步执行
11+
- [x] QA & embedding 异步执行
1212
- 现状:同步处理,上传文档时立即执行 split→QA生成→embedding
1313
- 优化:将split后的处理流程改为异步
1414

1515
#### 数据管理
16-
- [ ] chunk管理
16+
- [x] chunk管理
1717
- 问题:缺乏文档-chunk映射关系,无法编辑单个chunk
1818
- 方案:
1919
1. ES存储chunk时同步记录映射关系到MySQL
@@ -44,6 +44,8 @@
4444
- 优化:
4545
1. 引入第三方API提升解析质量(如mineru)
4646
2. 新增ppt/docx等格式支持
47+
3. 图片解析
48+
4. 用户自定义文档解析逻辑
4749

4850
#### 用户系统
4951
- [ ] 添加用户体系
@@ -57,3 +59,7 @@
5759
1. 自定义模型提供商
5860
2. 个人API_KEY管理
5961

62+
#### 多向量库支持
63+
- [ ] 多数据库支持
64+
- 现状:只支持 es
65+
- 优化:支持postgre、milvus等向量数据库

server/internal/cmd/cmd.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ var (
2323
s.Group("/", func(group *ghttp.RouterGroup) {
2424
s.AddStaticPath("", "./static/fe/")
2525
s.SetIndexFiles([]string{"index.html"})
26+
})
27+
s.Group("/api", func(group *ghttp.RouterGroup) {
2628
group.Middleware(MiddlewareHandlerResponse, ghttp.MiddlewareCORS)
2729
group.Bind(
2830
rag.NewV1(),

server/internal/model/gorm/knowledge_documents.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ import (
77
// KnowledgeDocuments GORM模型定义
88
type KnowledgeDocuments struct {
99
ID int64 `gorm:"primaryKey;column:id;autoIncrement"`
10-
KnowledgeBaseName string `gorm:"column:knowledge_base_name;type:varchar(255);not null;uniqueIndex"`
10+
KnowledgeBaseName string `gorm:"column:knowledge_base_name;type:varchar(255);not null"`
1111
FileName string `gorm:"column:file_name;type:varchar(255)"`
1212
Status int8 `gorm:"column:status;type:tinyint;not null;default:0"`
1313
CreateTime time.Time `gorm:"column:created_at;type:timestamp;default:CURRENT_TIMESTAMP"`

server/manifest/config/config_demo.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ database:
1515
pass: "123456" # 密码
1616
name: "go-rag" # 数据库名称
1717
type: "mysql" # 数据库类型
18+
charset: "utf8mb4" # 数据库编码,一定要加上,因为文档里面经常出现特殊字符
1819

1920
es:
2021
address: "http://elasticsearch:9200"

server/static/chunk-edit.png

187 KB
Loading

0 commit comments

Comments
 (0)