-
Notifications
You must be signed in to change notification settings - Fork 301
[Next week]✨ Knowledge Base Summary Development #1364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Mermaid97
wants to merge
10
commits into
develop
Choose a base branch
from
hs/1010_dev
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,837
−961
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
0149bcf
✨ Knowledge Base Summary Development
Mermaid97 03d76af
Merge remote-tracking branch 'origin/develop' into hs/1010_dev
Mermaid97 c0062e8
Merge branch 'develop' into hs/1010_dev
Mermaid97 dd1af1c
Update pyproject.toml
Mermaid97 91e08b6
✨ resolve test case
Mermaid97 0db9350
Merge remote-tracking branch 'origin/hs/1010_dev' into hs/1010_dev
Mermaid97 3592a02
✨ resolve test case again
Mermaid97 98c06b6
✨ resolve test case elasticsearch
Mermaid97 cd5fb96
✨ resolve test case and suggestions
Mermaid97 a82e295
✨ resolve requirements conflict
Mermaid97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# Async Knowledge Summary Prompt Templates (Chinese) | ||
|
||
# Summary Generation Prompt | ||
SUMMARY_GENERATION_PROMPT: |- | ||
### 你是【知识总结专家】,负责生成简洁准确的知识总结。 | ||
|
||
请为以下内容生成简洁的知识总结(不超过{{ max_length }}个中文字符): | ||
|
||
内容: | ||
{{ text }} | ||
|
||
### 要求: | ||
1. 提取核心观点和关键信息 | ||
2. 使用简洁清晰的语言 | ||
3. 保持客观准确 | ||
4. 突出重点内容 | ||
5. 不要使用markdown格式符号(如#、*、-等) | ||
6. 直接输出总结内容,无需额外说明 | ||
|
||
知识总结: | ||
|
||
# Keyword Extraction Prompt | ||
KEYWORD_EXTRACTION_PROMPT: |- | ||
### 你是【关键词提取专家】,负责从文本中提取核心关键词。 | ||
|
||
请从以下文本中提取{{ max_keywords }}个最重要的关键词: | ||
|
||
{{ text }} | ||
|
||
### 要求: | ||
1. 关键词应准确反映文本主题 | ||
2. 优先提取专有名词和核心概念 | ||
3. 每个关键词用逗号分隔 | ||
4. 只输出关键词,不要其他内容 | ||
5. 使用中文输出 | ||
|
||
关键词: | ||
|
||
# Knowledge Card Generation Prompt | ||
KNOWLEDGE_CARD_GENERATION_PROMPT: |- | ||
### 你是【知识卡片生成专家】,负责将文本内容提炼成结构化的知识卡片。 | ||
|
||
请为以下内容生成一个知识卡片,包含摘要和关键词: | ||
|
||
内容: | ||
{{ text }} | ||
|
||
### 要求: | ||
1. 摘要部分: | ||
- 不超过200个中文字符 | ||
- 提炼核心内容和关键信息 | ||
- 语言简洁清晰,逻辑连贯 | ||
- 不使用markdown格式符号 | ||
|
||
2. 关键词部分: | ||
- 提取5-10个核心关键词 | ||
- 用逗号分隔 | ||
- 反映内容主题 | ||
|
||
3. 输出格式: | ||
- 第一行:摘要内容 | ||
- 第二行:关键词(用"关键词:"前缀) | ||
|
||
请直接输出,无需额外说明。 | ||
|
||
# Cluster Integration Prompt | ||
CLUSTER_INTEGRATION_PROMPT: |- | ||
### 你是【知识整合专家】,负责将多个知识卡片整合成连贯的集群总结。 | ||
|
||
请将以下知识卡片整合成一个连贯完整的集群总结: | ||
|
||
{{ summaries_text }} | ||
|
||
### 要求: | ||
1. 将所有卡片的核心信息整合成统一主题 | ||
2. 根据内容重要性和相关性调整权重,重要内容详细描述,次要内容简要提及 | ||
3. 保持清晰逻辑和完整结构,确保所有信息都得到体现 | ||
4. 字数控制在200字以内 | ||
5. 使用简洁清晰的语言 | ||
6. 不要遗漏任何信息,只调整描述权重 | ||
7. 不要使用markdown格式符号 | ||
8. 直接输出纯文本内容 | ||
|
||
集群整合总结: | ||
|
||
# Global Integration Prompt | ||
GLOBAL_INTEGRATION_PROMPT: |- | ||
### 你是【知识库总结专家】,负责生成清晰明确的知识库整体总结。 | ||
|
||
请将以下{{ cluster_count }}个集群总结整合成一个清晰明确的知识库内容总结: | ||
|
||
{{ summaries_text }} | ||
|
||
### 要求: | ||
|
||
#### 1. 内容整合要求: | ||
- 分析{{ cluster_count }}个集群总结的内容相似性和关联性 | ||
- 将相似或关联的内容合并到同一个要点中 | ||
- 最终要点数量不能超过{{ cluster_count }}个(即≤{{ cluster_count }}个要点) | ||
- 如果内容差异很大,可以保持{{ cluster_count }}个独立要点 | ||
|
||
#### 2. 内容要求: | ||
- 总结要清晰、完整、不遗漏关键信息 | ||
- 每个要点突出核心观点和关键数据 | ||
- 语言简洁明确,便于大模型识别查询意图 | ||
- 保持逻辑连贯性和主题关联性 | ||
|
||
#### 3. 输出要求: | ||
- 使用纯文本格式,不使用Markdown标记 | ||
- 分点使用"一、"、"二、"等序号 | ||
- 每个要点之间用空行分隔 | ||
- 直接输出内容,无需额外说明 | ||
|
||
知识库内容总结: | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# Async Knowledge Summary Prompt Templates (English) | ||
|
||
# Summary Generation Prompt | ||
SUMMARY_GENERATION_PROMPT: |- | ||
### You are a [Knowledge Summary Expert] responsible for generating concise and accurate knowledge summaries. | ||
|
||
Please generate a concise knowledge summary (no more than {{ max_length }} characters) for the following content: | ||
|
||
Content: | ||
{{ text }} | ||
|
||
### Requirements: | ||
1. Extract core viewpoints and key information | ||
2. Use concise and clear language | ||
3. Maintain objectivity and accuracy | ||
4. Highlight important content | ||
5. Do not use markdown format symbols (such as #, *, -, etc.) | ||
6. Output the summary directly without additional explanation | ||
|
||
Knowledge Summary: | ||
|
||
# Keyword Extraction Prompt | ||
KEYWORD_EXTRACTION_PROMPT: |- | ||
### You are a【Keyword Extraction Expert】responsible for extracting core keywords from text. | ||
|
||
Please extract {{ max_keywords }} most important keywords from the following text: | ||
|
||
{{ text }} | ||
|
||
### Requirements: | ||
1. Keywords should accurately reflect the text theme | ||
2. Prioritize proper nouns and core concepts | ||
3. Separate each keyword with a comma | ||
4. Output only keywords, no other content | ||
|
||
Keywords: | ||
|
||
# Knowledge Card Generation Prompt | ||
KNOWLEDGE_CARD_GENERATION_PROMPT: |- | ||
### You are a【Knowledge Card Generation Expert】responsible for refining text content into structured knowledge cards. | ||
|
||
Please generate a knowledge card for the following content, including summary and keywords: | ||
|
||
Content: | ||
{{ text }} | ||
|
||
### Requirements: | ||
1. Summary section: | ||
- No more than 200 characters | ||
- Refine core content and key information | ||
- Use concise and clear language with coherent logic | ||
- Do not use markdown format symbols | ||
|
||
2. Keywords section: | ||
- Extract 5-10 core keywords | ||
- Separate with commas | ||
- Reflect content theme | ||
|
||
3. Output format: | ||
- First line: Summary content | ||
- Second line: Keywords (with "Keywords:" prefix) | ||
|
||
Please output directly without additional explanation. | ||
|
||
# Cluster Integration Prompt | ||
CLUSTER_INTEGRATION_PROMPT: |- | ||
### You are a【Knowledge Integration Expert】responsible for integrating multiple knowledge cards into coherent cluster summaries. | ||
|
||
Please integrate the following knowledge cards into a coherent and complete cluster summary: | ||
|
||
{{ summaries_text }} | ||
|
||
### Requirements: | ||
1. Integrate core information from all cards into a unified theme | ||
2. Adjust weight based on content importance and relevance, describe important content in detail, mention secondary content briefly | ||
3. Maintain clear logic and complete structure, ensure all information is represented | ||
4. Control word count within 200 words | ||
5. Use concise and clear language | ||
6. Do not omit any information, only adjust description weight | ||
7. Do not use markdown format symbols | ||
8. Output plain text content directly | ||
|
||
Cluster Integration Summary: | ||
|
||
# Global Integration Prompt | ||
GLOBAL_INTEGRATION_PROMPT: |- | ||
### You are a【Knowledge Base Summary Expert】responsible for generating clear and explicit overall knowledge base summaries. | ||
|
||
Please integrate the following {{ cluster_count }} cluster summaries into a clear and explicit knowledge base content summary: | ||
|
||
{{ summaries_text }} | ||
|
||
### Requirements: | ||
|
||
#### 1. Content Integration Requirements: | ||
- Analyze content similarity and relevance of {{ cluster_count }} cluster summaries | ||
- Merge similar or related content into the same point | ||
- Final number of points must not exceed {{ cluster_count }} (i.e., ≤{{ cluster_count }} points) | ||
- If content is very different, keep {{ cluster_count }} independent points | ||
|
||
#### 2. Content Requirements: | ||
- Summary should be clear, complete, without missing key information | ||
- Each point highlights core viewpoints and key data | ||
- Language is concise and clear, easy for large models to identify query intent | ||
- Maintain logical coherence and thematic relevance | ||
|
||
#### 3. Output Requirements: | ||
- Use plain text format, do not use Markdown markup | ||
- Use numbered points like "1.", "2.", etc. | ||
- Separate each point with blank lines | ||
- Output content directly without additional explanation | ||
|
||
Knowledge Base Content Summary: | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
提示词这里是否可以不体现async异步