Skip to content

Dataflow v1.0.5 Release Note

Latest
Compare
Choose a tag to compare
@haolpku haolpku released this 23 Jul 12:02
· 73 commits to main since this release
d220ab5

DataFlow v1.0.5 Key Feature Updates

  • Add General Reasoning Pipeline : add new pipeline to support general reasoning data and diy prompt, and fix some bugs, reform some reasoning ops by @scuuy in #137
  • Add Batch Wrapper : Upload batch_wrapper for batching a operator in a pipeline. by @SunnyHaze in #157
  • Pandas Operator Release : Release GeneralFilter for pandas by @wongzhenhao in #170
  • Add Multiturn Function Call Operators add example data for FuncCallPipeline & rename MultiTurnDialogueGenerator by @MOLYHECI in #136
  • Add Math Problem Extractor : Add VQAServing, Add mathbook_promblem_extractor to KBC Pipeline by @HeRunming in #152
  • Refine General Text Operators : Customizable prompt for sft generators by @zzy1127 in #139
  • Fix Local Serving Bug : Fix Local Model Serving, apply chat_template to sys & user prompt by @haolpku in #158
  • Speed Up Text2SQL Pipeline Recontruct the database manager to improve the efficiency for text2sql pipeline by @TechNomad-ds in #174

Notable Changes

  • Add Dataflow WebUI : Add Gradio WebUI for all operators by @HeRunming in #169
  • Add Dataflow-Agent WebUI :Add agent gradio UI by @DeepMindLiuZhou in #175
  • Add MinerU for KBCPipeline : @Niujunbo2002 add MinerU2.0 in #132 and support for fetching arxiv pdf links by @ZhaoyangHan04 in #171
  • Add Sglang Support : Add tensor_parallel and data_parallel to LocalLLMServing_sglang by @SunnyHaze in #147

What's Changed

  • add get_desc for all general text operators by @zzy1127 in #133
  • [Feature] GeneralFilter for GeneralText release! by @wongzhenhao in #135
  • fix problem by @YqjMartin in #138
  • add example data for FuncCallPipeline & rename MultiTurnDialogueGenerator by @MOLYHECI in #136
  • add examples to get_desc by @ZhaoyangHan04 in #134
  • 可定制prompt的sft生成器 by @zzy1127 in #139
  • (new) add new pipeline to support general reasoning data and diy prompt, and fix some bugs, reform some reasoning ops by @scuuy in #137
  • Support MinerU2 for KnowledgeCleaning by @Niujunbo2002 in #132
  • [serving] set default vllm_seed param for LocalModelLLMServing_vllm to None to avoid warning by @SunnyHaze in #143
  • 修复gpu reasoning pipeline bug by @scuuy in #145
  • refine the get_desc func for each operator for text2sql pipeline by @TechNomad-ds in #142
  • [Serving] Add tensor_parallel and data_parallel to LocalLLMServing_sglang by @SunnyHaze in #147
  • text的所有算子加get_desc函数 by @scuuy in #146
  • 修复storage列解析错误展开data字段到dataframe,调整版本,修复AnswerNgramFilter算子的bug by @leaderwolfpipi in #115
  • add medical pipeline, generated by agent by @DeepMindLiuZhou in #148
  • [serving] add sglang for all scripts for option by @SunnyHaze in #150
  • implement kbc batch process operators and pipeline by @ZhaoyangHan04 in #151
  • [Serving, KBC]Add VQAServing, Add mathbook_promblem_extractor to KBC Pipeline. by @HeRunming in #152
  • fix bug for RemoveEmojiRefiner by @zzy1127 in #153
  • fix bugs in batch_kbc by @ZhaoyangHan04 in #156
  • [batch_wrapper] upload batch_wrapper for batching a operator in a pipeline. by @SunnyHaze in #157
  • [Serving] Fix Local Model Serving, apply chat_template to sys & user prompt by @haolpku in #158
  • add API-based languagefilter & customized MetaScorer by @MOLYHECI in #161
  • fix quickstart bug by @haolpku in #162
  • 统一embedding的属性名,调整SQLVariationGenerator算子填充逻辑补充进原始数据 by @leaderwolfpipi in #160
  • add publications by @Qmeiyi in #163
  • 修复reasoning流水线上其他算子向前兼容问题 by @leaderwolfpipi in #165
  • add desc for func call & add statics for meta score by @MOLYHECI in #168
  • [webui] Add Gradio WebUI for experience all operators. by @HeRunming in #169
  • [Feature] PandasOperator release! [Update] GeneralFilter updated by @wongzhenhao in #170
  • Support for fetching arxiv pdf links by @ZhaoyangHan04 in #171
  • add new reasoning operator “answer_model_judge” , to check reference answer via llm by @scuuy in #172
  • [WebUI] Add API Pipeline UI by @HeRunming in #173
  • Add agent gradio UI by @DeepMindLiuZhou in #175
  • recontruct the database manager to improve the efficiency for text2sql pipeline by @TechNomad-ds in #174
  • fix bug by @TechNomad-ds in #176
  • Update Gardio and Bug Fix by @DeepMindLiuZhou in #177
  • add operator in readme by @Qmeiyi in #178
  • change kbc script in playground & manage kbc pipelines by @ZhaoyangHan04 in #179
  • Unify backend and fronted by @DeepMindLiuZhou in #180
  • add mathbook extract to playground by @HeRunming in #181
  • add gradio in readme by @Qmeiyi in #182
  • add safety checks in fetching pdf by @ZhaoyangHan04 in #184
  • 增加了多轮对话中,对部分user生成缺少assistant的情况修复 by @Arunshmily in #185

New Contributors

Full Changelog: v1.0.4...v1.0.5