Skip to content

Commit 8e0dee7

Browse files
improve
1 parent 7ecd825 commit 8e0dee7

File tree

2 files changed

+71
-28
lines changed

2 files changed

+71
-28
lines changed

docs/source/tasks/extractive_qa.mdx

Lines changed: 26 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,45 @@
1-
# Extractive Question Answering
1+
---
2+
title: "AutoTrain Extractive Question Answering - Train QA Models Easily"
3+
description: "Learn how to train extractive question answering models using AutoTrain. Simple guide for both local and cloud training with popular models like BERT."
4+
---
25

3-
Extractive Question Answering is a task in which a model is trained to extract the answer to a question from a given context.
4-
The model is trained to predict the start and end positions of the answer span within the context.
5-
This task is commonly used in question-answering systems to extract relevant information from a large corpus of text.
6+
# Extractive Question Answering with AutoTrain
67

8+
Extractive Question Answering (QA) enables AI models to find and extract precise answers from text passages. This guide shows you how to train custom QA models using AutoTrain, supporting popular architectures like BERT, RoBERTa, and DeBERTa.
79

8-
## Preparing your data
10+
## What is Extractive Question Answering?
911

10-
To train an Extractive Question Answering model, you need a dataset that contains the following columns:
12+
Extractive QA models learn to:
13+
- Locate exact answer spans within longer text passages
14+
- Understand questions and match them to relevant context
15+
- Extract precise answers rather than generating them
16+
- Handle both simple and complex queries about the text
1117

12-
- `text`: The context or passage from which the answer is to be extracted.
13-
- `question`: The question for which the answer is to be extracted.
14-
- `answer`: The start position of the answer span in the context.
18+
## Preparing your Data
1519

16-
Here is an example of how your dataset should look:
20+
Your dataset needs these essential columns:
21+
- `text`: The passage containing potential answers (also called context)
22+
- `question`: The query you want to answer
23+
- `answer`: Answer span information including text and position
1724

25+
Here is an example of how your dataset should look:
1826

1927
```
2028
{"context":"Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend \"Venite Ad Me Omnes\". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.","question":"To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?","answers":{"text":["Saint Bernadette Soubirous"],"answer_start":[515]}}
2129
{"context":"Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend \"Venite Ad Me Omnes\". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.","question":"What is in front of the Notre Dame Main Building?","answers":{"text":["a copper statue of Christ"],"answer_start":[188]}}
2230
{"context":"Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend \"Venite Ad Me Omnes\". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.","question":"The Basilica of the Sacred heart at Notre Dame is beside to which structure?","answers":{"text":["the Main Building"],"answer_start":[279]}}
2331
```
2432

25-
2633
Note: the preferred format for question answering is JSONL, if you want to use CSV, the `answer` column should be stringified JSON with the keys `text` and `answer_start`.
2734

2835
Example dataset from Hugging Face Hub: [lhoestq/squad](https://huggingface.co/datasets/lhoestq/squad)
2936

30-
3137
P.S. You can use both squad and squad v2 data format with correct column mappings.
3238

33-
## Training Locally
39+
## Training Options
40+
41+
### Local Training
42+
Train models on your own hardware with full control over the process.
3443

3544
To train an Extractive QA model locally, you need a config file:
3645

@@ -75,12 +84,14 @@ $ autotrain --config config.yaml
7584

7685
Here, we are training a BERT model on the SQuAD dataset using the Extractive QA task. The model is trained for 3 epochs with a batch size of 4 and a learning rate of 2e-5. The training process is logged using TensorBoard. The model is trained locally and pushed to the Hugging Face Hub after training.
7786

78-
## Training on the Hugging Face Spaces
87+
### Cloud Training on Hugging Face
88+
Train models using Hugging Face's cloud infrastructure for better scalability.
7989

8090
![AutoTrain Extractive Question Answering on Hugging Face Spaces](https://raw.githubusercontent.com/huggingface/autotrain-advanced/main/static/ext_qa.png)
8191

8292
As always, pay special attention to column mapping.
8393

84-
## Parameters
94+
95+
## Parameter Reference
8596

8697
[[autodoc]] trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams

docs/source/tasks/llm_finetuning.mdx

Lines changed: 45 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,27 @@
1-
# LLM Finetuning
2-
3-
With AutoTrain, you can easily finetune large language models (LLMs) on your own data.
4-
You can use AutoTrain to finetune LLMs for a variety of tasks, such as text generation, text classification,
5-
and text summarization. You can also use AutoTrain to finetune LLMs for specific use cases, such as chatbots,
6-
question-answering systems, and code generation and even basic fine-tuning tasks like classic text generation.
7-
8-
Config file task names:
9-
- `llm`: generic trainer
10-
- `llm-sft`: SFT trainer
11-
- `llm-reward`: Reward trainer
12-
- `llm-dpo`: DPO trainer
13-
- `llm-orpo`: ORPO trainer
1+
---
2+
title: "LLM Finetuning with AutoTrain Advanced"
3+
description: "Complete guide to fine-tuning Large Language Models (LLMs) using AutoTrain Advanced. Learn how to prepare data, train models, and deploy them for text generation, chatbots, and more."
4+
keywords: "llm finetuning, language model training, autotrain, hugging face, nlp, machine learning"
5+
---
6+
7+
# LLM Finetuning with AutoTrain Advanced
8+
9+
AutoTrain Advanced makes it easy to fine-tune large language models (LLMs) for your specific use cases. This guide covers everything you need to know about LLM fine-tuning.
10+
11+
## Key Features
12+
- Simple data preparation with CSV and JSONL formats
13+
- Support for multiple training approaches (SFT, DPO, ORPO)
14+
- Built-in chat templates
15+
- Local and cloud training options
16+
- Optimized training parameters
17+
18+
## Supported Training Methods
19+
AutoTrain supports multiple specialized trainers:
20+
- `llm`: Generic LLM trainer
21+
- `llm-sft`: Supervised Fine-Tuning trainer
22+
- `llm-reward`: Reward modeling trainer
23+
- `llm-dpo`: Direct Preference Optimization trainer
24+
- `llm-orpo`: ORPO (Optimal Reward Policy Optimization) trainer
1425

1526
## Data Preparation
1627

@@ -145,6 +156,27 @@ Chat models can be trained using the following trainers:
145156

146157
The only difference between the data format for reward trainer and DPO/ORPO trainer is that the reward trainer requires only `text` and `rejected_text` columns, while the DPO/ORPO trainer requires an additional `prompt` column.
147158

159+
## Best Practices for LLM Fine-tuning
160+
161+
### Memory Optimization
162+
- Use appropriate `block_size` and `model_max_length` for your hardware
163+
- Enable mixed precision training when possible
164+
- Utilize PEFT techniques for large models
165+
166+
### Data Quality
167+
- Clean and validate your training data
168+
- Ensure balanced conversation samples
169+
- Use appropriate chat templates
170+
171+
### Training Tips
172+
- Start with small learning rates
173+
- Monitor training metrics using tensorboard
174+
- Validate model outputs during training
175+
176+
### Related Resources
177+
- [AutoTrain Documentation](https://huggingface.co/docs/autotrain)
178+
- [Example Fine-tuned Models](https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads)
179+
- [Training Datasets](https://huggingface.co/datasets?task_categories=task_categories:text-generation)
148180

149181
## Training
150182

0 commit comments

Comments
 (0)