← ReFT: Reasoning with Reinforced Fine-Tuning Nvidia A100 →

SFT

Supervised fine-tuning (SFT) is a frequently used method for aligning large language models (LLMs) to human preferences. It involves curating a dataset of high-quality LLM outputs and then fine-tuning the model on this data using a next token prediction objective.

Areas of application

Supervised fine-tuning is a simple and cost-effective method for aligning LLMs to human preferences.

Supervised fine-tuning is based on the idea of learning from examples of good LLM output.

Supervised fine-tuning can be used to improve a variety of LLM behaviors, such as instruction following, helpfulness, and safety.

Supervised fine-tuning is a common component of the three-step framework for LLM training, which also includes pretraining and reinforcement learning from human feedback (RLHF).

Simple and cost-effective: SFT does not require the use of expensive human annotations, which makes it a more practical option than RLHF.
Versatility: Supervised fine-tuning can be used to improve a wide range of LLM behaviors.
Effectiveness: SFT has been shown to be effective in improving the performance of LLMs on a variety of downstream tasks.

Example

To improve the accuracy of LLM-generated summaries: SFT can be used to train an LLM to generate summaries that are more concise, relevant, and informative.
To make LLMs more helpful and informative in response to user queries: SFT can be used to train an LLM to provide more comprehensive and accurate answers to user questions.

To discourage harmful or unsafe outputs: SFT can be used to train an LLM to generate outputs that are not offensive, biased, or misleading.

Supervised fine-tuning (SFT) is a powerful tool for aligning LLMs to human preferences and making them more useful and reliable.

Resources

[2401.10222] Supervised Fine-tuning in turn Improves Visual Foundation Models (arxiv.org)

← ReFT: Reasoning with Reinforced Fine-Tuning Nvidia A100 →