xref: /aosp_15_r20/external/executorch/examples/llm_pte_finetuning/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1# ExecuTorch Finetuning example
2
3In this tutorial, we show how to fine-tune an LLM using executorch.
4
5## Pre-requisites
6
7You will need to have a model's checkpoint, in the Hugging Face format. For example:
8
9```
10git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
11```
12
13You will need to install [torchtune](https://github.com/pytorch/torchtune) following [its installation instructions](https://github.com/pytorch/torchtune?tab=readme-ov-file#installation).
14
15## Config Files
16
17As mentioned in the previous section, we internally use `torchtune` APIs, and thus, we use config files that follow `torchtune`'s structure. Typically, in the following sections we go through a working example which can be found in the `phi3_config.yaml` config file.
18
19### Tokenizer
20
21We need to define the tokenizer. Let's suppose we would like to use [PHI3 Mini Instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) model from Microsoft. We need to define the tokenizer component:
22
23```
24tokenizer:
25  _component_: torchtune.models.phi3.phi3_mini_tokenizer
26  path: /tmp/Phi-3-mini-4k-instruct/tokenizer.model
27  max_seq_len: 1024
28```
29
30This will load the tokenizer, and set the max sequence length to 1024. The class that will be instantiated will be [`Phi3MiniTokenizer`](https://github.com/pytorch/torchtune/blob/ee343e61804f9942b2bd48243552bf17b5d0d553/torchtune/models/phi3/_tokenizer.py#L30).
31
32### Dataset
33
34In this example we use the [Alpaca-Cleaned dataset](https://huggingface.co/datasets/yahma/alpaca-cleaned). We need to define the following parameters:
35
36```
37dataset:
38  _component_: torchtune.datasets.alpaca_cleaned_dataset
39seed: null
40shuffle: True
41batch_size: 1
42```
43
44Torchtune supports datasets using huggingface dataloaders, so custom datasets could also be defined. For examples on defining your own datasets, review the [torchtune docs](https://pytorch.org/torchtune/stable/tutorials/datasets.html#hugging-face-datasets).
45
46### Loss
47
48For the loss function, we use PyTorch losses. In this example we use the `CrossEntropyLoss`:
49
50```
51loss:
52  _component_: torch.nn.CrossEntropyLoss
53```
54
55### Model
56
57Model parameters can be set, in this example we replicate the configuration for phi3 mini instruct benchmarks:
58
59```
60model:
61  _component_: torchtune.models.phi3.lora_phi3_mini
62  lora_attn_modules: ['q_proj', 'v_proj']
63  apply_lora_to_mlp: False
64  apply_lora_to_output: False
65  lora_rank: 8
66  lora_alpha: 16
67```
68
69### Checkpointer
70
71Depending on how your model is defined, you will need to instantiate different components. In these examples we use checkpoints from HF (hugging face format), and thus we will need to instantiate a `FullModelHFCheckpointer` object. We need to pass the checkpoint directory, the files with the tensors, the output directory for training and the model type:
72
73```
74checkpointer:
75  _component_: torchtune.training.FullModelHFCheckpointer
76  checkpoint_dir: /tmp/Phi-3-mini-4k-instruct
77  checkpoint_files: [
78    model-00001-of-00002.safetensors,
79    model-00002-of-00002.safetensors
80  ]
81  recipe_checkpoint: null
82  output_dir: /tmp/Phi-3-mini-4k-instruct/
83  model_type: PHI3_MINI
84```
85
86### Device
87
88Torchtune supports `cuda` and `bf16` tensors. However, for ExecuTorch training we only support `cpu` and `fp32`:
89
90```
91device: cpu
92dtype: fp32
93```
94
95## Running the example
96
97### Step 1: Generate the ExecuTorch PTE (checkpoint)
98
99The `model_exporter.py` exports the LLM checkpoint into an ExecuTorch checkpoint (.pte). It has two parameters:
100
101* `cfg`: Configuration file
102* `output_file`: The `.pte` output path
103
104```
105python model_exporter.py --cfg=phi3_config.yaml --output_file=phi3_mini_lora.pte
106```
107
108### Step 2: Run the fine-tuning job
109
110To run the fine-tuning job:
111
112```
113python runner.py --cfg=phi3_config.yaml --model_file=phi3_mini_lora.pte
114```
115
116You need to use **the same** config file from the previous step. The `model_file` arg is the `.pte` model from the previous step.
117
118Example output:
119
120```
121Evaluating the model before training...
122100%|██████████████████████████████████████████████████████████████████████████████████████| 3/3 [31:23<00:00, 627.98s/it]
123Eval loss:  tensor(2.3778)
124100%|██████████████████████████████████████████████████████████████████████████████████████| 5/5 [52:29<00:00, 629.84s/it]
125Losses:  [2.7152762413024902, 0.7890686988830566, 2.249271869659424, 1.4777560234069824, 0.8378427624702454]
126100%|██████████████████████████████████████████████████████████████████████████████████████| 3/3 [30:35<00:00, 611.90s/it]
127Eval loss:  tensor(0.8464)
128```
129