data preparation
for llama-factory fine-tuning, here is the instruction for custom dataset preparation.
dataset classification
alpaca
stanford_alpaca dataset is a famous example to fine-tuning llama2 to get alpaca model, follow is its structure.
[ { "instruction": "user instruction (required)", "input": "user input (optional)", "output": "model response (required)", "history": [ ["user instruction in the first round (optional)", "model response in the first round (optional)"], ["user instruction in the second round (optional)", "model response in the second round (optional)"] ] } ]
from bellow digraph, you can get how they get alpaca model:
sharegpt
- llama-factory fine-tuning factory tuning llamallama-factory fine-tuning factory tuning fine-tuning tuning fine llama-factory llama-factory technologies explanation fine-tuning llama-factory fine-tuning fine-tune fine-tuning openai tuning model fine-tuned fine-tuning embeddings chatgpt tuning open-source fine-tuning chatbot trained