openweights/cookbook at main · longtermrisk/openweights

Name	Name	Last commit message	Last commit date
parent directory ..
api-deployment	api-deployment
custom_job	custom_job
inference	inference
preference_learning	preference_learning
sft	sft
README.md	README.md
inspect_eval.py	inspect_eval.py

Name

Last commit message

Last commit date

This folder contains examples that demonstrate usgae of openweights features.

Finetuning
Batch inference, supports:
- Inference from LoRA adapter
- Inference from checkpoint
API deployment
- Minimal example to deploy a huggingface model as openai-compatible vllm API
- Starting a gradio playground to chat with multiple LoRA finetunes of the same parent model
Writing a custom job

Data formats

We use jsonl files for datasets and prompts. Below is a description of the specific formats

Conversations

Example row

{
    "messages": [
        {
            "role": "user",
            "content": "This is a user message"
        },
        {
            "role": "assistant",
            "content": "This is the assistant response"
        }
    ]
}

We use this for SFT training/eval files and inference inputs. When an inference file ends with an assistant message, the assistant message is interpreted as prefix and the completion will continue the last assistant message.

Conversations, block-formatted

Example row:

{
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "We don't train on this text, because the weight is 0",
                    "weight": 0
                }
            ]
        },
        {
            "role": "assistant",
            "content": [
                {
                    "type": "text",
                    "text": "We have negative loss on these tokens, which means we try to minimize log-likelihood instead of maximizing it.",
                    "weight": -1,
                    "tag": "minimize",
                    "info1": "You can add as many other keys as you like, they will be ignored.",
                    "info2": "weight is only relevant for ow.weighted_sft",
                    "info3": "tag is relevant for logprobability tracking. You can track retrieve the log-probs of tokens in this content block if you use this file in a logp_callback_dataset."
                },
                {
                    "type": "text",
                    "text": "We have positive weight on these tokens, which means we train as normal on these tokens.",
                    "weight": 1,
                    "tag": "maximize"
                }
            ]
        }
    ]
}

This format is used for training files of ow.weighted_sft and for log-probability callbacks.

preferences

Example:

{
    "prompt": [
        {
            "role": "user",
            "content": "Would you use the openweights library to finetune LLMs and run batch inference"
        }
    ],
    "chosen": [
        {
            "role": "assistant",
            "content": "Absolutely it's a great library"
        }
    ],
    "rejected": [
        {
            "role": "assistant",
            "content": "No I would use something else"
        }
    ]
}

This format is used for fine-tuning with loss="dpo" or loss="orpo".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Data formats

Conversations

Conversations, block-formatted

preferences

FilesExpand file tree

cookbook

Directory actions

More options

Directory actions

More options

Latest commit

History

cookbook

Folders and files

parent directory

README.md

Data formats

Conversations

Conversations, block-formatted

preferences