hugging-face-model-trainer

Name: hugging-face-model-trainer
Author: patchy631

by patchy631

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: patchy631
Category: Security
Views: 2

GitHub repo

About this skill

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

How to use

Zainstaluj umiejętność w swoim agencie lub Claude'a poprzez dodanie referencji do hugging-face-model-trainer z repozytorium ai-engineering-hub.
Przygotuj swój zbiór danych w formacie obsługiwanym przez TRL (np. instrukcje dla SFT, preferencje dla DPO). Zweryfikuj strukturę danych i upewnij się, że zawierają one wymagane pola (tekst, instrukcja, odpowiedź lub preferencje).
Wybierz metodę treningu odpowiednią do Twoich potrzeb: SFT do standardowego dostrajania instrukcji, DPO do wyrównania modelu na podstawie danych preferencji, GRPO do treningu online RL, lub Reward Modeling do trenowania modeli nagród dla RLHF.
Skonfiguruj skrypt treningowy używając pakietu TRL Jobs z formatem UV i PEP 723. Określ model bazowy, parametry treningu, typ sprzętu GPU i szacunkowy budżet kosztów.
Uwierzytelnij się na Hugging Face Hub i skonfiguruj monitorowanie za pomocą Trackio, aby śledzić postęp treningu w czasie rzeczywistym.
Po zakończeniu treningu model zostanie automatycznie zapisany na Hugging Face Hub. Jeśli chcesz używać modelu lokalnie, przekonwertuj go do formatu GGUF dla Ollamy, LM Studio lub llama.c.

Related skills

obsidian

by gapmiss

Comprehensive guidelines for Obsidian.md plugin development including all 27 ESLint rules, TypeScript best practices, memory management, API usage (requestUrl vs fetch), UI/UX standards, and submission requirements. Use when working with Obsidian plugins, main.ts files,

Security

14111

brand-voice

by anthropics

Apply and enforce brand voice, style guide, and messaging pillars across content. Use when reviewing content for brand consistency, documenting a brand voice, adapting tone for different audiences, or checking terminology and style guide compliance.

Security

48158

1password

by openclaw

Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in (single or multi-account), or reading/injecting/running secrets via op.

Security

1174

typescript-review

by metabase

Review TypeScript and JavaScript code changes for compliance with Metabase coding standards, style violations, and code quality issues. Use when reviewing pull requests or diffs containing TypeScript/JavaScript code.

Security

17133

manim

by davila7

Comprehensive guide for Manim Community - Python framework for creating mathematical animations and educational videos with programmatic control

Security

1588

zendesk

by vm0-ai

Zendesk Support REST API for managing tickets, users, organizations, and support operations. Use this skill to create tickets, manage users, search, and automate customer support workflows.

Security

11100