torchforge-rl-training

Name: torchforge-rl-training
Author: davila7

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or scalable training with Monarch and TorchTitan.

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: davila7
Category: Security
Views: 1

GitHub repo

About this skill

How to use

Zainstaluj zależności: upewnij się, że masz PyTorch ≥2.9.0, TorchTitan ≥0.2.0, vLLM i Monarch dostępne w swoim środowisku. 2. Zdefiniuj swoją funkcję straty i model nagrody — torchforge dostarcza wbudowane implementacje GRPO, DAPO, CISPO, GSPO i SAPO, które możesz użyć bezpośrednio lub dostosować. 3. Napisz kod algorytmu w warstwie aplikacji (Your Code) — torchforge obsługuje infrastrukturę, ty skupiasz się na logice RL. Algorytm może być zaimplementowany w około 100 linii kodu. 4. Skonfiguruj skalowanie: jeśli trenujesz na jednej karcie, uruchom bezpośrednio; dla wielu GPU użyj Monarch do automatycznego zarządzania aktorami i TorchTitan do paralelizmu modelu. 5. Monitoruj trening — torchforge automatycznie synchronizuje wagi między węzłami za pośrednictwem TorchStore, a vLLM obsługuje wnioskowanie. Nie musisz ręcznie zarządzać komunikacją między procesami.

Related skills

feishu-docs

by openclaw

飞书文档(Docx)API技能。用于创建、读取、更新和删除飞书文档。支持Markdown/HTML内容转换、文档权限管理。

Security

1574

llama-cpp

by zechenzhangAGI

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.

Security

11252

software-security

by project-codeguard

A software security skill that integrates with Project CodeGuard to help AI coding agents write secure code and prevent common vulnerabilities. Use this skill when writing, reviewing, or modifying code to ensure secure-by-default practices are followed.

Security

1678

academic-researcher

by Shubhamsaboo

Academic research assistant for literature reviews, paper analysis, and scholarly writing.\nUse when: reviewing academic papers, conducting literature reviews, writing research summaries,\nanalyzing methodologies, formatting citations, or when user mentions academic research,

Security

1260

google-analytics

by davila7

Analyze Google Analytics data, review website performance metrics, identify traffic patterns, and suggest data-driven improvements. Use when the user asks about analytics, website metrics, traffic analysis, conversion rates, user behavior, or performance optimization.

Security

1260

1password

by openclaw

Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in (single or multi-account), or reading/injecting/running secrets via op.

Security

1174