train-with-environments

Name: train-with-environments
Author: PrimeIntellect-ai

Train models with verifiers environments using hosted RL or prime-rl. Use when asked to configure RL runs, tune key hyperparameters, diagnose instability, set up difficulty filtering and oversampling, or create practical train and eval loops for new environments.

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: PrimeIntellect-ai
Category: Backend
Views: 2

GitHub repo

About this skill

How to use

Zainstaluj środowisko, które chcesz trenować, używając polecenia prime env install [nazwa-środowiska]. 2. Przed rozpoczęciem długiego treningu uruchom ewaluację kanoniczną, aby zweryfikować zachowanie środowiska: prime eval run [nazwa-środowiska] -m gpt-4.1-mini -n 20 -r 3 -s. Sprawdź, czy istnieje różnorodność nagród na poziomie bazowym. 3. Wybierz ścieżkę treningu: dla większości użytkowników zacznij od Hosted Training (prime lab setup), dla zaawansowanych użytkowników z dostępem do GPU rozważ prime-rl (prime lab setup --prime-rl). 4. Skonfiguruj aliasy endpointów w pliku configs/endpoints.toml — dla testów zachowania wybierz modele instruct (seria gpt-4.1, qwen3 instruct), dla zadań wymagających głębokich rozumowań wybierz modele reasoning (seria gpt-5, qwen3 thinking). 5. Uruchom trening z konserwatywną długością przebiegu i przeanalizuj próbki na wczesnym etapie, aby zdiagnozować ewentualną niestabilność lub problemy z hiperparametrami. 6. Przed uruchomieniem długich przebiegów treningowych opublikuj środowisko, aby upewnić się, że jest gotowe do produkcji.

Related skills

youtube-transcript

by michalparkola

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

Backend

53214

supabase-operations

by elevanaltd

Supabase operational knowledge for migrations, RLS optimization, MCP tool benchmarks, and ADR-003 compliance. Use when validating database migrations, optimizing Row-Level Security policies, checking MCP tool performance, or ensuring Supabase operational standards. Triggers on:

Backend

27109

clickup

by civitai

Interact with ClickUp tasks and documents - get task details, view comments, create and manage tasks, create and edit docs. Use when working with ClickUp task/doc URLs or IDs.

Backend

2483

drizzle

by lobehub

Drizzle ORM schema and database guide. Use when working with database schemas (src/database/schemas/*), defining tables, creating migrations, or database model code. Triggers on Drizzle schema definition, database migrations, or ORM usage questions.

Backend

79340

literature-review

by K-Dense-AI

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature

Backend

238507

stripe-integration

by wshobson

Implement Stripe payment processing for robust, PCI-compliant payment flows including checkout, subscriptions, and webhooks. Use when integrating Stripe payments, building subscription systems, or implementing secure checkout flows.

Backend

40147