stable-baselines3

Name: stable-baselines3
Author: K-Dense-AI

by K-Dense-AI

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: K-Dense-AI
Category: Security
Views: 9

GitHub repo

About this skill

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

How to use

Zainstaluj bibliotekę Stable Baselines3 wraz z zależnościami (PyTorch, Gymnasium). Upewnij się, że masz Python 3.7+.
Utwórz środowisko treningowe za pomocą Gymnasium — możesz użyć predefiniowanego środowiska (np. CartPole-v1) lub stworzyć własne, implementując wymagany interfejs.
Zainicjalizuj model agenta, wybierając odpowiedni algorytm (PPO dla zadań ogólnych, SAC/TD3 dla sterowania ciągłego, DQN dla akcji dyskretnych). Przekaż środowisko i typ polityki (np. MlpPolicy).
Wytrenuj agenta za pomocą metody learn(), określając total_timesteps — pamiętaj, że rzeczywiste trenowanie może przekroczyć tę wartość ze względu na zbieranie batch'y.
Zapisz wytrenowany model za pomocą save() — replaye buffer nie jest zapisywany, aby zaoszczędzić miejsce.
Załaduj model do ewaluacji lub dalszego trenowania, używając statycznej metody load() z podaniem środowiska. Możesz teraz testować agenta na nowych zadaniach lub kontynuować trenowanie.

Related skills

brand-voice

by anthropics

Apply and enforce brand voice, style guide, and messaging pillars across content. Use when reviewing content for brand consistency, documenting a brand voice, adapting tone for different audiences, or checking terminology and style guide compliance.

Security

48158

youtube-watcher

by openclaw

Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.

Security

2231

solidity-security

by wshobson

Master smart contract security best practices to prevent common vulnerabilities and implement secure Solidity patterns. Use when writing smart contracts, auditing existing contracts, or implementing security measures for blockchain applications.

Security

10105

gmail-manager

by jeffvincent

Manage Gmail - send, read, search emails, manage labels and drafts. Use when user wants to interact with their Gmail account for email operations.

Security

17128

windows-ui-automation

by martinholovsky

Security

10115

google-analytics

by davila7

Analyze Google Analytics data, review website performance metrics, identify traffic patterns, and suggest data-driven improvements. Use when the user asks about analytics, website metrics, traffic analysis, conversion rates, user behavior, or performance optimization.

Security

1260