Toolverse
All skills

tensorrt-llm

by davila7

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author
davila7
Category
Data Science

About this skill

Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.

How to use

  1. Zainstaluj TensorRT-LLM — najłatwiej przez Docker (nvidia/tensorrt_llm:latest) lub pip (pip install tensorrt_llm==1.2.0rc3). Wymagane: CUDA 13.0.0, TensorRT 10.13.2, Python 3.10–3.12. 2. Zaimportuj bibliotekę i zainicjuj model: from tensorrt_llm import LLM, SamplingParams, następnie llm = LLM(model="meta-llama/Meta-Llama-3-8B"). 3. Skonfiguruj parametry próbkowania (sampling_params) — ustaw max_tokens, temperature i top_p zgodnie z potrzebami aplikacji. 4. Przygotuj listę promptów i wywołaj llm.generate(prompts, sampling_params), aby uzyskać odpowiedzi modelu. 5. Przetwórz wyniki — każdy output zawiera atrybut .text z wygenerowanym tekstem. 6. Do wdrożenia w produkcji użyj trtllm-serve do uruchomienia serwera inference'u, który obsługuje równoczesne żądania i skalowanie na wielu GPU.

Related skills

skill-installer

by openai

Install Codex skills into $CODEX_HOME/skills from a curated list or a GitHub repo path. Use when a user asks to list installable skills, install a curated skill, or install a skill from another repo (including private repos).

Data Science
23118

a-stock-analysis

by openclaw

A股实时行情与分时量能分析。获取沪深股票实时价格、涨跌、成交量,分析分时量能分布(早盘/尾盘放量)、主力动向(抢筹/出货信号)、涨停封单。支持持仓管理和盈亏分析。Use when: (1) 查询A股实时行情, (2) 分析主力资金动向, (3) 查看分时成交量分布, (4) 管理股票持仓, (5) 分析持仓盈亏。

Data Science
48153

moon-dev-trading-agents

by moondevonyt

Master Moon Dev's Ai Agents Github with 48+ specialized agents, multi-exchange support, LLM abstraction, and autonomous trading capabilities across crypto markets

Data Science
102232

xlsx

by anthropics

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2)

Data Science
40128

prompt-optimizer

by solatis

Optimize system prompts for Claude Code agents using proven prompt engineering patterns. Use when users request prompt improvement, optimization, or refinement for agent workflows, tool instructions, or system behaviors.

Data Science
15109

excalidraw

by ryanquinn3

\

Data Science
124204