ray-data

Name: ray-data
Author: davila7

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: davila7
Category: Security
Views: 10

GitHub repo

About this skill

How to use

Zainstaluj Ray Data poleceniem pip install -U 'ray[data]' wraz z wymaganymi zależnościami (pyarrow, pandas).
Załaduj dane z magazynu — użyj ray.data.read_parquet() do wczytania plików Parquet z lokalnego dysku lub S3, lub wybierz inny format (CSV, JSON) w zależności od Twoich danych.
Zdefiniuj transformacje danych za pomocą map_batches() — przekaż funkcję, która przetworzy partie danych, np. konwersję tekstu na małe litery lub normalizację obrazów. Ray wykonuje transformacje leniwie, bez ładowania całego zbioru do pamięci.
Iteruj po przetworzonych danych poleceniem iter_batches() z wybranym rozmiarem partii — każda iteracja zwraca gotową do użycia partię danych.
Aby skalować na wiele maszyn, połącz Ray Data z Ray Train — utwórz dataset, skonfiguruj ScalingConfig z liczbą węzłów i GPU, a następnie przekaż dataset do TorchTrainer lub innego trenera Ray Train.
Monitoruj przetwarzanie — Ray automatycznie zarządza dystrybucją pracy między dostępne zasoby (CPU, GPU) i węzły klastra.

Related skills

feishu-docs

by openclaw

飞书文档(Docx)API技能。用于创建、读取、更新和删除飞书文档。支持Markdown/HTML内容转换、文档权限管理。

Security

1574

manim

by davila7

Comprehensive guide for Manim Community - Python framework for creating mathematical animations and educational videos with programmatic control

Security

1588

python-expert

by Shubhamsaboo

Senior Python developer expertise for writing clean, efficient, and well-documented code.\nUse when: writing Python code, optimizing Python scripts, reviewing Python code for best practices,\ndebugging Python issues, implementing type hints, or when user mentions Python, PEP 8,

Security

2777

google-analytics

by davila7

Analyze Google Analytics data, review website performance metrics, identify traffic patterns, and suggest data-driven improvements. Use when the user asks about analytics, website metrics, traffic analysis, conversion rates, user behavior, or performance optimization.

Security

1260

reverse-engineering-tools

by gmh5225

Guide for reverse engineering tools and techniques used in game security research. Use this skill when working with debuggers, disassemblers, memory analysis tools, binary analysis, or decompilers for game security research.

Security

3168

obsidian

by gapmiss

Comprehensive guidelines for Obsidian.md plugin development including all 27 ESLint rules, TypeScript best practices, memory management, API usage (requestUrl vs fetch), UI/UX standards, and submission requirements. Use when working with Obsidian plugins, main.ts files,

Security

14111