data-engineering-data-pipeline

Name: data-engineering-data-pipeline
Author: sickn33

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: sickn33
Category: DevOps
Views: 14

GitHub repo

About this skill

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

How to use

Załaduj umiejętność w swoim agencie lub systemie obsługującym skill'e. Umiejętność aktywuje się automatycznie, gdy pracujesz nad architekturą potoków danych.
Opisz swoje źródła danych, wolumeny, wymagania opóźnień i systemy docelowe. Na tej podstawie otrzymasz rekomendację wzorca architektonicznego (ETL do transformacji przed załadowaniem, ELT do transformacji po załadowaniu, Lambda dla hybrydowych rozwiązań batch + stream, Kappa dla potoków tylko strumieniowych, lub Lakehouse dla ujednoliconego podejścia).
Poproś o szczegółowy projekt przepływu: źródła → ingestion → przetwarzanie → magazyn → serwowanie danych. Umiejętność doda punkty obserwacyjności i wskaże, gdzie monitorować potok.
Dla ingestionu wsadowego otrzymasz wzory na ładowanie przyrostowe ze znacznikami wierszy, logikę ponownych prób, walidację schematów i kolejki dla rekordów błędnych. Dla ingestionu strumieniowego – konsumenty Kafki z semantyką dokładnie raz, commity offsetów w transakcjach i okienkowanie dla agregacji czasowych.
Skorzystaj z porad do transformacji danych: dbt dla modelowania, Spark dla dużych wolumenów, Delta Lake lub Iceberg dla transakcji ACID i kontroli wersji. Umiejętność pokaże, jak partycjonować dane i optymalizować koszty.
Wdrażaj ramy jakości danych (Great Expectations, testy dbt) i monitorowanie (CloudWatch, Prometheus, Grafana). Umiejętność dostarczy checklist'y i best practice'e na każdym etapie.

Related skills

planning-with-files

by davila7

Implements Manus-style file-based planning for complex tasks. Creates task_plan.md, findings.md, and progress.md. Use when starting complex multi-step tasks, research projects, or any task requiring u003e5 tool calls.

DevOps

2365

macos-cleaner

by daymade

Analyze and reclaim macOS disk space through intelligent cleanup recommendations. This skill should be used when users report disk space issues, need to clean up their Mac, or want to understand what's consuming storage. Focus on safe, interactive analysis with user confirmation

DevOps

1331

miniprogram-development

by TencentCloudBase

WeChat Mini Program development rules. Use this skill when developing WeChat mini programs, integrating CloudBase capabilities, and deploying mini program projects.

DevOps

1955

grafana-dashboards

by wshobson

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

DevOps

92262

aws-solution-architect

by alirezarezvani

Design AWS architectures for startups using serverless patterns and IaC templates. Use when asked to design serverless architecture, create CloudFormation templates, optimize AWS costs, set up CI/CD pipelines, or migrate to AWS. Covers Lambda, API Gateway, DynamoDB, ECS, Aurora,

DevOps

1231

resolve-conflicts

by antinomyhq

Use this skill immediately when the user mentions merge conflicts that need to be resolved. Do not attempt to resolve conflicts directly - invoke this skill first. This skill specializes in providing a structured framework for merging imports, tests, lock files (regeneration),

DevOps

48163