crawl4ai

Name: crawl4ai
Author: basher83

This skill should be used when users need to scrape websites, extract structured data, handle JavaScript-heavy pages, crawl multiple URLs, or build automated web data pipelines. Includes optimized extraction patterns with schema generation for efficient, LLM-free extraction.

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: basher83
Category: DevOps
Views: 128

GitHub repo

About this skill

How to use

Sprawdź instalację Crawl4AI, uruchamiając w terminalu komendę crawl4ai-doctor. Jeśli pojawią się błędy, uruchom crawl4ai-setup aby uzupełnić konfigurację.
Dla prostego pobierania strony użyj gotowego skryptu: python scripts/basic_crawler.py https://twoja-strona.com. Skrypt wyodrębni zawartość w formacie markdown.
Jeśli chcesz przetwarzać wiele adresów URL, przygotuj plik tekstowy z listą linków (jeden URL na linię) i uruchom python scripts/batch_crawler.py urls.txt.
Do ekstrakcji strukturalnych danych (np. produktów z e-sklepu) użyj pipeline'u z automatycznym generowaniem schematu: python scripts/extraction_pipeline.py --generate-schema https://sklep.com "ekstrahuj produkty".
W kodzie Python zaimportuj AsyncWebCrawler, skonfiguruj zachowanie przeglądarki (headless mode, rozmiar okna, timeout) i wykonaj crawl za pomocą metody arun() z adresem URL.
Dostosuj ustawienia crawlingu poprzez CrawlerRunConfig — możesz włączyć screenshoty, usunąć elementy nakładające się (popupy), lub zmienić timeout strony.

Related skills

resolve-conflicts

by antinomyhq

Use this skill immediately when the user mentions merge conflicts that need to be resolved. Do not attempt to resolve conflicts directly - invoke this skill first. This skill specializes in providing a structured framework for merging imports, tests, lock files (regeneration),

DevOps

48163

task-master

by sfc-gh-dflippo

AI-powered task management for structured, specification-driven development. Use this skill when you need to manage complex projects with PRDs, break down tasks into subtasks, track dependencies, and maintain organized development workflows across features and branches.

DevOps

14126

streamlit

by sverzijl

When working with Streamlit web apps, data dashboards, ML/AI app UIs, interactive Python visualizations, or building data science applications with Python

DevOps

49161

aws-solution-architect

by alirezarezvani

Design AWS architectures for startups using serverless patterns and IaC templates. Use when asked to design serverless architecture, create CloudFormation templates, optimize AWS costs, set up CI/CD pipelines, or migrate to AWS. Covers Lambda, API Gateway, DynamoDB, ECS, Aurora,

DevOps

1231

game-art

by davila7

Game art principles. Visual style selection, asset pipeline, animation workflow.

DevOps

1268

senior-computer-vision

by davila7

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production

DevOps

1044