llava

Name: llava
Author: zechenzhangAGI

Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruction following. Use for vision-language

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: zechenzhangAGI
Category: Security
Views: 112

GitHub repo

About this skill

How to use

Sklonuj repozytorium LLaVA z GitHuba i przejdź do katalogu projektu za pomocą git clone.
Zainstaluj pakiet wraz z zależnościami (transformers, torch, pillow) uruchamiając pip install -e . w głównym katalogu.
Wczytaj wstępnie wytrenowany model, na przykład llava-v1.5-7b, używając funkcji load_pretrained_model z modułu llava.model.builder — podaj ścieżkę do modelu jako parametr model_path.
Przygotuj obraz do analizy — załaduj go za pomocą biblioteki PIL (Image.open) i przetwórz za pomocą funkcji process_images z llava.mm_utils.
Sformułuj pytanie lub instrukcję dotyczącą obrazu, a następnie przekaż obraz i tekst do modelu — model zwróci odpowiedź opisującą zawartość zdjęcia lub odpowiadającą na Twoje pytanie.
Możesz prowadzić wieloturową rozmowę, zadając kolejne pytania o ten sam obraz — model zachowuje kontekst poprzednich odpowiedzi.

Related skills

qmd

by tobi

Search personal markdown knowledge bases, notes, meeting transcripts, and documentation using QMD - a local hybrid search engine. Combines BM25 keyword search, vector semantic search, and LLM re-ranking. Use when users ask to search notes, find documents, look up information in

Security

1951

backend-security-coder

by sickn33

Expert in secure backend coding practices specializing in input validation, authentication, and API security. Use PROACTIVELY for backend security implementations or security code reviews.

Security

1133

typescript-review

by metabase

Review TypeScript and JavaScript code changes for compliance with Metabase coding standards, style violations, and code quality issues. Use when reviewing pull requests or diffs containing TypeScript/JavaScript code.

Security

17133

payload

by payloadcms

Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.

Security

50171

content-creator

by alirezarezvani

Create SEO-optimized marketing content with consistent brand voice. Includes brand voice analyzer, SEO optimizer, content frameworks, and social media templates. Use when writing blog posts, creating social media content, analyzing brand voice, optimizing SEO, planning content

Security

25124

windows-ui-automation

by martinholovsky

Security

10115