transformer-lens-interpretability

Name: transformer-lens-interpretability
Author: davila7

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching

Installation

Pick a client and clone the repository into its skills directory.

Installation

Quick info

Author: davila7
Category: Security
Views: 15

GitHub repo

About this skill

How to use

Zainstaluj TransformerLens za pomocą pip install transformer-lens. Jeśli chcesz pracować z najnowszą wersją ze źródła, użyj pip install git+https://github.com/TransformerLensOrg/TransformerLens.
Zaimportuj HookedTransformer — główną klasę, która opakowuje modele transformerów i udostępnia HookPoints na każdej aktywacji. To jest punkt wejścia do wszystkich badań interpretowalności.
Wybierz cel badań: jeśli chcesz odtwarzać algorytmy nauczone podczas treningu, używaj activation patchingu i causal tracing. Jeśli interesują cię wzorce uwagi i przepływ informacji, skoncentruj się na analizie attention patterns.
Wykorzystaj HookPoints do inspektowania pośrednich aktywacji modelu. Możesz cachować aktywacje i manipulować nimi, aby zrozumieć, które części sieci są odpowiedzialne za konkretne zachowania.
Przeprowadź eksperymenty circuit analysis — analizuj obwody takie jak induction heads lub IOI circuit, aby odkryć, jak model przetwarza informacje na poziomie mechanistycznym.
Jeśli pracujesz z architekturami innymi niż transformery, rozważ alternatywy: nnsight lub pyvene dla bardziej ogólnych podejść, SAELens dla Sparse Autoencoderów, lub nnsight z NDIF dla zdalnego wykonania na dużych modelach.

Related skills

qmd

by tobi

Search personal markdown knowledge bases, notes, meeting transcripts, and documentation using QMD - a local hybrid search engine. Combines BM25 keyword search, vector semantic search, and LLM re-ranking. Use when users ask to search notes, find documents, look up information in

Security

1951

windows-ui-automation

by martinholovsky

Security

10115

ui-audit

by openclaw

AI skill for automated UI audits. Evaluate interfaces against proven UX principles for visual hierarchy, accessibility, cognitive load, navigation, and more. Based on Making UX Decisions by Tommy Geoco.

Security

1223

architect-review

by sickn33

Master software architect specializing in modern architecture patterns, clean architecture, microservices, event-driven systems, and DDD. Reviews system designs and code changes for architectural integrity, scalability, and maintainability. Use PROACTIVELY for architectural

Security

2773

accessibility-compliance

by wshobson

Implement WCAG 2.2 compliant interfaces with mobile accessibility, inclusive design patterns, and assistive technology support. Use when auditing accessibility, implementing ARIA patterns, building for screen readers, or ensuring inclusive user experiences.

Security

2173

llama-cpp

by zechenzhangAGI

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.

Security

11252