Reverse Engineering Tools for Large Language Models

Julian Wergieluk

RevLLM is a Python library and a Streamlit webapp for exploring internal workings of Large Language Models (LLMs).

Sponsored by the Federal Ministry of Education and Research

(Förderkennzeichen: 01IS23S42)

RevLLM builds on top the nanoGPT implementation of the GPT-2 model family developed by Andrej Karpathy and adheres to its spirit of simplicity and transparency. We restrict the dependencies to a bare minimum and strive for clean and simple code that can be easily understood and reused.

The RevLLM library implements various methods for analyzing the internal data-flow of transformer decoder-type models. In particular, RevLLM,

To facilitate the ease of use and provide a hassle-free experimentation experience, we accompany the library with an interactive Streamlit app and provide a web interface to access the library functionality. The app automatically downloads and instantiates a chosen model from the GPT-2 family using the Huggingface model repository, and exposes the RevLLM library methods through a convenient interface.

Logit Lens

Prompt Importance Analysis

Self-Attention Analysis

Tokenizer Analysis

GPT-2 maintains a fixed dictionary of around 50k tokens. The model uses the Byte Pair Encoding tokenizer algorithm split any given input sentence into a token sequence. This token sequence is mapped to a sequence of integers that is consumed by the model.

Embedding Matrix Statistics and Visualization

Generation with Top-k Sampling and Temperature

Copyright © 2024 Julian Wergieluk