Turns Codebase into Easy Tutorial with AI

🔧 Team Fork Notice: This is our team's fork adapted for API security analysis workflows. See PROJECT_README.md for team-specific usage instructions and modifications.

Ever stared at a new codebase written by others feeling completely lost? This tutorial shows you how to build an AI agent that analyzes GitHub repositories and creates beginner-friendly tutorials explaining exactly how the code works.

This is a tutorial project of Pocket Flow, a 100-line LLM framework. It crawls GitHub repositories and builds a knowledge base from the code. It analyzes entire codebases to identify core abstractions and how they interact, and transforms complex code into beginner-friendly tutorials with clear visualizations.

Check out the YouTube Development Tutorial for more!
Check out the Substack Post Tutorial for more!

🔸 🎉 Reached Hacker News Front Page (April 2025) with >900 up‑votes: Discussion »

🔸 🎊 Online Service Now Live! (May 2025) Try our new online version at https://code2tutorial.com/ – just paste a GitHub link, no installation needed!

⭐ Example Results for Popular GitHub Repositories!

🤯 All these tutorials are generated entirely by AI by crawling the GitHub repo!

AutoGen Core - Build AI teams that talk, think, and solve problems together like coworkers!
Browser Use - Let AI surf the web for you, clicking buttons and filling forms like a digital assistant!
Celery - Supercharge your app with background tasks that run while you sleep!
Click - Turn Python functions into slick command-line tools with just a decorator!
Codex - Turn plain English into working code with this AI terminal wizard!
Crawl4AI - Train your AI to extract exactly what matters from any website!
CrewAI - Assemble a dream team of AI specialists to tackle impossible problems!
DSPy - Build LLM apps like Lego blocks that optimize themselves!
FastAPI - Create APIs at lightning speed with automatic docs that clients will love!
Flask - Craft web apps with minimal code that scales from prototype to production!
Google A2A - The universal language that lets AI agents collaborate across borders!
LangGraph - Design AI agents as flowcharts where each step remembers what happened before!
LevelDB - Store data at warp speed with Google's engine that powers blockchains!
MCP Python SDK - Build powerful apps that communicate through an elegant protocol without sweating the details!
NumPy Core - Master the engine behind data science that makes Python as fast as C!
OpenManus - Build AI agents with digital brains that think, learn, and use tools just like humans do!
PocketFlow - 100-line LLM framework. Let Agents build Agents!
Pydantic Core - Validate data at rocket speed with just Python type hints!
Requests - Talk to the internet in Python with code so simple it feels like cheating!
SmolaAgents - Build tiny AI agents that punch way above their weight class!
Showcase Your AI-Generated Tutorials in Discussions!

🚀 Getting Started

Clone this repository

git clone https://github.com/The-Pocket/PocketFlow-Tutorial-Codebase-Knowledge

Install dependencies:
```
pip install -r requirements.txt
```

Set up LLM in utils/call_llm.py by providing credentials. The project as configured uses:

Option A: Anthropic Claude (Recommended for best results)

# Use Anthropic Claude 4 Sonnet with Extended Thinking
def call_llm(prompt, use_cache: bool = True):
    from anthropic import Anthropic
    client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY", "your-api-key"))
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=20000,
        temperature=1,
        system="succinct, professional, friendly, solution-oriented tone",
        messages=[{"role": "user", "content": prompt}],
        thinking={"type": "enabled", "budget_tokens": 18000}
    )
    return response.content[1].text

Set your API key: export ANTHROPIC_API_KEY="your-anthropic-api-key"

Option B: Google Gemini (Default)

client = genai.Client(
  api_key=os.getenv("GEMINI_API_KEY", "your-api_key"),
)

Get your AI Studio key and set: export GEMINI_API_KEY="your-gemini-api-key"

Option C: OpenAI

# Use OpenAI o1 (uncomment the OpenAI section in utils/call_llm.py)

Set: export OPENAI_API_KEY="your-openai-api-key"

We highly recommend the latest models with thinking capabilities (Claude 3.7 with thinking, O1).

You can verify that it is correctly set up by running: You can verify your setup by running:

python utils/call_llm.py

Generate a complete codebase tutorial by running the main script:
```
# Analyze a GitHub repository
python main.py --repo https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000

# Or, analyze a local directory
python main.py --dir /path/to/your/codebase --include "*.py" --exclude "*test*"

# Or, generate a tutorial in Chinese
python main.py --repo https://github.com/username/repo --language "Chinese"
```
- --repo or --dir - Specify either a GitHub repo URL or a local directory path (required, mutually exclusive)
- -n, --name - Project name (optional, derived from URL/directory if omitted)
- -t, --token - GitHub token (or set GITHUB_TOKEN environment variable)
- -o, --output - Output directory (default: ./output)
- -i, --include - Files to include (e.g., "*.py" "*.js")
- -e, --exclude - Files to exclude (e.g., "tests/*" "docs/*")
- -s, --max-size - Maximum file size in bytes (default: 100KB)
- --language - Language for the generated tutorial (default: "english")
- --max-abstractions - Maximum number of abstractions to identify (default: 10)
- --no-cache - Disable LLM response caching (default: caching enabled)

The application will crawl the repository, analyze the codebase structure, generate tutorial content in the specified language, and save the output in the specified directory (default: ./output).

📖 Viewing Generated Tutorials

After generating tutorials, you have several options to view them:

Option 1: Simple HTTP Server (Quick & Easy)

Navigate to your tutorials directory and start a local server:

cd ./tutorials/YourProjectName/
python -m http.server 8543

Then open http://localhost:8543 in your browser. This will not show Mermaid diagrams by default.

Option 2: Docsify (Professional Documentation Site)

For a more polished documentation experience:

Install Docsify (requires Node.js):
```
npm install docsify-cli --save-dev
```
Initialize Docsify in your tutorials directory:
```
npx docsify-cli init ./tutorials
```
Serve the documentation:
```
npx docsify-cli serve ./tutorials
```
This creates a beautiful, searchable documentation site at http://localhost:3000

Option 3: Static File Viewing

You can also directly open the index.md file in any Markdown viewer or IDE that supports Markdown preview.

💡 Pro Tip: Development Integration

The generated tutorials provide an invaluable current state blueprint of your codebase that's perfect for:

Onboarding new developers to understand the project structure
Planning new features by understanding existing abstractions
Code reviews by having a high-level architectural overview
Documentation that stays current with your codebase evolution

🎯 Understanding the 100-Line LLM Framework

PocketFlow uses a simple but powerful abstraction: Graph + Shared Store

Nodes: Individual tasks (LLM calls, web search, data processing)
Flows: Connect nodes with labeled transitions ("Actions")
Shared Store: Dictionary for data communication between nodes
Advanced Features: Batch processing, async operations, parallel execution

This minimalist approach (just 100 lines!) provides maximum flexibility without vendor lock-in, making it perfect for building complex AI workflows that are easy to understand and modify.

🔧 Advanced Configuration

Increasing Analysis Depth

You can analyze more core abstractions by increasing the limit:

--max-abstractions 20  # Default is 10, you can go higher

Comprehensive File Analysis

For thorough codebase analysis, include multiple file types:

--include "*.py" "*.js" "*.ts" "*.md" "*.yml" "*.yaml" "*.json" "*.toml" "*.rs" "*.go"

Excluding Unnecessary Files

Optimize processing by excluding build artifacts and test files:

--exclude "*test*" "*tests/*" "*htmlcov/*" "*scripts/*" "*design/*" "*roadmap/*" "*.venv/*" "*__pycache__*" "*.pyc" "*dist/*" "*build/*" ".github/*" "*logs/*" "*.git/*"

🐳 Running with Docker

To run this project in a Docker container, you'll need to pass your API keys as environment variables.

Build the Docker image
```
docker build -t pocketflow-app .
```

Run the container

You'll need to provide your API key for the LLM to function. If you're analyzing private GitHub repositories or want to avoid rate limits, also provide your GITHUB_TOKEN.

Mount a local directory to /app/output inside the container to access the generated tutorials on your host machine.

Example for analyzing a public GitHub repository:

docker run -it --rm \
  -e ANTHROPIC_API_KEY="YOUR_ANTHROPIC_API_KEY_HERE" \
  -v "$(pwd)/output_tutorials":/app/output \
  pocketflow-app --repo https://github.com/username/repo

Example for analyzing a local directory:

docker run -it --rm \
  -e ANTHROPIC_API_KEY="YOUR_ANTHROPIC_API_KEY_HERE" \
  -v "/path/to/your/local_codebase":/app/code_to_analyze \
  -v "$(pwd)/output_tutorials":/app/output \
  pocketflow-app --dir /app/code_to_analyze

💡 Development Tutorial

I built using Agentic Coding, the fastest development paradigm, where humans simply design and agents code.
The secret weapon is Pocket Flow, a 100-line LLM framework that lets Agents (e.g., Cursor AI) build for you
Check out the Step-by-step YouTube development tutorial:

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
.archive		.archive
assets		assets
docs		docs
utils		utils
.clinerules		.clinerules
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitignore		.gitignore
.python-version		.python-version
.windsurfrules		.windsurfrules
Dockerfile		Dockerfile
LICENSE		LICENSE
PROJECT_README.md		PROJECT_README.md
README.md		README.md
TEAM_GUIDE.md		TEAM_GUIDE.md
flow.py		flow.py
main.py		main.py
nodes.py		nodes.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Turns Codebase into Easy Tutorial with AI

⭐ Example Results for Popular GitHub Repositories!

🚀 Getting Started

📖 Viewing Generated Tutorials

Option 1: Simple HTTP Server (Quick & Easy)

Option 2: Docsify (Professional Documentation Site)

Option 3: Static File Viewing

💡 Pro Tip: Development Integration

🎯 Understanding the 100-Line LLM Framework

🔧 Advanced Configuration

Increasing Analysis Depth

Comprehensive File Analysis

Excluding Unnecessary Files

💡 Development Tutorial

About

Uh oh!

Releases

Packages

Languages

License

kjon-life/PocketFlow-Codebase-Knowledge

Folders and files

Latest commit

History

Repository files navigation

Turns Codebase into Easy Tutorial with AI

⭐ Example Results for Popular GitHub Repositories!

🚀 Getting Started

📖 Viewing Generated Tutorials

Option 1: Simple HTTP Server (Quick & Easy)

Option 2: Docsify (Professional Documentation Site)

Option 3: Static File Viewing

💡 Pro Tip: Development Integration

🎯 Understanding the 100-Line LLM Framework

🔧 Advanced Configuration

Increasing Analysis Depth

Comprehensive File Analysis

Excluding Unnecessary Files

💡 Development Tutorial

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages