AgentLens
Interpretable safety framework for steering coding agent behavior via LLM internal representations
About
AgentLens is a research framework that enables fine-grained safety control for multi-turn coding agents by analyzing and steering LLM internal representations through mechanistic subspaces. Unlike traditional external guardrails, it operates at the model's internal layer level to provide interpretable behavioral control during agent execution. Designed for researchers and developers working on AI safety and agent alignment in coding contexts.
Details
| Type | |
| Integrations | |
| Language |
Tags
Quick Info
- Organization
- Research Team
- Pricing
- open-source
- Free Tier
- Yes
- Updated
- Jun 23, 2026
Also in Dev Tools
Crawl4AI
Open-source web crawler optimized for LLMs and AI agents — 62K+ stars
Firecrawl
Web scraping API built for LLMs — turn any website into LLM-ready data — 89K+ stars
Headroom Context Optimization
Reduce LLM API costs by 50-90% through advanced context compression