Back to Systems
Architecture AutopilotCLI • NPP PACKAGE • FEB 2026

CODELENS — NEURAL AGENT.

Transforming complex repositories into navigable neural indices. CodeLens serves as a cognitive layer that enables developers to transcend traditional search and engage in meaningful dialogue with their codebase.

npm i @muhammadusmangm/codelens

Distritubed as an enterprise-grade CLI tool

40+PolyglotLanguages Supported
200+AdoptionActive Live Users
NPM/NPXDistributionnpm i @codelens
768-dimMemoryNeural Vector Context

The Cognitive Bottleneck

In modern engineering, the bottleneck is no longer writing code, but comprehending it. Traditional IDE search relies on keyword matches, which fails for architectural discovery. I built CodeLens to provide a "neural bridge"—using RAG to ground AI in the objective truth of a codebase, eliminating hallucinations and manual grepping.

The Intelligence Pipeline

Layer 1: Structural Synthesis

CodeLens uses an AST-aware engine to parse 40+ languages. Instead of naive text splitting, it identifies functional boundaries (classes, methods, modules) to ensure each neural chunk retains logical integrity for the LLM.

1

Layer 2: Neural Mapping

Codebase data is transformed into 768-dimensional vectors. Using Qdrant's high-performance vector DB, CodeLens maps the semantic 'intent' of the code, enabling discovery through conceptual queries rather than keyword matches.

2

Layer 3: Cognitive Retrieval

A hybrid RAG pipeline dynamically switches modes: 'Full-Context' for repositories under 80k tokens for maximum accuracy, and 'Vector-Retrieval' with top-K re-ranking for enterprise-scale architectural discovery.

3

Layer 4: Architectural Dialogue

Powered by Gemini, the final layer provides a conversational interface to the codebase. It grounds every response in the objective truths of the repository, providing direct source file anchors for every insight generated.

4

Technical Decisions

Zero-Dependency AST AwarenessEngineered a regex-based depth tracking system for 40+ languages to avoid the heavy binary dependency of tree-sitter, ensuring a portable and lightweight `npx` experience.

Incremental IndexingImplemented SHA-256 file-hash comparison. The system only re-embeds modified or new files, drastically reducing API latency and cost for enterprise repositories.

Lessons Learned

"The hardest part of RAG for code isn't the retrieval, it's the context management. Handling the crossover where a repository is too large for full-context but too complex for naive vector search required a sophisticated multi-stage re-ranking strategy that accounts for both semantic intent and structural hierarchy."

Work
About
Skills
Contact