CODELENS — NEURAL AGENT.
An AI-powered search engine that instantly maps massive codebases. CodeLens slashes developer onboarding time by 80% and eliminates the hours wasted searching for undocumented architecture.
npm i @muhammadusmangm/codelensDistritubed as an enterprise-grade CLI tool
The Onboarding Bottleneck
The Discovery Pipeline
Layer 1: Structural Synthesis
CodeLens uses an AST-aware engine to parse 40+ languages. Instead of naive text splitting, it identifies functional boundaries (classes, methods, modules) to ensure each neural chunk retains logical integrity for the LLM.
Layer 2: Neural Mapping
Codebase data is transformed into 768-dimensional vectors. Using Qdrant's high-performance vector DB, CodeLens maps the semantic 'intent' of the code, enabling discovery through conceptual queries rather than keyword matches.
Layer 3: Cognitive Retrieval
A hybrid RAG pipeline dynamically switches modes: 'Full-Context' for repositories under 80k tokens for maximum accuracy, and 'Vector-Retrieval' with top-K re-ranking for enterprise-scale architectural discovery.
Layer 4: Architectural Dialogue
Powered by Gemini, the final layer provides a conversational interface to the codebase. It grounds every response in the objective truths of the repository, providing direct source file anchors for every insight generated.
Technical Implementation
// AST-Aware Semantic Chunking Engine
export async function ingest(repoPath: string) {
const files = await fastGlob(`${repoPath}/**/*`);
for (const file of files) {
const code = await fs.readFile(file, 'utf-8');
const chunks = Chunker.synthesize(code, {
strategy: "ast-boundary", // Preserves function/class integrity
maxItems: 800
});
// Neural Vectorization
const embeddings = await model.embed(chunks);
await qdrant.upsert(COLLECTION_NAME, {
points: chunks.map((c, i) => ({
id: uuid(), vector: embeddings[i], payload: c
}))
});
}
}Technical Decisions
Zero-Dependency AST AwarenessEngineered a regex-based depth tracking system for 40+ languages to avoid the heavy binary dependency of tree-sitter, ensuring a portable and lightweight `npx` experience.
Incremental IndexingImplemented SHA-256 file-hash comparison. The system only re-embeds modified or new files, drastically reducing API latency and cost for enterprise repositories.
Lessons Learned
"Building tools for enterprise engineers requires zero friction. The real challenge wasn't AI retrieval; it was building a system fast enough and accurate enough that senior engineers actually wanted to use it instead of grep. Performance and accuracy are the only metrics that matter."