Embedding index for semantic search #14

Open
opened 2026-04-07 20:23:21 +00:00 by austin · 0 comments
Owner

Implement semantic code search using fastembed.

  • Embed code chunks (functions, classes) and their docstrings/comments at index time
  • Embed the user's task description at query time
  • Retrieve the top-K most semantically relevant code chunks for a given query
  • Complement the AST-based symbol lookup: AST finds by name/structure, embeddings find by meaning
  • Store embeddings on disk alongside the AST index (under .localcode/)
  • Incremental updates: re-embed only changed symbols

Use nomic-embed-text or equivalent small embedding model via fastembed. This runs on CPU and should be fast enough for interactive use.

Implement semantic code search using fastembed. - Embed code chunks (functions, classes) and their docstrings/comments at index time - Embed the user's task description at query time - Retrieve the top-K most semantically relevant code chunks for a given query - Complement the AST-based symbol lookup: AST finds by name/structure, embeddings find by meaning - Store embeddings on disk alongside the AST index (under `.localcode/`) - Incremental updates: re-embed only changed symbols Use `nomic-embed-text` or equivalent small embedding model via fastembed. This runs on CPU and should be fast enough for interactive use.
austin added this to the Context Assembly milestone 2026-04-07 20:29:38 +00:00
Sign in to join this conversation.
No milestone
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
austin/localcode#14
No description provided.