TECHNIQUE
Retrieval & Grounding
Across the pool, embedding engineering is used less as standalone search and more as a vector/similarity stage inside hybrid retrieval, ranking, filtering, and serving pipelines.
Encode domain objects into embeddings or dense vectors for retrieval, matching, Q&A, recommendations, or clustering: work artifacts, jobs, files, images, structured entities, and journey keywords.
6 of 7 operatorsBack embeddings with vector databases, vector indexes, FAISS, large-scale indexing, or KV-backed serving paths for fast retrieval and downstream ranking.
4 of 7 operatorsDo not leave embedding or semantic-similarity output as the final decision: combine it with filters, rankers, LLM reranking/answering, classifiers, clustering, diversifiers, or human review.
7 of 7 operatorsEngineer the input context before embedding or retrieval: convert files to raw text and chunks, extract keywords with metadata, or classify intent and fetch external user/profile data instead of embedding only the raw query.
3 of 7 operatorsSelect, evaluate, or adapt embedding models for domain quality, latency, and cost rather than treating the embedding model as fixed.
4 of 7 operatorsAdd production mechanics around embeddings where freshness, reuse, or serving latency matters: real-time vector database updates, cached plugin states, nearline embedding publication to key-value stores, and daily incremental inference.
4 of 7 operatorsEvery observed operator uses embedding or semantic-similarity work as one component in a larger pipeline, with downstream retrieval, ranking, filtering, LLM, UI, serving, or review stages visible in the teardown.
What gets embedded or compared differs by product domain and granularity.
APPROACH 01
Text/work/search artifacts, jobs, files, security context, or structured entity descriptions are embedded for semantic retrieval, Q&A, or recommendations.
APPROACH 02
Images are represented as high-dimensional image embeddings for reverse image search.
APPROACH 03
Keywords extracted from user activity sources are embedded and hierarchically clustered into journey candidates.
APPROACH 04
Generated review comments are compared with a semantic similarity filter to merge overlapping suggestions.
The post-embedding relevance layer is not standardized.
APPROACH 01
Hybrid retrieval combines traditional fields or strict filters with neural semantic signals, then ranking layers refine results.
APPROACH 02
Vector similarity produces a shortlist and an LLM reranks or answers from the retrieved chunks.
APPROACH 03
Embedding search suggestions are shown to designers and can proceed through human review before republishing.
APPROACH 04
Keyword embedding clusters become journey candidates, then ranking and diversification determine what is surfaced.
APPROACH 05
Semantic similarity is part of post-processing for generated code-review comments, alongside quality scoring and category suppression.
Model strategy ranges from pretrained/off-the-shelf embedding models to domain adaptation and fine-tuning.
APPROACH 01
Use or select pretrained/off-the-shelf embedding models for the task.
APPROACH 02
Fine-tune or domain-adapt representation models for work-shaped or proprietary data.
APPROACH 03
Report strong results without customizing the original embedding approach.
Cost, latency, and compute pressure materially shape embedding architectures and model choices: operators cite high computational costs, quality/latency/cost balance, time-and-cost reasons for vector database choices, cost efficiency, and added LLM-query latency.
Raw embeddings or raw vector similarity can be insufficient for nuanced or domain-specific retrieval; operators add semantic query understanding, hybrid retrieval, or LLM-assisted ranking, and Grab reports dependence on data quality, complexity, use case, and query patterns.
Evaluation remains a practical constraint: operators use manual nearest-neighbor inspection, LLM-based evaluation or labels, and Grab reports experiments limited to small synthetic datasets with limited queries.
For simpler queries, raw similarity search may still be the efficient option; Grab explicitly warns not to assume the LLM-assisted vector approach is always preferable when computational efficiency matters.
| Name | Kind | When | Maturity |
|---|---|---|---|
| text-embedding-3-large | service | managed general-purpose embeddings with dimension truncation | commodity |
| bge-m3 | library | self-hosted multilingual or hybrid dense/sparse embeddings | established |
| Embedding fine-tuning on domain pairs | pattern | retrieval quality plateaus on domain vocabulary general models miss | emerging |