AI that works. As aspected.

From Filters to Features: Aspect Database vs Vector Databases

A comparison of how vector databases handle contextual constraints and how multi-aspect embeddings enable unified similarity search.

TL;DR

Most vector databases are designed to perform similarity search over embeddings derived from text or other unstructured data. Contextual constraints such as timestamps, document types, or ownership are typically handled as external filters applied before or after the vector search step. This architecture works well for many applications, but becomes limiting when contextual signals should influence ranking rather than simply restrict the search space. This article examines how traditional vector database architectures handle filtering and retrieval, and introduces an alternative design approach implemented by Aspected through the Aspect Database: a context-aware retrieval engine for AI systems, where contextual attributes are encoded directly into the representation used for similarity computation through multi-aspect embeddings.

The Challenge of Context in Vector Search

Vector databases have become a foundational component of modern AI systems. They enable applications such as semantic search, retrieval-augmented generation (RAG), recommendation systems, and similarity detection. In many of these systems, documents are represented as vector embeddings, and retrieval is performed using approximate nearest-neighbor (ANN) search over these embeddings.

However, real-world retrieval rarely depends only on semantic similarity. Retrieval systems often need to support queries that include contextual constraints such as:

  • Similar files from a specific department

  • Documents about a concept within a particular timeframe

  • Documents with a specific confidentiality or classification level

Traditional search engines solved these problems through query languages that combine structured conditions and textual relevance. In vector search systems, integrating contextual constraints with similarity search is less straightforward.

From Filters to Features in Vector Retrieval

Most vector databases treat contextual properties as filters rather than components of similarity. In this architecture, similarity search determines semantic proximity, while filters restrict the candidate set based on metadata. Consequently, vector search and contextual constraints are handled separately. An alternative approach is to treat those signals as features of the similarity space itself. Instead of restricting results after vector search, contextual signals contribute directly to the vector distance calculation used in ANN search.

This is the core concept behind the Aspect Database, the retrieval engine developed by Aspected. In this model, documents are represented through multiple vectorized aspects (contextual properties). Unlike a single text-derived embedding, this allows similarity search to operate across both content and contextual attributes. Before exploring this model in more detail, however, it is useful to examine how current vector database architectures handle filtering today.

How Traditional Vector Databases Handle Filters 

Popular vector database systems, such as FAISS, Pinecone and Redis generally follow a common retrieval pattern [1]. Depending on the implementation, filtering may occur at different stages. A typical query pipeline consists of generating a query embedding, performing nearest-neighbor search on the vector index, and applying contextual constraints through filters.

In practice, filtering can be applied at different stages of the retrieval pipeline. Some systems restrict the search space before similarity search is executed (pre-filtering), removing documents that do not satisfy the filter conditions. While this reduces the search space, it may also exclude relevant results when contextual information is incomplete or noisy. Other systems apply filters after similarity search (post-filtering), removing invalid results after scoring. This avoids prematurely discarding candidates, but often requires retrieving larger candidate sets to ensure enough valid results remain.

Many production systems combine these strategies with additional retrieval stages, reranking models, or application-level logic; building a hybrid approach. While all these designs work well in many situations, they introduce additional complexity when contextual properties should influence retrieval rather than simply restrict results. This architectural separation between similarity computation and contextual filtering introduces several practical limitations. At the same time, this often leads to multi-stage retrieval pipelines that require careful tuning and orchestration.

Engineering Limitations of Filter-Based Architectures

When contextual attributes are treated purely as filters, they cannot influence the similarity score itself. This distinction becomes important when contextual signals represent degrees of relevance rather than strict eligibility conditions.

Consider the following query:

Find internal compliance reports related to financial risk that are also recent.

In most vector database architectures, this query typically requires multiple steps:

  1. Perform vector similarity search on document embeddings to retrieve reports related to financial risk.
  2. Apply metadata filters or time ranges to restrict results to recent documents.
  3. Optionally rerank results to prioritize more recent reports.

In practice, developers often compensate for these limitations by introducing additional logic such as multiple queries, custom scoring heuristics, or reranking stages. While these approaches can be effective, they increase architectural complexity and require careful orchestration at the application level.

At a deeper level, the limitation is structural: contextual signals are handled outside the similarity space rather than contributing directly to ranking during search. This motivates an alternative approach: incorporating contextual attributes directly into the similarity model itself.

Aspected: Multi-Aspect Retrieval

Aspected addresses these limitations by representing documents through the Aspect Database, with multiple vectorized (i.e., embedded) aspects rather than relying on a single text-derived embedding. While traditional vector retrieval captures semantic distance effectively, it often fails to incorporate the contextual signals that influence relevance in real-world retrieval tasks. Aspected extends this model by representing documents through several vectorized aspects that capture different properties of the document. Examples of such aspects include:

  • Semantic content
  • Temporal attributes
  • Media content

Instead of storing contextual attributes separately and applying them as filters, Aspected encodes these attributes into vector representations that become part of the document representation itself. These elements are combined into a unified representation used during search. This way, each aspect contributes directly to the ranking computation used during retrieval.

From the perspective of the vector database, the system still performs similarity search. What changes is not the search operation itself, but what similarity represents. Rather than reflecting only semantic proximity, similarity becomes a composite measure that captures multiple aspects of a document simultaneously. In the Aspect Database, this enables retrieval systems to find documents that are not only semantically similar, but also contextually aligned, all within a single search operation.

Traditional RAG vs Aspected

Comparison between filter-based retrieval and multi-aspect retrieval

Practical Example: Content and Time

Continuing with the previously described scenario, consider a system storing large collections of compliance and regulatory reports. An analyst may want to retrieve:

Reports discussing financial risk that were produced recently.

In traditional vector retrieval architectures, this type of query typically requires combining semantic similarity with additional filtering or ranking logic to account for temporal constraints.

In Aspected, temporal attributes are encoded alongside semantic content as part of the document representation. The retrieval process becomes:

  1. Generate a query representation across the relevant aspects (content and time).

  2. Perform similarity search over the multi-aspect representation using a search strategy that can selectively include or ignore specific dimensions during distance computation, enabling more flexible and efficient retrieval than standard ANN approaches.
  3. Retrieve results ranked by the combined similarity across different aspects.

Because temporal attributes are integrated directly into the similarity computation, both topical relevance and temporal proximity are evaluated within the same search operation. This removes the need for separate filtering or reranking stages while producing results that better reflect the intended notion of relevance.

When Filters Still Make Sense

Filter-based architectures remain effective when contextual attributes represent strict constraints, such as:

  • Security boundaries
  • Tenant isolation
  • Access permissions
  • Document type restrictions

In these cases, filters define eligibility rather than relevance. However, many enterprise retrieval scenarios involve contextual signals that influence how relevant a result is, not whether it should be considered at all. Examples include domain proximity, organizational ownership, or regulatory context.

In these cases, treating contextual attributes as part of the similarity computation can provide more accurate and efficient retrieval behavior. The Aspected approach is particularly useful in enterprise and government environments where relevance depends on multiple contextual dimensions rather than semantic similarity alone.

Both approaches can coexist depending on system requirements, but multi-aspect retrieval provides a more expressive foundation for complex knowledge systems.

Toward More Expressive Vector Retrieval

As AI applications increasingly rely on vector search, retrieval architectures are evolving beyond purely semantic distance. Contextual attributes, structural signals, and domain-specific properties all influence what makes a document relevant in real-world retrieval systems. The challenge is not only extracting these signals but integrating them into the similarity model itself.

The Aspect Database represents a concrete implementation of this approach, building vector search systems where both content and context contribute directly to relevance computation.

If you are exploring new approaches to vector retrieval architectures, you can read our previous articles on Aspected, or visit http://aspected.com to learn more.

Team @ Aspected

References

[1] Amanbayev, A., Tsan, B., Dang, T., & Rusu, F. (2026). Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis.