Array / aggregate column filters on landing pages¶
Date: 2026-06-03
Problem / motivation¶
findLandingPages(sourceType: ITEMS, filterGroups: [...]) silently ignored filters
on tagIds. Tags on an item are exposed as array_agg(tags.id) /
array_agg(tags.tag_name) aggregate columns (see
ItemRepository.paginated_stmt),
and the generic filter builder
(apply_filters)
had no concept of array columns:
- Filtering on
tagId(singular) was dropped because the type-hint lookup found no such field onItemLandingPageResponse. - Filtering on
tagIds(plural) blew up insafe_parse_value—issubclass(list[uuid.UUID], bool)raises, so the filter was skipped with a warning. - Even if parsed, the builder emitted
array_agg(...) IN (...), which is invalid SQL: an aggregate can only appear inHAVING, and an array must be matched with array operators, not scalarIN.
A second, pre-existing bug was uncovered while tracing this: total_rows runs
SELECT count(*) ... GROUP BY ... on grouped landing statements and returns the
first group's row count (via scalar()) instead of the number of groups, so the
items landing page total is already wrong today.
In scope¶
- Detect array-typed columns in the generic filter builder and translate membership
operators (
IN/EQ/NOT_IN/NE) into PostgreSQL array-overlap (&&) conditions with the value flattened (no[[...]]double-wrap) and cast to the right element type. - Route conditions on aggregate columns (e.g.
array_agg) toHAVING; route plain array table-columns toWHERE. - Support OR-combining an aggregate (
HAVING) condition with a row-level (WHERE) condition inside a singlefilterGroupby promoting the row-level legs intoHAVINGviabool_or(...). - Make
total_rowscorrect for grouped statements (subquery count), so the filteredtotalis meaningful. - Unit tests for the new behaviour and the
total_rowsfix.
Out of scope¶
- New GraphQL types/fields. This is a runtime filter change only — no schema delta.
- Ordering (
orderBy) on array/aggregate columns.
Data model changes¶
None. No alembic migration (no schema change).
GraphQL surface¶
No new queries, mutations, types, inputs, or enums. findLandingPages /
findLandingPageKpis behave the same; the only change is that filterGroups /
filters targeting array columns (tagIds, tags) now produce SQL instead of being
silently dropped.
Example that now works:
query {
findLandingPages(
sourceType: ITEMS
filterGroups: [{
operator: OR
filters: [
{ columnName: "tagIds", operator: IN, value: "5736b624-...-49ee1" }
{ columnName: "tagIds", operator: IN, value: "d4ebe55c-...-dc79c" }
]
}]
) {
total
records { ... on ItemLandingPage { id tags tagIds } }
}
}
emits roughly:
... GROUP BY ...
HAVING (array_agg(tags.id) FILTER (WHERE tags.id IS NOT NULL) && ARRAY[:a]::uuid[]
OR array_agg(tags.id) FILTER (WHERE tags.id IS NOT NULL) && ARRAY[:b]::uuid[])
RBAC¶
No new Path / Resource. No backfill.
Background tasks / cron¶
None.
Frontend contract¶
- Filtering items (and any future array-backed landing column) by tag is now done with
columnName: "tagIds"(UUIDs) orcolumnName: "tags"(tag names), operatorIN. - Each value is matched as "the row has any of these" (array overlap). To match several
tags as OR, send one filter per value inside an
ORgroup (as in the example), or pass a JSON array of values to a single filter. NOT_IN/NEmean "the row has none of these" and include rows with no tags.- You can mix a
tagIdsfilter and a row-level column filter withoperator: ORin the same group (e.g.number = 'X' OR tagIds IN [...]) — the row-level leg is promoted intoHAVINGviabool_or(...)so it behaves as a true OR. totalon grouped landing pages (items) is now the real count of matching rows.
What's being implemented¶
- filters.py:
- New
FilterTargetenum +BuiltConditiondataclass;build_filter_conditionnow returnsBuiltCondition | Nonecarrying the WHERE/HAVING target. _contains_aggregate,_array_element_type,_build_array_condition,_coerce_array_valueshelpers.apply_filterssplits conditions into WHERE/HAVING buckets, combines per group, and applies each to the right clause. An OR group mixing both kinds promotes the WHERE legs into HAVING viabool_or(...).total_rowswraps grouped statements in a subquery before counting.- tests/graphql/common/test_filters.py:
existing tests updated for the new return type; new tests for array overlap, HAVING
routing, value flattening, mixed-group degradation, and the grouped
total_rowsfix.
Future additions¶
orderByon array/aggregate columns.apply_order_bystill rejects them. Deferred — sorting by tag set has no defined product meaning yet.
Open questions¶
None.