pub enum XapianOp {
Show 14 variants
OpAnd,
OpOr,
OpAndNot,
OpXor,
OpAndMaybe,
OpFilter,
OpNear,
OpPhrase,
OpValueRange,
OpScaleWeight,
OpEliteSet,
OpValueGe,
OpValueLe,
OpSynonym,
}
Expand description
Enum of possible query operations #[repr(i32)]
Variants§
OpAnd
Return iff both subqueries are satisfied
OpOr
Return if either subquery is satisfied
OpAndNot
Return if left but not right satisfied
OpXor
Return if one query satisfied, but not both
OpAndMaybe
Return iff left satisfied, but use weights from both
OpFilter
As AND, but use only weights from left subquery
OpNear
Find occurrences of a list of terms with all the terms occurring within a specified window of positions.
Each occurrence of a term must be at a different position, but the order they appear in is irrelevant.
The window parameter should be specified for this operation, but will default to the number of terms in the list.
OpPhrase
Find occurrences of a list of terms with all the terms occurring within a specified window of positions, and all the terms appearing in the order specified.
Each occurrence of a term must be at a different position.
The window parameter should be specified for this operation, but will default to the number of terms in the list.
OpValueRange
Filter by a range test on a document value.
OpScaleWeight
Scale the weight of a subquery by the specified factor.
A factor of 0 means this subquery will contribute no weight to the query - it will act as a purely boolean subquery.
If the factor is negative, Xapian::InvalidArgumentError will be thrown.
OpEliteSet
Pick the best N subqueries and combine with OP_OR.
If you want to implement a feature which finds documents similar to a piece of text, an obvious approach is to build an “OR” query from all the terms in the text, and run this query against a database containing the documents. However such a query can contain a lots of terms and be quite slow to perform, yet many of these terms don’t contribute usefully to the results.
The OP_ELITE_SET operator can be used instead of OP_OR in this situation. OP_ELITE_SET selects the most important ’‘N’’ terms and then acts as an OP_OR query with just these, ignoring any other terms. This will usually return results just as good as the full OP_OR query, but much faster.
In general, the OP_ELITE_SET operator can be used when you have a large OR query, but it doesn’t matter if the search completely ignores some of the less important terms in the query.
The subqueries don’t have to be terms, but if they aren’t then OP_ELITE_SET will look at the estimated frequencies of the subqueries and so could pick a subset which don’t actually match any documents even if the full OR would match some.
You can specify a parameter to the query constructor which
control the number of terms which OP_ELITE_SET will pick. If
not specified, this defaults to 10 (or
ceil(sqrt(number_of_subqueries))
if there are more
than 100 subqueries, but this rather arbitrary special case
will be dropped in 1.3.0). For example, this will pick the
best 7 terms:
Xapian::Query query(Xapian::Query::OP_ELITE_SET, subqs.begin(), subqs.end(), 7);
If the number of subqueries is less than this threshold, OP_ELITE_SET behaves identically to OP_OR.
OpValueGe
Filter by a greater-than-or-equal test on a document value.
OpValueLe
Filter by a less-than-or-equal test on a document value.
OpSynonym
Treat a set of queries as synonyms.
This returns all results which match at least one of the queries, but weighting as if all the sub-queries are instances of the same term: so multiple matching terms for a document increase the wdf value used, and the term frequency is based on the number of documents which would match an OR of all the subqueries.
The term frequency used will usually be an approximation, because calculating the precise combined term frequency would be overly expensive.
Identical to OP_OR, except for the weightings returned.