Enum xapian::constants::XapianOp

source ·

pub enum XapianOp {
Show 14 variants    OpAnd,
    OpOr,
    OpAndNot,
    OpXor,
    OpAndMaybe,
    OpFilter,
    OpNear,
    OpPhrase,
    OpValueRange,
    OpScaleWeight,
    OpEliteSet,
    OpValueGe,
    OpValueLe,
    OpSynonym,
}

Expand description

Enum of possible query operations #[repr(i32)]

Variants§

§

OpAnd

Return iff both subqueries are satisfied

§

OpOr

Return if either subquery is satisfied

§

OpAndNot

Return if left but not right satisfied

§

OpXor

Return if one query satisfied, but not both

§

OpAndMaybe

Return iff left satisfied, but use weights from both

§

OpFilter

As AND, but use only weights from left subquery

§

OpNear

Find occurrences of a list of terms with all the terms occurring within a specified window of positions.

Each occurrence of a term must be at a different position, but the order they appear in is irrelevant.

The window parameter should be specified for this operation, but will default to the number of terms in the list.

§

OpPhrase

Find occurrences of a list of terms with all the terms occurring within a specified window of positions, and all the terms appearing in the order specified.

Each occurrence of a term must be at a different position.

The window parameter should be specified for this operation, but will default to the number of terms in the list.

§

OpValueRange

Filter by a range test on a document value.

§

OpScaleWeight

Scale the weight of a subquery by the specified factor.

A factor of 0 means this subquery will contribute no weight to the query - it will act as a purely boolean subquery.

If the factor is negative, Xapian::InvalidArgumentError will be thrown.

§

OpEliteSet

Pick the best N subqueries and combine with OP_OR.

If you want to implement a feature which finds documents similar to a piece of text, an obvious approach is to build an “OR” query from all the terms in the text, and run this query against a database containing the documents. However such a query can contain a lots of terms and be quite slow to perform, yet many of these terms don’t contribute usefully to the results.

The OP_ELITE_SET operator can be used instead of OP_OR in this situation. OP_ELITE_SET selects the most important ’‘N’’ terms and then acts as an OP_OR query with just these, ignoring any other terms. This will usually return results just as good as the full OP_OR query, but much faster.

In general, the OP_ELITE_SET operator can be used when you have a large OR query, but it doesn’t matter if the search completely ignores some of the less important terms in the query.

The subqueries don’t have to be terms, but if they aren’t then OP_ELITE_SET will look at the estimated frequencies of the subqueries and so could pick a subset which don’t actually match any documents even if the full OR would match some.

You can specify a parameter to the query constructor which control the number of terms which OP_ELITE_SET will pick. If not specified, this defaults to 10 (or ceil(sqrt(number_of_subqueries)) if there are more than 100 subqueries, but this rather arbitrary special case will be dropped in 1.3.0). For example, this will pick the best 7 terms:

 Xapian::Query query(Xapian::Query::OP_ELITE_SET, subqs.begin(), subqs.end(), 7);

If the number of subqueries is less than this threshold, OP_ELITE_SET behaves identically to OP_OR.

§

OpValueGe

Filter by a greater-than-or-equal test on a document value.

§

OpValueLe

Filter by a less-than-or-equal test on a document value.

§

OpSynonym

Treat a set of queries as synonyms.

This returns all results which match at least one of the queries, but weighting as if all the sub-queries are instances of the same term: so multiple matching terms for a document increase the wdf value used, and the term frequency is based on the number of documents which would match an OR of all the subqueries.

The term frequency used will usually be an approximation, because calculating the precise combined term frequency would be overly expensive.

Identical to OP_OR, except for the weightings returned.