The Solr parameters defaultOperator, q.op, and mm are used to configure how optional clauses should be handled in search queries. The interactions between these parameters have changed over time in Solr, resulting in unexpected search results. Users have documented issues in both blog posts and multi-year Jira issues. Further complications may arise through legacy configuration tweaks that were once appropriate in earlier releases but are now causing unexpected behaviors. I intend to detail the relevant history of these parameters and their associate functionality, so that users may better configure Solr to suit their search requirements.

All historical Solr functionality assumptions in this document will use the 1.1.0 as the baseline release. For example, it is assumed that the standard and DisMax query parsers have always been available because they were both available in the 1.1.0 release, even though it’s more likely that the query parsers were not created simultaneously.

Relevant Solr History

Solr was started by Yonik Seeley as a close-sourced project in 2004. In 2006, the project was transitioned to the Apache Incubator, at which point it was open-sourced. The first Apache release was 1.1.0 on December 22, 2006. Solr releases were semantically versioned through 1.4.1, released June 25, 2010. Solr then merged with the Lucene project, and subsequent versioning followed the Lucene versioning. The first release to use Lucene’s versioning was 3.1.0, released March 31, 2011. As of this document’s creation, the latest Solr release is 7.3.0, released April 4, 2018.

Solr has closely tracked changes in their release notes and has included the relevant issue keys when appropriate. These issue keys are numerical and prefixed with SOLR-; e.g., SOLR-1234. The Solr community currently uses Jira for bug and issue tracking. Where appropriate, historical references in the document will link to the relevant issue link.

As of today, Solr supports three main query parsers:

The Standard and DisMax query parsers have always been available in Solr. The Standard query parser is oriented toward Lucene query syntax (wildcards, ranges, boolean operators, etc.). The DisMax query parser is oriented toward simple phrases, similar to a Google search. The Solr documentation describes DisMax syntax as such:

The DisMax query parser supports an extremely simplified subset of the Lucene QueryParser syntax. As in Lucene, quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses. All other Lucene query parser special characters (except AND and OR) are escaped to simplify the user experience. The DisMax query parser takes responsibility for building a good query from the user’s input using Boolean clauses containing DisMax queries across fields and boosts specified by the user.

The eDisMax query parser, an improved version of the DisMax query parser, was added in 3.1.0. eDisMax supports all functionality available in DisMax as well as the a greater subset of the Lucene query syntax. It is safe to think of eDisMax as a combination of the Standard and DisMax parsers. Unless otherwise noted, all further references DisMax references will be interchangeable with eDisMax.

defaultOperator and q.op

The defaultOperator parameter sets the query parser’s default operator. It was defined in the schema.xml file as an attribute in the solrQueryParser element. The valid options are AND or OR. The default value is OR.

The q.op parameter implements the defaultOperator functionality as a client-facing query parameter. If q.op is specified, it will override the defaultOperator value.

defaultOperator and q.op were originally intended for use with the Standard query parser. They have been both available since the initial Solr release. defaultOperator was deprecated in 3.6.0 in favor of using the q.op parameter, and was completely removed in 7.0.0. q.op continues to be supported.

mm

The mm parameter is a DisMax parameters that makes it possible to require a certain minimum number of optional clauses to match. These clauses are further explained in the mm documentation:

When processing queries, Lucene/Solr recognizes three types of clauses: mandatory, prohibited, and “optional” (also known as “should” clauses). By default, all words or phrases specified in the q parameter are treated as “optional” clauses unless they are preceded by a “+” or a “-“.

The syntax for mm may be expressed as number, a percentage, or a combination of a number and percentage. The default value is 100%, meaning that all clauses must match.

mm only applies to top-level clauses; sub-groupings of clauses through parentheses are not governed by mm. For example, the query A B (C D) with a mm of 100% will only require three terms, with both C and D matching for the third term. This applies to DisMax and eDisMax. This functionality is not included in the Solr documentation; it was found as a comment in an SOLR-2649.

There are no references to the initial implementation of the mm parameter in the Solr release notes, and is therefore assumed that mm has always been part of the DisMax query parser.

Interactions between the parameters

DisMax mm configurable by defaultOperator and q.op

Initially, the defaultOperator and q.op parameters were ignored by DisMax query parsers; DisMax would only use the mm parameter when determining optional clause behavior. In release 4.0.0, this behavior was changed so that defaultOperator and q.op could influence optional clauses:

The default logic for the ‘mm’ param of the ‘dismax’ QParser has been changed. If no ‘mm’ param is specified (either in the query, or as a default in solrconfig.xml) then the effective value of the ‘q.op’ param (either in the query or as a default in solrconfig.xml or from the ‘defaultOperator’ option in schema.xml) is used to influence the behavior. If q.op is effectively “AND” then mm=100%. If q.op is effectively “OR” then mm=0%. Users who wish to force the legacy behavior should set a default value for the ‘mm’ param in their solrconfig.xml file.

SOLR-1889 justifies the change as a logical default behavior, especially considering the atypical location of defaultOperator in the schema.xml file, implying a global behavior and not a query parser specific one. However, defaultOperator was deprecated by SOLR-2724 in 3.6.0 before the DisMax behavior was changed. If the issue key numbering are sequential, the defaultOperator deprecation issue (SOLR-2724) was opened and closed while the DisMax behavior change (SOLR-1889) was active. This may explain why both the global-oriented DisMax behavior and the parsing-oriented deprecation were implemented.

eDisMax boolean operators causes mm to be ignored

The original implementation of eDisMax would set mm to 100% if the query used any boolean operator except AND (+, -, OR, and NOT); DisMax was not affected by this behavior. However, mm=100% could be inappropriate for the query; consider the follow eDisMax queries in a cultural heritage Solr index:

Type Terms q.op/mm Row Count Notes
Lucene Ancient Art Sculpture OR 2,612,261  
Lucene Ancient Art Sculpture Marble OR 2,621,552 Additional optional terms increases row count
Lucene Ancient Art Sculpture Marble -Greek OR 2,613,535 Negated term reduces row count
Lucene Ancient Art Sculpture AND 3,324  
Lucene Ancient Art Sculpture Marble AND 142 Additional required terms reduce row count
Lucene Ancient Art Sculpture Marble -Greek AND 93 Negated term reduce row count
eDisMax Ancient Art Sculpture 50% 2,612,261 mm is evaluated to 1 required term; matches Lucene q.op=OR query
eDisMax Ancient Art Sculpture Marble 50% 296,640 Additional term increases minimum match from 1 to 2, reduces row count appropriately
eDisMax Ancient Art Sculpture Marble -Greek 50% 2,613,535 Unexpected results Negated term sets mm to 100%; matches Lucene q.op=OR query

This led to the multi-year issue SOLR-2649, which started on July 12, 2011. A patch was include in Solr 5.5.0 that significantly changed eDisMax handling, and the issue was marked resolved on December 15,

  1. The issues SOLR-8812 and SOLR-9174 were opened to address unset mm parameters, and the subsequent patch was included in Solr 5.5.3. These changes are summarized as follows:
Original Behavior The default operator (q.op) is hardcoded OR
New Behavior q.op and defaultOperator parameters affected how boolean operators are evaluated
Original Behavior The mm parameter is ignored if any boolean operators except AND are present
New Behavior If the mm parameter value is set, it is always used, regardless of the presence of boolean operators. If the mm parameter is not set and the query has boolean operators, a default mm value of 0% is used.

Jason Hellman also presents a thorough summary with examples in his article Edismax Queries in a post-Solr 5.5 World: The AND, the OR, and the Ugly.

Practical Considerations

  • The mm parameter will affect all eDisMax queries. mm can be considered as having been moved from a ‘back-end’ parameter to a ‘front-end’ one, i.e., a default mm value that had been set in solrconfig.xml may now be impractical.
  • Boolean operators in eDisMax queries are now subject to mm values, which may significantly affected queries. The query A OR B parses two optional terms in both q.op=OR and q.op=AND, but a mm=100% that was previously ignored will now require both terms to be present. A thorough understanding of how Solr parses boolean operators in queries is required; read Chris “Hoss” Hostetter’s Boolean Operators for Solr Users for an excellent overview.
  • A defaultOperator or q.op parameter set to AND may significantly affect eDisMax queries, especially with boolean operators. The query A OR B C with q.op=AND will mark A and B as optional and C as required, but a mm=100% value will require all options. As mm only applies to top-level clauses, (A OR B) C will reproduce the original behavior. Chris “Hoss” Hostetter, an active Solr contributor, advises users not to change the default operator in a January 3, 2012 comment from the blog post Boolean Operators for Solr Users.
  • Solr instances that have upgraded through multiple versions should be checked for the deprecated defaultOperator parameter.