QueryStringUtil (Sitevision API)

```
@Requireable(value="QueryStringUtil")
public interface QueryStringUtil
```
Query string utility interface.
An instance of the Sitevision class implementing this interface can be obtained via SearchFactory.getQueryStringUtil(). See SearchFactory for how to obtain an instance of the SearchFactory interface.

Since:

Sitevision 3.6

Author:

Magnus Lövgren

Field Summary

Fields
Modifier and Type Field and Description

static String MATCH_ALL_QUERY
The "match all" query string.

Fields
Modifier and Type	Field and Description
`static String`	`MATCH_ALL_QUERY` The "match all" query string.

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method and Description
`String`	`getDateAsString(Date aDate)` Returns a date formatted according to the Solr date string representation.
`String`	`getFieldQuery(String aFieldName, String aValueExpression)` Returns a field query that is properly grouped.
`String`	`removeQuerySyntaxChars(String aQueryString)` Removes all query syntax characters from a query string and trims the result.
`String`	`removeQuerySyntaxChars(String aQueryString, boolean aLenientRemove)` Removes query syntax characters from a query string and trims the result.
`String`	`smartWildcard(String aQueryString)` Gets a prefix/wildcard query that potentially will be scored.
`String`	`splitCollectionToQueryParts(Collection<String> aStringsToSplit, String aSplitExpression)` Transforms multiple strings with delimiters to a string that could be used in a field-grouped query expression.
`String`	`splitToQueryParts(String aStringToSplit, String aSplitExpression)` Transforms a string with delimiters to a string that could be used in a field-grouped query expression.
`String`	`stripLocalParams(String aQueryString)` Strips Local params for a query string.
`String`	`stripTrailingAnyChars(String aQueryString)` Strips all trailing "any" chars.

Field Detail
- MATCH_ALL_QUERY
```
static final String MATCH_ALL_QUERY
```
  The "match all" query string.
  This is the special query syntax ("*:*") to use when querying "everything".
  
  A common misunderstanding is that a single wildcard (i.e. "*") would also query "everything". That is a false assumption. A single wildcard is less efficient and it will only match docs that has data in the default query fields of the parser (i.e. a single wildcard will potentially not include "everything").
  
  Since:
  
  Sitevision 8.2
  
  See Also:
  
  Constant Field Values

Method Detail

stripTrailingAnyChars

String stripTrailingAnyChars(String aQueryString)

Strips all trailing "any" chars.

The question mark character is a query syntax char (the "any" char) and can potentially screw up querying (i.e. the parser fails to parse the query or return unexpected result). This method removes all trailing "any" chars (i.e. removes all trailing question marks).

Some examples
aQueryString	Returned
`"when is halloween"`	`"when is halloween"`
`"when is halloween?"`	`"when is halloween"`
`"when is halloween??"`	`"when is halloween"`

Parameters:: aQueryString - the query string
Returns:: aQueryString without trailing any chars or null if aQueryString is null
Since:: Sitevision 8.2

stripLocalParams

String stripLocalParams(String aQueryString)

Strips Local params for a query string.

Local params are a query string prefix that starts with "{!" and ends with "}". The Local Params can override/sidestep or affect desired search behaviour. This method strips Local params to prohibit that. Leading whitespace of Local params will also be stripped.

Some examples
aQueryString	Returned
`null`	`null`
`""`	`""`
`"hello query"`	`"hello query"`
`"{!}"`	`""`
`"{!}hey"`	`"hey"`
`"{!whatever}foo"`	`"foo"`
`"{!whatever} bar"`	`" bar"`
`" {! whatever }baz"`	`"baz"`
`"{!whatever"`	`"{!whatever"`

Parameters:: aQueryString - the query string
Returns:: aQueryString without any Local params
Since:: Sitevision 10

removeQuerySyntaxChars
```
String removeQuerySyntaxChars(String aQueryString)
```
Removes all query syntax characters from a query string and trims the result.
Current query syntax characters are:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

Note! This is a legacy shortcut for (strict/non-lenient) removeQuerySyntaxChars(aQueryString, false).

Parameters:

aQueryString - a non-null query expression

Returns:

a aQueryString without syntax chars or null if aQueryString is null

See Also:

removeQuerySyntaxChars(String, boolean)

removeQuerySyntaxChars

String removeQuerySyntaxChars(String aQueryString,
                              boolean aLenientRemove)

Removes query syntax characters from a query string and trims the result.

Current query syntax characters are:
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

Processing:

The "any" char will be removed, i.e: "ma?nus" -> "manus"
The "double" chars will be replaced with a "single" char, i.e: "ma&&nus" -> "ma&nus" and "ma||nus" -> "ma|nus"
The "not" chars will be removed, unless aLenientRemove is true. Lenient behaviour will try to keep all dashes that can be interpreted as "word separators" ("bindestreck" in swedish).
Other chars will be replaced with a space. Subsequent syntax chars will only result in one space, e.g: "This is *so* funny!" -> "This is so funny"

Some examples
aQueryString	aLenientRemove	Returned
`"(Site?vision: *Enterprise) !?"`	`true / false`	`"Sitevision Enterprise"`
`"Anna-Karin?"`	`true`	`"Anna-Karin"`
`"Anna-Karin?"`	`false`	`"Anna Karin"`

Parameters:: aQueryString - a non-null query expression; aLenientRemove - whether or not to handle syntax chars in a lenient matter
Returns:: a aQueryString without query syntax characters or null if aQueryString is null
Since:: Sitevision 8.2

smartWildcard

String smartWildcard(String aQueryString)

Gets a prefix/wildcard query that potentially will be scored.

The general purpose/advantage of a raw wildcard query (i.e Prefix query) is that it will result in hits also for a partial word. Typical a good thing for all "live-search/type-ahead" solutions. The downside is that the search result of such query can be a real mess since all wildcard-hits are scored exactly the same ("constant scoring"). In practice, this means that the hits of such search result can show up in random order.

This method returns a "smart" wildcard query that combines the prefix-matching advantage of a raw wildcard query with potential scoring capabilities. This is achieved by a expanding the word to multiple terms and adding the wildcard to one of them and use an implicit OR. In other words: "build a query that matches the exact word or the wildcarded word".

The query string "Car" transformed into a smart wildcard query "+(Car car*)" could conceptually result in a search result like this:

"Car sales" (exact clause match + wildcard clause match: score 1.23)
"Carpets" (wildcard clause match: constant scoring 1)
"Careless" (wildcard clause match: constant scoring 1)

The word that is wildcarded will also be lowercased for better matching (typically the query parser is primarily set up to use/query analyzed fields, i.e. typically lowercased). A word with a dash is potentially further duplicated for increased matching (dash is the "any" syntax char but is handled lenient). A word that ends with a syntax character will typically not be wildcarded at all. A word that contains a syntax character will typically get a raw wildcard as-is.

Some examples
aQueryString	Returned
`null`	`null`
`" "`	`null`
`"Car"`	`"+(Car car*)"`
`"Car*"`	`"+(Car car*)"`
`"Car?"`	`"Car?"`
`"title:Car"`	`"title:Car*"`
`"Anna-Carin"`	`"+(Anna-Carin AnnaCarin (+Anna +carin) anna-carin annacarin*)"`
`"019-173030"`	`"+(019-173030 019173030 (+019 +173030) 019-173030 019173030*)"`

The smart wildcard query downside/caveat is that the actual query is more complex. This increased complexity will typically distort the pattern matching for the Solr Elevation component, i.e. "elevated/sponsored" hits will typically never work for smart wildcard queries.

Parameters:: aQueryString - the query string
Returns:: aQueryString as a "smart" wildcard query or null if aQueryString is null or blank
Since:: Sitevision 8.2

splitToQueryParts

String splitToQueryParts(String aStringToSplit,
                         String aSplitExpression)

Transforms a string with delimiters to a string that could be used in a field-grouped query expression.

This is a convenience method when you want to query something based on items in a string that are delimited by some token. A typical example is a "keyword" metadata that contains multiple keywords delimited by a comma char.

This method splits the aStringToSplit with the aSplitExpression and each part is then trimmed and appended to the resulting string, separated with a space. Parts that contains a space char is quoted.

Some examples
aStringToSplit	aSplitExpression	Returned
`"one"`	`","`	`one`
`"one,two"`	`","`	`one two`
`"one, two"`	`","`	`one two`
`"one, two, three four"`	`","`	`one two "three four"`

`"one"`	`"aNonMatchingExpression"`	`one`
`"one,two"`	`"aNonMatchingExpression"`	`one,two`
`"one, two"`	`"aNonMatchingExpression"`	`"one, two"`
`"one, two, three four"`	`"aNonMatchingExpression"`	`"one, two, three four"`

`null`	`","`	`null`
`null`	`null`	`null`

`"one"`	`null`	`one`
`"one,two"`	`null`	`one,two`
`"one, two"`	`null`	`one, two`
`"one, two, three four"`	`null`	`one, two, three four`

Parameters:: aStringToSplit - the string that should be transformed; aSplitExpression - the regular expression to split up aStringToSplit in parts
Returns:: the result of the operation. if aStringToSplit is null, null will always be returned. if aSplitExpression is null, aStringToSplit will always be returned. if aSplitExpression is a non-matching expression, a trimmed aStringToSplit will always be returned, and it will be quoted if aStringToSplit contains a space char.

splitCollectionToQueryParts
```
String splitCollectionToQueryParts(Collection<String> aStringsToSplit,
                                   String aSplitExpression)
```
Transforms multiple strings with delimiters to a string that could be used in a field-grouped query expression.
This is a convenience method that executes splitToQueryParts(String, String) for a collection of strings and appends each returned value to a combined result, separated with a space. Whitespace only or null values will be ignored.

See splitToQueryParts(String, String) how each string of the collection will be transformed.

Parameters:

aStringsToSplit - a collection of strings

aSplitExpression - the regular expression to split up the strings in the aStringsToSplit collection in parts

Returns:

the result of the splitToQueryParts(String, String) operation for all strings in aStringsToSplit. if aStringsToSplit is null or empty, null will always be returned.

See Also:

splitToQueryParts(String, String)

getFieldQuery

String getFieldQuery(String aFieldName,
                     String aValueExpression)

Returns a field query that is properly grouped.

This method trims the aValueExpression and analyzes the space-separated parts, quoted and unquoted. The result will be a grouped field query if there are multiple parts in aValueExpression and a non-grouped field query if there are only one part in aValueExpression.

Note that this is a convenience method only. Neither field or value will be syntactically checked in any way. The caller of this method is responsible for passing values that the query parser used later on will accept.

Some examples
aFieldName	aValueExpression	Returned
`content.analyzed`	`sitevision`	`content.analyzed:sitevision`
`+content.analyzed`	`sitevision*`	`+content.analyzed:sitevision*`
`-content.analyzed`	`enterprise`	`-content.analyzed:enterprise`
`content.analyzed`	`"sitevision enterprise"`	`content.analyzed:"sitevision enterprise"`

`content.analyzed`	`sitevision enterprise`	`content.analyzed:(sitevision enterprise)`
`content.analyzed`	`portal "sitevision enterprise"`	`content.analyzed:(portal "sitevision enterprise")`

Parameters:: aFieldName - the field expression; aValueExpression - the value expression
Returns:: a properly grouped field query. Note that null will be returned if aFieldName or aValueExpression is null or whitespace only.

getDateAsString
```
String getDateAsString(Date aDate)
```
Returns a date formatted according to the Solr date string representation.
All dates in Solr (Lucene) are stored using UTC (zulu time 'Z'). When a date is converted to a string that should be sent to Solr (for example as a part of a query) the timezone must be taken into consideration since no adjustments will be performed by the query parser.

Parameters:

aDate - the date

Returns:

aDate formatted according to Solr's date representation. Returns null if aDate is null.

Since:

Sitevision 4.2

Sitevision - Portal and Content Management Made Easy
Sitevision is an advanced Java enterprise portal product and a portlet container (JSR 286) that implements Java Content Repository (JSR 283).

Interface QueryStringUtil

Field Summary

Method Summary

Field Detail

MATCH_ALL_QUERY

Method Detail

stripTrailingAnyChars

stripLocalParams

removeQuerySyntaxChars

removeQuerySyntaxChars

smartWildcard

splitToQueryParts

splitCollectionToQueryParts

getFieldQuery

getDateAsString