Before tackling the API,
it's important to review the parsing order of the query expression.
Parsing of the query is dependent on the ranking of logical operators.
This means defining which operators take precedence over others. The
concept is also known from traditional math, where, for example,
multiplications take precedence over additions. It can also be said that
multiplications have a higher order than additions. An example of this
is the following:
a=3 +4*5 = 23
The result of a
is calculated by
adding 3 to the sum of 4 times 5. The result in this case is 23 because
the multiplication gets executed first.
A normal way of
representing the order of computation is to use parentheses. In the math
example, the formula would then be written as follows:
a=3+(4*5) = 23
Using this form of representation makes it easier to understand how a query is executed in the search engine.
Operator Order
Table 1
shows the order of the query operators in SP 2010. This is relevant
when constructing queries that contain multiple terms or property
restrictions. As always it is possible to overrule this priority using
parentheses to group certain sub-expressions.
Table 1. Logical operators in SharePoint 2010 search
Order |
Operator |
Description |
1 |
AND |
The logical AND statement dictates that
terms on both sides of this operator have to match the result for it to
be returned. It can also be used to group two separate blocks of
conditions using the (condition set 1) AND (condition set 2). |
|
+ |
This is the default operator if no other
is specified. A term preceded by this operator must match a result for
it to be returned. |
2 |
NOT |
This operator must succeed another term or
condition group. The term or condition group following this operator
must not be matched in the result if it is to be returned. This operator
is similar to using the “-” operator. |
|
- |
A term preceded by this operator must not match a result for the result to be returned. |
3 |
OR |
The logical OR statement works similarly
to the AND operator, except only one of the terms before or after the OR
has to match the result for it to be returned. It can also be used to
group two separate blocks of conditions using the (condition set 1) OR
(condition set 2). |
3 |
WORDS |
This is not a logical operator but rather a
function operator. It should be followed by a comma-separated list of
terms surrounded by parentheses. The WORDS operator acts as if there
were an implicit OR statement between each item in the list, with the
exception of how they rank.
When using the WORDS operator, the terms are
treated as synonyms, not individual terms. Ranking of the terms is
therefore equal to the total amount of occurrences of any term in the
list. If the list contains two terms, and the first is found two times
and the second is found three times, the synonym group would be ranked
as if five occurrences of the same term were found. |
4 |
NEAR |
Sometimes the relevancy of a search is dependent
on not only the terms but also how they appear in the text relative to
one another. The NEAR operator allows for queries where the terms have
to be close to each other. For logical reasons, this works with free
text expressions only. Therefore it cannot be used in a keyword query.
NEAR can be considered a more restrictive version of the AND operator. |
5 |
* |
The wildcard operator or asterisk
character (“*”) is used to enable prefix matching. It can process any
part of the beginning of a word. Even only one letter followed by the
wildcard operator works. It acts as [0…n] random characters, which means
it can also be put after the entire term and still match that term. |
For example, take the following user-submitted query:
sharepo* search or search near office
This would be evaluated as follows:
('sharepo*' AND 'search') OR '(search' near 'office)
In this example, it is assumed that keyword
inclusion is set to All Keywords, which means that there is an implicit
AND between keywords.
Using a Tree Structure to Understand the Query
A good way of understanding how a query behaves is to create a query tree, as shown in Figure 1.
This way, it is easier to physically view the individual components of
the query and evaluate the impact of the operator order. Especially if
parentheses are added to the query to manipulate the ordering, this is a
helpful way of analyzing the query. The following is an example of a
query and the corresponding query tree.
sharepo* search or search near office
Figure 1. Creating a query tree structure for understanding the search
If this is to be submitted as a FullTextSqlQuery
, the SQL would be as follows:
SELECT WorkId,Rank,Title,Author,ModifiedBy,Size,Path,Description,Created,Write,Filename,
SiteName,SiteTitle,CollapsingStatus,HitHighlightedSummary,HitHighlightedProperties,
ContentClass,IsDocument,ContentType,objectid,PictureURL,WorkEmail,CreatedBy,ContentSource,
FileExtension FROM SCOPE() WHERE (FREETEXT(defaultproperties,'software') AND CONTAINS
(' "sharepo*" AND search ') ) OR CONTAINS('search NEAR office') ORDER BY Rank DESC
Manipulate the Query with Parentheses
You can force a specific ranking of operators in
the parsing of the query by combining different parts of a keyword
query using parentheses. It is important that the parentheses are closed
correctly in such a way that every single left side “(” has an equivalent right side “)”.
Empty spaces next to the parenthesis do not influence the search
result, as they are stripped out by the search expression parser.
Understanding how query
parsing takes place and how the order can be modified is important when
working with the search API. In the following sections, various ways of
using the search API are presented. When creating full text queries or
querying through the search web service, the query is passed as clear
text. When creating SQL queries, the query is created from a number of
aggregated statements. Although this is different from providing one
complete query string, the same principle applies.