Tuesday, July 12, 2011

Small tip regarding optimizing FTSearches

Lately I've been working on SQL (MS). At work today, I had a talk about optimizing queries with a colleague more seasoned in the art of writing queries. We got into a talk about if the order of the filter statements (WHEN ..=..) and performance. Apparently, MS have optimized their engine so that the order of the filtering statements don't have much influence on the performance of the query.

This got me thinking about FTSearch. A year or so ago, I thought about doing some testing on how you could structure an FT query to get the best performance, but never got around to do it.

I did a little test today, and it seems like the order of the filters doesn't influence the result much. One thing that seems to heavily influence the result is if one of the query items alone results in a lot of documents.

If you search for "Tom", and the value "Tom" is in a field in a lot of documents, this will drag down the result, no matter if another query item in the query would result if only one document being returned from the query.

Example from test:
Searching for 'abigail AND abbott' - 2-5ms to get result.
Searching for '[Form=Person] AND abigail AND abbott' - 15-20ms to get result.

Conclusion from my test. Query items in a FTSearch that alone results in a lot of documents drags down the performance of the entire query. Order of query items doesn't seem to influence the performance of the query.

If you're building a search engine for databases with a lot of documents, try to avoid having general filters (Form/etc.) if possible.

3 comments:

Karsten Lehmann said...

What's the performance of
'abigail AND abbott AND [Form=Person]' ?
Does reordering of search terms make a difference in this case?

Tommy Valand said...

From my tests, minimal at best. The "median" result didn't change when I shuffled the terms around.

Sodibba said...

If you are searching within a view then the FTI will discard based on NIF results. So you can filter forms that way.

Using the following notes ini setting will give you more details on the search you conducted.

Debug_FTV_Search=1