Some tips for using full-text search in SQL Server 2016
Use a full-text query instead of the LIKE Transact-SQL predicate if you need to
query formatted binary data or query a large amount of unstructured text data.
A full-text query against millions of rows of text data can take only seconds; whereas
a LIKE query against the same data can take minutes to return.
Reduce the full-text unique key size.
To create a full-text index, the table to be indexed must have a unique index. Try to
select a numeric column as the full-text unique key to increase the speed of full-text
population. If the table to be indexed does not have numeric unique index, consider
creating numeric unique index.
Set the virtual memory size to at least 3 times the physical memory installed in
the computer, and set the SQL Server ‘max server memory’ server configuration
option to half the virtual memory size setting (1.5 times the physical memory).
Because working with full-text search is very resource expensive, you should have
enough physical and virtual memory.
Update the statistics on the clustered index or the full-text key for a full population.
Using so, you can help a multi-range population to generate good partitions on the table.
Set the ‘max full-text crawl range’ option to the number of CPUs on the server box.
Setting this option to the number of CPUs on the server box allows optimize CPU
utilization, which improves crawl performance during a full-text index crawl.
Because the ‘max full-text crawl range’ configuration option is an advanced option, you
should set the ‘show advanced option’ option to 1 to make the ‘max full-text crawl
range’ available.
Note. Setting the ‘max full-text crawl range’ option takes effect immediately without
a server restart.
Consider using the full-text property searching.
SQL Server 2016 supports property searching. Now, you can configure a full-text
index to support property-scoped searching on properties, which are emitted by IFilters.
Use an integer data type for the first column of the clustered index of the base table.
Using an integer data type for the first column of the clustered index of the base table
produces the highest full-text index population speed.
Build a secondary index on a timestamp column.
By using so, you can improve the performance of incremental population.
If you have several physical disks, place the database files separately from the
full-text catalog files.
In this case, you can improve the speed of full-text queries, because multiple disks
can process input/output requests concurrently.
Consider using search across multiple columns.
In SQL Server 2016, you can specify an arbitrary number of columns in a full-text
predicate via a column list.
If you have several physical disks, create several Pagefile.sys files, so that each
Pagefile.sys file will be placed on its own physical disk.
Spreading paging files across multiple disk drives and controllers improves
performance on most disk systems because multiple disks can process input/output
requests concurrently.
Use the top_n_by_rank parameter with CONTAINSTABLE or FREETEXTTABLE.
It can be used to restrict the number of rows returned. The top_n_by_rank parameter
specifies that only the n-highest ranked matches, in descending order, will be returned.
Try to use the CONTAINS or FREETEXT predicates instead of the CONTAINSTABLE
or FREETEXTTABLE functions to simplify the query’s text.
Because qualifying rows returned by the CONTAINSTABLE or FREETEXTTABLE
rowset functions must be explicitly joined with the rows in the original SQL Server table,
the queries that use the CONTAINSTABLE and FREETEXTTABLE functions are more
complex than those that use the CONTAINS and FREETEXT predicates.
Consider limit the size of the buffer pool before you perform a full population.
If the sqlservr.exe process tries to grab all available memory for the buffer pool,
out-of-memory conditions and failure to allocate shared memory can occur for the
fdhost.exe process. You can limit the size of the buffer pool by setting the
‘max server memory’ server option to leave enough memory for the fdhost.exe
process and operating system use.
Use full-text data definition language (DDL) statements to create, modify, and
drop full-text catalogs and indexes.
You can use CREATE FULLTEXT CATALOG, ALTER FULLTEXT CATALOG,
DROP FULLTEXT CATALOG, CREATE FULLTEXT INDEX, ALTER FULLTEXT
INDEX and DROP FULLTEXT INDEX statements to create, modify, and drop
full-text catalogs and indexes. Because using the full-text data definition language
(DDL) statements is more efficient than using the stored procedures and the stored
procedures, which used to work with full-text catalogs and indexes, will be removed
in a future version of SQL Server, you should use full-text data definition language
(DDL) statements to create, modify, and drop full-text catalogs and indexes.
Consider using the NEAR option of the CONTAINS predicate or CONTAINSTABLE
function.
SQL Server 2016 supports the NEAR option of the CONTAINS predicate or
CONTAINSTABLE function. By using the custom NEAR option you can do the following:
– specify the maximum number of non-search terms that separate the first and last
search terms in a match
– specify that words and phrases are matched only if they occur in the order in which
you specify them.
Set the ‘awe enabled’ server configuration option to 1 if you have more than
4 gigabytes (GB) of physical memory.
Beginning in SQL Server 2008, the full-text engine can use AWE memory because
the full-text engine is part of the sqlservr.exe. Because the ‘awe enabled’ configuration
option is an advanced option, you should set the ‘show advanced option’ option to 1
to make the ‘awe enabled’ available.
Note. You must restart the SQL Server 2016 to apply changes to the ‘awe enabled’ option.
Make full-text index population during periods of low database access.
Because full-text index population takes some time, these updates should be
scheduled during CPU idle time and slow production periods.
Assign a very large table (a table that has millions of rows) to its own full-text catalog.
This can improve performance, and can be used to simplify administering and monitoring.