$smwgFulltextSearchMinTokenSize

From semantic-mediawiki.org


Configuration parameter details:
Name $smwgFulltextSearchMinTokenSize
Description Sets the minimum word/token length to help to decide whether MATCH or LIKE operators are to be used for a condition statement
Default setting
3
Software Semantic MediaWiki
Since version
Until version still available
Configuration Full-text search · Experimental
Keyword full-text search · data store · relational database · sql store · sql database · experimental


$smwgFulltextSearchMinTokenSize is a configuration parameter that sets the minimum word/token length to help to decide whether MATCH or LIKE operators are to be used for a condition statement of the full-text search. This helps to switch back to LIKE in cases where the minimum threshold is not applicable. The configuration parameter is specific to the relational data stores MySQL and MariaDB. The higher the number set for this configuration parameter the faster the queries will be. However the default setting appears to be the most pragmatic in most cases when querying for meaningful content. The configuration parameter was introduced in Semantic MediaWiki 2.5.0Released on 14 March 2017 and compatible with MW 1.23.0 - 1.29.x..1

This configuration parameter only takes effect if full-text search was enabled via configuration parameter $smwgEnabledFulltextSearchSets whether full-text search support for properties may be used.

Default setting[edit]

$smwgFulltextSearchMinTokenSize = 3;

This means that the minimum length of words stored in the InnoDB FULLTEXT index is "3".

Changing the default setting[edit]

If this configuration parameter is changed it must be set to an integer between "1" and "16" corresponding to either MySQL's configuration parameter innodb_ft_min_token_size (integer between "0" and "16") or configuration parameter ft_min_word_len (integer higher than "1").
Maintenance script "rebuildFulltextSearchTable.php"Allows to rebuild the full text search data table must be run after changing the setting of this configuration parameter.

To modify the setting to this configuration parameter, add one of the following lines to your "LocalSettings.php" file after the enableSemantics() call:

Increase the the minimum length of words stored
$smwgFulltextSearchMinTokenSize = 5;

This means that the minimum length of words stored in the InnoDB FULLTEXT index is "5". This reduces the size of the index, thus speeding up queries, by omitting common words that are unlikely to be significant in a search context, such as the English words "a", "to" and "and".

Reduce the the minimum length of words stored
$smwgFulltextSearchMinTokenSize = 1;

This means that the minimum length of words stored in the InnoDB FULLTEXT index is "1". Only recommended for a CJK languages (Chinese, Japanese, Korean).

See also[edit]

General information
Related configuration parameters


References

  1. ^  |  Semantic MediaWiki: GitHub pull request gh:smw:1481