Search++ should be ready configured on installation for most sites. Nevertheless, if you encounter any difficulties, here are some tips to help you along and find tune the search experience.
If some content or pages never show in the search results
This is the behaviour you would expect from the core concrete5 search. It can happen for two reasons:
Search++ has a scoring mechanism that can be tuned to bias results in favour of particular aspects of the matching algorithm. The default settings in the Search Score dashboard page are a good starting point for most sites, so before adjusting them, have a look at the Search Results dashboard page.
Searches are scored and the top scoring result has a relevance of 100%. All subsequent results are assigned a relevance percentage with respect to the top scoring result and the Minimum Relevance percentage is the value below which results are not considered relevant.
If you are getting too many irrelevant results, try raising this value a little. If you are getting too few relevant results, try lowering it. Generally, somewhere between 20% and 50% appears to work reasonably well, but you may need to adjust further if you have changed the Search Score weightings.
To help you fine tune this, the score and percentage score can be shown in search results by configuring the respective settings on the Diagnostics dashboard page. You can limit the users who see the scores to just Administrators, so you can see the scores but guests do not.
This is the opposite problem to too many results. The simplest solution is to lower the Minimum Relevance. But you may need to do a little more than that. To help diagnose, configure the Diagnostics in the dashboard to log failed searches and/or to show search scores with the results.
To help you fine tune this, the score and percentage score can be shown in search results by configuring the respective settings on the Diagnostics dashboard page. You can limit the users who see the scores to just Administrators, so you can see the scores but guests do not.
Search++ implements the concrete5 permissions system, so when configured properly will only list pages a user is permitted to see.
If you find users are seeing search results listing pages they are not permitted to see, the cause will be cache configuration.
Search++ facilitates caching at 3 scopes:
So if you find users are seeing search results listing pages they are not permitted to see, check to see if All users share cache is enabled, then change it to one of the less permissive cache settings.
What we mean here is that text extracted from blocks appears in a different sequence to the blocks on the page. This is unfortunately a consequence of a concrete5 core bug, where the actual PageSearchIndex content is populated with blocks in the wrong sequence. Most of the time the index is correct, but sometimes the core just gets the blocks out of sequence when it extracts the searchable content.
With the PageSearchIndex content out of sequence, Search++ then extracts a synopsis and the rest follows. The causality where you may encounter this is:
The same happens with the concrete5 core search because it is the data saved in the PageSearchIndex table that can be faulty. If you suspect this core bug, the faulty PageSearchIndex table can easily be confirmed using the Search++ Index Inspector.
There are some aspects of Search++ that can make it more likely to be noticed. Synopsis 'From top of content', 'At first match' and 'Bias to top' could be fed with incorrect data and can consequently pick a different synopsis to what would be shown with correct data. 'At densest match' is unaffected unless the synopsis length spans a boundary. If a synopsis is shorter, the issue with out-of-sequence PageSearchIndex content is less likely to be noticed.
From v9 the below is in the Concrete CMS core. Before v9 a partial resolution can be made by modifying the query the core uses to index searches. In file concrete/src/Page/Search/IndexedSearch.php, modify the query on line 130 to:
SELECT `bID`, `arHandle` FROM `CollectionVersionBlocks` WHERE `cID` = ? AND `cvID` = ? ORDER BY `arHandle` ASC, `cbDisplayOrder` ASC,
Rather than modify the code directly, the above is also available as an approved pull request on concrete5 Github.
This modification ensures that from v9 onwards or where the pull is applied blocks within areas are indexed in their on-page order, but does not resolve the issue for area sequence, or for layouts or other accumulations of blocks within areas. Hence it is an improvement, but not a complete solution to a correctly sequenced search index.
Stopwords are all those words used to link together sentences, but in most cases don't actually have any relevance to the relevance of the subject.
Words like and, is, of , at, about, from, that, where and when are all in the standard list of MySQL Stopwords. However, stopwords vary between installations and between languages. To view a list of the stopwords configured for your site, visit the Search++ dashboard Diagnostics page and enable List all stopwords in-page. This will dump the stopwords onto the search results page when you run a search.
Stopwords in MySQL are configured in the MySQL server settings and cannot be adjusted from within your concrete5 dashboard.
Stemming is a way of trimming off tailing variations of a word to find the root or stem of the word. For example:
However, there is a lot more involved to stemming than just truncating training 's' or 'ing'. Search++ uses an algorithm called Snowball. Stemming will try to configure Snowball for the currently active language.
If the stemming library does not have an option for the currently active language, a default language set in the Full Text Options dashboard page will be applied when words are stemmed. On install, the default is English.
Characters outside the configured character set are used to detect word boundaries and calculate word lengths when detecting three character words. Hence a word containing, for example, a German Ä may not be searched as you would like.
If your language includes characters outside the regular ascii A-Z, you may want to add them in the Character set extension section of the Full Text Search dashboard page.
The underlying MySQL free text search used by Search++ ignores all words shorter than 4 characters. Anything 3 characters or less is considered by MySQL to be a stop word.
However, especially in technical applications, we often have 3 character words or abbreviations. For example, an unmodified MySQL search would completely ignore "URL" and "CSS".
To get round this, Search++ provides a pair of mechanisms:
Search++ can be configured on the Logging dashboard page to log all successful searches to the concrete5 log. The addon has its own log channel at Search Plus Plus, so you can easily filter the log to just show search information. Successful searches are logged as information with the query terms and score.
For autocomplete they are logged as notice.
Search++ can be configured on the Logging dashboard page to log all failed searches to the concrete5 log. The addon has its own log channel at Search Plus Plus, so you can easily filter the log to just show search information. Failed searches are logged as warnings.
If you find visitors are often misspelling words or using alternative words, you can add them as synonyms in the Synonyms and Abbreviations dashboard page.
Search++ development was originally funded by customers with big documentation sites having thousands of pages and performed shockingly well, despite the complexity of the queries that Search++ can assemble.
The trick is the same as with any web site. You need a server sized to match your site size and load.
Since v0.2.6, Search++ also provides caching of searches. If the same search is executed multiple times, results can be pulled from the cache.
This support site is hosted on a typical shared hosting platform, so has considerably less server resources than a large business site would usually have.
If you are concerned over search speed, Search++ has options to show millisecond search timing with query results. See the Diagnostics dashboard page.
You can also install the concrete5 Speed Analyzer addon to examine how Search++ timings contribute to overall page rendering speed.
Styling of results is relatively agnostic within a Bootstrap theme. Within most Bootstrap themes Search++ query and result blocks will simply fit in straight away or be easily configurable through minor adjustments to a block template's view.css.
For other theme frameworks, Search++ blocks use the same classes as the core page list and search blocks, so again should adapt relatively easily to styles your theme already provides.
Search++ doesn't currently support filtering by tags or other attribute values (It is on the roadmap if you would like to sponsor development).
An easy alternative that helps in most situations is to make sure the tag terms are in the searchable content of the pages you need to be found. This can be in titles, description or content. You can also add attribute display or tag list blocks to the page. These block types will need a getSearchableContent() method as described in "A block type is not being indexed" below.
You can then create links to the 'tagged' searches, for example:
Here these are styled as links so you can see what is going on, but they could just as easily be styled as buttons or even as a tag cloud.
The 'tagged' pages wont always come out top of the search results, but will usually be high on the list.
What we mean here is an internal code exception in an addon or the core when you click the install button from the Add Functionality dashboard page or when you visit a page after installing an addon package.
If you experience such an issue, here are a few things you can check that may resolve the problem.
If you experience a code error on or immediately after install and need assistance, please use the Get Help link from the addon marketplace page to report the problem. On the Whoops report, click the [copy] button immediately below the error message. That will provide a stack trace you can paste into the help request and save time having to request that report later.
Search indexing has nothing to do with Search++, its all handled by the core search index jobs. Nevertheless, should you find the Index Search Engine - All job is running into php execution limits and failing, you can reduce the batch size in the file concrete/jobs/index_search_all.php.
Unfortunately the batch size is currently hard-coded into the source. In a better world it would be a configuration parameter. Hence you need to override the job file or edit the original source. This is a case where editing the original source is not that big a deal. Just change
public $jQueueBatchSize = 50; // whatever smaller value works reliably.
If you suspect a block type is not being indexed with the page, first off, double-check the simple things:
With the obvious double-checked, all block types are indexed by calling a getSearchableContent() method in the controller class. Find the block controller class and check for
public function getSearchableContent(){ // method returns $text derived from the block content }
Without that method, they have no searchable content and hence will not contribute any text to the PageSearchIndex table.
The solution is conceptually easy, you need to implement that method in the offending block controller or in an override of the block controller.
From version 0.3.3, Search++ provides a generic helper class for block search indexing. In the block controller you can implement:
public function getSearchableContent(){ $sc = new \JtF\SearchPlusPlus\SearchableContentHelper($this); return $sc->getSearchableContent(); }
The SearchableContentHelper class is a bit of an overkill approach because it actually renders the block and strips out the text from that. For many block types, there will be a more efficient way of capturing searchable content. Contact me if you need further assistance with Searchable Content.
As a final check, from version 0.3.5 use Dashboard > System & Settings > Search++ > Index Inspector to investigate the search index table for specific pages.
Alternatively you can instead look at the PageSearchIndex table using phpMyAdmin or the MySQL command line with the query.
SELECT `cID`, `cPath`, `content` FROM `PageSearchIndex` WHERE `cPath` NOT LIKE '%dashboard%' AND `cPath` NOT LIKE '%!%'
Perhaps you are unable to connect your site directly to the ConcreteCMS marketplace to install an addon or theme. This manual process works for all addons and themes - not just mine.
The process is exactly the same for addons and themes, except themes have an extra step of activating the theme after installing.
Sometimes step 3 above can run out of PHP execution time. This is most likely when installing an addon or theme that installs a large amount of sample content. You should not run into such an issue with any of my addons or themes.
If you do run into such issues, you can run the install manually from the shell command line.
$ concrete/bin/concrete5 c5:package-install my_package_handle
or
$ concrete/bin/concrete5 c5:package-update my_package_handle
When updating, be sure to replace the previously installed package directory rather than adding to it. If not, you could end up accumulating obsolete debris from a previous version of the package.
If you find yourself needing to install or update many addons or themes manually, consider my Package Magic addon.
Once Package Magic Starter is installed, through the marketplace or manually, all further installs can be handled from the site dashboard using Package Magic.
Searching on this site uses Search++, a highly enhanced search system for concrete5.
For example, the core search would find "Magic Tabs", but would return no results for any of:
Search++ handles full phrases and individual words, stemming, synonyms, abbreviations and acronyms, building a ranked search to list the best matches.
We also have a page showing a more direct comparison at Search Comparison.
If you would like to add similar enhanced searching to your concrete5 site, please contact me and we can discuss your search requirements.