Unexpected tag search behaviour

A client reported to me that full text search was not working with profile tag fields.

I checked the code and sure enough, the where clause contains:

msn.string IN ($tags_in)

rather than a full text search command such as LIKE or MATCH.

This means that, for example, if a user flags a profile tag field (such as interests) with "american football" and someone searches for "football", this tag will not appear in the search results.

This behaviour appears to be deliberate. However, I find it surprising and I wonder if there is an explanation as to why it was done that way.

If I wanted to replace the search_tags_hook function with one that did full text search, would I need to modify the search plugin or is there a way to do this in my own plugin?

  • The metastrings table doesn't have a fulltext index and some quick profiling showed adding one seriously affected performance.  I haven't had a chance to revisit this.

  • Ah, OK.

    I am adding configurable full text search to the form plugin.

    For all other entities I am just copying all searchable fields to one __searchable_data field and joining to that with a LIKE in the where clause.

    Perhaps not super fast, but gets the job done.

    I can't do that for user profiles because user metadata has independent access.

    Makes me wish (not for the first time) that metadata did not have its own access levels and that user profiles were object containers.

    Water under the bridge, however.

  • I am fooling around with coding a toy Elgg clone written in Haskell.

    This is really an excuse to learn Haskell but gives me the chance to try some blue sky ideas.

    For that I have a separate searchableData table with entityGuid and data fields and an API to determine what goes there.  This seems to work well if metadata does not have its own access controls.

  • Only parts of our initial search rewrite concept made it into core. The whole "separate table of searchable data with an API to manage it" approach was exactly what we'd started to program, and you can still see the vestiges of that in our initial code that made it out to github back last year. The concept was rejected by Curverider though, citing overhead and complexity concerns. At least the search API and UI improvements we worked on were mostly accepted. :)