Results 1 to 8 of 8
  1. #1
    ABW Ambassador MoneyBusiness's Avatar
    Join Date
    March 14th, 2006
    Posts
    2,051
    Red face Do you know how to write a better DB search feature?
    Here's the problem: I set up a safe and straightforward search feature for one of my datafeed sites (php/mysql). It works relatively well when searching on various columns - as long as the search query isn't horribly mispelled. If someone's search query is mispelled, then it's likely that the query I use (LIKE operator) won't work.

    For instance: if my tables store product data with a name like "Blue Widgets", and someone enters a search string of "Blue Wgets", then my query fails. (SELECT * from product_table WHERE ProductName like '%blue wgets%').

    I'd like to show them some alternative results based on their query, if possible.

    I'm asking, b/c it seems that I'm getting a few of these problems per day (not a ton, but more than I care to have) - and I hate the fact that most of them leave immediately afterwards, if they cannot find anything.

    Have any of you incorporated some kind of database search functionality that resolves this kind of problem? Or do you know of a good source that delves into this? (i've searched G high and low with no luck so far).

  2. #2
    ABW Ambassador PatrickAllmond's Avatar
    Join Date
    September 20th, 2005
    Location
    OKC
    Posts
    1,219
    A couple of options.

    1. Why not start keep track of these and adding a table that cross references them? Sure the first time somebody does it it will fail. But look at what they mistyped and associate it with something that is in your database. Like building a list of synonyms. In the future when somebody searchs you will look for what they types and all synonyms of that word.

    2. In addition or instead of #1 why not expand on your WHERE class to search for the requested phrase and several variations.

    If I am search for "Fuddruckers" build the where clause that also searches for (at the same time):

    uddruckers
    fuddrucker
    fudruckers (remove the extra letter)
    fddrckrs (remove all of the vowels)

    This can all be done with some good string manipulation in PHP.

    Something else you might look into. See if dictionary.com has a spelling API you can tap into. I know if you go there and mispell something they are pretty good about suggesting an alternative. Maybe you can pay to tap into that technology. A web service perhaps.

    P
    ---
    This response was masterly crafted via the fingers of Patrick Allmond who believe you should StopDoingNothing starting today.
    ---
    Focus Consulting is where I roll | Follow @patrickallmond on Twitter
    Search Engine Marketing | Search Engine Optimization | Social Media | Online Video

  3. #3
    ABW Ambassador Snib's Avatar
    Join Date
    January 18th, 2005
    Location
    Virginia
    Posts
    5,303
    You can use a fulltext search which will cause the misspelled word to be ignored and the search will only be based on the correctly spelled words. Obviously this doesn't solve the problem, but you can avoid producing zero results.

    A much more complicated solution is to compile a user defined function into mySQL called the Levenshtein distance. This function will tell you how many letters it takes to change one word to another. This can help you with spelling problems and should be a better solution than the mySQL soundex() function. I suggest researching both of these functions.

    The problem with creating a user defined function is you need to compile it into mySQL, so without a dedicated server you're somewhat out of luck. I've heard of creating a stored function for this, but I'm not sure how well it will work. Stored functions can be added without needing to re-compile mySQL.

    Now using this in a practical situation is quite cut and dry either. If you can get the levenshtein function added to mySQL you're limited to a word to word comparison. So you can't compare a search query to product names directly. You'll need to split the search query into individual words and compare those to a keyword table (possibly based on past successful searches) to find the keyword they most likely meant.

    - Scott
    Hatred stirs up strife, But love covers all transgressions.

  4. #4
    ABW Ambassador PatrickAllmond's Avatar
    Join Date
    September 20th, 2005
    Location
    OKC
    Posts
    1,219
    You can do stored functions in mysql without any recompiling. You used to have to use UDFs, but now you can use functions and stored procedures that are strictly defined via SQL. About time they caught uo with SQL Server in that aspect

    It looks likes Yahoo! has a spelling suggestion API:

    http://developer.yahoo.com/search/we...uggestion.html

    You can pass it a word and it will pass back a suggestion.
    ---
    This response was masterly crafted via the fingers of Patrick Allmond who believe you should StopDoingNothing starting today.
    ---
    Focus Consulting is where I roll | Follow @patrickallmond on Twitter
    Search Engine Marketing | Search Engine Optimization | Social Media | Online Video

  5. #5
    ABW Ambassador MoneyBusiness's Avatar
    Join Date
    March 14th, 2006
    Posts
    2,051
    Thanks for the help guys. As for tracking mispelled words on a case by case basis, it'll be kind of hard - the product db is 25k+ records.

    If I am search for "Fuddruckers" build the where clause that also searches for (at the same time):

    uddruckers
    fuddrucker
    fudruckers (remove the extra letter)
    fddrckrs (remove all of the vowels)
    That's about as close as I got last night, with this exception:

    1) Try a search with the full query (using LIKE)
    2) If that returns 0 results, break the query into pieces (or split by spaces)
    3) Remove any stop words
    4) Then retry the query a couple of times with the new values.

    For example: "blu and widget" would return 0 results on a normal search. Split it, and you're left with searching for "blu" and "widget" each of which should return some results. (I know..this could return crazy results, especially if the query string is small one).

    A much more complicated solution is to compile a user defined function into mySQL called the Levenshtein distance.

    Snib, read a bunch of info on these functions (including similar_text()). Last night was the first time I've heard of these - and to tell you the truth, it sounds perfect for the job. But I'm curious as to how it will perform when it has to loop through a full keyword table to make comparisons (that's if I'm thinking about this the right way).

    I'm also looking through some info right now about implementing PHPDig (for creating the keyword table) - supposedly works really well, and it's free..

    Ok, thanks again guys. Let me go mull this info over, and decide what to do next.

  6. #6
    ABW Ambassador PatrickAllmond's Avatar
    Join Date
    September 20th, 2005
    Location
    OKC
    Posts
    1,219
    I am not sure I entirely understand the hard part. Couldn't you just add a record to a badsearches table every time someone search for something and it could not be found? Then look at at that table every week or so.
    ---
    This response was masterly crafted via the fingers of Patrick Allmond who believe you should StopDoingNothing starting today.
    ---
    Focus Consulting is where I roll | Follow @patrickallmond on Twitter
    Search Engine Marketing | Search Engine Optimization | Social Media | Online Video

  7. #7
    Newbie DataFeedFile.com's Avatar
    Join Date
    May 26th, 2006
    Posts
    329
    This is an advance SQL topic. I can not tell you exactly how to do it because I use Postgresql not MySql.

    Snib, is on the right track. Solving it with technology or plug-in already available from MySQL is the best way. Full Text search, Fuzzy search, those things tweaked properly with some creativity can make a big difference.

  8. #8
    ABW Ambassador MoneyBusiness's Avatar
    Join Date
    March 14th, 2006
    Posts
    2,051
    Quote Originally Posted by patrick24601
    I am not sure I entirely understand the hard part. Couldn't you just add a record to a badsearches table every time someone search for something and it could not be found? Then look at at that table every week or so.

    I understand what you're saying - keep track of all mispellings and enter them by hand every week. So if I have 50 mispellings, search each one through the db to see what the user actually had in mind. Then enter the appropriate mispelling related back to the main product table.

    For instance: someone searches for "bloo wiget" (store it later, with the assumption that someone will spell it the same way - or at least close).

    How long do you suppose this would take to be effective, if we're talking about 25k+ products, and multiple mispellings a week? (keeping in mind the time spent entering the data).

    Wasn't trying to be dense...merely looking for a way to do this in as much of a "hands-off" approach as possible. Especially since this isn't my only site.

  9. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. Search Feature
    By windkd in forum PopShops
    Replies: 0
    Last Post: February 6th, 2009, 09:19 AM
  2. New search box feature
    By porkchop in forum PopShops
    Replies: 10
    Last Post: January 11th, 2008, 01:35 PM
  3. Google's Search History feature
    By Merchant Consultant Team in forum Search Engine Optimization
    Replies: 0
    Last Post: April 20th, 2007, 10:09 AM
  4. write a search engine script
    By Heyder in forum Programming / Datafeeds / Tools
    Replies: 1
    Last Post: March 15th, 2002, 09:59 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •