combining levenshtin distance and double metaphone

Submitted by merlin on Wed, 2011-03-16 10:58.Troubleshooting

Hi all,
I'm using the latest dbsight 4.17 release and this is my 7th day using dbsight, so, here is my noob question.

I'm trying to match person's name such as 'picasso'. When searching for 'picaso', 'Bachs' alone with a bunch of other names showed up with the same score as 'picasso'. But to have 'picasso' show up on the top of the list is what I want to achieve. So, I'm thinking, if i can put levenshtin distance in there and assign it with a higher score, then i might be able to solve this problem.(configure search --> searchable columns)
But... i dont know where i can incorporate this. I tried picaso~0.5 as search term and it returns no result. I tried snow ball analyzers but none works. maybe I'm not looking at the right places?

any thoughts?

thank you.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Submitted by will on Wed, 2011-03-16 11:06.

Do you have name as a separated field?

If so, you can adjust weighting for the name field, on the "searchable" page.

btw: you can always fallback to lucene's query parser by appending this to URL

 &lq=fieldName:"picaso"^4
Submitted by merlin on Wed, 2011-03-16 19:55.

i'm using lq=lastNameS:picaso~0.5
now i have a combination of exact match, word stemming and phonetics for last name search. thank you Will!

Submitted by merlin on Wed, 2011-03-16 12:01.

my setup looks like this.
lastNameT, lastNameS, lastNameP all hold the same value, and they are only differ in analyzer and weight.

fieldName type FieldType Analyzer weight

lastNameT String text numberOrLowerCase 3.0
lastNameS String text snowball-English Lovins 2.0
lastNameP String text Double Metaphone 1.0

lastNameT is for text search, so when i type in picasso the correct spelling, the correct on shows up on the top. if i only use phonetic, name with similar sounds will show up with the same score.

lastNameS is what i intend to use for Stemming or Levenshtin distance, so that words that spells similarly will weight more than the phonetic matches.

lastNameP is the phonetic one.

currently, lastNameT and lastNameP works great and i'm trying to get lastNameS to work in the "similar spelling" match.

Submitted by will on Thu, 2011-03-17 11:00.

This is very nice details and are very useful to other users. If you put it to your blog and we will link it to your page from our wiki.

Thanks!

Submitted by merlin on Fri, 2011-03-18 12:00.

http://merlin1109.blogspot.com/2011/03/dbsight-combining-levenshtin-distance.html

thank you Will and I'm glad and happy to share with the dbsight community.