1832069 Members
3059 Online
110034 Solutions
New Discussion

Sound Index - Soundex

 
SOLVED
Go to solution
Mary Rice
Frequent Advisor

Sound Index - Soundex


Hello everyone.

As part of a project to insert voter data into a program written in Informix ESQL/C, I have been asked to add a 'Sound Index' to the scrolling index. The idea is that names like 'Tomas' and 'Thomas' will appear near each other. I've never heard of a 'Sound Index'. Is there such a thing?

TIA, Mary

5 REPLIES 5
Christopher Caldwell
Honored Contributor

Re: Sound Index - Soundex

Yup. It's built into Oracle 8.1.X and >. Your mileage may vary for Informix. Look at the Informix docs for contextual indexing - the different types of indices should be defined there.
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: Sound Index - Soundex

Hi Mary,

I found the function and added a very small main to test the program.

compile it like this:

cc -Aa soundex.c -o soundex

Then execute it like soundex cain cane kane
CAIN --> C500
CANE --> K500
KANE --> K500

Notice that there is an problem inherent in soundex in that 'KANE' sounds the same but generates a different soundex code. This is one of the weaknesses of soundex. It is a very old algorithm and more modern hashing schemes are available.

I suggest that you do not create a index on this column because you will get very high duplicate counts which can cause unusually long delays especially in deletion. Instead create a composite index of something like
SOUNDEX, LAST NAME, FIRST NAME, ...

The name fields can be shortened versions of the actual columns to save space.

This should get you started, Clay
If it ain't broke, I can fix that.
James R. Ferguson
Acclaimed Contributor

Re: Sound Index - Soundex

Hi Mary:

Absolutely! We've use one for years, although it's written in Algol for a Unisys server.

The general idea is to reduce alphabetic strings into simple similar keys that can then be used in searching and matching.

For instance, my name, "Ferguson" has several common spelling variations which "sound-alike". On the telephone you might spell it "Ferguson", "Furguson", "Furgason", "Furgasen", etc.

The algorithm we use retains the first letter; drops all vowels (a,e,i,o,u); and maps the remaining letters to a single digit.

All of the above variations of "Ferguson" would return a "soundex" key of F625.

Does this help?

Regards!

...JRF...
Mary Rice
Frequent Advisor

Re: Sound Index - Soundex

Hello everyone and thank you!

Christopher, good idea but I don't think my version of Informix has this feature.

James, yes, that is what I am looking for. I compiled and ran Clay's program and got the same code you listed. You two guys must be on the same page.

I do have a problem and that is where to put this on the screen. This is an old curses-based application and screen space is very tight especially on the scrolling index. I'm glad it's only four characters.
A. Clay Stephenson
Acclaimed Contributor

Re: Sound Index - Soundex

Hi Mary,

Silly me but I don't think that is the right question. I think the right question is: 'Do I need the soundex code displayed on the screen?'

Really, there is no reason to display it, you simply need to prompt for the starting last name, and possibly the first name, and then compute the soundex code.

I would then do a select where soundex >= starting_soundex order by soundex, last_name, first_name.

The retrievals will be much faster if you order by the same index that has the soundex value as the first part of the index.

Food for thought, Clay
If it ain't broke, I can fix that.