SQL SERVER – Simple Explanation and Puzzle with SOUNDEX Function and DIFFERENCE Function
- by pinaldave
Earlier this week I asked a question where I asked how to Swap Values of the column without using CASE Statement. Read here: A Puzzle – Swap Value of Column Without Case Statement,there were more than 50 solutions proposed in the comment. There were many creative solutions. I have mentioned my personal favorite (different ones) here: Solution of Puzzle – Swap Value of Column Without Case Statement.
However, I received lots of questions regarding one of the Solution by SIJIN KUMAR V P. He has used the function SOUNDEX in his solution. The request was to explain how SOUNDEX and DIFFERENCE works. Well, there are pretty decent documentations provided over here SOUNDEX function and DIFFERENCE over on MSDN and if I attempt to explain this function I will end up writing the same details which are available on MSDN. Instead of writing theory, we will try to learn this function by using a couple of simple puzzles. You try to solve the puzzles using the MSDN and see if you can learn something very quickly.
In simple words - SOUNDEX converts an alphanumeric string to a four-character code to find similar-sounding words or names. The first character of the code is the first character of character_expression and the second through fourth characters of the code are numbers that represent the letters in the expression. Vowels incharacter_expression are ignored unless they are the first letter of the string. DIFFERENCE function returns an integer value. The  integer returned is the number of characters in the SOUNDEX values that are the same. The return value ranges from 0 through 4: 0 indicates weak or no similarity, and 4 indicates strong similarity or the same values.
Learning Puzzle 1:
Now let us run following four queries and observe its output.
SELECT SOUNDEX('SQLAuthority') SdxValue
SELECT SOUNDEX('SLTR') SdxValue
SELECT SOUNDEX('SaLaTaRa') SdxValue
SELECT SOUNDEX('SaLaTaRaM') SdxValue
When you look at the result set all the four values are same.
The reason for all the values to be same is as for SQL Server SOUNDEX function all the four strings are similarly sounding string.
Learning Puzzle 2:
Now let us run following five queries and observe its output.
SELECT DIFFERENCE (SOUNDEX('SLTR'),SOUNDEX('SQLAuthority'))
SELECT DIFFERENCE (SOUNDEX('TH'),SOUNDEX('SQLAuthority'))
SELECT DIFFERENCE ('SQLAuthority',SOUNDEX('SQLAuthority'))
SELECT DIFFERENCE ('SLTR',SOUNDEX('SQLAuthority'))
SELECT DIFFERENCE ('SLTR','SQLAuthority')
When you look at the result set you will get the result in the ranges from 1 to 4.
Here is how it works if your result is 0 which means absolutely not relevant to each other and if your result is 1 which means the results are relevant to each other.
Have you ever used above two functions in your business need or on production server? If yes, would you please leave a comment with use cases. I believe it will be beneficial to everyone.
Reference: Pinal Dave (http://blog.SQLAuthority.com)
Filed under: PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology