How much information can you mine out of a name?

Posted by Finglas Fjorn on Programmers See other posts from Programmers or by Finglas Fjorn
Published on 2011-01-02T17:12:10Z Indexed on 2011/01/03 1:58 UTC
Read the original article Hit count: 206

Filed under:
|
|

While not directly related to programming, I figured that the programmers on here would be just as curious as I was about this question. Feel free to close the question if it does not meet with the guidelines.

A name: first, possibly a middle, and surname.

I'm curious about how much information you can mine out of a name, using publicly available datasets. I know that you can get the following with anywhere between a low-high probability (depending on the input) using US census data: 1) Gender. 2) Race.

Facebook for instance, used exactly that to find out, with a decent level of accuracy, the racial distribution of users of their site (https://www.facebook.com/note.php?note_id=205925658858).

What else can be mined? I'm not looking for anything specific, this is a very open-ended question to assuage my curiousity.

My examples are US specific, so we'll assume that the name is the name of someone located in the US; but, if someone knows of publicly available datasets for other countries, I'm more than open to them too.

I hope this is an interesting question!

© Programmers or respective owner

Related posts about data

Related posts about statistics