Essential skills of a Data Scientist

Posted by harshsinghal on Stack Overflow See other posts from Stack Overflow or by harshsinghal
Published on 2010-05-18T19:13:35Z Indexed on 2010/05/18 19:30 UTC
Read the original article Hit count: 328

Filed under:
|

I would like to know more about the relevant skills in the arsenal of a Data Scientist, and with new technologies coming in every day, how one picks and chooses the essentials.

A few ideas germane to this discussion:

  • Knowing SQL and the use of a DB such as MySQL, PostgreSQL was great till the advent of NoSql and non-relational databases. MongoDB, CouchDB etc. are becoming popular to work with web-scale data.
  • Knowing a stats tool like R is enough for analysis, but to create applications one may need to add Java, Python, and such others to the list.
  • Data now comes in the form of text, urls, multi-media to name a few, and there are different paradigms associated with their manipulation.
  • What about cluster computing, parallel computing, the cloud, Amazon EC2, Hadoop ?
  • OLS Regression now has Artificial Neural Networks, Random Forests and other relatively exotic machine learning/data mining algos. for company

Thoughts?

© Stack Overflow or respective owner

Related posts about r

    Related posts about data