kick off a map reduce job from my java/mysql webapp

Posted by Brian on Stack Overflow See other posts from Stack Overflow or by Brian
Published on 2011-01-08T22:11:41Z Indexed on 2011/01/09 5:54 UTC
Read the original article Hit count: 350

Filed under:
|
|
|
|

Hi guys,

I need a bit of archecture advice. I have a java based webapp, with a JPA based ORM backed onto a mysql relational database. Now, as part of the application I have a batch job that compares thousands of database records with each other. This job has become too time consuming and needs to be parallelized. I'm looking at using mapreduce and hadoop in order to do this. However, I'm not too sure about how to integrate this into my current architecture. I think the easiest initial solution is to find a way to push data from mysql into hadoop jobs. I have done some initial research on this and found the following relevant information and possibilities:

1) https://issues.apache.org/jira/browse/HADOOP-2536 this gives an interesting overview of some inbuilt JDBC support 2) This article http://architects.dzone.com/articles/tools-moving-sql-database describes some third party tools to move data from mysql to hadoop.

To be honest I'm just starting out with learning about hbase and hadoop but I really don't know how to integrate this into my webapp.

Any advice is greatly appreciated. cheers, Brian

© Stack Overflow or respective owner

Related posts about mysql

Related posts about architecture