Having two sets of input combined on hadoop

Posted by aeolist on Stack Overflow See other posts from Stack Overflow or by aeolist
Published on 2010-04-27T23:01:43Z Indexed on 2010/04/27 23:03 UTC
Read the original article Hit count: 179

Filed under:
|
|
|
|

I have a rather simple hadoop question which i'll try to present with an example

say you have a list of strings and a large file and you want each mapper to process a piece of the file and one of the strings in a grep like program.

how are you supposed to do that? I am under the impression that the number of mappers is a result of the inputsplits produced. I could run subsequent jobs, one for each string, but it seems kinda... messy?

© Stack Overflow or respective owner

Related posts about hadoop

Related posts about multiple