How to map a set of text as a whole to a node?

Posted by JIpeng Tan on Stack Overflow See other posts from Stack Overflow or by JIpeng Tan
Published on 2011-01-13T19:47:54Z Indexed on 2011/01/13 19:53 UTC
Read the original article Hit count: 140

Filed under:
|

Suppose I have a plain text file with the following data:

DataSetOne <br />
content <br />
content <br />
content <br />


DataSetTwo <br />
content <br />
content <br />
content <br />
content <br />

...and so on...

What I want to to is: count how many contents in each data set. For example the result should be

<DataSetOne, 3>, <DataSetTwo, 4>

I am a beginer to hadoop, I wonder if there is a way to map a chunk of data as a whole to a node. for example, set all DataSetOne to node 1 and all DataSetTwo to node 2.

Does anyone can give me an idea how to archive this?

© Stack Overflow or respective owner

Related posts about hadoop

Related posts about filesplitting