Hadoop Map Reduce job never finishes

Posted by rohanbk on Stack Overflow See other posts from Stack Overflow or by rohanbk
Published on 2010-06-11T03:57:18Z Indexed on 2010/06/11 4:02 UTC
Read the original article Hit count: 173

Filed under:
|
|

I am running a Hadoop Map Reduce job using a Python Mapper and Reducer script, and Hadoop Streaming. Both my Map and Reduce jobs run till they are both 100%, but the job doesn't end. I know that when things go sour, Hadoop will terminate the job, but in this case, both stages reach a 100% and just never end. Has anyone else encountered anything similar?

Also, how do I debug my program to figure out where things are going wrong? If I use a smaller input file, and I just run something like: $> cat input_file | mapper.py | sort | reduce.py >> output_file everything works perfectly fine. However, when I use Hadoop, things don't work out.

© Stack Overflow or respective owner

Related posts about python

Related posts about hadoop