Hadoop and Object Reuse, Why?

Posted by Andrew White on Programmers See other posts from Programmers or by Andrew White
Published on 2014-02-11T15:37:35Z Indexed on 2014/06/11 21:39 UTC
Read the original article Hit count: 379

Filed under:

java

|

Performance

|

hadoop

In Hadoop, objects passed to reducers are reused. This is extremely surprising and hard to track down if you're not expecting it. Furthermore, the original tracker for this "feature" doesn't offer any evidence that this change actually improved performance (unless I missed it).

It would speed up the system substantially if we reused the keys and values [...] but I think it is worth doing.

This seems completely counter to this very popular answer. Is there some credence to the Hadoop developer's claim? Is there something "special" about Hadoop that would invalidate the notion of object creation being cheap?

© Programmers or respective owner

Related posts about java

Tomcat 6: Access Control Exception?

as seen on Server Fault - Search for 'Server Fault'
I'm trying to setup a tomcat6 server, and I'm trying to match another setup someone else established. However, my deployment (default Ubuntu install) uses a policy.d/ directory structure, and the established server just uses a catalina.policy file. I've tried setting every entry in policy.d to match… >>> More
Problem in creation MDB Queue connection at Jboss StartUp

as seen on Stack Overflow - Search for 'Stack Overflow'
I am not able to create a Queue connection in JBOSS4.2.3GA Version & Java1.5, as I am using MDB as per the below details. I am putting this MDB in a jar file(named utsJar.jar) and copied it in deploy folder of JBOSS, In the test env. this MDB works well but in another env. [ env settings and… >>> More
failing to establish connection between Postgres db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am using Postgres and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When I try to connect it gives "ClassNotFoundException". Here is what I get when I try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
failing to establish connection between postgre db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, For i am using postgre and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When i try to connect it gives "ClassNotFoundException". Here is what i get when i try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
Migration and deployement problems JBoss 4.2.2.GA to JBoss 6.0.0.M2

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I'm trying to migrate an application running on JBoss 4.2.2.GA to JBoss 6.0.0.M2 I give you some log to explain my problem : boot.log : 2010-03-16 09:59:29,406 ERROR [org.jboss.system.server.profileservice.ProfileServiceBootstrap] (Thread-2) Failed to load profile: Summary of incomplete deployments… >>> More

Related posts about Performance

Improving VPN performance - stronger encryption = more performance?

as seen on Server Fault - Search for 'Server Fault'
I have a site-to-site VPN set up with two SonicWall's (a TZ170 and a Pro1260). It was suggested to me that turning off encryption (so the VPN is tunneling only) would improve performance. (I'm not concerned with security, because the VPN is running over a trusted line.) Using FTP and HTTP transfers… >>> More
Inaccurate performance counter timer values in Windows Performance Monitor

as seen on Stack Overflow - Search for 'Stack Overflow'
I am implementing instrumentation within an application and have encountered an issue where the value that is displayed in Windows Performance Monitor from a PerformanceCounter is incongruent with the value that is recorded. I am using a Stopwatch to record the duration of a method execution, then… >>> More
Excel-based Performance Reviews transformed into Web Application for Performance Management

as seen on Geeks with Blogs - Search for 'Geeks with Blogs'
HR TMS provides enterprise talent management solutions for healthcare, retail and corporate customers, focusing on performance management, compensation management and succession planning. As the competency of nurses and other healthcare workers is critical, the government, via the Joint Commission… >>> More
How to save a perfmon Performance Counter as a textfile (Reliability and Performance Monitor Version

as seen on Server Fault - Search for 'Server Fault'
Now the file gets saved as blg, but I would like a txt versin to import in Excel. >>> More
SQLAuthority News – A Successful Performance Tuning Seminar at Pune – Dec 4-5, 2010

as seen on SQL Authority - Search for 'SQL Authority'
This is report to my third of very successful seminar event on SQL Server Performance Tuning. SQL Server Performance Tuning Seminar in Colombo was oversubscribed with total of 35 attendees. You can read the details over here SQLAuthority News – SQL Server Performance Optimizations Seminar – Grand… >>> More