Search Results

Search found 3 results on 1 pages for 'cervo'.

Page 1/1 | 1 

  • Handling Indirection and keeping layers of method calls, objects, and even xml files straight

    - by Cervo
    How do you keep everything straight as you trace deeply into a piece of software through multiple method calls, object constructors, object factories, and even spring wiring. I find that 4 or 5 method calls are easy to keep in my head, but once you are going to 8 or 9 calls deep it gets hard to keep track of everything. Are there strategies for keeping everything straight? In particular, I might be looking for how to do task x, but then as I trace down (or up) I lose track of that goal, or I find multiple layers need changes, but then I lose track of which changes as I trace all the way down. Or I have tentative plans that I find out are not valid but then during the tracing I forget that the plan is invalid and try to consider the same plan all over again killing time.... Is there software that might be able to help out? grep and even eclipse can help me to do the actual tracing from a call to the definition but I'm more worried about keeping track of everything including the de-facto plan for what has to change (which might vary as you go down/up and realize the prior plan was poor). In the past I have dealt with a few big methods that you trace and pretty much can figure out what is going on within a few calls. But now there are dozens of really tiny methods, many just a single call to another method/constructor and it is hard to keep track of them all.

    Read the article

  • How do you go from a so so programmer to a great one? [closed]

    - by Cervo
    How do you go from being an okay programmer to being able to write maintainable clean code? For example David Hansson was writing Basecamp when in the process he created Rails as part of writing Basecamp in a clean/maintainable way. But how do you know when there is value in a side project like that? I have a bachelors in computer science, and I am about to get a masters and I will say that colleges teach you to write code to solve problems, not neatly or anything. Basically you think of a problem, come up with a solution, and write it down...not necessarily the most maintainable way in the world. Also my first job was in a startup, and now my third is in a small team in a large company where the attitude was/is get it done yesterday (also most of my jobs are mainly database development with SQL with a few ASP.NET web pages/.NET apps on the side). So of course cut/paste is more favored than making things more cleanly. And they would rather have something yesterday even if you have to rewrite it next month rather than to have something in a week that lasts for a year. Also spaghetti code turns up all over the place, and it takes very smart people to write/understand/maintain spaghetti code...However it would be better to do things so simple/clean that even a caveman/woman could do maintenance. Also I get very bored/unmotivated having to go modify the same things cut/pasted in a few locations. Is this the type of skill that you need to learn by working with a serious software organization that has an emphasis on maintenance and maybe even an architect who designs a system architecture and reviews code? Could you really learn it by volunteering on an open source project (it seems to me that a full time programmer job is way more practice than a few hours a week on an open source project)? Is there some course where you can learn this? I can attest that graduate school and undergraduate school do not really emphasize clean software at all. They just teach the structures/algorithms and then send you off into the world to solve problems. Overall I think the first thing is learning to write clean/maintainable code within the bounds of the project in order to become a good programmer. Then the next thing is learning when you need to do a side project (like a framework) to make things more maintainable/clean even while you still deliver things for the deadline in order to become a great programmer. For example, you are making an SQL report and someone gives you 100 calculations for individual columns. At what point does it make sense to construct a domain specific language to encode the rules in simply and then generate all the SQL as opposed to cut/pasting the query from the table a bunch of times and then adjusting each query to do the appropriate calculations. This is the type of thing I would say a great programmer would know. He/she would maybe even know ways to avoid the domain specific language and to still do all the calculations without creating an unmaintainable mess or a ton of repetitive code to cut/paste everywhere.

    Read the article

  • How can I load a file into a DataBag from within a Yahoo PigLatin UDF?

    - by Cervo
    I have a Pig program where I am trying to compute the minimum center between two bags. In order for it to work, I found I need to COGROUP the bags into a single dataset. The entire operation takes a long time. I want to either open one of the bags from disk within the UDF, or to be able to pass another relation into the UDF without needing to COGROUP...... Code: # **** Load files for iteration **** register myudfs.jar; wordcounts = LOAD 'input/wordcounts.txt' USING PigStorage('\t') AS (PatentNumber:chararray, word:chararray, frequency:double); centerassignments = load 'input/centerassignments/part-*' USING PigStorage('\t') AS (PatentNumber: chararray, oldCenter: chararray, newCenter: chararray); kcenters = LOAD 'input/kcenters/part-*' USING PigStorage('\t') AS (CenterID:chararray, word:chararray, frequency:double); kcentersa1 = CROSS centerassignments, kcenters; kcentersa = FOREACH kcentersa1 GENERATE centerassignments::PatentNumber as PatentNumber, kcenters::CenterID as CenterID, kcenters::word as word, kcenters::frequency as frequency; #***** Assign to nearest k-mean ******* assignpre1 = COGROUP wordcounts by PatentNumber, kcentersa by PatentNumber; assignwork2 = FOREACH assignpre1 GENERATE group as PatentNumber, myudfs.kmeans(wordcounts, kcentersa) as CenterID; basically my issue is that for each patent I need to pass the sub relations (wordcounts, kcenters). In order to do this, I do a cross and then a COGROUP by PatentNumber in order to get the set PatentNumber, {wordcounts}, {kcenters}. If I could figure a way to pass a relation or open up the centers from within the UDF, then I could just GROUP wordcounts by PatentNumber and run myudfs.kmeans(wordcount) which is hopefully much faster without the CROSS/COGROUP. This is an expensive operation. Currently this takes about 20 minutes and appears to tack the CPU/RAM. I was thinking it might be more efficient without the CROSS. I'm not sure it will be faster, so I'd like to experiment. Anyway it looks like calling the Loading functions from within Pig needs a PigContext object which I don't get from an evalfunc. And to use the hadoop file system, I need some initial objects as well, which I don't see how to get. So my question is how can I open a file from the hadoop file system from within a PIG UDF? I also run the UDF via main for debugging. So I need to load from the normal filesystem when in debug mode. Another better idea would be if there was a way to pass a relation into a UDF without needing to CROSS/COGROUP. This would be ideal, particularly if the relation resides in memory.. ie being able to do myudfs.kmeans(wordcounts, kcenters) without needing the CROSS/COGROUP with kcenters... But the basic idea is to trade IO for RAM/CPU cycles. Anyway any help will be much appreciated, the PIG UDFs aren't super well documented beyond the most simple ones, even in the UDF manual.

    Read the article

1