Search Results

Search found 8384 results on 336 pages for 'lines'.

Page 116/336 | < Previous Page | 112 113 114 115 116 117 118 119 120 121 122 123  | Next Page >

  • Using Hadoop, are my reducers guaranteed to get all the records with the same key?

    - by samg
    I'm running a hadoop job (using hive actually) which is supposed to uniq lines in a lot of text file. More specifically it chooses the most recently timestamped record for each key in the reduce step. Does hadoop guarantee that every record with the same key, output by the map step, will go to a single reducer, even if there are many reducers running across a cluster? I'm worried that the mapper output might be split after the shuffle happens, in the middle of a set of records with the same key.

    Read the article

  • Generate a list of file names based on month and year arithmetic

    - by MacUsers
    How can I list the numbers 01 to 12 (one for each of the 12 months) in such a way so that the current month always comes last where the oldest one is first. In other words, if the number is grater than the current month, it's from the previous year. e.g. 02 is Feb, 2011 (the current month right now), 03 is March, 2010 and 09 is Sep, 2010 but 01 is Jan, 2011. In this case, I'd like to have [09, 03, 01, 02]. This is what I'm doing to determine the year: for inFile in os.listdir('.'): if inFile.isdigit(): month = months[int(inFile)] if int(inFile) <= int(strftime("%m")): year = strftime("%Y") else: year = int(strftime("%Y"))-1 mnYear = month + ", " + str(year) I don't have a clue what to do next. What should I do here? Update: I think, I better upload the entire script for better understanding. #!/usr/bin/env python import os, sys from time import strftime from calendar import month_abbr vGroup = {} vo = "group_lhcb" SI00_fig = float(2.478) months = tuple(month_abbr) print "\n%-12s\t%10s\t%8s\t%10s" % ('VOs','CPU-time','CPU-time','kSI2K-hrs') print "%-12s\t%10s\t%8s\t%10s" % ('','(in Sec)','(in Hrs)','(*2.478)') print "=" * 58 for inFile in os.listdir('.'): if inFile.isdigit(): readFile = open(inFile, 'r') lines = readFile.readlines() readFile.close() month = months[int(inFile)] if int(inFile) <= int(strftime("%m")): year = strftime("%Y") else: year = int(strftime("%Y"))-1 mnYear = month + ", " + str(year) for line in lines[2:]: if line.find(vo)==0: g, i = line.split() s = vGroup.get(g, 0) vGroup[g] = s + int(i) sumHrs = ((vGroup[g]/60)/60) sumSi2k = sumHrs*SI00_fig print "%-12s\t%10s\t%8s\t%10.2f" % (mnYear,vGroup[g],sumHrs,sumSi2k) del vGroup[g] When I run the script, I get this: [root@serv07 usage]# ./test.py VOs CPU-time CPU-time kSI2K-hrs (in Sec) (in Hrs) (*2.478) ================================================== Jan, 2011 211201372 58667 145376.83 Dec, 2010 5064337 1406 3484.07 Feb, 2011 17506049 4862 12048.04 Sep, 2010 210874275 58576 145151.33 As I said in the original post, I like the result to be in this order instead: Sep, 2010 210874275 58576 145151.33 Dec, 2010 5064337 1406 3484.07 Jan, 2011 211201372 58667 145376.83 Feb, 2011 17506049 4862 12048.04 The files in the source directory reads like this: [root@serv07 usage]# ls -l total 3632 -rw-r--r-- 1 root root 1144972 Feb 9 19:23 01 -rw-r--r-- 1 root root 556630 Feb 13 09:11 02 -rw-r--r-- 1 root root 443782 Feb 11 17:23 02.bak -rw-r--r-- 1 root root 1144556 Feb 14 09:30 09 -rw-r--r-- 1 root root 370822 Feb 9 19:24 12 Did I give a better picture now? Sorry for not being very clear in the first place. Cheers!! Update @Mark Ransom This is the result from Mark's suggestion: [root@serv07 usage]# ./test.py VOs CPU-time CPU-time kSI2K-hrs (in Sec) (in Hrs) (*2.478) ========================================================== Dec, 2010 5064337 1406 3484.07 Sep, 2010 210874275 58576 145151.33 Feb, 2011 17506049 4862 12048.04 Jan, 2011 211201372 58667 145376.83 As I said before, I'm looking for the result to b printed in this order: Sep, 2010 - Dec, 2010 - Jan, 2011 - Feb, 2011 Cheers!!

    Read the article

  • Draw a parallel line

    - by VOX
    I have x1,y1 and x2,y2 which forms a line segment. How can I get another line x3,y3 - x4,y4 which is parallel to the first line as in the picture. I can simply add n to x1 and x2 to get a parallel line but it is not what i wanted. I want the lines to be as parallel in the picture.

    Read the article

  • Regular Expression for CSV with numbers

    - by Bernie Perez
    I'm looking for some regular expression to help parse my CSV file. The file has lines of number,number number,number Comment I want to skip number,number number,number Ex: 319,5446 564425,87 Text to skip 27,765564 I read each line into a string and I wanted to use some regular express to make sure the line matches the pattern of (number,number). If not then don't use the line.

    Read the article

  • IP address numbers in MySQL subquery

    - by Iain Collins
    I have a problem with a subquery involving IPV4 addresses stored in MySQL (MySQL 5.0). The IP addresses are stored in two tables, both in network number format - e.g. the format output by MySQL's INET_ATON(). The first table ('events') contains lots of rows with IP addresses associated with them, the second table ('network_providers') contains a list of provider information for given netblocks. events table (~4,000,000 rows): event_id (int) event_name (varchar) ip_address (unsigned 4 byte int) network_providers table (~60,000 rows): ip_start (unsigned 4 byte int) ip_end (unsigned 4 byte int) provider_name (varchar) Simplified for the purposes of the problem I'm having, the goal is to create an export along the lines of: event_id,event_name,ip_address,provider_name If do a query along the lines of either of the following, I get the result I expect: SELECT provider_name FROM network_providers WHERE INET_ATON('192.168.0.1') >= network_providers.ip_start ORDER BY network_providers.ip_start DESC LIMIT 1 SELECT provider_name FROM network_providers WHERE 3232235521 >= network_providers.ip_start ORDER BY network_providers.ip_start DESC LIMIT 1 That is to say, it returns the correct provider_name for whatever IP I look up (of course I'm not really using 192.168.0.1 in my queries). However, when performing this same query as a subquery, in the following manner, it doesn't yield the result I would expect: SELECT event.id, event.event_name, (SELECT provider_name FROM network_providers WHERE event.ip_address >= network_providers.ip_start ORDER BY network_providers.ip_start DESC LIMIT 1) as provider FROM events Instead the a different (incorrect) value for network_provider is returned - over 90% (but curiously not all) values returned in the provider column contain the wrong provider information for that IP. Using event.ip_address in a subquery just to echo out the value confirms it contains the value I'd expect and that the subquery can parse it. Replacing event.ip_address with an actual network number also works, just using it dynamically in the subquery in this manner that doesn't work for me. I suspect the problem is there is something fundamental and important about subqueries in MySQL that I don't get. I've worked with IP addresses like this in MySQL quite a bit before, but haven't previously done lookups for them using a subquery. The question: I'd really appreciate an example of how I could get the output I want, and if someone here knows, some enlightenment as to why what I'm doing doesn't work so I can avoid making this mistake again. Notes: The actual real-world usage I'm trying to do is considerably more complicated (involving joining two or three tables). This is a simplified version, to avoid overly complicating the question. Additionally, I know I'm not using a between on ip_start & ip_end - that's intentional (the DB's can be out of date, and such cases the owner in the DB is almost always in the next specified range and 'best guess' is fine in this context) however I'm grateful for any suggestions for improvement that relate to the question. Efficiency is always nice, but in this case absolutely not essential - any help appreciated.

    Read the article

  • Scala traits and implicit conversion confusion

    - by pr1001
    The following lines work when I enter them by hand on the Scala REPL (2.7.7): trait myTrait { override def toString = "something" } implicit def myTraitToString(input: myTrait): String = input.toString object myObject extends myTrait val s: String = myObject However, if I try to compile file with it I get the following error: [error] myTrait.scala:37: expected start of definition [error] implicit def myTraitToString(input: myTrait): String = input.toString [error] ^ Why? Thanks!

    Read the article

  • reformat text in perl

    - by paul44
    I have a file of 1000 lines, each line in the format filename dd/mm/yyyy hh:mm:ss I want to convert it to read filename mmddhhmm.ss been attempting to do this in perl and awk - no success - would appreciate any help thanks

    Read the article

  • Resource.h in Windows API simple app

    - by nXqd
    there'are these lines in the sample Win32 app created default by VS. Can you explain why they're just numbers, and it's meaning :) //{{NO_DEPENDENCIES}} // Microsoft Visual C++ generated include file. // Used by Testing Project.rc // #define IDS_APP_TITLE 103 #define IDR_MAINFRAME 128 #define IDD_TESTINGPROJECT_DIALOG 102 #define IDD_ABOUTBOX 103 #define IDM_ABOUT 104 #define IDM_EXIT 105 #define IDI_TESTINGPROJECT 107 #define IDI_SMALL 108 #define IDC_TESTINGPROJECT 109 #define IDC_MYICON 2 #ifndef IDC_STATIC #define IDC_STATIC -1 #endif // Next default values for new objects // #ifdef APSTUDIO_INVOKED #ifndef APSTUDIO_READONLY_SYMBOLS #define _APS_NO_MFC 130 #define _APS_NEXT_RESOURCE_VALUE 129 #define _APS_NEXT_COMMAND_VALUE 32771 #define _APS_NEXT_CONTROL_VALUE 1000 #define _APS_NEXT_SYMED_VALUE 110 #endif #endif

    Read the article

  • Understanding Haskell's filter

    - by dmindreader
    I understand that Haskell's filter is a high order function (meaning a function that takes another function as a parameter) that goes through a list checking which element fulfills certain boolean condition. I don't quite understand its definition: filter:: (a->Bool)->[a]->[a] filter p [] = [] filter p (x:y) | p x = x:filter p y | otherwise = filter p y I understand that if I pass an empty list to the function, it would just return an empty list, but how do I read the last two lines?

    Read the article

  • How to get a Clean String in Javascript?

    - by streetparade
    i have a long String. With some German characters and lots of new lines tabs ect.. In a Selectbox user can select a text, on change i do document.getElementById('text').value=this.value; But this fails. I just get a "unterminated string literal" as error in JavaScript. I think i should clean the string. How can i do it in JavaScript?

    Read the article

  • Is it safe to change the 'Security.salt' line to a more lengthy string {64 hex key}

    - by Gaurav Sharma
    Hi everyone, I have changed the Configure::write('Security.salt', '############'); value in the file config/core.php file to a '256-bit hex key'. Is it safe or a good practice to change these lines for every different installation of cakephp application or shall I revert back to the original ? I also changed the Configure::write('Security.cipherSeed','7927237598237592759727'); to a different one of more length. Please throw some light on this. Thanks

    Read the article

  • java.sql.SQLException: database locked

    - by rajkumari
    Hello We are using sqllite056 jar in our code. While inserting into database in batch we are getting exception on line when we going to commit. Lines of Code <object of Connection> .commit(); <object of Connection>.setAutoCommit(true); Exception java.sql.SQLException: database locked

    Read the article

  • Read next word in java

    - by ArtWorkAD
    Hi, I have a text file that has following content: ac und accipio annehmen ad zu adeo hinzugehen ... I read the text file and iterate through the lines: Scanner sc = new Scanner(new File("translate.txt")); while(sc.hasNext()){ String line = sc.nextLine(); } Each line has two words. Is there any method in java to get the next word or do I have to split the line string to get the words?

    Read the article

  • Code golf - hex to (raw) binary conversion

    - by Alnitak
    In response to this question asking about hex to (raw) binary conversion, a comment suggested that it could be solved in "5-10 lines of C, or any other language." I'm sure that for (some) scripting languages that could be achieved, and would like to see how. Can we prove that comment true, for C, too? NB: this doesn't mean hex to ASCII binary - specifically the output should be a raw octet stream corresponding to the input ASCII hex. Also, the input parser should skip/ignore white space. edit (by Brian Campbell) May I propose the following rules, for consistency? Feel free to edit or delete these if you don't think these are helpful, but I think that since there has been some discussion of how certain cases should work, some clarification would be helpful. The program must read from stdin and write to stdout (we could also allow reading from and writing to files passed in on the command line, but I can't imagine that would be shorter in any language than stdin and stdout) The program must use only packages included with your base, standard language distribution. In the case of C/C++, this means their respective standard libraries, and not POSIX. The program must compile or run without any special options passed to the compiler or interpreter (so, 'gcc myprog.c' or 'python myprog.py' or 'ruby myprog.rb' are OK, while 'ruby -rscanf myprog.rb' is not allowed; requiring/importing modules counts against your character count). The program should read integer bytes represented by pairs of adjacent hexadecimal digits (upper, lower, or mixed case), optionally separated by whitespace, and write the corresponding bytes to output. Each pair of hexadecimal digits is written with most significant nibble first. The behavior of the program on invalid input (characters besides [a-fA-F \t\r\n], spaces separating the two characters in an individual byte, an odd number of hex digits in the input) is undefined; any behavior (other than actively damaging the user's computer or something) on bad input is acceptable (throwing an error, stopping output, ignoring bad characters, treating a single character as the value of one byte, are all OK) The program may write no additional bytes to output. Code is scored by fewest total bytes in the source file. (Or, if we wanted to be more true to the original challenge, the score would be based on lowest number of lines of code; I would impose an 80 character limit per line in that case, since otherwise you'd get a bunch of ties for 1 line).

    Read the article

  • Dealing with development and large javascript files?

    - by maxp
    When dealing with websites with large amount of javascript, i see that these are still usually served to the client as one large javascript file. In the development phase, are the javascript files usually split up (say there are 300 lines of js) to make things abit more manageable, and then merged when the website is 'put live'? Or do the developers just put up with working in one long large file?

    Read the article

  • How should I begin to help projects in Github?

    - by Erik Escobedo
    I'm new to github and I like to help other people with his projects that I find interesting. I know there's a lot of guides in the github place, but I think it could be nice to gather a bunch of real people's experiences. So, I invite you to post about your first experiences in github. Whether you are a not-so-newbie or you are a heavy rock in github comunnity, I think your lines could encourage real newbies like me about entering this great open source community.

    Read the article

  • Value object or not for 3d points ?

    - by Stefano Borini
    I need to develop a geometry library in python, describing points, lines and planes in 3d space, and various geometry operations. Related to my previous question. The main issue in the design is if these entities should have identity or not. I was wondering if there's a similar library out there (developed in another language) to take inspiration from, what is the chosen design, and in particular the reason for one choice vs. the other.

    Read the article

  • xerces-c: Xml parsing multiple files

    - by user459811
    I'm atempting to learn xerces-c and was following this tutorial online. http://www.yolinux.com/TUTORIALS/XML-Xerces-C.html I was able to get the tutorial to compile and run through a memory checker (valgrind) with no problems however when I made alterations to the program slightly, the memory checker returned some potential leak bytes. I only added a few extra lines to main to allow the program to read two files instead of one. int main() { string configFile="sample.xml"; // stat file. Get ambigious segfault otherwise. GetConfig appConfig; appConfig.readConfigFile(configFile); cout << "Application option A=" << appConfig.getOptionA() << endl; cout << "Application option B=" << appConfig.getOptionB() << endl; // Added code configFile = "sample1.xml"; appConfig.readConfigFile(configFile); cout << "Application option A=" << appConfig.getOptionA() << endl; cout << "Application option B=" << appConfig.getOptionB() << endl; return 0; } I was wondering why is it when I added the extra lines of code to read in another xml file, it would result in the following output? ==776== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info ==776== Command: ./a.out ==776== Application option A=10 Application option B=24 Application option A=30 Application option B=40 ==776== ==776== HEAP SUMMARY: ==776== in use at exit: 6 bytes in 2 blocks ==776== total heap usage: 4,031 allocs, 4,029 frees, 1,092,045 bytes allocated ==776== ==776== 3 bytes in 1 blocks are definitely lost in loss record 1 of 2 ==776== at 0x4C28B8C: operator new(unsigned long) (vg_replace_malloc.c:261) ==776== by 0x5225E9B: xercesc_3_1::MemoryManagerImpl::allocate(unsigned long) (MemoryManagerImpl.cpp:40) ==776== by 0x53006C8: xercesc_3_1::IconvGNULCPTranscoder::transcode(unsigned short const*, xercesc_3_1::MemoryManager*) (IconvGNUTransService.cpp:751) ==776== by 0x4038E7: GetConfig::readConfigFile(std::string&) (in /home/bonniehan/workspace/test/a.out) ==776== by 0x403B13: main (in /home/bonniehan/workspace/test/a.out) ==776== ==776== 3 bytes in 1 blocks are definitely lost in loss record 2 of 2 ==776== at 0x4C28B8C: operator new(unsigned long) (vg_replace_malloc.c:261) ==776== by 0x5225E9B: xercesc_3_1::MemoryManagerImpl::allocate(unsigned long) (MemoryManagerImpl.cpp:40) ==776== by 0x53006C8: xercesc_3_1::IconvGNULCPTranscoder::transcode(unsigned short const*, xercesc_3_1::MemoryManager*) (IconvGNUTransService.cpp:751) ==776== by 0x40393F: GetConfig::readConfigFile(std::string&) (in /home/bonniehan/workspace/test/a.out) ==776== by 0x403B13: main (in /home/bonniehan/workspace/test/a.out) ==776== ==776== LEAK SUMMARY: ==776== definitely lost: 6 bytes in 2 blocks ==776== indirectly lost: 0 bytes in 0 blocks ==776== possibly lost: 0 bytes in 0 blocks ==776== still reachable: 0 bytes in 0 blocks ==776== suppressed: 0 bytes in 0 blocks ==776== ==776== For counts of detected and suppressed errors, rerun with: -v ==776== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 2 from 2)

    Read the article

< Previous Page | 112 113 114 115 116 117 118 119 120 121 122 123  | Next Page >