Search Results

Search found 19554 results on 783 pages for 'xml pull parser'.

Page 362/783 | < Previous Page | 358 359 360 361 362 363 364 365 366 367 368 369  | Next Page >

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

  • Digital Certificate Parsing Library in C++?

    - by Sherwood Hu
    I used Crypto++ for my application. However it lacks a digital certificate parser. I know that openSSL has one, but I have to learn the whole library again. Is there some parsing library existing for C++? All I want is to read the certificate and extract some fields, including the public key.

    Read the article

  • "Push" linq vs reactive framework

    - by Benjol
    (Once again exposing the depths of my ignorance here by combining two concepts which I haven't grokked) I read here about the Reactive framework being a 'Push' model compared to Linq's 'Pull' model. This reminded me of reading an article about 'Push' Linq. Is there really any similarity between these two 'frameworks'? UPDATE Since I asked this question, Jon Skeet has asked it too, here are his first and second impressions.

    Read the article

  • Regex not operator

    - by Erik Goens
    I need to use a regex to pull a value out a url domain that will exclude everything but the host (ex: wordpress) and domain type (ex .com). The urls are dynamic and contain 2-3 values for each result (www.example.com or example.org). I am trying to use this expression, but I am only getting back the first letter of every item I am attempting to exclude: Expresssion (?!wordpress|com|www)(\w+|\d+) String example.wordpress.com Results example ordpress om Desired Result example Any assistance would be greatly appreciated

    Read the article

  • How to get Facebook share behavior with Facebook Connect on iPhone

    - by Benoit
    With the standard share at http://www.facebook.com/sharer.php one only needs to specify a URL. Title and thumbnail are automatically pulled from the web page (perhaps with the help of meta tags). With Facebook Connect (I am using the iPhone SDK), I need to supply everything explicitly (URL, title, caption, description, images, etc.). Is there a way to emulate the "share" behavior with Facebook Connect, i.e. let Facebook pull the missing elements from the page being shared itself?

    Read the article

  • Regex to parse youtube yid

    - by novaurora
    Example URLs http://www.youtube.com/user/Scobleizer#p/u/1/1p3vcRhsYGo http://www.youtube.com/watch?v=cKZDdG9FTKY&feature=channel http://www.youtube.com/watch?v=yZ-K7nCVnBI&playnext_from=TL&videos=osPknwzXEas&feature=sub Any regex that will pull the correct YID from all 3 of these use cases? The first case is especially odd. Thank you.

    Read the article

  • Postgresql GROUP_CONCAT equivalent?

    - by KnockKnockWhosThere
    I have a table and I'd like to pull one row per id with field values concatenated... In my table, for example, I have this: TM67 | 4 | 32556 TM67 | 9 | 98200 TM67 | 72 | 22300 TM99 | 2 | 23009 TM99 | 3 | 11200 And, I'd like to output: TM67| 4,9,72 | 32556,98200,22300 TM99 | 2,3 | 23009,11200 In MySQL, I was able to use GROUP_CONCAT, but that doesn't seem to work here... Is there an equivalent or another way to accomplish this?

    Read the article

  • How do those bitmasks actually work?

    - by mystify
    For example, this method from NSCalendar takes a bitmask: - (NSDate *)dateByAddingComponents:(NSDateComponents *)comps toDate:(NSDate *)date options:(NSUInteger)opts So options can be like: NSUInteger options = kCFCalendarUnitYear; or like: NSUInteger options = kCFCalendarUnitYear | kCFCalendarUnitMonth | kCFCalendarUnitDay; What I don't get is, how is this actually done? I mean: How can they pull out those values which are merged into options? If I wanted to program something like this, that can take a bitmask, how would that look?

    Read the article

  • using IE credentials to log on with c#

    - by james
    Hi i am writing a HTML parser for helping with some job duties, I can enter the site using IE explorer. but using csharp code i get an error i have tried using client.Credentials = CredentialCache.DefaultNetworkCredentials; client.Proxy.Credentials = CredentialCache.DefaultCredentials; i don't get the requested page, but an error page. if i can view page in explorer there must be a way to retrieve its html in C# (note that same page in other browsers requires authintication - not in IE) appreciate the help

    Read the article

  • Configure (or mimic) svn:externals to include code from Github in a svn-hosted project

    - by Dylan Beattie
    We use Subversion locally, and we're working on a project that uses a fork of Fluent NHibernate, which is hosted on Github. I'd like it set up so that a single svn checkout will retrieve everything necessary to build the project, but maintain the ability to fetch HEAD updates from github. Is there any way I can pull code from the Git repository as though it was an svn:external dependency? Can I just check the .git folder into our Subversion repository and just run git fetch when I need to, then svn commit the results?

    Read the article

  • Silverlight 4 Default Button Service

    - by Mark Cooper
    For a few months I have been successfully using David Justices Default Button example in my SL 3 app. This approach is based on an attached property. After upgrading to SL4, the approach no longer works, and I get a XAML exception: "Unknown parser error: Scanner 2148474880" Has anyone succesfully used this (or any other) default button attached behaviours in SL4? Is there any other way to achieve default button behaviour in SL4 with the new classes that are available? Thanks, Mark

    Read the article

  • Benefits of implementing OAuth

    - by zfranciscus
    From a webservice provider point of view what is the benefit of asking users to create an account or login using 3rd party web service provider (e.g: Twitter or facebook) to log into your site with? Wouldn't it be easier to ask the user to provide their twitter or facebook login and use that to pull the user's twitter or facebook data? It is safer to use OAuth than giving some one the internet our twitter or facebook login credential. But, I can't figure out the benefit from the web service point of view.

    Read the article

  • Duplicate System.Web.UI.AsyncPostBackTrigger Controls keep getting inserted automatically, causing P

    - by Albert
    I have an update panel with a number of [asp:AsyncPostBackTrigger...] controls, and everything was working fine. But now, something keeps inserting duplicate AsyncTriggers, and instead of simply being [asp:AsyncPostBackTrigger...] controls they're [System.Web.UI.AsyncPostBackTrigger...] controls, and I get parser errors as a result. So I delete the duplicate triggers, and they get re-inserted within a few minutes, seemingly randomly. Anyone know whats going on here?

    Read the article

  • Rookie PHP question

    - by Thomas
    I am hacking together a theme for wordpress and I am using the following code to pull out data from a custom field with several values: <?php $mykey_values = get_post_custom_values('services'); foreach ( $mykey_values as $key => $value ) { echo "<span>$value, </span>"; } ?> I use a comma to seperate the results, but I don't want a comma after the last result. How do I get around this?

    Read the article

  • Turing-Complete language possibilities?

    - by I can't tell you my name.
    In every Turing-Complete language, is it possible to create a working Compiler for itself which first runs on an interpreter written in some other language and then compiles it's own source code? (Bootstrapping) Standards-Compilant C++ compiler which outputs binaries for, e.g.: Windows? Regex Parser and Evaluater? World of Warcraft clone? (Assuming the language gets the necessary API bindings as, for example, OpenGL and the WoW source code is available) (Everything here theoretical) Let's take Brainf*ck as an example language.

    Read the article

  • How is parsing phase in a compiler different from a rule engine ?

    - by abhinav
    Hi, I have a rough understanding of how the compilers work (I mean languages, grammars, lexical analysis, parsing etc). The rule engines have various rules and associated action, just like you have rules in the grammars and you can associate actions with them in parser-generator tools like ANTLR. So I am a bit confused on how to differentiate between these two. Could anyone give a clearer, more formal explanation for the differences ? Thanks, Abhinav.

    Read the article

< Previous Page | 358 359 360 361 362 363 364 365 366 367 368 369  | Next Page >