Search Results

Search found 2708 results on 109 pages for 'michael altman'.

Page 77/109 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >

  • Improving HTML scrapper efficiency with pcntl_fork()

    - by Michael Pasqualone
    With the help from two previous questions, I now have a working HTML scrapper that feeds product information into a database. What I am now trying to do is improve efficiently by wrapping my brain around with getting my scrapper working with pcntl_fork. If I split my php5-cli script into 10 separate chunks, I improve total runtime by a large factor so I know I am not i/o or cpu bound but just limited by the linear nature of my scraping functions. Using code I've cobbled together from multiple sources, I have this working test: <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $hrefArray = array("http://slashdot.org", "http://slashdot.org", "http://slashdot.org", "http://slashdot.org"); function doDomStuff($singleHref,$childPid) { $html = new DOMDocument(); $html->loadHtmlFile($singleHref); $xPath = new DOMXPath($html); $domQuery = '//div[@id="slogan"]/h2'; $domReturn = $xPath->query($domQuery); foreach($domReturn as $return) { $slogan = $return->nodeValue; echo "Child PID #" . $childPid . " says: " . $slogan . "\n"; } } $pids = array(); foreach ($hrefArray as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref,$childPid); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); Which raises the following questions: 1) Given my hrefArray contains 4 urls - if the array was to contain say 1,000 product urls this code would spawn 1,000 child processes? If so, what is the best way to limit the amount of processes to say 10, and again 1,000 urls as an example split the child work load to 100 products per child (10 x 100). 2) I've learn that pcntl_fork creates a copy of the process and all variables, classes, etc. What I would like to do is replace my hrefArray variable with a DOMDocument query that builds the list of products to scrape, and then feeds them off to child processes to do the processing - so spreading the load across 10 child workers. My brain is telling I need to do something like the following (obviously this doesn't work, so don't run it): <?php libxml_use_internal_errors(true); ini_set('max_execution_time', 0); ini_set('max_input_time', 0); set_time_limit(0); $maxChildWorkers = 10; $html = new DOMDocument(); $html->loadHtmlFile('http://xxxx'); $xPath = new DOMXPath($html); $domQuery = '//div[@id=productDetail]/a'; $domReturn = $xPath->query($domQuery); $hrefsArray[] = $domReturn->getAttribute('href'); function doDomStuff($singleHref) { // Do stuff here with each product } // To figure out: Split href array into $maxChilderWorks # of workArray1, workArray2 ... workArray10. $pids = array(); foreach ($workArray(1,2,3 ... 10) as $singleHref) { $pid = pcntl_fork(); if ($pid == -1) { die("Couldn't fork, error!"); } elseif ($pid > 0) { // We are the parent $pids[] = $pid; } else { // We are the child $childPid = posix_getpid(); doDomStuff($singleHref); exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); } // Clear the libxml buffer so it doesn't fill up libxml_clear_errors(); But what I can't figure out is how to build my hrefsArray[] in the master/parent process only and feed it off to the child process. Currently everything I've tried causes loops in the child processes. I.e. my hrefsArray gets built in the master, and in each subsequent child process. I am sure I am going about this all totally wrong, so would greatly appreciate just general nudge in the right direction.

    Read the article

  • Need help in using Eclipse JEE version to develop a servlet project

    - by michael
    Hi, I have downloaded eclipse jee version (3.5) and I would like to use it to develop a servlet project on tomcat. So I * install tomcat and add it as my server in my eclipse environment. * create a Dynamic Web Project called 'TestServlet' * create a new servlet called 'MainServlet' and then I deploy my project to the tomcat server via eclipse and 'run the server in debug' mode. But when I use the browser to hit 'http://localhost:8080/TestServlet/MainServlet' I see no resource found (that page is generated by Tomcat, so I know my Tomcat is running). Can you please tell me what am I missing? Or how can I trouble shoot my problem? I think it must be some path /name is not set correctly.

    Read the article

  • How can I move all my modification to a branch

    - by michael
    Hi, I create a working repository in HG. And I have modified some files. How can i move my all my modification to a branch (a branch that I have not created)? (kind of 'git stash' and the move the stash away change to a branch. Actually, I am not sure how I can do that in git either. If you know, I appreciate if you can tell me in git as well) Thank you.

    Read the article

  • Lambda expressions - set the value of one property in a collection of objects based on the value of

    - by Michael Rut
    I'm new to lambda expressions and looking to leverage the syntax to set the value of one property in a collection based on another value in a collection Typically I would do a loop: class item{ public string name {get;set;} public string value {get;set;} } class business { item item1 = new item(name="name1"); item item2 = new item(name="name2"); item item3 = new item(name="name3"); Collection<item> items = new Collection() {item1,item2,item3}; //This is what I want to simplify for( int i = 0; i < items.count; i++) { if(items[i].item.name == "name2") { //set the value items[i].item.value = "value2"; } } }

    Read the article

  • Jboss Seam Booking Example Extract Shared Libs From Ear

    - by michael lucas
    Example Booking Application, which JBoss Seam is shipped with, build into EAR file of about 7 MB. That's pretty much if you consider deploying this package to a remote Jboss server and possibly redeploying it package many times during your regular work. Lib files like richfaces and jsf-facelet make the lion's share of that EAR size. Why can't we just extract lib files into jboss-web.deployer directory on JBoss 4.2.0 GA server?

    Read the article

  • Animation of UINavigationController's UIToolbar

    - by Michael Waterfall
    When presenting a view controller that has toolbar items, is it possible for the toolbar to slide in with the view controller (i.e. slide in from the right) as opposed to it sliding from the bottom? In the view controller that is being presented, I've got the toolbar being shown within the -viewWillAppear: method, but the toolbar is being slid up from the bottom of the screen as opposed to it looking like it belongs to the view controller. - (void)viewWillAppear:(BOOL)animated { [self.navigationController setToolbarHidden:NO animated:YES]; ... }

    Read the article

  • Hibernate without primary keys generated by db?

    - by Michael Jones
    I'm building a data warehouse and want to use InfiniDB as the storage engine. However, it doesn't allow primary keys or foreign key constraints (or any constraints for that matter). Hibernate complains "The database returned no natively generated identity value" when I perform an insert. Each table is relational, and contains a unique integer column that was previously used as the primary key - I want to keep that, but just not have the constraint in the db that the column is the primary key. I'm assuming the problem is that Hibernate expects the db to return a generated key. Is it possible to override this behaviour so I can set the primary key field's value myself and keep Hibernate happy? -- edit -- Two of the mappings are as follows: <?xml version="1.0"?> <!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd"> <!-- Generated Jun 1, 2010 2:49:51 PM by Hibernate Tools 3.2.1.GA --> <hibernate-mapping> <class name="com.example.project.Visitor" table="visitor" catalog="orwell"> <id name="id" type="java.lang.Long"> <column name="id" /> <generator class="identity" /> </id> <property name="firstSeen" type="timestamp"> <column name="first_seen" length="19" /> </property> <property name="lastSeen" type="timestamp"> <column name="last_seen" length="19" /> </property> <property name="sessionId" type="string"> <column name="session_id" length="26" unique="true" /> </property> <property name="userId" type="java.lang.Long"> <column name="user_id" /> </property> <set name="visits" inverse="true"> <key> <column name="visitor_id" /> </key> <one-to-many class="com.example.project.Visit" /> </set> </class> </hibernate-mapping> and: <?xml version="1.0"?> <!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd"> <!-- Generated Jun 1, 2010 2:49:51 PM by Hibernate Tools 3.2.1.GA --> <hibernate-mapping> <class name="com.example.project.Visit" table="visit" catalog="orwell"> <id name="id" type="java.lang.Long"> <column name="id" /> <generator class="identity" /> </id> <many-to-one name="visitor" class="com.example.project.Visitor" fetch="join" cascade="all"> <column name="visitor_id" /> </many-to-one> <property name="visitId" type="string"> <column name="visit_id" length="20" unique="true" /> </property> <property name="startTime" type="timestamp"> <column name="start_time" length="19" /> </property> <property name="endTime" type="timestamp"> <column name="end_time" length="19" /> </property> <property name="userAgent" type="string"> <column name="user_agent" length="65535" /> </property> <set name="pageViews" inverse="true"> <key> <column name="visit_id" /> </key> <one-to-many class="com.example.project.PageView" /> </set> </class> </hibernate-mapping>

    Read the article

  • Passenger, Apache and avoiding page caching

    - by Michael Guterl
    I'm hosting a rack application with passenger and apache. The application is setup to cache the content of each request to the public directory after each request. This allows apache to serve the content directly as a static page for future requests. I would like to tell Apache, presumably through some rewrite rules that any requests with query parameters should not be cached, but instead passed down to the rack application. With a mongrel setup I would just redirect it to the balancer if it meets my rewrite conditions. How do you do the same with passenger?

    Read the article

  • SEO for Computer Software Engineering Topics

    - by Michael Aaron Safyan
    I'm currently trying to SEO my development and coding search custom search engine as well my website that has a variety of coding and development resources. I would like to increase the number of links to my website, but I don't want to simply generate spam. What are some places that I should submit my website, where the content would be considered relevant rather than just spam? Thanks.

    Read the article

  • How to get datarow array refering to datatable 's field value with linq

    - by Michael
    I want to use linq to get datarow array from a datatable which its string type ColumnA is not null or depending on its length 0 , so I can get the row index with Indexof() method to deal with something else. ColumnA ColumnB ColumnC A0 B0 C0 Null B1 C1 A2 B2 C2 Null B3 C3 My Linq Statment: DataRow[] rows = myDataTable.Select("ColumnA is not null").Where(row=>row.Field<string>("ColumnA").Length>0); somebody who can help?

    Read the article

  • Changing <object> height and width works in Chrome but not Firefox or IE. Why?

    - by Michael Hopkins
    I am making a site with two Youtube videos. These videos use the raw embed code from Youtube. The site's design doesn't work with any of the default Youtube sizes, so I am writing code to automatically resize the video. Here is my code. There will never be more than these two tags on the page, otherwise I'd do a better job selecting the videos. <script language='JavaScript' type='text/javascript'> var x=document.getElementsByTagName('object'); x.[0].width='350'; x.[0].height='350'; x.[1].width='350'; x.[1].height='350'; </script> For reference, here's a sample default Youtube embed that the code might alter: <object width="480" height="385"> <param name="movie" value="http://www.youtube-nocookie.com/v/zSgiXGELjbc&hl=en_US&fs=1&rel=0"></param> <param name="allowFullScreen" value="true"></param> <param name="allowscriptaccess" value="always"></param> <embed src="http://www.youtube-nocookie.com/v/zSgiXGELjbc&hl=en_US&fs=1&rel=0" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed> </object> In Chrome, the video players sit perfectly in a 350x350 box. In IE and FF (latest versions), the videos are the unchanged, normal size. I cannot find anything in Google that explans why this won't work. I have tried using setattribute, for loops, adjusting both and , single-quotes and double-quotes, etc. Any ideas what is going wrong?

    Read the article

  • best way to export data from pdfs

    - by michael
    Hi i work at a news paper and we are lookin a way to make archieve material available. Atm our pages come in pdf format so we need a way to export text and images from the pdf so that they can be added to a database. We've had a look at the News studio plugin for Adobe Acrobat from Iceni Technology, but just wondering if anyone else knows other options for exporting pdf data. thanks

    Read the article

  • MonoDevelop seems to hang (not unresponsive) when building csprojs

    - by Michael Shimmins
    Building a solution from Visual Studio in mono develop seems to have some issues. I'm hoping someone else has experienced this and has some suggestions. The actual dcms process goes pretty quickly, but in between projects it hangs after printing: Building: XXX.YYY.ZZZ (Debug) After a few minutes (been 10 so far on this current run), it jumps to: Performing main compilation... /Library/Frameworks/Mono.framework/Versions/2.10.1/bin/dmcs /noconfig "/... Build complete -- 0 errors, 0 warnings Building: XXX.YYY.ZZZ (Debug) Then hangs again for another few minutes. This is a sln file with 29 csproj projects in it that was originally created in Visual Studio 2010. I'm wondering if there is a better way to set this up - potentially a native MD file format?

    Read the article

  • How can I restore the "auto" values with for list-style-type in nested unordered lists with CSS?

    - by Michael
    By default, an unstyled set of nested <ul> lists looks like this (in Chrome, Firefox, and IE at least): The top level has a list-style-type of disc, the next level is circle, and subsequent levels are square. If I include a stylesheet that changes the list-style-type to none, is there a simple way to revert back to the "automatic bullet types" later in the document? (e.g., override with a subsequent CSS definition or JavaScript style change) Basically, I'm looking for something like list-style-type: auto; (which is apparently not valid and has no effect): <style type="text/css"> ul { list-style-type: none; } ul { list-style-type: auto; } /* Does not work */ </style> Setting the list-style-type back to disc changes every bullet in the list and I no longer see different bullets at different levels, so that doesn't work either. Is the only way to accomplish this by explicitly defining styles for every level? e.g.: <style type="text/css"> ul { list-style-type: disc; } ul ul { list-style-type: circle; } ul ul ul { list-style-type: square; } </style>

    Read the article

  • Detecting if MSBuild/.net 4 is installed from C# code running on 3.5?

    - by Michael Stum
    I have an application that is running on .net 3.5 SP1 and that is supposed to check if .net 4 is installed. Actually, I'm more interested if MSBuild v4 is installed, which would boil down to a simple File.Exists(@"C:\Windows\Microsoft.NET\Framework\v4.0.30319\msbuild.exe"); However, apart from the fragility of the 4.0.30319 Version (and the Windir, but that's easy to solve), I wonder if there is a more appropriate way, like an API?

    Read the article

  • How do you manage tasks within your work?

    - by Michael
    Just wondering how you all manage your workload effectively when there's a lot of your plate? What do you do to break it down into bite-size chunks and how do you track progress of each task? Do you find TDD helps to focus your attention of getting areas of functionality complete before moving onto the next one? I quite often find myself getting a bit overwhelmed when I have an involving task on the go (even if it can be broken down into lots of small chunks), even though I know I'm more than capable of doing the work. We have a kind of agile approach Interested to hear how everyone manages things effectively.

    Read the article

  • How long until programming becomes a professionally certified (and respected as such) skill such as

    - by Michael Campbell
    Given Toyota's recent issues and the ENORMOUS amount of safety that is being relegated to computers, how long do you think it will be before programming, or perhaps programming certain things (embedded transportation software, (air) traffic control, electrical grid, hospital equipment, nuclear plant security, planes, etc.) becomes regulated? And/or, how long before there will be certain regulated certifications before you can call yourself a "software developer", like Architects and Engineers have now? (NB: I'm from the US, so I don't know how it works in other countries; please forgive my ignornace.)

    Read the article

  • I'm trying to run a command using WMI.

    - by MIchael Burns
    This is my code: a button is clicked and the text in a textbox is taken for the remotePC. I can run it locally but when I try to run it remotely it will not work, I think it has something to do with using WMI to run a shared file? public void IPXFER(string RemotePC) { object[] theProcessToRun = { @"\\network-share\ipxfer\ipxfer.exe -s corp-trend -p 1234 -m 1 -c 12345" }; ConnectionOptions theConnection = new ConnectionOptions(); theConnection.Impersonation = ImpersonationLevel.Impersonate; theConnection.EnablePrivileges = true; ManagementScope theScope = new ManagementScope("\\\\" + RemotePC + "\\root\\cimv2", theConnection); ManagementClass theClass = new ManagementClass(theScope, new ManagementPath("Win32_Process"), new ObjectGetOptions()); theClass.InvokeMethod("Create", theProcessToRun); }

    Read the article

  • Importing Python modules without installing - Sybase ASE

    - by Michael
    I need to use the Sybase Python module but our SA's won't install because it's not in the repo's. I've downloaded it and placed it on the box and would just like to 'import' or 'include' the module without installing it first. - Is this possible? From the looks of it (Sybase ASE) it needs some type of compilation before use. Is it possible for this type of work around?

    Read the article

  • php DOM, get values from xml document, php xml

    - by Michael
    I'm trying to get some information (itemID, title, price and mileage) for multiple listings from ebay website using their api . So far I got this link up http://open.api.ebay.com/shopping?callname=GetMultipleItems&responseencoding=XML&appid=Morcovar-c74b-47c0-954f-463afb69a4b3&siteid=0&version=525&IncludeSelector=ItemSpecifics&ItemID=220617293997,250645537939,230485306218 I've saved the document as .xml file using php curl and now I need to get/extract the values(itemID, title, price and mileage) into arrays and store them in database. Unfortunately I never worked with php dom and I can't figure it out how to extract the values . I tried to follow the tutorial found on IBM website http://www.ibm.com/developerworks/library/os-xmldomphp/ but I had no success. Some help would be highly appreciated.

    Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >