Daily Archives

Articles indexed Friday June 11 2010

Page 77/114 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >

  • Asynchronous Webcrawling F#, something wrong ?

    - by jlezard
    Not quite sure if it is ok to do this but, my question is: Is there something wrong with my code ? It doesn't go as fast as I would like, and since I am using lots of async workflows maybe I am doing something wrong. The goal here is to build something that can crawl 20 000 pages in less than an hour. open System open System.Text open System.Net open System.IO open System.Text.RegularExpressions open System.Collections.Generic open System.ComponentModel open Microsoft.FSharp open System.Threading //This is the Parallel.Fs file type ComparableUri ( uri: string ) = inherit System.Uri( uri ) let elts (uri:System.Uri) = uri.Scheme, uri.Host, uri.Port, uri.Segments interface System.IComparable with member this.CompareTo( uri2 ) = compare (elts this) (elts(uri2 :?> ComparableUri)) override this.Equals(uri2) = compare this (uri2 :?> ComparableUri ) = 0 override this.GetHashCode() = 0 ///////////////////////////////////////////////Funtions to retreive html string////////////////////////////// let mutable error = Set.empty<ComparableUri> let mutable visited = Set.empty<ComparableUri> let getHtmlPrimitiveAsyncDelay (delay:int) (uri : ComparableUri) = async{ try let req = (WebRequest.Create(uri)) :?> HttpWebRequest // 'use' is equivalent to ‘using’ in C# for an IDisposable req.UserAgent<-"Mozilla" //Console.WriteLine("Waiting") do! Async.Sleep(delay * 250) let! resp = (req.AsyncGetResponse()) Console.WriteLine(uri.AbsoluteUri+" got response after delay "+string delay) use stream = resp.GetResponseStream() use reader = new StreamReader(stream) let html = reader.ReadToEnd() return html with | _ as ex -> Console.WriteLine( ex.ToString() ) lock error (fun () -> error<- error.Add uri ) lock visited (fun () -> visited<-visited.Add uri ) return "BadUri" } ///////////////////////////////////////////////Active Pattern Matching to retreive href////////////////////////////// let (|Matches|_|) (pat:string) (inp:string) = let m = Regex.Matches(inp, pat) // Note the List.tl, since the first group is always the entirety of the matched string. if m.Count > 0 then Some (List.tail [ for g in m -> g.Value ]) else None let (|Match|_|) (pat:string) (inp:string) = let m = Regex.Match(inp, pat) // Note the List.tl, since the first group is always the entirety of the matched string. if m.Success then Some (List.tail [ for g in m.Groups -> g.Value ]) else None ///////////////////////////////////////////////Find Bad href////////////////////////////// let isEmail (link:string) = link.Contains("@") let isMailto (link:string) = if Seq.length link >=6 then link.[0..5] = "mailto" else false let isJavascript (link:string) = if Seq.length link >=10 then link.[0..9] = "javascript" else false let isBadUri (link:string) = link="BadUri" let isEmptyHttp (link:string) = link="http://" let isFile (link:string)= if Seq.length link >=6 then link.[0..5] = "file:/" else false let containsPipe (link:string) = link.Contains("|") let isAdLink (link:string) = if Seq.length link >=6 then link.[0..5] = "adlink" elif Seq.length link >=9 then link.[0..8] = "http://adLink" else false ///////////////////////////////////////////////Find Bad href////////////////////////////// let getHref (htmlString:string) = let urlPat = "href=\"([^\"]+)" match htmlString with | Matches urlPat urls -> urls |> List.map( fun href -> match href with | Match (urlPat) (link::[]) -> link | _ -> failwith "The href was not in correct format, there was more than one match" ) | _ -> Console.WriteLine( "No links for this page" );[] |> List.filter( fun link -> not(isEmail link) ) |> List.filter( fun link -> not(isMailto link) ) |> List.filter( fun link -> not(isJavascript link) ) |> List.filter( fun link -> not(isBadUri link) ) |> List.filter( fun link -> not(isEmptyHttp link) ) |> List.filter( fun link -> not(isFile link) ) |> List.filter( fun link -> not(containsPipe link) ) |> List.filter( fun link -> not(isAdLink link) ) let treatAjax (href:System.Uri) = let link = href.ToString() let firstPart = (link.Split([|"#"|],System.StringSplitOptions.None)).[0] new Uri(firstPart) //only follow pages with certain extnsion or ones with no exensions let followHref (href:System.Uri) = let valid2 = set[".py"] let valid3 = set[".php";".htm";".asp"] let valid4 = set[".php3";".php4";".php5";".html";".aspx"] let arrLength = href.Segments |> Array.length let lastExtension = (href.Segments).[arrLength-1] let lengthLastExtension = Seq.length lastExtension if (lengthLastExtension <= 3) then not( lastExtension.Contains(".") ) else //test for the 2 case let last4 = lastExtension.[(lengthLastExtension-1)-3..(lengthLastExtension-1)] let isValid2 = valid2|>Seq.exists(fun validEnd -> last4.EndsWith( validEnd) ) if isValid2 then true else if lengthLastExtension <= 4 then not( last4.Contains(".") ) else let last5 = lastExtension.[(lengthLastExtension-1)-4..(lengthLastExtension-1)] let isValid3 = valid3|>Seq.exists(fun validEnd -> last5.EndsWith( validEnd) ) if isValid3 then true else if lengthLastExtension <= 5 then not( last5.Contains(".") ) else let last6 = lastExtension.[(lengthLastExtension-1)-5..(lengthLastExtension-1)] let isValid4 = valid4|>Seq.exists(fun validEnd -> last6.EndsWith( validEnd) ) if isValid4 then true else not( last6.Contains(".") ) && not(lastExtension.[0..5] = "mailto") //Create the correct links / -> add the homepage , make them a comparabel Uri let hrefLinksToUri ( uri:ComparableUri ) (hrefLinks:string list) = hrefLinks |> List.map( fun link -> try if Seq.length link <4 then Some(new Uri( uri, link )) else if link.[0..3] = "http" then Some(new Uri(link)) else Some(new Uri( uri, link )) with | _ as ex -> Console.WriteLine(link); lock error (fun () ->error<-error.Add uri) None ) |> List.filter( fun link -> link.IsSome ) |> List.map( fun o -> o.Value) |> List.map( fun uri -> new ComparableUri( string uri ) ) //Treat uri , removing ajax last part , and only following links specified b Benoit let linksToFollow (hrefUris:ComparableUri list) = hrefUris |>List.map( treatAjax ) |>List.filter( fun link -> followHref link ) |>List.map( fun uri -> new ComparableUri( string uri ) ) |>Set.ofList let needToVisit uri = ( lock visited (fun () -> not( visited.Contains uri) ) ) && (lock error (fun () -> not( error.Contains uri) )) let getLinksToFollowAsyncDelay (delay:int) ( uri: ComparableUri ) = async{ let! links = getHtmlPrimitiveAsyncDelay delay uri lock visited (fun () ->visited<-visited.Add uri) let linksToFollow = getHref links |> hrefLinksToUri uri |> linksToFollow |> Set.filter( needToVisit ) |> Set.map( fun link -> if uri.Authority=link.Authority then link else link ) return linksToFollow } //Add delays if visitng same authority let getDelay(uri:ComparableUri) (authorityDelay:Dictionary<string,int>) = let uriAuthority = uri.Authority let hasAuthority,delay = authorityDelay.TryGetValue(uriAuthority) if hasAuthority then authorityDelay.[uriAuthority] <-delay+1 delay else authorityDelay.Add(uriAuthority,1) 0 let rec getLinksToFollowFromSetAsync maxIteration ( uris: seq<ComparableUri> ) = let authorityDelay = Dictionary<string,int>() if maxIteration = 100 then Console.WriteLine("Finished") else //Unite by authority add delay for those we same authority others ignore let stopwatch= System.Diagnostics.Stopwatch() stopwatch.Start() let newLinks = uris |> Seq.map( fun uri -> let delay = lock authorityDelay (fun () -> getDelay uri authorityDelay ) getLinksToFollowAsyncDelay delay uri ) |> Async.Parallel |> Async.RunSynchronously |> Seq.concat stopwatch.Stop() Console.WriteLine("\n\n\n\n\n\n\nTimeElapse : "+string stopwatch.Elapsed+"\n\n\n\n\n\n\n\n\n") getLinksToFollowFromSetAsync (maxIteration+1) newLinks getLinksToFollowFromSetAsync 0 (seq[ComparableUri( "http://twitter.com/" )]) Console.WriteLine("Finished") Some feedBack would be great ! Thank you (note this is just something I am doing for fun)

    Read the article

  • ANTLR, Dynamic variables

    - by JTS
    Hello, I have an ANTLR grammar that can parse and evaluate simple expressions like 1+2*4, etc. What I would like to do is to evaluate expressions like 2+$a-$b/4 where the $ variables are dynamic variables, that come from an external source and are continuously updated. Is there any design pattern on how to do this using ANTLR, best practices, etc? Shall I "substring" the $a with the updated value ($a - 4.34) A nicer way to do this? Thx

    Read the article

  • How to merge or copy anonymous session data into user data when user logs in?

    - by benhoyt
    This is a general question, or perhaps a request for pointers to other open source projects to look at: I'm wondering how people merge an anonymous user's session data into the authenticated user data when a user logs in. For example, someone is browsing around your websites saving various items as favourites. He's not logged in, so they're saved to an anonymous user's data. Then he logs in, and we need to merge all that data into their (possibly existing) user data. Is this done different ways in an ad-hoc fashion for different applications? Or are there some best practices or other projects people can direct me to?

    Read the article

  • Database cache that I'm not aware of?

    - by Martin
    I'm using asp.net mvc, linq2sql, iis7 and sqlserver express 2008. I get these intermittent server errors, primary key conflicts on insertion. I'm using a different setup on my development computer so I can't debug. After a while they go away. Restarting iis helps. I'm getting the feeling there is cache somewhere that I'm not aware of. Can somebody help me sort out these errors? Cannot insert duplicate key row in object 'dbo.EnquiryType' with unique index 'IX_EnquiryType'. Edits regarding Venemos answer Is it possible that another application is also accessing the same database simultaneously? Yes there is, but not this particular table and no inserts or updates. There is one other table with which I experience the same problem but it has to do with a different part of the model. How often an in what context do you create a new DataContext instance? Only once, using the singleton pattern. Are the primary keys generated by the database or by the application? Database. Which version of ASP.NET MVC and which version of .NET are you using? RC2 and 3.5.

    Read the article

  • How to create JPA EntityMananger in Spring Context?

    - by HDave
    I have a JPA/Spring application that uses Hibernate as the JPA provider. In one portion of the code, I have to manually create a DAO in my application with the new operator rather than use Spring DI. When I do this, the @PersistenceContext annotation is not processed by Spring. In my code where I create the DAO I have an EntityManagerFactory which I used to set the entityManager as follows: @PersistenceUnit private EntityManagerFactory entityManagerFactory; MyDAO dao = new MyDAOImpl(); dao.setEntityManager(entityManagerFactory.createEntityManager()); The problem is that when I do this, I get a Hibernate error: Could not find UserTransaction in JNDI [java:comp/UserTransaction] It's as if the HibernateEntityManager never received all the settings I've configured in Spring: <bean id="myAppTestLocalEmf" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="persistenceUnitName" value="myapp-core" /> <property name="persistenceUnitPostProcessors"> <bean class="com.myapp.core.persist.util.JtaPersistenceUnitPostProcessor"> <property name="jtaDataSource" ref="myappPersistTestJdbcDataSource" /> </bean> </property> <property name="jpaProperties"> <props> <prop key="hibernate.transaction.factory_class">org.hibernate.transaction.JTATransactionFactory</prop> <prop key="hibernate.transaction.manager_lookup_class">com.atomikos.icatch.jta.hibernate3.TransactionManagerLookup</prop> </props> </property> <property name="jpaVendorAdapter"> <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"> <property name="showSql" value="true" /> <!-- The following use the PropertyPlaceholderConfigurer but it doesn't work in Eclipse --> <property name="database" value="$DS{hibernate.database}" /> <property name="databasePlatform" value="$DS{hibernate.dialect}" /> I am not sure, but I think the issue might be that I am not creating the entity manager correctly with the plain vanilla createEntityManager() call rather than passing in a map of properties. I say this because when Spring's LocalContainerEntityManagerFactoryBean proxy's the call to Hibernate's createEntityManager() all of the Spring configured options are missing. That is, there is no Map argument to the createEntityManager() call. Perhaps it is another problem entirely however....not sure!

    Read the article

  • C# infragistics UltraDockManager

    - by RunnerNWizard
    I have created a side tab that opens just fine(basically turned a UltraGrid into a tab, not a form w/ an UltraGrid ), but when pinning the side tab it gets sized in back of the main form. How do I set the pinned tab layout to open up based on my sizing?

    Read the article

  • Consume restful webservice through web proxy

    - by Chico
    Hello, I'm trying to consume a restful webservice in java using the Apache Wink framework through my school web proxy requiring authentification ClientConfig clientConfig = new ClientConfig(); clientConfig.proxyHost("proxy.school.com"); clientConfig.proxyPort(3128); //nothing to set username and password :( RestClient client = new RestClient(clientConfig); Resource resource = client.resource("http://vimeo.com/api/v2/artist/videos.xml"); String response = resource.accept("text/plain").get(String.class); I've also tried to use the BasicAuthSecurityHandler but it seems to be used to authenticate directly to a web server, not the web proxy BasicAuthSecurityHandler basicAuthHandler = new BasicAuthSecurityHandler(); basicAuthHandler.setUserName("username"); basicAuthHandler.setPassword("password"); config.handlers(basicAuthHandler); It still fail with a HTTP 407 error code : Proxy Authentication Required. I've googled the best I could, nothing came up better to consume a webservice from a Java client through a web proxy, if someone has another idea, feel free to respond

    Read the article

  • Software firewall used in network

    - by user45019
    Hi, I have a medium sized organization with users between 300-500 users. I am looking for software firewall for this type orgnization. Which type of software do you guys prefer, am not looking for hardware firewall...Can u suggest me some names of software firewall for this kind of organization. thanks, Gary

    Read the article

  • Is FreeBSD more suitable than CentOS for firing 40k concurrent connections (for Jmeter)?

    - by blacklotus
    Hi, I am trying to run Jmeter to simulate 40k concurrent users and stress test a particular system. Putting aside the possibility that Jmeter may not be able to push such a high number (although I have read that it is at least possible to handle 10k concurrent threads on a very powerful machine), is FreeBSD a more suitable OS as compare to CentOS to be used for my Jmeter machine for handling 40k (or as high as possible) of concurrent outbound connections? Reason for asking this is that, I have found articles on FreeBSD for tuning and optimizing for maximum outbound connections, but seem to have little luck with CentOS. It makes me wonder if for some specific reasons, people don't use CentOS for such high number of outbound connections. Personally however, I am more familiar with CentOS and would like to stick with it if possible. Any input is greatly appreciated!

    Read the article

  • Storage server 2003 shadow copy backups deleted

    - by Aceth
    Hi there We have a 1TB storage server I've just gone to transfer a 100Gb file across to it. And it has deleted the shadow copy. From Googling I understand that this probably occurred: http://support.microsoft.com/kb/826936 Is there any way of recovering those shadow copies back? Thank you very much for having a read anyhow and any help would be greatly appreciated.

    Read the article

  • Making one of the folders default in Apache

    - by OmerO
    Hello, The file & directory structure of my website is as follows: /Library/WebServer/mysite/joomla .. /Library/WebServer/mysite/wiki .. /Library/WebServer/mysite/forum .. /Library/WebServer/mysite/index.php As you see, there are various applications each residing in separate folders. Now, in order to define this structure, I have made this entry in Apache http-vhosts.config file: ServerName mysite.com DocumentRoot "/Library/WebServer/mysite" ** And I already have the DirectoryIndex defined: DirectoryIndex index.html index.php, and so on. So far so good but I want this specific functionality: When someone visits mysite, he/she should automatically directed to: /Library/WebServer/mysite/joomla (and therefore /Library/WebServer/mysite/joomla/index.php) I don't want to achieve that functionality by putting a redirection code inside /Library/WebServer/mysite/index.php or /Library/WebServer/mysite/index.htm because that causes time delays (because of the redirection, of course) But in this case, the only proper way of achieving it seems to set DocumentRoot this way: DocumentRoot "/Library/WebServer/mysite/joomla" But when I set it that way, then the other folders (/wiki, /forum, etc.) are simply not served by Apache. To work around it, I put directives like: Alias /wiki /Library/WebServer/mysite/wiki .. Alias /forum /library/WebServer/mysite/forum and it did work actually the way I wanted. But... I still cannot use it that way because in this case I just couldn't manage to make the wiki use Short URLs (as described in link text) So, I have to set the DocumentRoot back to /Library/WebServer/mysite and shoud be able to assign /Library/WebServer/mysite/joomla as the "default directory" (my own terminology :) Can I do it in Apache? Is there any other way you might suggest? Thanks.

    Read the article

  • Are there videocamera which geotag individual frames?

    - by Grzegorz Adam Hankiewicz
    I'm looking for a way to record live video with the specific requirement of having each frame georeferenced with GPS. Right now I'm using a normal video camera with a PDA+GPS that records the position, but it's difficult to sync both of these plus sometimes I've forgotten to turn the PDA+GPS or it has failed for some reason and all my video has been useless. Using google I found that about two years ago a company named Seero produced such video cameras and software, but apparently the domain doesn't exist any more and I only find references of other pages mentioning it. Does somebody know of any other product? I need to record this video in HD and have some way to export to Google Maps or other GIS software the positions of the frames in a way that I can click on the map and see what was being recorded in the video at that point. The precission of the GPS tracking is good enough as one position per second, intermediate frames of the video stream can be interpolated.

    Read the article

  • What exactly is the Nullify delete rule doing?

    - by dontWatchMyProfile
    Does that mean that if I delete an managed object which has references (relationship) to some others, the relationships are removed to those others? Example: objectA references objectB and objectC. objectA gets deleted, it's relationship to objectB and objectC is set to the Nullify rule. What happens in detail?

    Read the article

  • Regex pattern failing

    - by Scott Chamberlain
    I am trying to strip out all things that are in a string that are not a letter number or space so I created the regex private static Regex _NonAlphaChars = new Regex("[^[A-Za-z0-9 ]]", RegexOptions.Compiled); however When I call _NonAlphaChars.Replace("Scott,", ""); it returns "Scott," What am I doing wrong that it is not matching the ,?

    Read the article

  • Fluent Nhibernate mapping related items

    - by Josh
    I am trying to relate 2 items. I have a table that is simply an Id field, and then 2 columns for the Item Id's to relate. I want it to be a 2 way relationship - that is, if the items appear twice in the table, I only want one relationship connection back. So, here's my item: public class Item { public virtual Guid ItemId {get; set;} public virtual string Name {get; set;} public virtual IList<Item> RelatedItems {get; set;} } The table for relating the items looks like this: CREATE TABLE RelatedItems ( RelatedItemId uniqueidentifier NOT NULL, ItemId uniqueidentifier NOT NULL, RelatedId uniqueidentifier NOT NULL, CONSTRAINT PK_RelatedItems PRIMARY KEY CLUSTERED (RelatedItemId) ) What is the best way to map this connection?

    Read the article

  • Modifying The Published Name Of A Program?

    - by Soo
    I have a C# program that I poorly named when I first started it and want it changed now. I've changed the solution name, but that doesn't appear to change what the program is named when it publishes. My question is how to change the publish name.

    Read the article

  • Date in textboxes changing format

    - by AWinters
    I have an application (asp.net 3.5) that support 4 different languages. Along with other cultural changes, the date formats must match the current culture on out reporting pages. We set the date formats of each of the textboxes like: string date = DateTime.Today.ToString("d"); //returns the date portion only textbox1.Text = date; textbox2.Text = date; etc... When the user selects Spanish or British English the format should be dd/mm/yyyy. However, then I navigate to the page it displays in mm/dd/yyyy. After a postback it then displays dd/mm/yyyy. After another postback it switches to the mm/dd/yyyy format and on and on. I have debugged through this and I see that the culture is correct for the application and the date formats are returned to me correctly, yet when it displays, it displays incorrectly. Has anyone ever seen this or know what is happening?

    Read the article

  • .NET Framework Version Dependency Bootstrapper

    - by Chris
    About 30 minutes of troubleshooting and it turns out the person who's trying to use my application doesn't have the newest version of the .NET Framework. Is there a way or some KB article which describes a best practice for informing users who are not updated or do not have the framework, etc. Thanks!

    Read the article

  • What's does "cardinality of an relationship" mean in Core Data?

    - by dontWatchMyProfile
    From the docs: If all of a managed object's relationship delete rules are Nullify, then for that object at least there is no additional work to do (you may have to consider other objects that were at the destination of the relationship—if the inverse relationship was either mandatory or had a lower limit on cardinality, then the destination object or objects might be in an invalid state). Does someone have an example of this cardinality thing? What's this good for and what's important to know about this? (sounds very important...)

    Read the article

  • QT, LGPL, Commercial closed-source application

    - by user364730
    We have a commercial windows application making use of QT. I'll be very simplistic in my description as I must have a clear answer. At compile time we use QT *.LIB files We have a result of our compilation is an *.EXE file, we wrap into an installer and ship to clients. This *.EXE files depends on *.DLL files in QT. at runtime the *.DLL files of QT are used My questions are: 1) is can I legally bundle the QT *.dll files in my installer? 2) can I legally bundle my final *.EXE files even if it's compilation/linkage depends on QT *.LIB files Thank you

    Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >