Search Results

Search found 16682 results on 668 pages for 'search engines'.

Page 255/668 | < Previous Page | 251 252 253 254 255 256 257 258 259 260 261 262  | Next Page >

  • Interconnect nodes in a Java distributed infrastructure for tweet processing

    - by David Moreno García
    I'm working in a new version of an old project that I used to download and process user statuses from Twitter. The main problem of that project was its infrastructure. I used multiple instances of a java application (trackers) to download from Twitter given an specific task (basically terms to search for), connected with a central node (a web application) that had to process all tweets once per day and generate a new task for each trackers once each 15 minutes. The central node also had to monitor all trackers and enable/disable them under user petition. This, as I said, was too slow because I had multiple bottlenecks, so in this new version I want to improve the infrastructure and isolate all functionalities in specific nodes. I also need a good notification system to receive notifications for any node. So, in the next diagram I show the components that I'll need in this new version: As you can see, there are more nodes. Here are some notes about them: Dashboard: Controls trackers statuses and send a single task to each of them (under user request). The trackers will use this task until replaced with a new one (if done, not each 15 minutes like before). Search engine: I need to store all the tweets. They are firstly stored in a local database for each tracker but after that I'm thinking on using something like Elasticsearch to be able to do fast searches. Tweet processor: Just and isolated component with its own database (maybe something like the search engine to have fast access to info generated by the module). In the future more could be added. Application UI: A web application with a shared database with the Dashboard (mainly to store users information and preferences). Indeed, both could be merged into a single web. The main difference with the previous version of the project is that now they will be isolated and they will only show information and send requests. I will not do any heavy task in them (like process tweets as I did before). So, having this components, my main headache is how to structure all to not have to rewrite a lot of code every time I need to access any new data. Another headache is how can I interconnect nodes. I could use sockets but that is a pain in the ass. Maybe a REST layer? And finally, if all the nodes are isolated, how could I generate notifications for each user which info is only in the database used by the Application UI? I'm programming this using Java and Spring (at least I used them in the last version) but I have no problems with changing the language if I can take advantage of a tool/library/engine to make my life easier and have a better platform. Any comment will be appreciated.

    Read the article

  • Google Cache showing wrong URL

    - by Sathiya Kumar
    I searched the cache details of the URL http://property.sulekha.com/pune-properties but the Google Cache showing details for property.sulekha.com. I don't know why it's showing like this. Not only for http://property.sulekha.com/pune-properties but also for all the Indian city relates URL's like http://property.sulekha.com/chennai-properties , http://property.sulekha.com/mumbai-properties , http://property.sulekha.com/kolkata-properties etc. Even i don't find these urls in the Google search result. If i search Chennai properties in Google, i find property.sulekha.com and not http://property.sulekha.com/chennai-properties . Why its happening like this? Please let me know

    Read the article

  • Is my robots.txt working as it should?

    - by TigerBlood
    I want crawlers to have access to http://www.example.com but not http://www.example.com/ My robots.txt is as follows: User-agent: * Allow: /$ Disallow: / My site is in google search results, but I am not coming up in Bing, Yahoo, etc. I have had the same robots.txt since last year, and I initially requested inclusion ~1 year ago, having also resubmitted the URL to those latter search engines several times since as well. Is my robots.txt blocking those other crawlers? And if so, why not google as well? Thanks in advance!

    Read the article

  • Asynchronously returning a hierarchal data using .NET TPL... what should my return object "look" like?

    - by makerofthings7
    I want to use the .NET TPL to asynchronously do a DIR /S and search each subdirectory on a hard drive, and want to search for a word in each file... what should my API look like? In this scenario I know that each sub directory will have 0..10000 files or 0...10000 directories. I know the tree is unbalanced and want to return data (in relation to its position in the hierarchy) as soon as it's available. I am interested in getting data as quickly as possible, but also want to update that result if "better" data is found (better means closer to the root of c:) I may also be interested in finding all matches in relation to its position in the hierarchy. (akin to a report) Question: How should I return data to my caller? My first guess is that I think I need a shared object that will maintain the current "status" of the traversal (started | notstarted | complete ) , and might base it on the System.Collections.Concurrent. Another idea that I'm considering is the consumer/producer pattern (which ConcurrentCollections can handle) however I'm not sure what the objects "look" like. Optional Logical Constraint: The API doesn't have to address this, but in my "real world" design, if a directory has files, then only one file will ever contain the word I'm looking for.  If someone were to literally do a DIR /S as described above then they would need to account for more than one matching file per subdirectory. More information : I'm using Azure Tables to store a hierarchy of data using these TPL extension methods. A "node" is a table. Not only does each node in the hierarchy have a relation to any number of nodes, but it's possible for each node to have a reciprocal link back to any other node. This may have issues with recursion but I'm addressing that with a shared object in my recursion loop. Note that each "node" also has the ability to store local data unique to that node. It is this information that I'm searching for. In other words, I'm searching for a specific fixed RowKey in a hierarchy of nodes. When I search for the fixed RowKey in the hierarchy I'm interested in getting the results FAST (first node found) but prefer data that is "closer" to the starting point of the hierarchy. Since many nodes may have the particular RowKey I'm interested in, sometimes I may want to get a report of ALL the nodes that contain this RowKey.

    Read the article

  • Why are we being twitter spammed?

    - by Tom Gullen
    This is a search relating to us: https://twitter.com/#!/search/realtime/scirra We're getting a of of new accounts tweeting: The Layers Bar - Scirra.com Firstly this is not us doing it as we're quite proud of doing everything completely whitehat. Also this tweet doesn't make any sense, "The Layers Bar" seems to be referring to a manual entry of ours. They all seem to be new accounts with no followers and no prior tweets coming in like clockwork every hour. Does anyone know why this could be happening? Could this harm us? It it possible to find out the source of this? I should mention I'm hesitant to report them all as spam because it could look like we are the culprits.

    Read the article

  • SEO and Spelling mistakes in keyword

    - by Sushil
    I am about to register a domain name (suppose) someone.com (with proper spelling), in mind targeting the keyword "SOMEONE". But then I discovered on 'google keyword research tool' that not this but a typo "SOME1" seems to be more popular and people search this significantly more often than the proper keyword. And luckily someone.com and some1.com both are available. I understand that I can register both the domains, but I don't know on which should I keep my website and redirect the other one. Should I make the typo "some1.com" my base site? But that's a typo. P.S., my site has a totally relevant content and not just keyword targeted worthless site. What do you guys suggest? I am confused. How would that affect my SEO ranking?? EDIT: Because the competition for the keyword I am targeting is fairly low, I think nevertheless whatever domain I choose, it will appear on the search engine first page.

    Read the article

  • VPN disconnected: resolv.conf not refreshed

    - by cwall
    I connect to VPN using vpnc. When VPN disconnects, either via time out or the session limit is reached, VPN is terminated, but resolve.conf continues to contain references to my VPN network. resolv.conf before VPN is connected: nameserver 127.0.0.1 search mylocalnetwork resolv.conf after VPN is connected and remains once VPN is lost: nameserver X.X.X.X nameserver X.X.X.Z nameserver 127.0.0.1 search internal.mycompany.com mylocalnetwork In 10.04, when VPN lost, I'd run this script to refresh resolve.conf: 7$ cat bin/refreshResolvconf.sh #!/bin/bash #if [ -e /etc/resolvconf/run/interface/tun0 -a "`pidof vpnc`" == "" ]; then /sbin/resolvconf -d tun0; fi if [ -e /etc/resolvconf/run/interface/tun0 -a "`pidof vpnc`" == "" ] then /sbin/resolvconf -d tun0; echo "Refreshed resolv.conf" fi But, resolveconf changed in 12.04 changed, so this script is no longer applicable. To resolve, I manually edit resolve.conf or turn off/on my connection via "gnome-control-center network". Anyone else have the same problem? How can resolv.conf be updated post-VPN disconnect?

    Read the article

  • Restricting crawler activity to certain directories with robots.txt

    - by neimad
    I would like to use robots.txt to prevent indexing of some parts of my website. I want search engines to index only the / directory and not search inside my controllers. In my robots.txt, I have this: User-Agent: * Disallow: /compagnies/ Disallow: /floors/ Disallow: /spaces/ Disallow: /buildings/ Disallow: /users/ Disallow: / I put this file in /mysite/public. I tested the file with a robots.txt validator and got no errors. However, Google always returns the result of my site. For testing, I added Disallow: /, but again, Google indexed all pages. floors, spaces, buildings, etc. are not physical directories. Is this a bug? How can I work around it?

    Read the article

  • Blogging & SEO - They Go Hand in Hand

    You write a blog loyally every day or so. You provide informative, fascinating substance for your faithful readers. You've even got a number of member links in there, too. But is that this enough to induce great search engine results for your hard work? In all probability not. Certain, you'll get listed with the search engines effortlessly. But without a high twenty listing at one among the majors (Google, Yahoo! or MSN), you will not have traffic, literally, banging down your door....

    Read the article

  • Copy only folders not files?

    - by Shannon
    Is there a way to copy an entire directory, but only the folders? I have a corrupt file somewhere in my directory which is causing my hard disks to fail. So instead of copying the corrupt file to another hard disk, I wanted to just copy the folders, because I have scripts that search for hundreds of folders, and I don't want to have to manually create them all. I did search the cp manual, but couldn't see anything (I may have missed it) Say I have this structure on my failed HDD: dir1 files dir2 files files dir4 dir3 files All I a want is the directory structure, not any files at all. So I'd end up with on the new HDD: dir1 dir2 dir4 dir3 Hoping someone knows some tricks!

    Read the article

  • Webapps don't open correctly when using Chromium

    - by Alex
    I have just installed Ubuntu 12.10 completely fresh, the old version of Ubuntu was discarded or overwritten (or whatever you call it). I want to use the Ubuntu webapps with Chromium but I've had several problems. }The first problem is that Chromium won't ask me if I want to install a webapp if I go to a supported site (and I don't already have the webapp installed). The second problem is that when I install the webapp by visiting the site in Firefox, and then I try to open it in Chromium, Ubuntu will open a completely new Chromium icon and window in the Launcher, and the icon will be labeled "Untitled"; also there is no search bar in the new window, only the tab at the top. I've tried using several webapps with Firefox set as the default browser and they work as expected: once the webapp icon is clicked a Firefox window is opened on the Firefox launcher icon, and the window has 'new tab' button and search bar.

    Read the article

  • JavaOne Content Catalog Live!

    - by programmarketingOTN
    The JavaOne Content Catalog—the central repository for information on sessions, demos, labs, user groups, exhibitors, and more for San Francisco 2012—is live!In the Content Catalog you can search on tracks, session types, session categories, keywords, and tags. Or, you can search for your favorite speakers to see what they’re presenting this year. And, directly from the catalog, you can share sessions you’re interested in with friends and colleagues through a broad array of social media channels.Start checking out JavaOne content now to plan your week at the conference. Then you’ll be ready to sign up for all of your sessions in mid-July when the scheduling tool goes live. Happy browsing! 

    Read the article

  • Extension Manager in Visual Studio 2010

    One of the powerful aspect of Visual Studio is its ability to be extended and many people do that. You can find numerous extensions at the Visual Studio Gallery. The VSX team links to a 4-part blog series on how to create and share templates. You can also look find extension examples on the vsx code gallery.With Visual Studio 2010, you can search for items and install them directly from within Visual Studio's new Extension Manager. You launch it from the Tools menu:When the dialog comes up, be sure to explore the various actionable areas on the left and also note the search on the right. For example, I typed "MP" and it quickly filtered the list to show me the MPI Project Template:Others have written about this before me, just bing Extension Manager (and note that Beta2 introduced changes, some of which you can witness in the screenshot above). Comments about this post welcome at the original blog.

    Read the article

  • Installing Cairo to get FastRWeb working for R gWidgetsWWW2 -pkg

    - by hhh
    I want to install FastRWeb for R but it requires some Cairo. How can I install the Cairo? compilation terminated. make: *** [xlib-backend.o] Error 1 ERROR: compilation failed for package ‘Cairo’ * removing ‘/home/xfz/R/i686-pc-linux-gnu-library/2.13/Cairo’ ERROR: dependency ‘Cairo’ is not available for package ‘FastRWeb’ * removing ‘/home/xfz/R/i686-pc-linux-gnu-library/2.13/FastRWeb’ The downloaded packages are in ‘/tmp/Rtmpno8hhF/downloaded_packages’ Warning messages: 1: In install.packages("FastRWeb", , "http://rforge.net/", type = "source") : installation of package 'Cairo' had non-zero exit status 2: In install.packages("FastRWeb", , "http://rforge.net/", type = "source") : installation of package 'FastRWeb' had non-zero exit status I cannot find what the Cairo is here, 16 entries with this search term below. It is apparently some library. $ apt-cache search libcairo|wc 16 132 996 Perhaps related http://stackoverflow.com/questions/9826128/r-making-r-rook-program-into-rscript-program-r http://stackoverflow.com/questions/9812547/r-gui-vizualiser-with-command-line-access-browser-based-letting-users-to-s Some related packages FastRWeb and RServe for the gWidgetsWWW2 -pkg.

    Read the article

  • Is this Anti-Scraping technique viable with Crawl-Delay?

    - by skibulk
    I want to prevent web scrapers from abusing 1,000,000 on my website. I'd like to do this by returning a "503 Service Unavailable" error code for users that access an abnormal number of pages per minute. I don't want search engine spiders to ever receive the error. My inclination is to set a robots.txt crawl-delay which will ensure spiders access a number of pages per minute under my 503 threshold. Is this an appropriate solution? Do all major search engines support the directive? Could it negatively affect SEO? Are there any other solutions or recommendations?

    Read the article

  • My First robots.txt

    - by Whitechapel
    I'm creating my first robots.txt and wanted to get a second opinion on it. Basically I have a FTP setup on my board for some special users to transfer files between each other and I do NOT want that included in the search by the bots. I also want to point to my sitemap which gets auto generated by a PHP page. So here is what I have, what else should I include, and if I need to fix anything with it? Also, it's linking to xmlsitemap.php because that generates the sitemap when called. My goal is to allow any search bot crawl the forums to grab meta data. User-agent: * Disallow: /admin/ Disallow: /ali/ Disallow: /benny/ Disallow: /cgi-bin/ Disallow: /ders/ Disallow: /empire/ Disallow: /komodo_117/ Disallow: /xanxan/ Disallow: /zeroordie/ Disallow: /tmp/ Sitemap: http://www.vivalanation.com/forums/xmlsitemap.php Edit, I'm not sure how to handle all the user's folders under /public_html/ since the robots.txt will be going in /public_html.

    Read the article

  • Remove third/nth level domains from google Index

    - by drakythe
    Somehow google has indexed some third(and fourth!) level domains that I had attached to my server temporarily, eg. my.domain.root.com. I now have these redirected properly where I would like them to go, however with a carefully crafted search one can still find them and I'd rather they not be exposed. My google foo skills have failed me in finding an answer, so I come to you wonderful folks: Is there a way/How do I remove sub-level domains from google search results? I have the site in google webmaster tools and verified, but all the URL removal requests I can perform append the url to the base url, not prefixed. And finally, how can I prevent this in the future?

    Read the article

  • Cleaning a dataset of song data - what sort of problem is this?

    - by Rob Lourens
    I have a set of data about songs. Each entry is a line of text which includes the artist name, song title, and some extra text. Some entries are only "extra text". My goal is to resolve as many of these as possible to songs on Spotify using their web API. My strategy so far has been to search for the entry via the API - if there are no results, apply a transformation such as "remove all text between ( )" and search again. I have a list of heuristics and I've had reasonable success with this but as the code gets more and more convoluted I keep thinking there must be a more generic and consistent way. I don't know where to look - any suggestions for what to try, topics to study, buzzwords to google?

    Read the article

  • How to prevent Google from finding my admin index page?

    - by krish
    I am running a website but for some days i stopped it and put the under-construction page because the Index of admin page is visible to the outside world through the Google search. One of my friend told me that your websites index is visible and its one step away to access the password file and he shows me that very simply using the Google search. How can i prevent this and i am hosting my site with a hosting company and i report about this to them but they simply replied to me still its secure so you no need to worry... am i really don need to worry and continue my site with the visible index of admin page?

    Read the article

  • Flowchart for solving programming problems

    - by nurne
    I noticed that every developer implements a somewhat different flowchart for solving programming problems. By flowchart I mean a defined system of techniques that the developer goes through in a certain sequence, trying to solve the problem at hand. Some examples for techniques: Google "how to..." or "... tutorial". Search the java/msdn/apple/etc API doc for the specific class or method. Search in stack overflow the exact problem with some tags like [iphone]/[java] etc. Take a nap and let the subconscious work. Debug. Draw the algorithm or system. Google the logged error message. Ask a colleague or manager. Ask a new question in stack overflow. From your experience, what is the best flowchart for solving a programming problem?

    Read the article

  • Installer File Size Reduced

    The DevExpress DXperience installation file size will be about 1/2 size of its current 300+ megabytes! Starting with the v2010.1 installer, youll have less to download. How? The documentation files have been removed from the DXperience installer. However, a separate installer with just the documentation will be available for download. Search.DevExpress.Com FTW! Personally, I prefer to use the online documentation available at http://search.devexpress.com. Its fast and up to date! Look for...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Google ranking - Modal views - google analytics events [duplicate]

    - by minchiya
    This question already has an answer here: How to diagnose a search engine ranking drop? 5 answers I modified a site recently : - I added many google analytics events, to better understand user behaviour. - I added also two buttons on almost all the pages of the site. Those buttons show modal-views (I am using bootstrap) with questions about user opinion. This modals views are on almost all pages of the site. After this modification the ranking of the site decreased on google search from the second place to the seconde page :( Is it the events-collected or the model-views added ? If the model-views are the reason, then how to better do similar surveys ? Did you have please similar experience, or explanation to this ? Perhaps it is the effect of panda4 update. In this cas, what can I look for to improve the site. How to debug the problem/reasons ?

    Read the article

< Previous Page | 251 252 253 254 255 256 257 258 259 260 261 262  | Next Page >