robots - Page 12 - Developer IT

How to prevent majestic 12 from indexing a site

- by matnagel

We experience a lot of traffic and server load on a web server. All I can find out is majestic12 accessing pages all the time. I wonder how I can prevent majestic12 from indexing the site Do they respect any robots.txt entry and how do I write such an enty?

Read the article

How to limit the number of concurrent CGI script invocations in Apache 2.2?

- by hsivonen

How can I limit the number of concurrent CGI invocations in Apache 2.2.x? More specifically, my problem is this: I have Apache hosting a Bugzilla instance and other stuff on one server. There's very little legitimate concurrent use of Bugzilla. However, it's trivial to mount a Denial of Service attack on the whole server by ignoring robots.txt and simply fetching a lot of bug pages that fork a process and hit a database.

Read the article

Response code for Chinese spiders? [closed]

- by pt2ph8

My server is being "attacked" by Chinese spiders that don't respect the rules in my robots.txt. They are being very aggressive and using a lot of resources, so I'm going to set up some rules in nginx to block them by user agent. Question: which response code should I return, 403, 444 (empty response in nginx) or something else? I'm wondering how the spiders will react to different status codes. What's the best practice?

Read the article

NightHacking with James Gosling

- by Yolande Poirier

Java Evangelist Stephen Chin is back on the road for a new NightHacking Tour. He is meeting with James Gosling at Kona, Hawaii, the launch base of the Wave Glider. The Glider is an aquatic robot which communicates real-time data from the surface of the ocean. It runs on an ARM chip using Java SE Embedded. "During this broadcast we will show some of the footage of his aquatic robots, talk through the technologies he is hacking on daily, and do Q&A with folks on the live chat" explains Stephen Chin. Sign up for the live stream on Wednesday, October 23rd at: 8AM Hawaii Time 11AM PST 2PM EST 20:00 CET Follow @nighthackingtv for the next Nighthacking events

Read the article

Nighthacking with James Gosling

- by Yolande Poirier

Java Evangelist Stephen Chin is back on the road for a new NightHacking Tour. He is meeting with James Gosling at Kona, Hawaii, the launch base of the Wave Glider. The Glider is an aquatic robot which communicates real-time data from the surface of the ocean. It runs on an ARM chip using Java SE Embedded. "During this broadcast we will show some of the footage of his aquatic robots, talk through the technologies he is hacking on daily, and do Q&A with folks on the live chat" explains Stephen Chin. Sign up for the live stream on Wednesday, October 23rd at: 8AM Hawaii Time 11AM PST 2PM EST 20:00 CET Follow @nighthackingtv for the next Nighthacking events

Read the article

How google handle site traffic in google analytics

- by Hamidreza

I have a site with address www.exam.com and I have put Google analytics javascript scripts in it. I have made an app for my site, I want that everytime a user uses app, he visit the site in the application with built in browser which is inside the application ( I am using C# for application and .NET web browser ). User will address www.example.com/appvisit in the app and I just have put google analytics scripts in that page and nothing else. And I want to disallow this address /appvisit in my robots.txt file . I want to know that Is there any problem with doing this? will google crawl in the /appvisit directory ? Does google hate this work? and will google think this traffic is true and normal? thanks

Read the article

exclamation mark for sitemaps in webmastertools when resubmited

- by Jayapal Chandran

Hi, I have three sitemaps submitted to webmastertools. In that one has very few links and was accepted. It showed a green tick. The other two had around 150 links. They had been accepted in think yet webmastertools displays the exclamation mark. I think i saw this already but what confused was my hosting was blocking frequent bots recently by using a firewall and just now they added googles ip range in their witelist. and then my robots and sitemaps were read by webmastertools. But two sitemaps shows exclamation. I hope it is nothing to do with the above problem. What are all the reasons and where can i see the reason for that exclamation mark.? Here is the screen shot.

Read the article

32 Stunning Movie Tributes in LEGO

- by Jason Fitzpatrick

These impressive Sci-Fi LEGO tributes are an impressive combination of time, money, and a whole lot of LEGO bricks. Read on to see everything from Death Star hangers to adorable robots. Over at Dvice, a SyFy channel blog, they’ve rounded up 32 impressive movie tributes crafted entirely in LEGO bricks. The model seen above, for example, is composed of 30,000 bricks and is over six feet on a side. Planning on building your own? You’d better have $2,300 to blow on bricks and six months of spare time to invest. Hit up the link below for more LEGO tributes. 32 Fan-Built LEGO Tributes to Science Fiction [Dvice] 8 Deadly Commands You Should Never Run on Linux 14 Special Google Searches That Show Instant Answers How To Create a Customized Windows 7 Installation Disc With Integrated Updates

Read the article

Thousands of 404 errors in Google Webmaster Tools

- by atticae

Because of a former error in our ASP.Net application, created by my predecessor and undiscovered for a long time, thousands of wrong URLs where created dynamically. The normal user did not notice it, but Google followed these links and crawled itself through these incorrect URLs, creating more and more wrong links. To make it clearer, consider the url example.com/folder should create the link example.com/folder/subfolder but was creating example.com/subfolder instead. Because of bad url rewriting, this was accepted and by default showed the index page for any unknown url, creating more and more links like this. example.com/subfolder/subfolder/.... The problem is resolved by now, but now I have thousands of 404 errors listed in the Google Webmaster Tools, which got discovered 1 or 2 years ago, and more keep coming up. Unfortunately the links do not follow a common pattern that I could deny for crawling in the robots.txt. Is there anything I can do to stop google from trying out those very old links and remove the already listed 404s from Webmaster Tools?

Read the article

How do I block a user-agent from Apache

- by rubo77

How do I realize a UA string block by regular expression in the config files of my Apache webserver? For example: if I would like to block out all bots from Apache on my debian server, that have the regular expression /\b\w+[Bb]ot\b/ or /Spider/ in their user-agent. Those bots should not be able to see any page on my server and they should not appear neither in the accesslogs nor in the errorlogs. http://global-security.blogspot.de/2009/06/how-to-block-robots-before-they-hit.html supposes to uses mod_security for that, but isn't there a simple directive for http.conf?

Read the article

Would using AJAX only "Add to Cart" buttons be wise?

- by Alex Erwin

I want to AJAX enable all of my Add To Cart buttons because search engine bots are indexing these and not paying attention to my robots file or site map. I just don't want to loose potential customers. I have seen a number of top sites using heavily JavaScript support content, including Amazon, is it OK to follow the trend? The rest of my site progressively degrades, but I would really like to implement this because of the benefits to the customer (instant satisfaction), my infrastructure (constant page rebuilds), and allowing me to use SEO tools to optimize without the tool picking up thousands of "Add to Cart" widgets in my catalog. Thanks

Read the article

Tears of Steel [Short Movie]

- by Asian Angel

In the future a young couple reach a parting of the ways because the young man can not handle the fact that she has a robotic arm. The bitterness of the break-up and bad treatment from her fellow humans lead to a dark future 40 years later where robots are relentlessly hunting and killing humans. Can the man who started her down this dark path redeem himself and save her or will it all end in ruin? TEARS OF STEEL – DOWNLOAD & WATCH [Original Blog Post & Download Links] Tears of Steel – Blender Foundation’s fourth short Open Movie [via I Love Ubuntu] HTG Explains: What is the Windows Page File and Should You Disable It? How To Get a Better Wireless Signal and Reduce Wireless Network Interference How To Troubleshoot Internet Connection Problems

Read the article

Robot.txt can get all soft404s fixed?

- by olo

I got many soft404 in Google webmaster Tools, and those webpages aren't existing any more. thus I am unable to insert <meta name="robots" content="noindex, nofollow"> into my pages, and I've been searching a while but didn't get some valuable clues. There are about 100 URLs are soft 404, to redirect them all one by one is a bit silly as it would cost too much time for me. If i just add those links into robot.txt like below User-agent: * Disallow: /mysite.asp Disallow: /mysite-more.html if this way will fix all soft404s solidly? or if there is a way to change all soft404 to hard404? Please give me some suggestions. Many thanks

Read the article

Do backlinks to blocked content add value?

- by David Fisher

We've been debating the following SEO question at our office: If you block bot access to a page either via robots.txt or on-page noindex metadata, does that negate the value of any backlinks to that page? We have a client who wants to block some event booking form pages from being indexed as each booking form page has a unique URL parameter and the pages are "clogging up" the Google index; however lots of websites link to those booking form pages and we wouldn't want to lose the value of those links. Any opinions welcomed.

Read the article

Recovering a website

- by Jessica

I found my website in the Wayback Machine a few months ago, but today I've tried again and now it tells me it can't find robots.txt. My old webhost stopped paying for their servers back in August without any notice. I was going to do a backup the day it happened. Is there a way just to find the text? I have the old IP, images, but nothing else. None of the big search engines have caches anymore, and I already looked in the cache of three of my Macs with nothing to be found.

Read the article

Why are the tags on my site using wordpress being indexed instead of the page?

- by Bernard

I can't figure out why my tags are being indexed by google and not my actual posts. So in google, my posts are showing up as mysite.com/tags/post and I of course I want it to look like mysite.com/category/actualpost. Any ideas what could be wrong? My domain is 3 years old and I just started a new focus of an existing site. I can't figure this out! There is no duplicate content, I have a sitemap submitted to webmaster tools and robots.txt...I have everything I need. This is the first time something like this has happened to me. Let me know if anyone has any ideas.

Read the article

How to recover a website's lost robot.txt?

- by Jessica

I found my website in the Wayback Machine a few months ago, but today I've tried again and now it tells me it can't find robots.txt. My old webhost stopped paying for their servers back in August without any notice. I was going to do a backup the day it happened. Is there a way just to find the text? I have the old IP, images, but nothing else. None of the big search engines have caches anymore, and I already looked in the cache of three of my Macs with nothing to be found.

Read the article

when will google revert back page rank after i cleared network unrechable error

- by Jayapal Chandran

For the past one month i was getting network unreachable error. I contacted my web hosting and they said that google bots were blocked if it were causing more traffic. And then they witelisted google bots. Now the errors did not appear but my ranking and search results went down to more than 6 pages or they did not appear at all. Now google is able to read my robots and sitemap. Just yesterday. when will search results and page rank gets to its previous positions? like it were before a month? Most links did not appear in google search result.

Read the article

Removing existing filtered pages from Google's index: noindex / 301 / canonical to non-filtered page?

- by Noam

I've decided to remove some of my site's pages from the Google index to focus more of the indexed pages on higher quality pages. The pages I'm going to remove are already in the index. These removed pages are filtered pages which will continue to exist, I just don't want them in the google index because they add little quality to the same page without any filter selected. I've added in webmaster tools specification of narrow for the parameters that set these filters, but it doesn't seem this changes anything in how he handles these pages. So I'm considering three options: Adding <meta name="robots" content="noindex" /> to the html header of these filtered pages 301 to the non-filtered page that contains the most similar information and will remain in the index Canonical tag. Which I'm not sure is exactly the mainstream use case, as these aren't really the same pages. Which should I use?

Read the article

Google I/O 2010 - Google Wave Media APIs

Google I/O 2010 - Google Wave Media APIs Google I/O 2010 - Google Wave Media APIs: Attachments can surf too! Wave 201 Seth Covitz, Jimin Li, Phil Liao Google Wave is used by diverse groups to communicate and collaborate on projects from work to school to plain old having fun. To make users even more productive, we are providing capabilities that enable them to collaborate on and around any piece of third-party content (eg attachments). In this session, we will introduce the Wave Media APIs which enable robots and gadgets to create, access, and modify third-party content in Wave. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 5 0 ratings Time: 41:04 More in Science & Technology

Read the article

RewriteRule not working at server level?

- by Alexis Wilke

I wanted to forbid some robots from doing certain things to my websites and decided to add a RewriteRule for that purpose. The rule works when put in one of my <VirtualHost *:80> tag and looks like this: RewriteEngine On RewriteCond %{HTTP_USER_AGENT} libwww-perl RewriteCond %{REQUEST_METHOD} POST RewriteRule . - [F,L] However, I wanted to apply that to all my websites instead of just one of them. So with the newest version of Apache2 settings, I decided to put that code in the security.conf file. This file is defined under /etc/apache2/conf-available/... (and yes, I have a softlink from the /etc/apache2/conf-enabled/... directory.) However, if the definition is only in the conf-available/security.conf files, it somehow gets ignored. From the documentation, it says that these Rewrite* commands all work at server level! Any idea of what I would be missing?

Read the article

Does a "nofollow" attribute on a link prevent URL discovery by search engines?

- by Stephen Ostermiller

I know that nofollow prevents link juice from being passed across a link. But if search engine robots discover a link with a nofollow on it, will they add that link to their crawl queue? In other words, if I create a link to a brand new page and put a rel=nofollow attribute on that link, will it prevent search engine bots (particularly Googlebot) from crawling the page. (Assuming that this link remains the only link into that page.) I've read conflicting reports about this over the years and I'm looking for authoritative references about the current state of affairs. Official statements from Google or published results of independent testing would be ideal.

Read the article

Why is Google Webmaster Tools crawling invalid URLS and showing 500 errors?

- by Amos Kane

Google Webmaster tools is reporting 12k+ 500 errors. Eeek! None of the URLS are valid- they all contain www.youtube.com. First, why is Google crawling these URLS if they don't exist? I supplied a sitemap, and they are of course not in the sitemap. I don't have a robots.txt blocking anything. I've checked for invalid redirects--none, and checked for unclosed tags or something that would throw www.youtube.com into the URL by accident--none. In every 'linked from', the referring URL is also a bad URL, with www.youtube.com in it. The Google Tools report no malware, and I can't check the server logs because the host won't give me access. Really stuck!! Any ideas appreciated!

Read the article

Duplicate content in Top Level Domain and country specific website

- by Ando

I have myproduct.com which is my master product page. For UK I also own myproduct.co.uk which is a copy of myproduct.com with some localized content: landing page, promotions, prices, and specific tags. But there is also duplicate content: myproduct.com/FAQs/ is the same as myproduct.co.uk/FAQs/ I don't want to do a redirect from myproduct.co.uk/FAQs/ to myproduct.com/FAQs/ as I don't want people to leave the localized website. The myproduct.com/FAQs/ is my "go-to" FAQ page and it's the most likely to be up to date - so I want this page to be indexed my search engines, where as I don't care about myproduct.co.uk/FAQs/ being indexed (unless indexing this page would increase my page rank :) ). What to do now to be SEO friendly & SEO optimal? Stop indexing of myproduct.co.uk/FAQs/ via robots.txt? Do some rel="alternate" hreflang="x" configuring on both /FAQs/ page? Something else?

Read the article

Maker Faire 2012 Attendees build with Java Technology

- by hinkmond

Looks like Daniel Green, systems engineer from Oracle, and the panel of Java experts had a successful Java Technology booth at this year's Maker Faire 2012. See: Maker Faire 2012 adds Java Here's a quote: "We made a huge impact for Java and Oracle, creating positive perception, building brand awareness, and introducing fun and engaging ways for future technologists to learn Java programming," says Michelle Kovac, Oracle director, Java Marketing and Operations. Good stuff, considering all the future developers of exploding robots and fire-breathing dragon metal sculptures attend the Maker Faire. They can blow up stuff with Java technology just as effectively as other programming languages. Hinkmond

Search Results

Search found 499 results on 20 pages for 'robots'.

Page 12/20 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >

- by matnagel

- by hsivonen

- by pt2ph8

- by Yolande Poirier

- by Yolande Poirier

- by Hamidreza

- by Jayapal Chandran

- by Jason Fitzpatrick

- by atticae

- by rubo77

- by Alex Erwin

- by Asian Angel

- by olo

- by David Fisher

- by Jessica

- by Bernard

- by Jessica

- by Jayapal Chandran

- by Noam

- by Alexis Wilke

- by Stephen Ostermiller

- by Amos Kane

- by Ando

- by hinkmond

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >