Search Results

Search found 1733 results on 70 pages for 'shutdown'.

Page 60/70 | < Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67  | Next Page >

  • Need help running Python app as service in Ubuntu with Upstart

    - by GreeenGuru
    I have written a logging application in Python that is meant to start at boot, but I've been unable to start the app with Ubuntu's Upstart init daemon. When run from the terminal with sudo /usr/local/greeenlog/main.pyw, the application works perfectly. Here is what I've tried for the Upstart job: /etc/init/greeenlog.conf # greeenlog description "I log stuff." start on startup stop on shutdown script exec /usr/local/greeenlog/main.pyw end script My application starts one child thread, in case that is important. I've tried the job with the expect fork stanza without any change in the results. I've also tried this with sudo and without the script statements (just a lone exec statement). In all cases, after boot, running status greeenlog returns greeenlog stop/waiting and running start greeenlog returns: start: Rejected send message, 1 matched rules; type="method_call", sender=":1.61" (uid=1000 pid=2496 comm="start) interface="com.ubuntu.Upstart0_6.Job" member="Start" error name="(unset)" requested_reply=0 destination="com.ubuntu.Upstart" (uid=0 pid=1 comm="/sbin/init")) Can anyone see what I'm doing wrong? I appreciate any help you can give. Thanks.

    Read the article

  • How to asynchronously read to std::string using Boost::asio?

    - by SpyBot
    Hello. I'm learning Boost::asio and all that async stuff. How can I asynchronously read to variable user_ of type std::string? Boost::asio::buffer(user_) works only with async_write(), but not with async_read(). It works with vector, so what is the reason for it not to work with string? Is there another way to do that besides declaring char user_[max_len] and using Boost::asio::buffer(user_, max_len)? Also, what's the point of inheriting from boost::enable_shared_from_this<Connection> and using shared_from_this() instead of this in async_read() and async_write()? I've seen that a lot in the examples. Here is a part of my code: class Connection { public: Connection(tcp::acceptor &acceptor) : acceptor_(acceptor), socket_(acceptor.get_io_service(), tcp::v4()) { } void start() { acceptor_.get_io_service().post( boost::bind(&Connection::start_accept, this)); } private: void start_accept() { acceptor_.async_accept(socket_, boost::bind(&Connection::handle_accept, this, placeholders::error)); } void handle_accept(const boost::system::error_code& err) { if (err) { disconnect(); } else { async_read(socket_, boost::asio::buffer(user_), boost::bind(&Connection::handle_user_read, this, placeholders::error, placeholders::bytes_transferred)); } } void handle_user_read(const boost::system::error_code& err, std::size_t bytes_transferred) { if ( err or (bytes_transferred != sizeof(user_)) ) { disconnect(); } else { ... } } ... void disconnect() { socket_.shutdown(tcp::socket::shutdown_both); socket_.close(); socket_.open(tcp::v4()); start_accept(); } tcp::acceptor &acceptor_; tcp::socket socket_; std::string user_; std::string pass_; ... };

    Read the article

  • Any way to turn off quips in OOWeb?

    - by Misha Koshelev
    http://ooweb.sourceforge.net/tutorial.html Not really a question, but I can't seem to stop writing stuff like this. Maybe someone will find it useful. I know rewriting an HTTP server is not the way to turn off the quips ;) /* Copyright 2010 Misha Koshelev. All Rights Reserved. */ package com.mksoft.common; import java.io.BufferedReader; import java.io.InputStreamReader; import java.io.IOException; import java.io.PrintWriter; import java.io.UnsupportedEncodingException; import java.net.URLDecoder; import java.text.SimpleDateFormat; import java.util.Date; import java.util.LinkedHashMap; import java.net.ServerSocket; import java.net.Socket; /** * Simple HTTP Server. * * @author Misha Koshelev */ public class HttpServer extends Thread { /* * Constants */ /** * 404 Not Found Result */ protected final static String result404NotFound="<html><head><title>404 Not Found</title></head><body bgcolor='#ffffff'><h1>404 Not Found</h1></body></html>"; /* * Variables */ /** * Port on which HTTP server handles requests. */ protected int port; public int getPort() { return port; } public void setPort(int _port) { port=_port; } /* * Constructors */ public HttpServer(int _port) { setPort(_port); } /* * Helpers */ /** * Errors */ protected void error(String message) { System.err.println(message); System.err.flush(); } /** * Debugging */ protected boolean debugOutput=true; protected void debug(String message) { if (debugOutput) { error(message); } } /** * Lock object */ private Object lock=new Object(); /** * Should we quit? */ protected boolean doQuit=false; /** * Are we done? */ protected boolean areWeDone=false; /** * Process POST request headers */ protected String processPostRequest(String url,LinkedHashMap<String,String> headers,String inputLine) { debug("HttpServer.processPostRequest: url=\""+url); if (debugOutput) { for (String key: headers.keySet()) { debug("HttpServer.processPostRequest: headers."+key+"=\""+headers.get(key)+"\""); } } debug("HttpServer.processPostRequest: inputLine=\""+inputLine+"\""); try { inputLine=new URLDecoder().decode(inputLine,"UTF-8"); } catch (UnsupportedEncodingException uee) { uee.printStackTrace(); } String[] keyValues=inputLine.split("&"); LinkedHashMap<String,String> post=new LinkedHashMap<String,String>(); for (int i=0;i<keyValues.length;i++) { String keyValue=keyValues[i]; int equals=keyValue.indexOf('='); String key=keyValue.substring(0,equals); String value=keyValue.substring(equals+1); post.put(key,value); } return post(url,headers,post); } /** * Server loop (here for exception handling purposes) */ protected void serverLoop() throws IOException { /* Start server socket */ ServerSocket serverSocket=null; try { serverSocket=new ServerSocket(getPort()); } catch (IOException ioe) { ioe.printStackTrace(); System.exit(1); } Socket clientSocket=null; while (true) { /* Quit if necessary */ if (doQuit) { break; } /* Accept incoming connections */ try { clientSocket=serverSocket.accept(); } catch (IOException ioe) { ioe.printStackTrace(); System.exit(1); } /* Read request */ BufferedReader in=null; String inputLine=null; String firstLine=null; String blankLine=null; LinkedHashMap<String,String> headers=new LinkedHashMap<String,String>(); try { in=new BufferedReader(new InputStreamReader(clientSocket.getInputStream())); while (true) { if (blankLine==null) { inputLine=in.readLine(); } else { /* POST request, read Content-length bytes */ int contentLength=new Integer(headers.get("Content-Length")).intValue(); StringBuilder sb=new StringBuilder(contentLength); for (int i=0;i<contentLength;i++) { sb.append((char)in.read()); } inputLine=sb.toString(); break; } if (firstLine==null) { firstLine=inputLine; } else if (blankLine==null) { if (inputLine.equals("")) { if (firstLine.startsWith("GET ")) { break; } blankLine=inputLine; } else { int colon=inputLine.indexOf(": "); String key=inputLine.substring(0,colon); String value=inputLine.substring(colon+2); headers.put(key,value); } } } } catch (IOException ioe) { ioe.printStackTrace(); } /* Process request */ String result=null; firstLine=firstLine.replaceAll(" HTTP/.*",""); if (firstLine.startsWith("GET ")) { result=get(firstLine.replaceFirst("GET ",""),headers); } else if (firstLine.startsWith("POST ")) { result=processPostRequest(firstLine.replaceFirst("POST ",""),headers,inputLine); } else { error("HttpServer.ServerLoop: Unhandled request \""+firstLine+"\""); } debug("HttpServer.ServerLoop: result=\""+result+"\""); /* Send response */ PrintWriter out=null; try { out=new PrintWriter(clientSocket.getOutputStream(),true); } catch (IOException ioe) { ioe.printStackTrace(); } if (result!=null) { out.println("HTTP/1.1 200 OK"); } else { out.println("HTTP/1.0 404 Not Found"); result=result404NotFound; } Date now=new Date(); out.println("Date: "+new SimpleDateFormat("EEE, d MMM yyyy HH:mm:ss z").format(now)); out.println("Content-Type: text/html; charset=UTF-8"); out.println("Content-Length: "+result.length()); out.println(""); out.print(result); /* Clean up */ out.close(); if (in!=null) { in.close(); } clientSocket.close(); } serverSocket.close(); areWeDone=true; synchronized(lock) { lock.notifyAll(); } } /* * Methods */ /** * Run server on port specified in constructor. */ public void run() { try { serverLoop(); } catch (IOException ioe) { ioe.printStackTrace(); System.exit(1); } } /** * Process GET request (should be overwritten). */ public String get(String url,LinkedHashMap<String,String> headers) { debug("HttpServer.get: url=\""+url+"\""); if (debugOutput) { for (String key: headers.keySet()) { debug("HttpServer.get: headers."+key+"=\""+headers.get(key)+"\""); } } if (url.equals("/")) { return "<html><head><title>HttpServer GET Test Page</title></head>\r\n"+ "<body bgcolor='#ffffff'>\r\n"+ "<center><h1>HttpServer GET Test Page</h1></center>\r\n"+ "<hr />\r\n"+ "<center><table>\r\n"+ "<form method='post' action='/'>\r\n"+ "<tr><td align=right>Test 1:</td>\r\n"+ " <td><input type='text' name='text 1' value='test me !!! !@#$'></td></tr>\r\n"+ "<tr><td align=right>Test 2:</td>\r\n"+ " <td><input type='text' name='text 2' value='type smthng'></td></tr>\r\n"+ "<tr><td>&nbsp;</td>\r\n"+ " <td align=right><input type='submit' value='Submit'></td></tr>\r\n"+ "</form>\r\n"+ "</table></center>\r\n"+ "<hr />\r\n"+ "<center><a href='/quit'>Shutdown Server</a></center>\r\n"+ "</html>"; } else if (url.equals("/quit")) { quit(); return ""; } else { return null; } } /** * Process POST request (should be overwritten). */ public String post(String url,LinkedHashMap<String,String> headers,LinkedHashMap<String,String> post) { debug("HttpServer.post: url=\""+url+"\""); if (debugOutput) { for (String key: headers.keySet()) { debug("HttpServer.post: headers."+key+"=\""+headers.get(key)+"\""); } } if (url.equals("/")) { String result="<html><head><title>HttpServer Post Test Page</title></head>\r\n"+ "<body bgcolor='#ffffff'>\r\n"+ "<center><h1>HttpServer Post Test Page</h1></center>\r\n"+ "<hr />\r\n"+ "<center><table>\r\n"+ "<tr><th>Key</th><th>Value</th></tr>\r\n"; for (String key: post.keySet()) { result+="<tr><td align=right>"+key+"</td><td align=left>"+post.get(key)+"</td></tr>\r\n"; } result+="</table></center>\r\n"+ "</html>"; return result; } else { return null; } } /** * Wait for server to quit. */ public void waitForCompletion() { while (areWeDone==false) { synchronized(lock) { try { lock.wait(); } catch (InterruptedException ie) { } } } } /** * Shutdown server. */ public void quit() { doQuit=true; } }

    Read the article

  • Python 3.3 Webserver restarting problems

    - by IPDGino
    I have made a simple webserver in python, and had some problems with it before as described here: Python (3.3) Webserver script with an interesting error In that question, the answer was to use a While True: loop so that any crashes or errors would be resolved instantly, because it would just start itself again. I've used this for a while, and still want to make the server restart itself every few minutes, but on Linux for some reason it won't work for me. On windows the code below works fine, but on linux it keeps saying Handler class up here ... ... class Server: def __init__(self): self.server_class = HTTPServer self.server_adress = ('MY IP GOES HERE, or localhost', 8080) global httpd httpd = self.server_class(self.server_adress, Handler) self.main() def main(self): if count > 1: global SERVER_UP_SINCE HOUR_CHECK = int(((count - 1) * RESTART_INTERVAL) / 60) SERVER_UPTIME = str(HOUR_CHECK) + " MINUTES" if HOUR_CHECK > 60: minutes = int(HOUR_CHECK % 60) hours = int(HOUR_CHECK // 60) SERVER_UPTIME = ("%s HOURS, %s MINUTES" % (str(hours), str(minutes))) SERVING_ON_ADDR = self.server_adress SERVER_UP_SINCE = str(SERVER_UP_SINCE) SERVER_RESTART_NUMBER = count - 1 print(""" SERVER INFO ------------------------------------- SERVER_UPTIME: %s SERVER_UP_SINCE: %s TOTAL_FILES_SERVED: %d SERVING_ON_ADDR: %s SERVER_RESTART_NUMBER: %s \n\nSERVER HAS RESTARTED """ % (SERVER_UPTIME, SERVER_UP_SINCE, TOTAL_FILES, SERVING_ON_ADDR, SERVER_RESTART_NUMBER)) else: print("SERVER_BOOT=1\nSERVER_ONLINE=TRUE\nRESTART_LOOP=TRUE\nSERVING_ON_ADDR:%s" % str(self.server_adress)) while True: try: httpd.serve_forever() except KeyboardInterrupt: print("Shutting down...") break httpd.shutdown() httpd.socket.close() raise(SystemExit) return def server_restart(): """If you want the restart timer to be longer, replace the number after the RESTART_INTERVAL variable""" global RESTART_INTERVAL RESTART_INTERVAL = 10 threading.Timer(RESTART_INTERVAL, server_restart).start() global count count = count + 1 instance = Server() if __name__ == "__main__": global SERVER_UP_SINCE SERVER_UP_SINCE = strftime("%d-%m-%Y %H:%M:%S", gmtime()) server_restart() Basically, I make a thread to restart it every 10 seconds (For testing purposes) and start the server. After ten seconds it will say File "/home/username/Desktop/Webserver/server.py", line 199, in __init__ httpd = self.server_class(self.server_adress, Handler) File "/usr/lib/python3.3/socketserver.py", line 430, in __init__ self.server_bind() File "/usr/lib/python3.3/http/server.py", line 135, in server_bind socketserver.TCPServer.server_bind(self) File "/usr/lib/python3.3/socketserver.py", line 441, in server_bind self.socket.bind(self.server_address) OSError: [Errno 98] Address already in use As you can see in the except KeyboardInterruption line, I tried everything to make the server stop, and the program stop, but it will NOT stop. But the thing I really want to know is how to make this server able to restart, without giving some wonky errors.

    Read the article

  • SQLBrowser will not start

    - by Oliver
    SQL Server 2005 x64 on Windows Server 2003 x64, with multiple instances (default + 2 named). Engineers moved server to a different domain. Since then, cannot get SQLBrowser to start. Still able to query the default instance, and can access named instances by port (TCP:hostname,port#). When on server, can use SSMS to connect to the instances, all is well from that perspective. No errors in the SQL Server logs. As SQLBrowser is starting, an entry in EventViewer.Application says that one of the named instances has an invalid configuration, but I haven't been able to figure out what is invalid. Startup continues, and next message says "The SQLBrowser service was unable to establish SQL instance and connectivity discovery." Next, it enables instance and connectivity discovery support; next, another message about that same named instance having an invalid configuration; then an event says that SQLBrowser has started; last, an event shows the SQLBrowser service has shutdown. I got SQLBrowser to get past the issue with the first named instance by temporarily renaming a registry entry, and now the second named instance can be accessed by name rather than port. Still, cannot access the first named instance by name. Advice?

    Read the article

  • Java: design for using many executors services and only few threads

    - by Guillaume
    I need to run in parallel multiple threads to perform some tests. My 'test engine' will have n tests to perform, each one doing k sub-tests. Each test result is stored for a later usage. So I have n*k processes that can be ran concurrently. I'm trying to figure how to use the java concurrent tools efficiently. Right now I have an executor service at test level and n executor service at sub test level. I create my list of Callables for the test level. Each test callable will then create another list of callables for the subtest level. When invoked a test callable will subsequently invoke all subtest callables test 1 subtest a1 subtest ...1 subtest k1 test n subtest a2 subtest ...2 subtest k2 call sequence: test manager create test 1 callable test1 callable create subtest a1 to k1 testn callable create subtest an to kn test manager invoke all test callables test1 callable invoke all subtest a1 to k1 testn callable invoke all subtest an to kn This is working fine, but I have a lot of new treads that are created. I can not share executor service since I need to call 'shutdown' on the executors. My idea to fix this problem is to provide the same fixed size thread pool to each executor service. Do you think it is a good design ? Do I miss something more appropriate/simple for doing this ?

    Read the article

  • Deploying a WAR to tomcat only using a context descriptor

    - by DanglingElse
    i need to deploy a web application in WAR format to a remote tomcat6 server. The thing is that i don't want to do that the easy way, meaning not just copy/paste the WAR file to /webapps. So the second choice is to create a unique "Context Descriptor" and pointing this out to the WAR file. (Hope i got that right till here) So i have a few questions: Is the WAR file allowed to be anywhere in the file system? Meaning can i copy the WAR file anywhere in the remote file system, except /webapps or any other folder of the tomcat6 installation? Is there an easy way to test whether the deployment was successful or not? Without using any browser or anything, since i'm reaching to the remote server only via SSH and terminal. (I'm thinking ping?) Is it normal that the startup.sh/shutdown.sh don't exist? I'm not the admin of the server and don't know how the tomcat6 is installed. But i'm sure that in my local tomcat installations these files are in /bin and ready to use. I mean you can still start/restart/stop the tomcat etc., but not with the these -standard?- scripts. Thanks a lot.

    Read the article

  • Deploy to web container, bundle web container or embed web container...

    - by Jason
    I am developing an application that needs to be as simple as possible to install for the end user. While the end users will likely be experience Linux users (or sales engineers), they don't really know anything about Tomcat, Jetty, etc, nor do I think they should. So I see 3 ways to deploy our applications. I should also state that this is the first app that I have had to deploy that had a web interface, so I haven't really faced this question before. First is to deploy the application into an existing web container. Since we only deploy to Suse or RedHat this seems easy enough to do. However, we're not big on the idea of multiple apps running in one web container. It makes it harder to take down just one app. The next option is to just bundle Tomcat or Jetty and have the startup/shutdown scripts launch our bundled web container. Or 3rd, embed.. This will probably provide the same user experience as the second option. I'm curious what others do when faced with this problem to make it as fool proof as possible on the end user. I've almost ruled out deploying into an existing web container as we often like to set per application resource limits and CPU affinity, which I believe would affect all apps deployed into a web container/app server and not just a specific application. Thank you.

    Read the article

  • Android - Where to store generated bitmaps?

    - by Josh
    I've got an app which dynamically generates anywhere from 6 to 100 small bitmaps for the user to move around the screen in a given session. I currently generate them in onCreate and store them to the sd card, so that after an orientation change I can grab them out of external storage and display them again. However, this takes time (the loading) and I'd like to keep the bitmap references around between lifecyle changes for quicker access. My question is, is there a better place to store my generated bitmaps? I was thinking about creating a static storage library in my base activity, something that would only need to be reloaded when the app is completely removed from memory (shutdown, other apps need resources, 30 minute restart, etc). Ideally, I'd like the user to be able to back out to the title screen, click a "Resume" button, and in onCreate I just have access to those resident bitmap references instead of having to load them from storage again. For this reason I don't think Activity.onRetainNonConfigurationInstance is what I need. Alternatively, is there a better way to handle multiple generated bitmaps than what I'm doing or the plan I described?

    Read the article

  • Simple POSIX threads question

    - by Andy
    Hi, I have this POSIX thread: void subthread(void) { while(!quit_thread) { // do something ... // don't waste cpu cycles if(!quit_thread) usleep(500); } // free resources ... // tell main thread we're done quit_thread = FALSE; } Now I want to terminate subthread() from my main thread. I've tried the following: quit_thread = TRUE; // wait until subthread() has cleaned its resources while(quit_thread); But it does not work! The while() clause does never exit although my subthread clearly sets quit_thread to FALSE after having freed its resources! If I modify my shutdown code like this: quit_thread = TRUE; // wait until subthread() has cleaned its resources while(quit_thread) usleep(10); Then everything is working fine! Could someone explain to me why the first solution does not work and why the version with usleep(10) suddenly works? I know that this is not a pretty solution. I could use semaphores/signals for this but I'd like to learn something about multithreading, so I'd like to know why my first solution doesn't work. Thanks!

    Read the article

  • Random number generation in MVC applications

    - by SlimShaggy
    What is the correct way of generating random numbers in an ASP.NET MVC application if I need exactly one number per request? According to MSDN, in order to get randomness of sufficient quality, it is necessary to generate multiple numbers using a single System.Random object, created once. Since a new instance of a controller class is created for each request in MVC, I cannot use a private field initialized in the controller's constructor for the Random object. So in what part of the MVC app should I create and store the Random object? Currently I store it in a static field of the controller class and lazily initialize it in the action method that uses it: public class HomeController : Controller { ... private static Random random; ... public ActionResult Download() { ... if (random == null) random = new Random(); ... } } Since the "random" field can be accessed by multiple instances of the controller class, is it possible for its value to become corrupted if two instances attempt to initialize it simultaneously? And one more question: I know that the lifetime of statics is the lifetime of the application, but in case of an MVC app what is it? Is it from IIS startup till IIS shutdown?

    Read the article

  • I need help solving a rather weird error in a WCF service.

    - by Moulde
    Hi.. I have a solution that contains three projects. A main project with my MVC app, a silverlight application and a (silverlight enabled) WCF service project. In my silverlight project i have made a Service Reference to my WCF service. And i pretty much got that working. In my WCF service i have a method that returns an Book object, which got some random fields like title, date etc. In the book class, i have a ICollection field that contains a list of events. The book class is generated using entity framework 4.0, and Lazy Loading is enabled. If i in my getBook(int id) method return a book with the events field not initialized, it works as a charm. But if i initialize the field, i'm getting this error. The server did not provide a meaningful reply; this might be caused by a contract mismatch, a premature session shutdown or an internal server error. I have a few ideas why that is happening, and while writing this i just got another one. The wcf service somehow threw away the reference to the event class. That would be very weird since i have a reference between my main mvc app (with the models) and my WCF service. Since i have enabled lazy loading in EF 4.0, i suspect that it may be the thing generating the error. But i'm not sure why that would be, because i'm not in any way accessing that field. I could understand that i may not be able to access the events field after i recive the object in my silverlight application since the connection between the book object and the entity framework is like broken. Did i mention that Lazy Loading is enabled on my EF instance? And there is no inner exception in the thrown exception. Thanks in advance. Malte Baden Hansen

    Read the article

  • Accessing Application Scoped Bean Causes NullPointerException

    - by user2946861
    What is an Application Scoped Bean? I understand it to be a bean which will exist for the life of the application, but that doesn't appear to be the correct interpretation. I have an application which creates an application scoped bean on startup (eager=true) and then a session bean that tries to access the application scoped bean's objects (which are also application scoped). But when I try to access the application scoped bean from the session scoped bean, I get a null pointer exception. Here's excerpts from my code: Application Scoped Bean: @ManagedBean(eager=true) @ApplicationScoped public class Bean1 implements Serializable{ private static final long serialVersionUID = 12345L; protected ArrayList<App> apps; // construct apps so they are available for the session scoped bean // do time consuming stuff... // getters + setters Session Scoped Bean: @ManagedBean @SessionScoped public class Bean2 implements Serializable{ private static final long serialVersionUID = 123L; @Inject private Bean1 bean1; private ArrayList<App> apps = bean1.getApps(); // null pointer exception What appears to be happening is, Bean1 is created, does it's stuff, then is destroyed before Bean2 can access it. I was hoping using application scoped would keep Bean1 around until the container was shutdown, or the application was killed, but this doesn't appear to be the case.

    Read the article

  • how to get http responses continuously to a java application?

    - by senrulz
    I have the coding shown below where I have sheduled to get a http response from a php web server page. public static void stopPHPDataChecker() { canStop=true; } public static void main (String args[]) { // http request to the php page and get the response PHPDataChecker pdc = new PHPDataChecker(); ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1); final ScheduledFuture<?> pdcHandle = scheduler.scheduleAtFixedRate(pdc, 0L, 10L, TimeUnit.MILLISECONDS);// Start schedule scheduler.schedule(new Runnable() { public void run() { System.out.println(">> TRY TO STOP!!!"); pdcHandle.cancel(true); Sheduler.stopPHPDataChecker(); System.out.println("DONE"); } }, 1L, TimeUnit.MILLISECONDS); do { if (canStop) { scheduler.shutdown(); } } while (!canStop); System.out.println("END"); } this coding only returns one response but I want it to get responses continuously so i can do different tasks according to the returned value. how can i do it? Thank you in advance :)

    Read the article

  • External USB drive is failing

    - by dma_k
    I have an external USB 2.0 drive WD My Book Mirror Edition, running in RAID 1 (mirroring) mode. A while ago the hard drive started to fail: it stops responding (directories are not listed returning an error after a big timeout). Sometimes it works for weeks before a failure, sometimes – few hours. Small write operations (like removing few files or editing a small file) do not harm, but when copying large files to the drive over the network, or creating the archive locally, the kernel dumps. Also interesting to note that once kernel has failed, Linux does not want to reboot normally (reboot hangs); when Linux box is shutdown with power button, WD drive does not go to sleep mode (as it usually does): leds continue to run, pressing and holding the "shutdown" button on drive's back panel does not do anything; only unplugging the power cord helps. Here goes the boot log: Aug 16 00:32:21 kernel: [ 1.514106] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver Aug 16 00:32:21 kernel: [ 1.657738] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23 Aug 16 00:32:21 kernel: [ 1.673747] ehci_hcd 0000:00:1d.7: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 1.673751] ehci_hcd 0000:00:1d.7: EHCI Host Controller Aug 16 00:32:21 kernel: [ 1.725224] ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1 Aug 16 00:32:21 kernel: [ 1.741647] ehci_hcd 0000:00:1d.7: using broken periodic workaround Aug 16 00:32:21 kernel: [ 1.761790] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported Aug 16 00:32:21 kernel: [ 1.761873] ehci_hcd 0000:00:1d.7: irq 23, io mem 0xfdfff000 Aug 16 00:32:21 kernel: [ 1.796043] ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00 Aug 16 00:32:21 kernel: [ 1.879069] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 Aug 16 00:32:21 kernel: [ 1.895446] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 1.911796] usb usb1: Product: EHCI Host Controller Aug 16 00:32:21 kernel: [ 1.928015] usb usb1: Manufacturer: Linux 2.6.32-5-686 ehci_hcd Aug 16 00:32:21 kernel: [ 1.944331] usb usb1: SerialNumber: 0000:00:1d.7 Aug 16 00:32:21 kernel: [ 1.961285] usb usb1: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 1.994412] hub 1-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.010864] hub 1-0:1.0: 8 ports detected Aug 16 00:32:21 kernel: [ 2.085939] uhci_hcd: USB Universal Host Controller Interface driver Aug 16 00:32:21 kernel: [ 2.191945] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23 Aug 16 00:32:21 kernel: [ 2.226029] uhci_hcd 0000:00:1d.0: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.226034] uhci_hcd 0000:00:1d.0: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.243237] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2 Aug 16 00:32:21 kernel: [ 2.260390] uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000fe00 Aug 16 00:32:21 kernel: [ 2.277517] usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.294815] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.312173] usb usb2: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.329534] usb usb2: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 2.346828] usb usb2: SerialNumber: 0000:00:1d.0 Aug 16 00:32:21 kernel: [ 2.412989] usb usb2: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 2.430651] usb 1-2: new high speed USB device using ehci_hcd and address 2 Aug 16 00:32:21 kernel: [ 2.449046] hub 2-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.466514] hub 2-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 2.484639] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19 Aug 16 00:32:21 kernel: [ 2.537750] uhci_hcd 0000:00:1d.1: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.537756] uhci_hcd 0000:00:1d.1: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.555085] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3 Aug 16 00:32:21 kernel: [ 2.572231] uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000fd00 Aug 16 00:32:21 kernel: [ 2.589593] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.606869] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.624134] usb usb3: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.641329] usb usb3: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 2.658505] usb usb3: SerialNumber: 0000:00:1d.1 Aug 16 00:32:21 kernel: [ 2.675843] usb usb3: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 2.692864] hub 3-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.709651] hub 3-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 2.727378] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 Aug 16 00:32:21 kernel: [ 2.768252] uhci_hcd 0000:00:1d.2: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.768258] uhci_hcd 0000:00:1d.2: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.806679] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4 Aug 16 00:32:21 kernel: [ 2.824117] uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000fc00 Aug 16 00:32:21 kernel: [ 2.841405] usb 1-2: New USB device found, idVendor=1058, idProduct=1104 Aug 16 00:32:21 kernel: [ 2.858448] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Aug 16 00:32:21 kernel: [ 2.875347] usb 1-2: Product: My Book Aug 16 00:32:21 kernel: [ 2.892113] usb 1-2: Manufacturer: Western Digital Aug 16 00:32:21 kernel: [ 2.908915] usb 1-2: SerialNumber: 575532553130303530353538 Aug 16 00:32:21 kernel: [ 2.943242] usb usb4: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.960405] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.977615] usb usb4: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.994687] usb usb4: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 3.011711] usb usb4: SerialNumber: 0000:00:1d.2 Aug 16 00:32:21 kernel: [ 3.029589] usb usb4: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.082027] sd 2:0:0:0: [sda] Attached SCSI disk Aug 16 00:32:21 kernel: [ 3.103953] usb 1-2: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.122625] hub 4-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 3.140484] hub 4-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 3.161680] uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 16 (level, low) -> IRQ 16 Aug 16 00:32:21 kernel: [ 3.181257] uhci_hcd 0000:00:1d.3: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 3.181263] uhci_hcd 0000:00:1d.3: UHCI Host Controller Aug 16 00:32:21 kernel: [ 3.198614] uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5 Aug 16 00:32:21 kernel: [ 3.216012] uhci_hcd 0000:00:1d.3: irq 16, io base 0x0000fb00 Aug 16 00:32:21 kernel: [ 3.249877] Uniform CD-ROM driver Revision: 3.20 Aug 16 00:32:21 kernel: [ 3.267765] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 3.284947] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 3.302023] usb usb5: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 3.319215] usb usb5: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 3.336298] usb usb5: SerialNumber: 0000:00:1d.3 Aug 16 00:32:21 kernel: [ 3.368377] Initializing USB Mass Storage driver... Aug 16 00:32:21 kernel: [ 3.390652] usbcore: registered new interface driver hiddev Aug 16 00:32:21 kernel: [ 3.408109] scsi4 : SCSI emulation for USB Mass Storage devices Aug 16 00:32:21 kernel: [ 3.425281] sr 0:0:1:0: Attached scsi CD-ROM sr0 Aug 16 00:32:21 kernel: [ 3.438978] sr 0:0:1:0: Attached scsi generic sg0 type 5 Aug 16 00:32:21 kernel: [ 3.456328] usbcore: registered new interface driver usb-storage Aug 16 00:32:21 kernel: [ 3.474564] usb-storage: device found at 2 Aug 16 00:32:21 kernel: [ 3.474567] usb-storage: waiting for device to settle before scanning Aug 16 00:32:21 kernel: [ 3.475320] sd 2:0:0:0: Attached scsi generic sg1 type 0 Aug 16 00:32:21 kernel: [ 3.492587] USB Mass Storage support registered. Aug 16 00:32:21 kernel: [ 3.510930] usb usb5: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.531076] hub 5-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 3.548399] hub 5-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 3.591743] input: Western Digital My Book as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2:1.1/input/input2 Aug 16 00:32:21 kernel: [ 3.609515] generic-usb 0003:1058:1104.0001: input,hidraw0: USB HID v1.11 Device [Western Digital My Book] on usb-0000:00:1d.7-2/input1 Aug 16 00:32:21 kernel: [ 3.627466] usbcore: registered new interface driver usbhid Aug 16 00:32:21 kernel: [ 8.581664] usb-storage: device scan complete Aug 16 00:32:21 kernel: [ 8.624270] scsi 4:0:0:0: Direct-Access WD My Book 1008 PQ: 0 ANSI: 4 Aug 16 00:32:21 kernel: [ 8.655135] scsi 4:0:0:1: Enclosure WD My Book Device 1008 PQ: 0 ANSI: 4 Aug 16 00:32:21 kernel: [ 8.675393] sd 4:0:0:0: Attached scsi generic sg2 type 0 Aug 16 00:32:21 kernel: [ 8.698669] scsi 4:0:0:1: Attached scsi generic sg3 type 13 Aug 16 00:32:21 kernel: [ 8.723370] sd 4:0:0:0: [sdb] 1953513472 512-byte logical blocks: (1.00 TB/931 GiB) Aug 16 00:32:21 kernel: [ 8.750477] sd 4:0:0:0: [sdb] Write Protect is off Aug 16 00:32:21 kernel: [ 8.769411] sd 4:0:0:0: [sdb] Mode Sense: 10 00 00 00 Aug 16 00:32:21 kernel: [ 8.769414] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.822971] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.841978] sdb: sdb1 Aug 16 00:32:21 kernel: [ 8.905580] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.924173] sd 4:0:0:0: [sdb] Attached SCSI disk Aug 16 00:32:21 kernel: [ 11.600492] XFS mounting filesystem sdb1 Aug 16 00:32:21 kernel: [ 12.222948] Ending clean XFS mount for filesystem: sdb1 After a while the following appears in a log: Aug 16 09:30:56 kernel: [32359.112029] usb 1-2: reset high speed USB device using ehci_hcd and address 2 Aug 16 09:31:59 kernel: [32422.112035] usb 1-2: reset high speed USB device using ehci_hcd and address 2 Aug 16 09:33:00 kernel: [32483.112029] usb 1-2: reset high speed USB device using ehci_hcd and address 2 And then it is followed by few kernel dumps, which I think, are not good: Aug 16 09:33:40 kernel: [32520.428027] INFO: task xfssyncd:1002 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32520.462689] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32520.497422] xfssyncd D c3d84a60 0 1002 2 0x00000000 Aug 16 09:33:40 kernel: [32520.532117] f6c9aa80 00000046 c1132742 c3d84a60 00000286 c1418100 c1418100 00000000 Aug 16 09:33:40 kernel: [32520.566867] f6c9ac3c c2808100 00000000 f653b18b 00001d76 00000001 f6c9aa80 c3c3f0e0 Aug 16 09:33:40 kernel: [32520.601343] 08e59242 f6c9ac3c 2e41392b 00000000 08e59242 00000000 c3f7fb48 0067385a Aug 16 09:33:40 kernel: [32520.635533] Call Trace: Aug 16 09:33:40 kernel: [32520.668991] [<c1132742>] ? cfq_set_request+0x0/0x290 Aug 16 09:33:40 kernel: [32520.702804] [<c126b532>] ? io_schedule+0x5f/0x98 Aug 16 09:33:40 kernel: [32520.736555] [<c1128be0>] ? get_request_wait+0xcb/0x146 Aug 16 09:33:40 kernel: [32520.770360] [<c10437ba>] ? autoremove_wake_function+0x0/0x2d Aug 16 09:33:40 kernel: [32520.804110] [<c112907c>] ? __make_request+0x2cc/0x3d9 Aug 16 09:33:40 kernel: [32520.837713] [<c1128230>] ? blk_peek_request+0x135/0x143 Aug 16 09:33:40 kernel: [32520.871265] [<f8582987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod] Aug 16 09:33:40 kernel: [32520.904407] [<c1127cf1>] ? generic_make_request+0x266/0x2b4 Aug 16 09:33:40 kernel: [32520.937007] [<c10cf821>] ? bvec_alloc_bs+0x95/0xaf Aug 16 09:33:40 kernel: [32520.969033] [<c1127dfb>] ? submit_bio+0xbc/0xd6 Aug 16 09:33:40 kernel: [32521.000485] [<c10cffd1>] ? bio_add_page+0x28/0x2e Aug 16 09:33:40 kernel: [32521.031403] [<f8918d38>] ? _xfs_buf_ioapply+0x206/0x22b [xfs] Aug 16 09:33:40 kernel: [32521.061888] [<f89197bd>] ? xfs_buf_iorequest+0x38/0x60 [xfs] Aug 16 09:33:40 kernel: [32521.091845] [<f8907230>] ? xlog_bdstrat_cb+0x16/0x3d [xfs] Aug 16 09:33:40 kernel: [32521.121222] [<f8905781>] ? XFS_bwrite+0x32/0x64 [xfs] Aug 16 09:33:40 kernel: [32521.150007] [<f89059be>] ? xlog_sync+0x20b/0x311 [xfs] Aug 16 09:33:40 kernel: [32521.178214] [<f89112fc>] ? xfs_trans_ail_tail+0x12/0x27 [xfs] Aug 16 09:33:40 kernel: [32521.205914] [<f8906261>] ? xlog_state_sync_all+0xa2/0x141 [xfs] Aug 16 09:33:40 kernel: [32521.233074] [<f8906611>] ? _xfs_log_force+0x51/0x68 [xfs] Aug 16 09:33:40 kernel: [32521.259664] [<c103abaf>] ? process_timeout+0x0/0x5 Aug 16 09:33:40 kernel: [32521.285662] [<f8906636>] ? xfs_log_force+0xe/0x27 [xfs] Aug 16 09:33:40 kernel: [32521.311171] [<f89202df>] ? xfs_sync_worker+0x17/0x5c [xfs] Aug 16 09:33:40 kernel: [32521.336117] [<f891fbb7>] ? xfssyncd+0x134/0x17d [xfs] Aug 16 09:33:40 kernel: [32521.360498] [<f891fa83>] ? xfssyncd+0x0/0x17d [xfs] Aug 16 09:33:40 kernel: [32521.384211] [<c1043588>] ? kthread+0x61/0x66 Aug 16 09:33:40 kernel: [32521.407890] [<c1043527>] ? kthread+0x0/0x66 Aug 16 09:33:40 kernel: [32521.430876] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Aug 16 09:33:40 kernel: [32521.453394] INFO: task flush-8:16:12945 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32521.476116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32521.498579] flush-8:16 D 00000000 0 12945 2 0x00000000 Aug 16 09:33:40 kernel: [32521.520649] f4e4d540 00000046 e412e940 00000000 00000002 c1418100 c1418100 c14136ac Aug 16 09:33:40 kernel: [32521.542426] f4e4d6fc c2808100 00000000 00000000 000008b4 00000001 f4e4d540 c3c3f0e0 Aug 16 09:33:40 kernel: [32521.563745] 02e905a8 f4e4d6fc 007a5399 00000000 02e905a8 00000000 f4e2db48 00670b98 Aug 16 09:33:40 kernel: [32521.585077] Call Trace: Aug 16 09:33:40 kernel: [32521.605790] [<c126b532>] ? io_schedule+0x5f/0x98 Aug 16 09:33:40 kernel: [32521.626184] [<c1128be0>] ? get_request_wait+0xcb/0x146 Aug 16 09:33:40 kernel: [32521.646133] [<c10437ba>] ? autoremove_wake_function+0x0/0x2d Aug 16 09:33:40 kernel: [32521.665659] [<c112907c>] ? __make_request+0x2cc/0x3d9 Aug 16 09:33:40 kernel: [32521.684716] [<f891796e>] ? xfs_convert_page+0x30a/0x331 [xfs] Aug 16 09:33:40 kernel: [32521.703366] [<c1127cf1>] ? generic_make_request+0x266/0x2b4 Aug 16 09:33:40 kernel: [32521.721644] [<c10cf821>] ? bvec_alloc_bs+0x95/0xaf Aug 16 09:33:40 kernel: [32521.739465] [<c1127dfb>] ? submit_bio+0xbc/0xd6 Aug 16 09:33:40 kernel: [32521.756896] [<c10cfa45>] ? bio_alloc_bioset+0x7b/0xba Aug 16 09:33:40 kernel: [32521.774046] [<f8917af0>] ? xfs_submit_ioend_bio+0x3b/0x44 [xfs] Aug 16 09:33:40 kernel: [32521.790694] [<f8917ba3>] ? xfs_submit_ioend+0xaa/0xc4 [xfs] Aug 16 09:33:40 kernel: [32521.806736] [<f891817d>] ? xfs_page_state_convert+0x5c0/0x61c [xfs] Aug 16 09:33:40 kernel: [32521.822859] [<c113705b>] ? __lookup_tag+0x8e/0xee Aug 16 09:33:40 kernel: [32521.838958] [<f891840d>] ? xfs_vm_writepage+0x91/0xc4 [xfs] Aug 16 09:33:40 kernel: [32521.855039] [<c108bbcc>] ? __writepage+0x8/0x22 Aug 16 09:33:40 kernel: [32521.871067] [<c108c17b>] ? write_cache_pages+0x1af/0x29f Aug 16 09:33:40 kernel: [32521.886616] [<c108bbc4>] ? __writepage+0x0/0x22 Aug 16 09:33:40 kernel: [32521.901593] [<c108c285>] ? generic_writepages+0x1a/0x21 Aug 16 09:33:40 kernel: [32521.916455] [<f8918338>] ? xfs_vm_writepages+0x0/0x38 [xfs] Aug 16 09:33:40 kernel: [32521.931484] [<c108c2a5>] ? do_writepages+0x19/0x25 Aug 16 09:33:40 kernel: [32521.946648] [<c10c80d9>] ? writeback_single_inode+0xc7/0x273 Aug 16 09:33:40 kernel: [32521.961675] [<c10c8c44>] ? writeback_inodes_wb+0x3dd/0x49c Aug 16 09:33:40 kernel: [32521.976831] [<c10c8e18>] ? wb_writeback+0x115/0x178 Aug 16 09:33:40 kernel: [32521.991778] [<c10c901f>] ? wb_do_writeback+0x121/0x131 Aug 16 09:33:40 kernel: [32522.006538] [<c103abaf>] ? process_timeout+0x0/0x5 Aug 16 09:33:40 kernel: [32522.021091] [<c10c9050>] ? bdi_writeback_task+0x21/0x89 Aug 16 09:33:40 kernel: [32522.035493] [<c10979e5>] ? bdi_start_fn+0x59/0xa4 Aug 16 09:33:40 kernel: [32522.049765] [<c109798c>] ? bdi_start_fn+0x0/0xa4 Aug 16 09:33:40 kernel: [32522.063792] [<c1043588>] ? kthread+0x61/0x66 Aug 16 09:33:40 kernel: [32522.077612] [<c1043527>] ? kthread+0x0/0x66 Aug 16 09:33:40 kernel: [32522.091260] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Aug 16 09:33:40 kernel: [32522.104966] INFO: task smartctl:13098 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32522.118883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32522.133012] smartctl D 00000020 0 13098 13097 0x00000000 Aug 16 09:33:40 kernel: [32522.147221] e50b9540 00000086 c11d28a8 00000020 00000770 c1418100 c1418100 c14136ac Aug 16 09:33:40 kernel: [32522.161720] e50b96fc c2808100 00000000 e53e8800 00000000 00000020 c3cec000 c13886c0 Aug 16 09:33:40 kernel: [32522.176217] f99dab68 e50b96fc 007a4f1e 00000001 c4082f24 c4082ed8 00000001 c3c3f0e0 Aug 16 09:33:40 kernel: [32522.190737] Call Trace: Aug 16 09:33:40 kernel: [32522.205038] [<c11d28a8>] ? __netdev_alloc_skb+0x14/0x2d Aug 16 09:33:40 kernel: [32522.219605] [<c126b799>] ? schedule_timeout+0x20/0xb0 Aug 16 09:33:40 kernel: [32522.234144] [<c112820d>] ? blk_peek_request+0x112/0x143 Aug 16 09:33:40 kernel: [32522.248649] [<f85873b6>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod] Aug 16 09:33:40 kernel: [32522.263233] [<c103aba8>] ? del_timer+0x55/0x5c Aug 16 09:33:40 kernel: [32522.277773] [<c126b6a2>] ? wait_for_common+0xa4/0x100 Aug 16 09:33:40 kernel: [32522.292342] [<c102cd8d>] ? default_wake_function+0x0/0x8 Aug 16 09:33:40 kernel: [32522.306958] [<c112b3d1>] ? blk_execute_rq+0x8b/0xb2 Aug 16 09:33:40 kernel: [32522.321569] [<c112b2ac>] ? blk_end_sync_rq+0x0/0x23 Aug 16 09:33:40 kernel: [32522.336070] [<c112b58b>] ? blk_recount_segments+0x13/0x20 Aug 16 09:33:40 kernel: [32522.350583] [<c1127307>] ? blk_rq_bio_prep+0x44/0x74 Aug 16 09:33:40 kernel: [32522.365059] [<c112b0b2>] ? blk_rq_map_kern+0xc5/0xee Aug 16 09:33:40 kernel: [32522.379439] [<c112e2a5>] ? sg_scsi_ioctl+0x221/0x2aa Aug 16 09:33:40 kernel: [32522.393801] [<c112e672>] ? scsi_cmd_ioctl+0x344/0x39a Aug 16 09:33:40 kernel: [32522.408140] [<c1024c87>] ? update_curr+0x106/0x1b3 Aug 16 09:33:40 kernel: [32522.422566] [<c1024c87>] ? update_curr+0x106/0x1b3 Aug 16 09:33:40 kernel: [32522.436832] [<f87676aa>] ? sd_ioctl+0x90/0xb5 [sd_mod] Aug 16 09:33:40 kernel: [32522.451228] [<c112c35f>] ? __blkdev_driver_ioctl+0x53/0x63 Aug 16 09:33:40 kernel: [32522.465689] [<c112cbbf>] ? blkdev_ioctl+0x850/0x891 Aug 16 09:33:40 kernel: [32522.479982] [<c1020474>] ? __wake_up_common+0x34/0x59 Aug 16 09:33:40 kernel: [32522.494138] [<c10244cd>] ? complete+0x28/0x36 Aug 16 09:33:40 kernel: [32522.507986] [<c1086c64>] ? find_get_page+0x1f/0x81 Aug 16 09:33:40 kernel: [32522.521671] [<c10abed5>] ? add_partial+0xe/0x40 Aug 16 09:33:40 kernel: [32522.535285] [<c1086e68>] ? lock_page+0x8/0x1d Aug 16 09:33:40 kernel: [32522.548797] [<c1087432>] ? filemap_fault+0xb5/0x2e6 Aug 16 09:33:40 kernel: [32522.562141] [<c109941c>] ? __do_fault+0x381/0x3b1 Aug 16 09:33:40 kernel: [32522.575441] [<c10d0c30>] ? block_ioctl+0x27/0x2c Aug 16 09:33:40 kernel: [32522.588708] [<c10d0c09>] ? block_ioctl+0x0/0x2c Aug 16 09:33:40 kernel: [32522.601858] [<c10bcd78>] ? vfs_ioctl+0x1c/0x5f Aug 16 09:33:40 kernel: [32522.614917] [<c10bd30c>] ? do_vfs_ioctl+0x4aa/0x4e5 Aug 16 09:33:40 kernel: [32522.627961] [<c10350db>] ? __do_softirq+0x115/0x151 Aug 16 09:33:40 kernel: [32522.640901] [<c126e270>] ? do_page_fault+0x2f1/0x307 Aug 16 09:33:40 kernel: [32522.653803] [<c10bd388>] ? sys_ioctl+0x41/0x58 Aug 16 09:33:40 kernel: [32522.666674] [<c10030fb>] ? sysenter_do_call+0x12/0x28 Then again few messages reset high speed USB device using ehci_hcd and address 2. I have browsed and read similar error reports here and there and I tried: I have upgraded the kernel from v2.6.26-2 to 2.6.32-5, which has not solved the problem. They say, this might a cable problem. I have tried to replace the USB-to-miniUSB cable (that connects external drive with computer) with another one. No changes. Somebody suggests to try another USB port. I have only 4 external USB ports, tried another one with no success. They say to try uhci_hcd. I have unmounted the device, unloaded ehci_hcd and mounted again. The difference was that now in log I get reset full speed USB device using uhci_hcd and address 2 and similar kernel dumps after a while. They say to echo 128 > /sys/block/sdb/device/max_sectors. I tried it with ehci_hcd with no success (note: I have issued this command after the drive was mounted but before using it actively). I have lauched smartmond and checking periodically the output of smartctl: drive temperature is OK, number of bad sectors and uncorrectable errors is 0. Nothing suspicious is reported by S.M.A.R.T. except maybe the following: Aug 16 12:40:12 kernel: [43715.314566] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Aug 16 12:40:13 kernel: [43715.705622] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Of course, I have not tried all combinations of above. But unfortunately, I am run out of cardinal ideas. If anybody can advice something specific about the problem, you are very welcome.

    Read the article

  • How can I run supervisord without using root?

    - by Jason Baker
    I seem to be having trouble figuring out why supervisord won't run as a non-root user. If I start it with the user set to jason (pid 1000), I get the following in the log file: 2010-05-24 08:53:32,143 CRIT Set uid to user 1000 2010-05-24 08:53:32,143 WARN Included extra file "/home/jason/src/tsched/celeryd.conf" during parsing 2010-05-24 08:53:32,189 INFO RPC interface 'supervisor' initialized 2010-05-24 08:53:32,189 WARN cElementTree not installed, using slower XML parser for XML-RPC 2010-05-24 08:53:32,189 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2010-05-24 08:53:32,190 INFO daemonizing the supervisord process 2010-05-24 08:53:32,191 INFO supervisord started with pid 3444 ...then the process dies for some unknown reason. If I start it without sudo (under the user jason), I get similar output: 2010-05-24 08:51:32,859 INFO supervisord started with pid 3306 2010-05-24 08:52:15,761 CRIT Can't drop privilege as nonroot user 2010-05-24 08:52:15,761 WARN Included extra file "/home/jason/src/tsched/celeryd.conf" during parsing 2010-05-24 08:52:15,807 INFO RPC interface 'supervisor' initialized 2010-05-24 08:52:15,807 WARN cElementTree not installed, using slower XML parser for XML-RPC 2010-05-24 08:52:15,807 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2010-05-24 08:52:15,808 INFO daemonizing the supervisord process 2010-05-24 08:52:15,809 INFO supervisord started with pid 3397 ...and it still doesn't run. If it's any help, here's the supervisord.conf file I'm using: [unix_http_server] file=/tmp/supervisor.sock ; path to your socket file [supervisord] logfile=./supervisord.log ; supervisord log file logfile_maxbytes=50MB ; maximum size of logfile before rotation logfile_backups=10 ; number of backed up logfiles loglevel=debug ; info, debug, warn, trace pidfile=./supervisord.pid ; pidfile location nodaemon=false ; run supervisord as a daemon minfds=1024 ; number of startup file descriptors minprocs=200 ; number of process descriptors user=jason ; default user childlogdir=./supervisord/ ; where child log files will live [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use unix:// schem for a unix sockets. [include] # Uncomment this line for celeryd for Python files=celeryd.conf # Uncomment this line for celeryd for Django. ;files=django/celeryd.conf ...and here's celeryd.conf: [program:celery] command=bin/celeryd --loglevel=INFO --logfile=./celeryd.log environment=PYTHONPATH='./tsched_worker', JIVA_DB_PLATFORM='oracle', ORACLE_HOME='/usr/lib/oracle/xe/app/oracle/product/10.2.0/server', LD_LIBRARY_PATH='/usr/lib/oracle/xe/app/oracle/product/10.2.0/server/lib', TNS_ADMIN='/home/jason', CELERY_CONFIG_MODULE='tsched_worker.celeryconfig' directory=. user=jason numprocs=1 stdout_logfile=/var/log/celeryd.log stderr_logfile=/var/log/celeryd.log autostart=true autorestart=true startsecs=10 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 600 ; if rabbitmq is supervised, set its priority higher ; so it starts first priority=998 Can anyone help me figure out what's going on?

    Read the article

  • vSphere Client vCenter Template Customization Specification Using Windows Sysprep Unattended Answer XML File

    - by Brian
    I'm trying to setup a vSphere Client vCenter v5.0.0 Build 455964 Template Customization Specification using a Windows Sysprep unattended answer XML file for Win2008R2. However I didn't know how Sysprep worked before attempting this so it was a time-consuming nightmare (even after reviewing VMware vSphere ESXi 5's documentation)! I think I've figure out what I'm supposed to be doing, but it's still not working. The biggest problem at this point is that vSphere Client vCenter Customization Specification IP address information is not sticking when I load a Sysprep XML file with just 1 basic setting! This can only be a bug. Here is the process I'm using: PROCESS for Windows - vSphere Client Install Windows OS install VM Tools customize Windows (GPOs can be used to do this after deployment) install Applications (GPOs can be used to do this after deployment too) shutdown the VM convert the VM to a template create a custom Windows Sysprep XML answer file with desired customizations View Management Customization Specifications Manager create "New" Specification for "Target Virtual Machine OS" select Windows check "Use Custom Sysprep Answer File" (ADDS: Custom Sysprep File. KEEPS: Network (IP), Operating System Options (SID, Sysprep /generalize). REPLACES: Registration Information of Owner Name & Organization, Computer Name, Windows License (Key), Administrator Password, Time Zone, Run Once, Workgroup or Domain) name it as "VMwareCS-OS####R#x32/64w/Sysprep-TEST" (CS=Customization Specification) set Description as "Created YYYY/MM/DD by FLast" NEXT import a Sysprep answer file from secure location NEXT Custom settings NEXT click "..." box to right of "Use DHCP" set "Use the following IP settings:" for "IP Address" fill out the first 2 octets set appropriate values for other 2-3 fields set DNS server addresses OK NEXT check "Generate New Security ID (SID)" ALWAYS as template is likely a domain-member computer so it can be updated occasionally NEXT Finish View Inventory VMs and Templates right-click previously completed template Deploy Virtual Machine from this Template provide the new OS name (max15char) select inventory location NEXT select Host/Cluster (wait for validation to succeed) NEXT select Resource Pool (wait for validation to succeed) NEXT select Storage location NEXT check "Power on this virtual machine after creation" select "Customize using an existing customization specification" select desired specification select "Use the Customization Wizard to temporarily adjust the specification before deployment" NEXT NEXT Custom settings? NEXT check "Generate New Security ID (SID)" ALWAYS as template is likely a domain-member computer so it can be updated occasionally NEXT Finish Finish. I know a community member named "brian" (http://serverfault.com/users/25904/brian) has worked with this scenario before, but I couldn't figure out how to contact him directly, so Brian if you see this message could you provide some information to help? Thanks, Brian

    Read the article

  • OpenSSL Handshake Failure (14094410) - Erroneous Client Certificate Check from Mobile Phone

    - by Clayton Sims
    I'm running a proxy server through Apache with modssl, which we're using to proxy POSTs from mobile devices to another internal server. This works successfully for most clients, but requests from a specific phone model (Nokia 2690) are showing a bizarre handshake failure. It looks as though OpenSSL is either requesting (or attempting to read an unsolicited) client certificate from the phone (which is especially bizarre because j2me's kssl implementation doesn't support client certs). I've disabled client certificates with the SSLVerifyClient none directive in both the virtual host conf and the modssl conf. The trace from error.log on debug level is (details redacted): [client 41.220.207.10] Connection to child 0 established (server www.myserver.org:443) [info] Seeding PRNG with 656 bytes of entropy [debug] ssl_engine_kernel.c(1866): OpenSSL: Handshake: start [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: before/accept initialization [debug] ssl_engine_io.c(1882): OpenSSL: read 11/11 bytes from BIO#7fe3fbaf17a0 [mem: 7fe3fbaf90d0] (BIO dump follows) [debug] ssl_engine_io.c(1815): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1860): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1882): OpenSSL: read 49/49 bytes from BIO#7fe3fbaf17a0 [mem: 7fe3fbaf90db] (BIO dump follows) [debug] ssl_engine_io.c(1815): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1860): +-------------------------------------------------------------------------+ [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: SSLv3 read client hello A [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: SSLv3 write server hello A [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: SSLv3 write certificate A [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: SSLv3 write server done A [debug] ssl_engine_kernel.c(1874): OpenSSL: Loop: SSLv3 flush data [debug] ssl_engine_io.c(1882): OpenSSL: read 5/5 bytes from BIO#7fe3fbaf17a0 [mem: 7fe3fbaf90d0] (BIO dump follows) [debug] ssl_engine_io.c(1815): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1860): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1882): OpenSSL: read 2/2 bytes from BIO#7fe3fbaf17a0 [mem: 7fe3fbaf90d5] (BIO dump follows) [debug] ssl_engine_io.c(1815): +-------------------------------------------------------------------------+ [debug] ssl_engine_io.c(1860): +-------------------------------------------------------------------------+ [debug] ssl_engine_kernel.c(1879): OpenSSL: Read: SSLv3 read client certificate A [debug] ssl_engine_kernel.c(1898): OpenSSL: Exit: failed in SSLv3 read client certificate A [client 41.220.207.10] SSL library error 1 in handshake (server www.myserver.org:443) [info] SSL Library Error: 336151568 error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure [client 41.220.207.10] Connection closed to child 0 with abortive shutdown (server www.myserver.org:443) I've tried enabling all ciphers and all protocols temporarily with modssl, neither of which seemed to be the issue. The phone should be using RSA_RC4_128_MD5 and SSLv3, all of which are available. Am I missing something more fundamental about what's failing here? It seemed like the certificate request might have been part of a renegotiation failure. I tried enabling SSLInsecureRenegotiation On on the virtual host, in case it was an issue of the phone's SSL not supporting the new protocol, but to no avail. Currently running: Apache/2.2.16 (Ubuntu) mod_ssl/2.2.16 OpenSSL/0.9.8o Apache proxy_html/3.0.1

    Read the article

  • Monitoring slow nginx/unicorn requests

    - by injekt
    I'm currently using Nginx to proxy requests to a Unicorn server running a Sinatra application. The application only has a couple of routes defined, those of which make fairly simple (non costly) queries to a PostgreSQL database, and finally return data in JSON format, these services are being monitored by God. I'm currently experiencing extremely slow response times from this application server. I have another two Unicorn servers being proxied via Nginx, and these are responding perfectly fine, so I think I can rule out any wrong doing from Nginx. Here is my God configuration: # God configuration APP_ROOT = File.expand_path '../', File.dirname(__FILE__) God.watch do |w| w.name = "app_name" w.interval = 30.seconds # default w.start = "cd #{APP_ROOT} && unicorn -c #{APP_ROOT}/config/unicorn.rb -D" # -QUIT = graceful shutdown, waits for workers to finish their current request before finishing w.stop = "kill -QUIT `cat #{APP_ROOT}/tmp/unicorn.pid`" w.restart = "kill -USR2 `cat #{APP_ROOT}/tmp/unicorn.pid`" w.start_grace = 10.seconds w.restart_grace = 10.seconds w.pid_file = "#{APP_ROOT}/tmp/unicorn.pid" # User under which to run the process w.uid = 'web' w.gid = 'web' # Cleanup the pid file (this is needed for processes running as a daemon) w.behavior(:clean_pid_file) # Conditions under which to start the process w.start_if do |start| start.condition(:process_running) do |c| c.interval = 5.seconds c.running = false end end # Conditions under which to restart the process w.restart_if do |restart| restart.condition(:memory_usage) do |c| c.above = 150.megabytes c.times = [3, 5] # 3 out of 5 intervals end restart.condition(:cpu_usage) do |c| c.above = 50.percent c.times = 5 end end w.lifecycle do |on| on.condition(:flapping) do |c| c.to_state = [:start, :restart] c.times = 5 c.within = 5.minute c.transition = :unmonitored c.retry_in = 10.minutes c.retry_times = 5 c.retry_within = 2.hours end end end Here is my Unicorn configuration: # Unicorn configuration file APP_ROOT = File.expand_path '../', File.dirname(__FILE__) worker_processes 8 preload_app true pid "#{APP_ROOT}/tmp/unicorn.pid" listen 8001 stderr_path "#{APP_ROOT}/log/unicorn.stderr.log" stdout_path "#{APP_ROOT}/log/unicorn.stdout.log" before_fork do |server, worker| old_pid = "#{APP_ROOT}/tmp/unicorn.pid.oldbin" if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH # someone else did our job for us end end end I have checked God status logs but it appears CPU and Memory Usage are never out of bounds. I also have something to kill high memory workers, which can be found on the GitHub blog page here. When running a tail -f on the Unicorn logs I see some requests, but they're far and few between, when I was at around 60-100 a second before this trouble seemed to have arrived. This log also shows workers being reaped and started as expected. So my question is, how would I go about debugging this? What are the next steps I should be taking? I'm extremely baffled that the server will sometimes respond quickly, but at others time it's very slow, for long periods of time (which may or may not be peak traffic times). Any advice is much appreciated.

    Read the article

  • Debian Apache2 SSL Issues - Error code: ssl_error_rx_record_too_long

    - by Tone
    I'm setting up apache on Debian lenny and having issues with SSL. I've been through numberous tutorials and i had this working on Ubuntu server, but for the life of me can't get anywhere with Debian. Port 80 (http) works fine, but port 443 (https) gives me the following error (in firefox) - homeserver is my hostname and my dhcp assigned ip is 192.168.1.109. I have a feeling it's something with my config and not with the cert/key generation. An error occurred during a connection to homeserver. SSL received a record that exceeded the maximum permissible length. (Error code: ssl_error_rx_record_too_long) Anyone see any issues with the following config files? /etc/apache2/sites-available/default-ssl <IfModule mod_ssl.c> <VirtualHost *:443> ServerAdmin webmaster@localhost ServerName homeserver DocumentRoot /var/www/ <Directory /> Options FollowSymLinks AllowOverride None </Directory> <Directory /var/www/> Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny allow from all </Directory> ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/ <Directory "/usr/lib/cgi-bin"> AllowOverride None Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch Order allow,deny Allow from all </Directory> ErrorLog /var/log/apache2/error.log LogLevel warn CustomLog /var/log/apache2/ssl_access.log combined Alias /doc/ "/usr/share/doc/" <Directory "/usr/share/doc/"> Options Indexes MultiViews FollowSymLinks AllowOverride None Order deny,allow Deny from all Allow from 127.0.0.0/255.0.0.0 ::1/128 </Directory> SSLEngine on SSLCertificateFile /etc/ssl/certs/server.crt SSLCertificateKeyFile /etc/ssl/private/server.key SSLOptions +FakeBasicAuth +ExportCertData +StrictRequire <FilesMatch "\.(cgi|shtml|phtml|php)$"> SSLOptions +StdEnvVars </FilesMatch> <Directory /usr/lib/cgi-bin> SSLOptions +StdEnvVars </Directory> BrowserMatch ".*MSIE.*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 </VirtualHost> </IfModule> /etc/apache2/ports.conf NameVirtualHost *:80 Listen 80 Listen 443 #<IfModule mod_ssl.c> # SSL name based virtual hosts are not yet supported, therefore no # NameVirtualHost statement here #Listen 443 #</IfModule> /etc/hosts 127.0.0.1 localhost 127.0.0.1 homeserver #192.168.1.109 homeserver #tried this but it didn't work # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts

    Read the article

  • Old operational master still thinks it is the "one"

    - by Doug
    Hi there, I have a domain with 3 AD servers for now i'll just call them: AD01 (Win 2008 GC, Operations master) AD02 (Win 2008 GC) AD03 (Win 2003 GC) A couple of months there was some hardware issues with AD01 so the operations master, PDC and Infrastructure Master was moved to AD02. All machines where on while this was happening. AD01 (Win 2008 GC) AD02 (Win 2008 GC, Operations master) AD03 (Win 2003 GC) AD01 was then shutdown for a month. Upon starting this machine up with replaced hardware (NIC and RAID card) i now have a weird problem. AD01 Thinks it is operations master still in AD on the local box AD02 & AD03 Thinks AD02 is operations master in AD on both boxes When running DCDIAG on AD01 i get a number of issues (listed below) When running "dcdiag /test:advertising" on AD01: Doing primary tests Testing server: Default-First-Site-Name\AD01 Starting test: Advertising Warning: DsGetDcName returned information for \\ad02.domain.local, when we were trying to reach AD01. SERVER IS NOT RESPONDING or IS NOT CONSIDERED SUITABLE. ......................... AD01 failed test Advertising Running partition tests on : ForestDnsZones Running partition tests on : DomainDnsZones Running partition tests on : Schema Running partition tests on : Configuration Running partition tests on : domain Running enterprise tests on : domain.local When running "dcdiag" on AD01 i get the following errors (excerpt of the Final output): Testing server: Default-First-Site-Name\AD01 Starting test: Advertising Warning: DsGetDcName returned information for \\ad02.domain.local, when we were trying to reach AD01. SERVER IS NOT RESPONDING or IS NOT CONSIDERED SUITABLE. ......................... AD01 failed test Advertising Starting test: FrsEvent There are warning or error events within the last 24 hours after the SYSVOL has been shared. Failing SYSVOL replication problems may cause Group Policy problems. Starting test: NCSecDesc Error NT AUTHORITY\ENTERPRISE DOMAIN CONTROLLERS doesn't have Replicating Directory Changes In Filtered Set access rights for the naming context: DC=ForestDnsZones,DC=domain,DC=local Error NT AUTHORITY\ENTERPRISE DOMAIN CONTROLLERS doesn't have Replicating Directory Changes In Filtered Set access rights for the naming context: DC=DomainDnsZones,DC=domain,DC=local Starting test: Replications [Replications Check,Replications Check] Inbound replication is disabled. To correct, run "repadmin /options AD01 -DISABLE_INBOUND_REPL" [Replications Check,AD01] Outbound replication is disabled. To correct, run "repadmin /options AD01 -DISABLE_OUTBOUND_REPL" So the problem appeasr to be that when i moved the operations master, AD01 never got the memo, and now that it's started up, all the other AD servers don't think its the boss anymore when it trys to replicate etc. So i really need to manually update AD01 so that it knows who the operations master, instrastructure and PDC is - but i'm not having any luck I've been googling for nearly a day and all solutions lead to "the cake is a lie" Your ninja skills will be greatly appreciated

    Read the article

  • ActiveMQ Pure Master / Slave - Out of sync

    - by pico
    What i have : 1 master broker and 1 slave broker both in ActiveMQ 5.4.0 What i use : waitForSlave on master side and failover uri on slave side (in the master connector URI) What i want to do : I want to wait some delay (like 5 seconds) in case of a tiny network failures between master and slave before starting slave transpôrt connectors So i put this in slave config : <broker xmlns="http://activemq.apache.org/schema/core" brokerName="slave" dataDirectory="${activemq.base}/data" useJmx="true" persistent="true" populateJMSXUserID="true" masterConnectorURI="failover://(tcp://master:61616)?initialReconnectDelay=1000&amp;maxReconnectDelay=30000" shutdownOnMasterFailure="false" advisorySupport="false"> It seems to work but after a network hang between master and slave, the slave reconnect successfully and then the master logs a lot of : 2010-10-18 17:08:44,421 | ERROR | Slave Failed | org.apache.activemq.broker.ft.MasterBroker | ActiveMQ Task java.lang.IllegalStateException: Cannot lookup a connection that had not been registered: ID:master-1040-634226732611718750-0:0 at org.apache.activemq.broker.MapTransportConnectionStateRegister.lookupConnectionState(MapTransportConnectionStateRegister.java:93) at org.apache.activemq.broker.TransportConnection.lookupConnectionState(TransportConnection.java:1412) at org.apache.activemq.broker.TransportConnection.processRemoveConsumer(TransportConnection.java:561) at org.apache.activemq.command.RemoveInfo.visit(RemoveInfo.java:76) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:309) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:185) at org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116) at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) at org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:218) at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98) at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36) On the slave side everything is fine. So after that, i've tried to stop the master to see if the slave is capable of turning master after these "network hangs". The master took long time to shutdown (10 seconds) and then some error message appears in slave logs : 2010-10-18 17:09:32,915 | WARN | Async error occurred: java.lang.IllegalStateException: Cannot lookup a connection that had not been registered: ID:master-1049-634226732657812500-0:3 | org.apache.activemq.broker.TransportConnection.Service | VMTransport: vm://slave#5 java.lang.IllegalStateException: Cannot lookup a connection that had not been registered: ID:master-1049-634226732657812500-0:3 at org.apache.activemq.broker.MapTransportConnectionStateRegister.lookupConnectionState(MapTransportConnectionStateRegister.java:93) at org.apache.activemq.broker.TransportConnection.lookupConnectionState(TransportConnection.java:1412) at org.apache.activemq.broker.TransportConnection.processRemoveSession(TransportConnection.java:600) at org.apache.activemq.command.RemoveInfo.visit(RemoveInfo.java:74) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:309) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:185) at org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116) at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) at org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:218) at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98) at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36) Are they any ways to keep my kaha stores (they are individual stores) synchronised? The main problem is that the slave never turn master after a master failure, it stay block on this message : 2010-10-18 17:09:33,681 | WARN | Transport (master/172.21.60.61:61616) failed to tcp://master:61616 , attempting to automatically reconnect due to: java.net.SocketException: Software caused connection abort: socket write error | org.apache.activemq.transport.failover.FailoverTransport | ActiveMQ Transport: tcp://master/172.21.60.61:61616 I'm totally stuck with these syncs problems, any help welcome! Regards

    Read the article

  • Issues with Server 2012 using DFSR running on Hyper-V 2012

    - by Bryan
    We have a number of Server 2012 systems, all of which run virtualised on Hyper-V 2012 server. We are having problems with two such virtual instances, both of which are used as file servers, whereby they occasionally stop responding to requests to serve files to clients. After logging on to the server, attempts to shut it down gracefully fail (no error, it just fails to acknowledge a shutdown request). Recovery is a case of power cycling the server(s) from the Hyper-V console. These two servers don't server a large number of users (one serves no more than 6 users, and the other serves around 20 users), they are in the same domain, but on different physical hardware (and at different sites). They don't lock up at the same time. They both use DFSR to replicate a fairly large amount of data between themselves (200GB) over ADSL connections, this is working fine, and we have been using DFSR to do this on the previous two generations of server OS we have used (Server 2008 R2 and Server 2003 - both of which were physical installs however). Today, when one of the servers crashed, I noticed an entry in the event log, which looked similar to the following: Log Name: Application Source: ESENT Date: 27/11/2012 10:25:55 Event ID: 533 Task Category: General Level: Warning Keywords: Classic User: N/A Computer: HAL-FS-01.example.com Description: DFSRs (1500) \\.\E:\System Volume Information\DFSR\database_C8CC_101_CC00_EC0E\ dfsr.db: A request to write to the file "\\.\E:\System Volume Information\ DFSR\database_C8CC_101_CC00_EC0E\fsr.log" at offset 4423680 (0x0000000000438000) for 4096 (0x00001000) bytes has not completed for 36 second(s). This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem. When the server started up again, I went to find the event log entry to investigate further and found that the event log entry was no longer there (I assume it was in memory but failed to write to disk before the server was powered off, for the reason mentioned in the message). I found the above message by searching back further in the event log. Both of these virtual servers have their E: volumes fully allocated as opposed to dynamically expanding, and there are no other issues on any of the other virtual servers (which include server 2012, server 2008 R2 and Ubuntu 12.04 x64). There are no signs of IO, memory or CPU starvation on the host systems. I've used performance counters on the affected virtual servers to monitor memory usage (including non paged pool usage), as well as CPU and network utilisation, and none of these show any signs of trouble when the issue arises. I would have thought our configuration isn't that uncommon, so I'm wondering if anyone else has seen this, and managed to resolve the problem?

    Read the article

  • kill -9 doesn't work

    - by Daniel
    I have a server with 3 oracle instances on it, and the file system is nfs with netapp. After shutdown the databases, one process for each database doesn't quit for a long time. Each kill -i doesn't work. I tried to truss, pfile it, the command through error. And iostat shows there are lots of IO to the netapp server. So someone said the process was busy writing data to remote netapp server, and before the write complete, it won't quit. So what need to be done was just wait until all the IO was done. After wait for longer time (about 1.5 hours), the processes exit. So my question is: how can a process ignore the kill signal? As far as I know, if we kill -9, it will stop immediately. Do you encounter such situation kill -i doesn't kill the process right away? TEST7-stdby-phxdbnfs11$ ps -ef|grep dbw0 oracle 1469 25053 0 22:36:53 pts/1 0:00 grep dbw0 oracle 26795 1 0 21:55:23 ? 0:00 ora_dbw0_TEST7 oracle 1051 1 0 Apr 08 ? 3958:51 ora_dbw0_TEST2 oracle 471 1 0 Apr 08 ? 6391:43 ora_dbw0_TEST1 TEST7-stdby-phxdbnfs11$ kill -9 1051 TEST7-stdby-phxdbnfs11$ ps -ef|grep dbw0 oracle 1493 25053 0 22:37:07 pts/1 0:00 grep dbw0 oracle 26795 1 0 21:55:23 ? 0:00 ora_dbw0_TEST7 oracle 1051 1 0 Apr 08 ? 3958:51 ora_dbw0_TEST2 oracle 471 1 0 Apr 08 ? 6391:43 ora_dbw0_TEST1 TEST7-stdby-phxdbnfs11$ kill -9 471 TEST7-stdby-phxdbnfs11$ ps -ef|grep dbw0 oracle 26795 1 0 21:55:23 ? 0:00 ora_dbw0_TEST7 oracle 1051 1 0 Apr 08 ? 3958:51 ora_dbw0_TEST2 oracle 471 1 0 Apr 08 ? 6391:43 ora_dbw0_TEST1 oracle 1495 25053 0 22:37:22 pts/1 0:00 grep dbw0 TEST7-stdby-phxdbnfs11$ ps -ef|grep smon oracle 1524 25053 0 22:38:02 pts/1 0:00 grep smon TEST7-stdby-phxdbnfs11$ ps -ef|grep dbw0 oracle 1526 25053 0 22:38:06 pts/1 0:00 grep dbw0 oracle 26795 1 0 21:55:23 ? 0:00 ora_dbw0_TEST7 oracle 1051 1 0 Apr 08 ? 3958:51 ora_dbw0_TEST2 oracle 471 1 0 Apr 08 ? 6391:43 ora_dbw0_TEST1 TEST7-stdby-phxdbnfs11$ kill -9 1051 471 26795 TEST7-stdby-phxdbnfs11$ ps -ef|grep dbw0 oracle 1528 25053 0 22:38:19 pts/1 0:00 grep dbw0 oracle 26795 1 0 21:55:23 ? 0:00 ora_dbw0_TEST7 oracle 1051 1 0 Apr 08 ? 3958:51 ora_dbw0_TEST2 oracle 471 1 0 Apr 08 ? 6391:43 ora_dbw0_TEST1 TEST7-stdby-phxdbnfs11$ truss -p 26795 truss: unanticipated system error: 26795 TEST7-stdby-phxdbnfs11$ pfiles 26795 pfiles: unanticipated system error: 26795

    Read the article

  • Windows Vista: Networking can only connect "local only"

    - by Damien
    I am attempting to debug a problem on a Windows Vista laptop - not mine! Until just recently (last week or so), it was operating normally for about 4 years :) The problem is that I am having issues connecting to the local network (a basic wireless home router; more later) and the internet (via ADSL). This is both for wired [Broadcom chipset] and wireless [Intel chipset]. I will elaborate further later. To detail the network. I have three other clients (HTC phone, Ubuntu 12.04 desktop [wired] and Ubuntu 10.04 laptop [wireless]), all of whom are able to connect to the network and internet normally. A windows 7 virtual machine running on said desktop connects normally. I have tried two different wireless routers - Netgear DG834G and Netgear DGN3500. The same error mode is common to both. Updating the firmware to the latest on both routers does not help. Overall, it seems safe to say it's localised to the laptop in question. I do not have another Vista client to test with. The specific symptoms are as follows: When "connected", it says "Local Only", and says it cannot connect to the internet. This is true for both wired and wireless. It can get an IP address (192.168.0.5), and the router (192.168.0.1) reports that it can see the device. When I try to ping, I get the following results: ping 192.168.0.1 - (router) all packets lost ping 192.168.0.5 - (laptop's address) OK ping 192.168.0.4 - (desktop) all packets lost Pinging from the desktop to the problematic laptop results in "From 192.168.0.4 icmp_seq=1 Destination Host Unreachable" The most promising "fix" from trawling forums is KB928233 which does not work for me. The problem is persistent across reports (both full shutdown and hibernate) so it appears not to be sleep related. I am not a regular vista user, though I can fumble my way about a bit. Is there any other suggestions as to what I should do? Is there any further information I can provide?

    Read the article

< Previous Page | 56 57 58 59 60 61 62 63 64 65 66 67  | Next Page >