Search Results

Search found 416 results on 17 pages for 'hung'.

Page 1/17 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Hadoop reduce task gets hung

    - by user806098
    I set up a hadoop cluster with 4 nodes, When running a map-reduce task, the map task finishes quickly, while the reduce task hangs at 27% percent. I checked the log, it's that the reduce task fails to fetch map output from map nodes. The job tracker log of master shows messages like this: 2011-06-27 19:55:14,748 INFO org.apache.hadoop.mapred.JobTracker: Adding task (REDUCE) 'attempt_201106271953_0001_r_000000_0' to tip task_201106271953_0001_r_000000, for tracker 'tracker_web30.bbn.com.cn:localhost/127.0.0.1:56476' And the name node log of master shows messages like this: 2011-06-27 14:00:52,898 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 54310, call register(DatanodeRegistration(202.106.199.39:50010, storageID=DS-1989397900-202.106.199.39-50010-1308723051262, infoPort=50075, ipcPort=50020)) from 192.168.225.19:16129: error: java.io.IOException: verifyNodeRegistration: unknown datanode 202.106.199.3 9:50010 However, neither the "web30.bbn.com.cn" or 202.106.199.39, 202.106.199.3 is the slave node. I think such ip/domains appear because hadoop fails to resolve a node(first in the Intranet DNS server), then it goes to a higher-level DNS server, later to the top, still fails, then the "junk" ip/domains are returned. But I checked my config, it goes like this: /etc/hosts: 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.225.16 master 192.168.225.66 slave1 192.168.225.20 slave5 192.168.225.17 slave17 conf/core-site.xml: hadoop.tmp.dir /root/hadoop_tmp/hadoop_${user.name} fs.default.name hdfs://master:54310 io.sort.mb 1024 hdfs-site.xml: dfs.replication 3 masters: master slaves: master slave1 slave5 slave17 Also, all firewalls(iptables) are turned off, and ssh between each 2 nodes is ok. so I don't know where exact the error comes from. Please help. Thanks a lot.

    Read the article

  • Ubuntu 10.04 hung on unattended apt-get upgrade?

    - by hafichuk
    I'm looking into why our ubuntu web server hung this morning and see that there was some package upgrades a few hours prior to the problem. I was able to ssh into the system and get a snapshot from top: top - 08:13:54 up 210 days, 8:25, 2 users, load average: 433.30, 422.40, 375.70 Tasks: 1192 total, 381 running, 810 sleeping, 0 stopped, 1 zombie Cpu(s): 0.5%us, 6.1%sy, 0.0%ni, 93.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49549772k total, 48518392k used, 1031380k free, 960152k buffers Swap: 11595768k total, 279368k used, 11316400k free, 39355664k cached This is a 16 processor system, so I would typically expect a load in the low teens. I tried to restart apache, which didn't work, and subsequently had to do a hard reboot to get it working again (which it is). One thing I found was that the server did an unattended package update. Is it possible that upgrading php or curl (which our web sites use) might have caused apache to stop responding? Here is the snip from the unattended-upgrades.log from this morning: 2012-09-18 06:48:30,076 INFO Initial blacklisted packages: 2012-09-18 06:48:30,076 INFO Starting unattended upgrades script 2012-09-18 06:48:30,077 INFO Allowed origins are: ["['Ubuntu', 'lucid-security']"] 2012-09-18 06:49:37,017 INFO Packages that are upgraded: gnupg-curl php5-dev linux-server dhcp3-common linux-libc-dev php5-curl gpgv gnupg linux-headers-server linux-image-server php5 php5-mysql php-pear php5-cli php5-common libapache2-mod-php5 dhcp3-client 2012-09-18 06:49:37,018 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg_2012-09-18_06:49:37.017909.log'

    Read the article

  • GPU hung when switching graphic card

    - by Lie Ryan
    I have a laptop (Dell Inspiron N4110) with a switchable graphic. $ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 01:00.0 VGA compatible controller: ATI Technologies Inc NI Whistler [AMD Radeon HD 6600M Series] (rev ff) Normally, my laptop starts with both graphic cards enabled, which caused the laptop to turn very hot and the fan to become very noisy. I have been using a small script to disable the Radeon card. For some time, I'm quite happy with this arrangement. However, I have been having some issues with the Intel card (IGD), the Intel card often randomly hang when running OpenGL apps; and so I want to give the Radeon card (DIS) another chance. I have never been able to switch to the Radeon card, but recently, I found out that if I do a "delayed switching" (DDIS): # echo "DDIS" > /sys/kernel/debug/vgaswitcheroo/switch root@lieryan-dell-ubuntu:/sys/kernel/debug/vgaswitcheroo# cat switch 0:IGD:+:Pwr:0000:00:02.0 1:DIS: :Pwr:0000:01:00.0 then I logoff (i.e. to restart X), the screen switch to pseudo-tty and then it stuck there freezing. At this situation, mouse and keyboard stops working so I can't switch to another ptty. I tried ssh-ing from another computer to salvage logs (dmesg at that point) and whatnot; I found out that when freezing, the active graphic card is the AMD card: -- this is from ssh -- # cat switch 0:IGD: :Off:0000:00:02.0 1:DIS:+:Pwr:0000:01:00.0 but the GPU is apparently hung, looking at dmesg gives: ... [ 1411.649974] vga_switcheroo: client 0 refused switch [ 1411.649985] vga_switcheroo: setting delayed switch to client 1 [ 1423.911759] vga_switcheroo: processing delayed switch to 1 [ 1424.006564] fbcon: Remapping primary device, fb1, to tty 1-63 [ 1424.006799] i915: switched off [ 1424.840351] [drm:drm_mode_getfb] *ERROR* invalid framebuffer id [ 1425.718088] [drm:drm_mode_getfb] *ERROR* invalid framebuffer id [ 1426.622377] [drm:drm_mode_getfb] *ERROR* invalid framebuffer id [ 1427.355683] [drm:drm_mode_getfb] *ERROR* invalid framebuffer id [ 1428.193549] [drm:drm_mode_getfb] *ERROR* invalid framebuffer id ... the invalid framebuffer id error is repeated for many times over ... I were able to successfully recover by switching back to the Intel card and restarting X from ssh; indicating that only the Radeon card has problems switching. System info: $ uname -a Linux lieryan-dell-ubuntu 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:28:43 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 11.10 Release: 11.10 Codename: oneiric The laptop also do not have the option to set graphic card at BIOS and the proprietary driver, fglrx, also have never worked; when I installed it through jockey ("Additional Drivers"), glxinfo showed that it still being rendered by Mesa, the /sys/kernel/debug/vgaswitcheroo directory has gone missing, and the driver crashes with a traceback if I use xorg.conf to tell X to use fglrx. Anyone had any idea if it is possible to use this AMD card either with the radeon or the fglrx driver? logs: dmesg

    Read the article

  • SQL 2005 Transaction Rollback Hung–unresolved deadlock

    - by steveh99999
    Encountered an interesting issue recently with a SQL 2005 sp3 Enterprise Edition system. Every weekend, a full database reindex was being run on this system – normally this took around one and a half hours. Then, one weekend, the job ran for over 17 hours  - and had yet to complete... At this point, DBA cancelled the job. Job status is now cancelled – issue over…   However, cancelling the job had not killed the reindex transaction – DBCC OPENTRAN was still showing the transaction being open. The oldest open transaction in the database was now over 17 hours old.  Consequently, transaction log % used growing dramatically and locks still being held in the database... Further attempts to kill the transaction did nothing. ie we had a transaction which could not be killed. In sysprocesses, it was apparent the SPID was in rollback status, but the spid was not accumulating CPU or IO. Was the SPID stuck ? On examination of the SQL errorlog – shortly after the reindex had started, a whole bunch of deadlock output had been produced by trace flag 1222. Then this :- spid5s      ***Stack Dump being sent to   xxxxxxx\SQLDump0042.txt spid5s      * ******************************************************************************* spid5s      * spid5s      * BEGIN STACK DUMP: spid5s      *   12/05/10 01:04:47 spid 5 spid5s      * spid5s      * Unresolved deadlock spid5s      * spid5s      *   spid5s      * ******************************************************************************* spid5s      * ------------------------------------------------------------------------------- spid5s      * Short Stack Dump spid5s      Stack Signature for the dump is 0x000001D7 spid5s      External dump process return code 0x20000001. Unresolved deadlock – don’t think I’ve ever seen one of these before…. A quick call to Microsoft support confirmed the following bug had been hit :- http://support.microsoft.com/kb/961479 So, only option to get rid of the hung spid – to restart SQL Server… Fortunately SQL Server restarted without any issues. I was pleasantly surprised to see that recovery on this particular database was fast. However, restarting SQL Server to fix an issue is not something I would normally rush to do... Short term fix – the reindex was changed to use MAXDOP of 1. Longer term fix will be to apply the correct CU, or wait for SQL 2005 sp 4 ?? This should be released any day soon I hope..

    Read the article

  • Installing netflix, terminal is getting hung up

    - by Twiggins
    Just got Ubuntu up and running, I am attempting to follow the steps laid our here to install Netflix via Wine. I have done the 3 terminal commands, and I believe the package has downloaded. The commands: sudo apt-add-repository ppa:ehoover/compholio sudo apt-get update sudo apt-get install netflix-desktop After this the terminal spits out a user agreement, and I cannot get past this. Is there some way to minimize the window within the terminal? This is what I see: . Netflix is not yet in my dash. It also appears that a variety of other updates are being held up, I assume, because my terminal is midthought.

    Read the article

  • Keyboard layout hung up

    - by Erlend
    I have a problem with the keyboard layout. I use Ubuntu 12.04. I configured the layout so that I could interchange between a Norwegian and Hebrew keyboard. The system language of my Ubuntu is Norwegian and both my user name and password are written in latin characters. I had been typing Hebrew for some while, then I left the computer for a break. When I came back, I had to unlock the account but then the keyboard layout was locked in a Hebrew keyboard layout and I could not switch back to Norwegian. I tried to reboot the machine and to turn it off and on but not matter what I did I could only type Hebrew letters. So it was impossible for me to login with my own account which had a password written with latin characters. Finally I gave up and installed Ubuntu from scratch. Now I would like to be able to change between Hebrew and Norwegian keyboard layouts but I don't dare to do it before I know what went wrong. Any solutions?

    Read the article

  • Network Interface Lost Functionality after Firewall Installation Hung

    - by Sadeq Dousti
    I tried to install Agnitum Outpost firewall, but the setup hung while installing network drivers. Oddly, the NIC properties shows no connect string whatsoever, nor any services: http://pic-ups.com/images/1fjf.png Device Manager shows problematic drivers as well: www.pic-ups.com/images/2aqa.png Any suggestions? PS: I'm using Windows XP SP3. PS2: I applied instructions below, but all were in vein: www.agnitum.com/support/kb/article.php?id=1000041 www.agnitum.com/support/kb/article.php?id=1000159

    Read the article

  • Outlook 2010 hung on updating outlook.com email account

    - by warren
    For the past week, through restarts of Outlook, and even a bounce of my machine, Outlook has been hung synchronizing my outlook.com account, after having moved messages from a different email inbox into the outlook.com inbox. The old account does not have the moved message any more (those synched-out correctly). The new account does not have the moved messages when acessing via the webui - ie, they are stuck in just Outlook. My problem? I need to reimage this laptop for a friend I need Outlook to finish syncing all those messages out to the hosted email How can I force this to happen?

    Read the article

  • Apache service hung and incoming requests not accepted

    - by Gnanam
    Hi, My production server is running Apache v2.2.4 with mod_mono v1.2.4 on CentOS release 5.2 (Final). Suddenly, Apache service hung during normal usage time (approx. 1 pm EDT). Traffic is not too high at this time. This is the first time we're noticing this kind of behaviour in our server. I noticed from access log that even subsequent requests are also not received, even though there were incoming requests. I then manually tried to invoke my application call from web browser, it never returned successfully but it was still loading. I found no unusual behaviour/activity in: 1) Apache access_log and error_log 2) No kernel level errors found in /var/log/messages I've no other option but ended up restarting Apache service. Any idea on what would cause Apache to hang and thereby not allowing subsequent incoming requests? How do I debug/diagnose when this happens next time? Experts advice/recommendation on this are highly appreciated.

    Read the article

  • server 2008 sp2 hung not responding with kvm reset as well

    - by dasko
    using server 2008 sp2 64bit os standard edition with all updates from ms site. server was hung this morning, resetting kvm did nothing, plugging in another usb mouse on the back did not let the mouse light up red on it's optic end. other machines on the kvm worked fine including the mouse. server is rack mounted 4u supermicro systems or superserver. had to hard power off and restart. any thoughts? i burnt this system in well for a couple of weeks before deploying so it is kind of odd that this happened. any help is greatly appreciated, or if anyone can suggest software to install that can maybe send out the email when something like that goes down. i looked for the minidump but nothing. nothing in the event viewers either. gd

    Read the article

  • Server hung with "blocked for more than 120 seconds", diskless

    - by alterpub
    I have server which hung every 2-5 days. dmesg show following situation: kernel: [490894.231753] INFO: task munin-html:10187 blocked for more than 120 seconds. kernel: [490894.231799] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: [490894.231843] munin-html D 0000000000000000 0 10187 9796 0x00000000 kernel: [490894.231878] ffff88063174b968 0000000000000082 ffff88063174b8d8 0000000000015b80 kernel: [490894.231930] ffff88063174bfd8 0000000000015b80 ffff88063174bfd8 ffff88062fe644d0 kernel: [490894.231982] 0000000000015b80 0000000000015b80 ffff88063174bfd8 0000000000015b80 kernel: [490894.232033] Call Trace: kernel: [490894.232059] [<ffffffff8117fce0>] ? sync_buffer+0x0/0x50 kernel: [490894.232089] [<ffffffff815a1f13>] io_schedule+0x73/0xc0 kernel: [490894.232115] [<ffffffff8117fd25>] sync_buffer+0x45/0x50 kernel: [490894.232143] [<ffffffff815a258f>] __wait_on_bit+0x5f/0x90 kernel: [490894.232170] [<ffffffff8117fce0>] ? sync_buffer+0x0/0x50 kernel: [490894.232197] [<ffffffff815a2638>] out_of_line_wait_on_bit+0x78/0x90 kernel: [490894.232227] [<ffffffff81080250>] ? wake_bit_function+0x0/0x40 kernel: [490894.232255] [<ffffffff8117fcd6>] __wait_on_buffer+0x26/0x30 kernel: [490894.232288] [<ffffffffa00131be>] squashfs_read_data+0x1be/0x520 [squashfs] kernel: [490894.232320] [<ffffffff8114f0f1>] ? __mem_cgroup_try_charge+0x71/0x450 kernel: [490894.232350] [<ffffffffa0013963>] squashfs_cache_get+0x1c3/0x320 [squashfs] kernel: [490894.232381] [<ffffffffa00136eb>] ? squashfs_copy_data+0x10b/0x130 [squashfs] kernel: [490894.232426] [<ffffffff815a3dbe>] ? _raw_spin_lock+0xe/0x20 kernel: [490894.232454] [<ffffffffa0013b68>] ? squashfs_read_metadata+0x48/0xf0 [squashfs] kernel: [490894.232499] [<ffffffffa0013ae1>] squashfs_get_datablock+0x21/0x30 [squashfs] kernel: [490894.232544] [<ffffffffa0015026>] squashfs_readpage+0x436/0x4a0 [squashfs] kernel: [490894.232575] [<ffffffff8111a375>] ? __inc_zone_page_state+0x35/0x40 kernel: [490894.232606] [<ffffffff8110d072>] __do_page_cache_readahead+0x172/0x210 kernel: [490894.232636] [<ffffffff8110d131>] ra_submit+0x21/0x30 kernel: [490894.232662] [<ffffffff811045f3>] filemap_fault+0x3f3/0x450 kernel: [490894.232691] [<ffffffff812bd156>] ? prio_tree_insert+0x256/0x2b0 kernel: [490894.232726] [<ffffffffa009225d>] aufs_fault+0x11d/0x170 [aufs] kernel: [490894.232755] [<ffffffff8111f6d4>] __do_fault+0x54/0x560 kernel: [490894.232782] [<ffffffff81122f39>] handle_mm_fault+0x1b9/0x440 kernel: [490894.232811] [<ffffffff811286f5>] ? do_mmap_pgoff+0x335/0x380 kernel: [490894.232840] [<ffffffff815a7af5>] do_page_fault+0x125/0x350 kernel: [490894.232867] [<ffffffff815a4675>] page_fault+0x25/0x30 Os info cat /etc/issue.net Ubuntu 10.04.2 LTS uname -a Linux Shard1Host3 2.6.35-32-server #68~lucid1-Ubuntu SMP Wed Mar 28 18:33:00 UTC 2012 x86_64 GNU/Linux This system load via ltsp and hasn't harddrives also it has a lot of memory 24Gb. free -m total used free shared buffers cached Mem: 24152 17090 7061 0 50 494 -/+ buffers/cache: 16545 7607 Swap: 0 0 0 I put vmstat info here but I think it won't give results after reboot vmstat 1 30 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 7231156 52196 506400 0 0 0 0 172 143 7 0 92 0 1 0 0 7231024 52196 506400 0 0 0 0 7859 16233 5 0 94 0 0 0 0 7231024 52196 506400 0 0 0 0 7870 16446 2 0 98 0 0 0 0 7230900 52196 506400 0 0 0 0 7308 15661 5 0 95 0 0 0 0 7231100 52196 506400 0 0 0 0 7960 16543 6 0 94 0 0 0 0 7231100 52196 506400 0 0 0 0 7542 16047 5 1 94 0 3 0 0 7231100 52196 506400 0 0 0 0 7709 16621 3 0 96 0 0 0 0 7231220 52196 506400 0 0 0 0 7857 16552 4 0 96 0 0 0 0 7231220 52196 506400 0 0 0 0 7192 15491 6 0 94 0 0 0 0 7231220 52196 506400 0 0 0 0 7423 15792 5 1 94 0 1 0 0 7231260 52196 506404 0 0 0 0 7686 16296 2 0 98 0 0 0 0 7231260 52196 506404 0 0 0 0 6976 15183 5 0 95 0 0 0 0 7231260 52196 506404 0 0 0 0 7303 15600 4 0 95 0 0 0 0 7231320 52196 506404 0 0 0 0 7967 16241 1 0 98 0 0 0 0 7231444 52196 506404 0 0 0 0 6948 15113 6 0 94 0 0 0 0 7231444 52196 506404 0 0 0 0 7931 16181 6 0 94 0 1 0 0 7231516 52196 506404 0 0 0 0 7715 15829 6 0 94 0 0 0 0 7231516 52196 506404 0 0 0 0 7771 16036 2 0 97 0 0 0 0 7231268 52196 506404 0 0 0 0 7782 16202 6 0 94 0 1 0 0 7231212 52196 506404 0 0 0 0 7457 15622 4 0 96 0 0 0 0 7231212 52196 506404 0 0 0 0 7573 16045 2 0 98 0 2 0 0 7231216 52196 506404 0 0 0 0 7689 16076 6 0 94 0 0 0 0 7231424 52196 506404 0 0 0 0 7429 15650 4 0 95 0 3 0 0 7231424 52196 506404 0 0 0 0 7534 16168 3 0 97 0 1 0 0 7230548 52196 506404 0 0 0 0 8559 15926 7 1 92 0 0 0 0 7230672 52196 506404 0 0 0 0 7720 15905 2 0 98 0 0 0 0 7230548 52196 506404 0 0 0 0 7677 16313 5 0 95 0 1 0 0 7230676 52196 506404 0 0 0 0 7209 15432 5 0 95 0 0 0 0 7230800 52196 506404 0 0 0 0 7522 15861 2 0 98 0 0 0 0 7230552 52196 506404 0 0 0 0 7760 16661 5 0 95 0 In the munin(monitoring system) graphs I see(before server hung): Disk(nbd0) IOs per device: read: 289m but avg by week 2.09m Disk(nbd0) throughput per device: read: 4.73k but avg by week 108.76 Disk(nbd0) utilization per device: 100% but avg by week 1.2% Eth0 traffic was low: in/out only 2Mbps Number of threads increased to 566 usually 392 Fork rate 1.08 but usually 2.82 VMStat(processes state) Increased to 17.77(from 0 as far as I could see in the graph) CPU usage iowait 880.46% when usually 6.67% It'll be great if somebody help me to understand what's up.

    Read the article

  • MapReduce job is hung after 1 of 5 reducers completed on single-node environment

    - by Marboni
    I have only one Data Node on my dev environment on EC2. I ran heavy MR job and in 6 hours noticed that 100% of mappers and 20% of reducers finished (1 of reducer shows 100% competition, other ones - 0%). Looks like job is hung between 2 reducer runs. I don't see any errors in log files. What it can be? P.S. Last logs of successfully finished reducer: 2012-11-09 11:29:21,576 INFO org.apache.hadoop.mapred.Task: Task:attempt_201211090523_0004_r_000000_0 is done. And is in the process of commiting 2012-11-09 11:29:22,692 INFO org.apache.hadoop.mapred.Task: Task attempt_201211090523_0004_r_000000_0 is allowed to commit now 2012-11-09 11:29:22,719 INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_201211090523_0004_r_000000_0' to /data/output/1352457275873/20121109-053433-common 2012-11-09 11:29:22,721 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201211090523_0004_r_000000_0' done. 2012-11-09 11:29:22,725 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1

    Read the article

  • kill a hung mount process

    - by John P
    I have a virtual machine drive that ran out of space, so I shutdown the VM, extended the volume using lvextend. After resizing the partition (ext3), I ran e2fsck on it, and it found and corrected errors. Unfortunately, when I ran efsck one more time, there were more errors that had to be fixed. I went through 3 rounds of e2fsck before I decided to try mounting it to clean up some space manually. I tried mounting it, but the mount process hung. I tried to "kill -9" the mount process, but that did not kill it. I killed the parent process, but that did not kill it either. Any ideas on how to kill a rogue mount process? Some evidence: ps -l 13292 F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 4 R 0 13292 1 99 85 0 - 17964 - ? 11:27 mount /dev/mapper/xen7-123p3 /tmp/p3/ lsof -p 13292 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mount 13292 root cwd DIR 9,2 4096 25264129 /root mount 13292 root rtd DIR 9,2 4096 2 / mount 13292 root txt REG 9,2 61656 2916434 /bin/mount mount 13292 root mem REG 9,2 144776 31457282 /lib64/ld-2.5.so mount 13292 root mem REG 9,2 1718232 31457284 /lib64/libc-2.5.so mount 13292 root mem REG 9,2 23360 31457291 /lib64/libdl-2.5.so mount 13292 root mem REG 9,2 43808 31457783 /lib64/libblkid.so.1.0 mount 13292 root mem REG 9,2 247496 31457331 /lib64/libsepol.so.1 mount 13292 root mem REG 9,2 95464 31457337 /lib64/libselinux.so.1 mount 13292 root mem REG 9,2 154640 31457491 /lib64/libdevmapper.so.1.02 mount 13292 root mem REG 9,2 17936 31457472 /lib64/libuuid.so.1.2 mount 13292 root mem REG 9,2 56438208 12684878 /usr/lib/locale/locale-archive mount 13292 root 0u CHR 136,11 0t0 13 /dev/pts/11 (deleted) mount 13292 root 1u CHR 136,11 0t0 13 /dev/pts/11 (deleted) mount 13292 root 2u CHR 136,11 0t0 13 /dev/pts/11 (deleted) umount -f /tmp/p3/ umount2: Invalid argument umount: /tmp/p3/: not mounted

    Read the article

  • How can I keep gnu screen from becoming unresponsive after losing my SSH connection?

    - by Mikey
    I use a VPN tunnel to connect to my work network and then SSH to connect to my work PC running cygwin. Once logged in I can attach to a screen session and everything works great. Now, after a while, I walk away from my computer and sooner or later, the VPN tunnel times out. The SSH connection on each end eventually times out and then I eventually come back to my computer to do some work. Theoretically, this should be a simple matter of just restarting the VPN, reconnecting via SSH, and then running "screen -r -d". However apparently when the sshd daemon times out on the cygwin PC, it leaves the screen session in some kind of hung state. I can reproduce a similar hung state by clicking the close box on a cygwin bash shell window while it's running a screen session. Is there any way to get the screen session to recover once this has happened, so that I don't lose anything?

    Read the article

  • WordPress website getting hung up for 10-15 seconds

    - by synergy989
    Problem I have a website (which loads fine): http://testupg.videve.com/ Go to the blog section (and it begins to load, gets hung up, then loads): http://testupg.videve.com/blog/ It's all one WordPress install. The only pages that are getting hung up are the blog page itself and any article page. The video pages load fine etc. What Have I done and noticed. If I disable all plugins, the blog page loads fine. I narrowed it down to DisplayBuddy plugins (Featured posts, Carousel, Slider)...as soon as I enable any single one of them, the site loads slow. The thing that doesn't make sense is, I disabled the sidebar (deleted it from the template) and the article page still loaded slow and it has ZERO instances of this plugin. I disable the plugin and the article page loads fine. How on earth can this be? I am hoping this case above can give just a little bit of insight! One other thing. The video pages use a custom taxonomy so I am wondering if its something to do with the default wordpress taxonomy. Any help is GREATLY appreciated, I have been at this thing for hours and it's time to call in some support. Cheers

    Read the article

  • Hung JVM consuming 100% CPU

    - by Bogdan
    We have a JAVA server running on Sun JRE 6u20 on Linux 32-bit (CentOS). We use the Server Hotspot with CMS collector with the following options (I've only provided the relevant ones): -Xmx896m -Xss128k -XX:NewSize=384M -XX:MaxPermSize=96m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC Sometimes, after running for a while, the JVM seems to slip into a hung state, whereby even though we don't make any requests to the application, the CPU continues to spin at 100% (we have 8 logical CPUs, so it looks like only one CPU does the spinning). In this state the JVM doesn't respond to SIGHUP signals (kill -3) and we can't connect to it normally with jstack. We CAN connect with "jstack -F", but the output is dodgy (we can see lots of NullPointerExceptions from JStack apparently because it wasn't able to 'walk' some stacks). So the "jstack -F" output seems to be useless. We have run a stack dump from "gdb" though, and we were able to match the thread id that spins the CPU (we found that using "top" with a per-thread view - "H" option) with a thread stack that appears in the gdb result and this is how it looks like: Thread 443 (Thread 0x7e5b90 (LWP 26310)): #0 0x0115ebd3 in CompactibleFreeListSpace::block_size(HeapWord const*) const () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #1 0x01160ff9 in CompactibleFreeListSpace::prepare_for_compaction(CompactPoint*) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #2 0x0123456c in Generation::prepare_for_compaction(CompactPoint*) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #3 0x01229b2c in GenCollectedHeap::prepare_for_compaction() () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #4 0x0122a7fc in GenMarkSweep::invoke_at_safepoint(int, ReferenceProcessor*, bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #5 0x01186024 in CMSCollector::do_compaction_work(bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #6 0x011859ee in CMSCollector::acquire_control_and_collect(bool, bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #7 0x01185705 in ConcurrentMarkSweepGeneration::collect(bool, bool, unsigned int, bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #8 0x01227f53 in GenCollectedHeap::do_collection(bool, bool, unsigned int, bool, int) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #9 0x0115c7b5 in GenCollectorPolicy::satisfy_failed_allocation(unsigned int, bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #10 0x0122859c in GenCollectedHeap::satisfy_failed_allocation(unsigned int, bool) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #11 0x0158a8ce in VM_GenCollectForAllocation::doit() () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #12 0x015987e6 in VM_Operation::evaluate() () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #13 0x01597c93 in VMThread::evaluate_operation(VM_Operation*) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #14 0x01597f0f in VMThread::loop() () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #15 0x015979f0 in VMThread::run() () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #16 0x0145c24e in java_start(Thread*) () from /usr/java/jdk1.6.0_20/jre/lib/i386/server/libjvm.so #17 0x00ccd46b in start_thread () from /lib/libpthread.so.0 #18 0x00bc2dbe in clone () from /lib/libc.so.6 It seems that a JVM thread is spinning while doing some CMS related work. We have checked the memory usage on the box, there seems to be enough memory available and the system is not swapping. Has anyone come across such a situation? Does it look like a JVM bug? UPDATE I've obtained some more information about this problem (it happened again on a server that has been running for more than 7 days). When the JVM entered the "hung" state it stayed like that for 2 hours until the server was manually restarted. We have obtained a core dump of the process and the gc log. We tried to get a heap dump as well, but "jmap" failed. We tried to use jmap -F but then only a 4Mb file was written before the program aborted with an exception (something about the a memory location not being accessible). So far I think the most interesting information comes from the gc log. It seems that the GC logging stopped as well (possibly at the time when the VM thread went into the long loop): 657501.199: [Full GC (System) 657501.199: [CMS: 400352K->313412K(524288K), 2.4024120 secs] 660634K->313412K(878208K), [CMS Perm : 29455K->29320K(68568K)], 2.4026470 secs] [Times: user=2.39 sys=0.01, real=2.40 secs] 657513.941: [GC 657513.941: [ParNew: 314624K->13999K(353920K), 0.0228180 secs] 628036K->327412K(878208K), 0.0230510 secs] [Times: user=0.08 sys=0.00, real=0.02 secs] 657523.772: [GC 657523.772: [ParNew: 328623K->17110K(353920K), 0.0244910 secs] 642036K->330523K(878208K), 0.0247140 secs] [Times: user=0.08 sys=0.00, real=0.02 secs] 657535.473: [GC 657535.473: [ParNew: 331734K->20282K(353920K), 0.0259480 secs] 645147K->333695K(878208K), 0.0261670 secs] [Times: user=0.11 sys=0.00, real=0.02 secs] .... .... 688346.765: [GC [1 CMS-initial-mark: 485248K(524288K)] 515694K(878208K), 0.0343730 secs] [Times: user=0.03 sys=0.00, real=0.04 secs] 688346.800: [CMS-concurrent-mark-start] 688347.964: [CMS-concurrent-mark: 1.083/1.164 secs] [Times: user=2.52 sys=0.09, real=1.16 secs] 688347.964: [CMS-concurrent-preclean-start] 688347.969: [CMS-concurrent-preclean: 0.004/0.005 secs] [Times: user=0.00 sys=0.01, real=0.01 secs] 688347.969: [CMS-concurrent-abortable-preclean-start] CMS: abort preclean due to time 688352.986: [CMS-concurrent-abortable-preclean: 2.351/5.017 secs] [Times: user=3.83 sys=0.38, real=5.01 secs] 688352.987: [GC[YG occupancy: 297806 K (353920 K)]688352.987: [Rescan (parallel) , 0.1815250 secs]688353.169: [weak refs processing, 0.0312660 secs] [1 CMS-remark: 485248K(524288K)] 783055K(878208K), 0.2131580 secs] [Times: user=1.13 sys =0.00, real=0.22 secs] 688353.201: [CMS-concurrent-sweep-start] 688353.903: [CMS-concurrent-sweep: 0.660/0.702 secs] [Times: user=0.91 sys=0.07, real=0.70 secs] 688353.903: [CMS-concurrent-reset-start] 688353.912: [CMS-concurrent-reset: 0.008/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 688354.243: [GC 688354.243: [ParNew: 344928K->30151K(353920K), 0.0305020 secs] 681955K->368044K(878208K), 0.0308880 secs] [Times: user=0.15 sys=0.00, real=0.03 secs] .... .... 688943.029: [GC 688943.029: [ParNew: 336531K->17143K(353920K), 0.0237360 secs] 813250K->494327K(878208K), 0.0241260 secs] [Times: user=0.10 sys=0.00, real=0.03 secs] 688950.620: [GC 688950.620: [ParNew: 331767K->22442K(353920K), 0.0344110 secs] 808951K->499996K(878208K), 0.0347690 secs] [Times: user=0.11 sys=0.00, real=0.04 secs] 688956.596: [GC 688956.596: [ParNew: 337064K->37809K(353920K), 0.0488170 secs] 814618K->515896K(878208K), 0.0491550 secs] [Times: user=0.18 sys=0.04, real=0.05 secs] 688961.470: [GC 688961.471: [ParNew (promotion failed): 352433K->332183K(353920K), 0.1862520 secs]688961.657: [CMS I suspect this problem has something to do with the last line in the log (I've added some "...." in order to skip some lines that were not interesting). The fact that the server stayed in the hung state for 2 hours (probably trying to GC and compact the old generation) seems quite strange to me. Also, the gc log stops suddenly with that message and nothing else gets printed any more, probably because the VM Thread gets into some sort of infinite loop (or something that takes 2+ hours).

    Read the article

  • Ubuntu hardy to intrepid upgrade hung on starting bluetooth

    - by srboisvert
    I have no bluetooth. Preliminary googling indicates that is probably an issue with some usb devices. I had an external drive, a mouse and a network dongle attached. It is just stalled during the Installing the Upgrades phase - the last commands were "Creating device nodes" Cancel will leave the system in a broken state. What next?

    Read the article

  • MySQL replication hung after slave goes offline and comes back online again

    - by Ed Manet
    I have a master server and several slave servers replicating a single database. I am using in MySQL 5.0 in SLES 11. During fault tolerance testing I found that when the slave's network connection is broken (cable un-plugged) and then restored, replication hangs. It shows no errors and the slave appears to be running but the Read_Master_Log_Pos and Exec_Master_Log_Pos values do not match the log postion on the master server. The Slave_IO_State is "Waiting for master to send event". The Slave_IO_Running and Slave_SQL_Running values are both are "Yes". The Master_Log_File and Relay_Master_Log_File match. If I stop and start the slave or restart the mysql daemon, replication starts working again. Any ideas on what I can do about this?

    Read the article

  • hung up troubleshooting packet discards

    - by Chris Satola
    I realize my question is generic, but hopefully someone may have some guidance for me. My network consists of Cisco switches. I am seeing a significant amount (upwards of millions of packets per day) transmit drops between two switches. One being a 3750 and the other a 3560. The peak throughput of this link is only upper 400Mbps, so it shouldn't be a bandwidth issue. At this point, I am sort of clueless where to look or what tools I can use to determine what packets are dropping and why. I can setup a SPAN port on that link and wireshark it, but I don't know if that could tell me anything. Does anyone have any suggestions? Thanks in advance.

    Read the article

  • Hung Java JVM failing to respond to kill -3

    - by Hans
    I have a Java VM that is hanging "randomly". I quote the randomly bit, because there is obviously a reason that the VM is hanging, but the hang does not occur periodically. We have the same software running in different customer environments and in those environments the JVM is not hanging. In the process of attempting to troubleshoot the hang the process exists with zero CPU utilization. I then attempt to execute kill -3 and the kill command hangs. No JVM Thread Dump is produced. I have spent time instrumenting the code to periodically log the thread stack traces hoping to catch the JVM in a state that would indicate where the issue lies, but so far this attempt has not born much fruit. Unfortunately I have not been able to reproduce this issue in my lab environment so I am limited by what can be done at the Customer site. The OS's in question are Red Hat Enterprise 5.4 and SUSE 10 running java version 1.6.0_05-b13 Has anyone had this problem? Any ideas on why kill -3 is failing to produce a Java Thread Dump? Thanks!

    Read the article

  • Git fatal: remote end hung up

    - by Bill
    So I thought I had finally got everything setup on Windows ... then ran into this issue. Current setup URL: ssh://user@host:port/myapp.git Already run Putty - and can connect using valid .ppk keys through the ~/.ssh/authorized_keys direct. In Git and TortoiseGIT - I set both to use "plink.exe". Putty works fine - no issues - but when I run that URL into bash I get for a git clone (url) fatal: the remote end hung up expectedly In a cygwin bash terminal - running "ssh user@host" - works no probs at all. Anyone suggest anything?

    Read the article

  • python: can't terminate a thread hung in socket.recvfrom() call

    - by Dihlofos
    Hello, everyone I cannot get a way to terminate a thread that is hung in a socket.recvfrom() call. For example, ctrl+c that should trigger KeyboardInterrupt exception can't be caught. Here is a script I've used for testing: from socket import * from threading import Thread from sys import exit class TestThread(Thread): def __init__(self,host="localhost",port=9999): self.sock = socket(AF_INET,SOCK_DGRAM) self.sock.bind((host,port)) super(TestThread,self).__init__() def run(self): while True: try: recv_data,addr = self.sock.recvfrom(1024) except (KeyboardInterrupt, SystemExit): sys.exit() if __name__ == "__main__": server_thread = TestThread() server_thread.start() while True: pass The main thread (the one that executes infinite loop) exits. However the thread that I explicitly create, keeps hanging in recvfrom(). Please, help me resolve this.

    Read the article

  • How can I keep gnu screen from becoming unresponsive after losing my SSH connection?

    - by Mikey
    I use a VPN tunnel to connect to my work network and then SSH to connect to my work PC running cygwin. Once logged in I can attach to a screen session and everything works great. Now, after a while, I walk away from my computer and sooner or later, the VPN tunnel times out. The SSH connection on each end eventually times out and then I eventually come back to my computer to do some work. Theoretically, this should be a simple matter of just restarting the VPN, reconnecting via SSH, and then running "screen -r -d". However apparently when the sshd daemon times out on the cygwin PC, it leaves the screen session in some kind of hung state. I can reproduce a similar hung state by clicking the close box on a cygwin bash shell window while it's running a screen session. Is there any way to get the screen session to recover once this has happened, so that I don't lose anything?

    Read the article

  • SQL Server Full-Text Search: Hung processes with MSSEARCH wait type

    - by CheeseInPosition
    We have a SQL Server 2005 SP2 machine running a large number of databases, all of which contain full-text catalogs. Whenever we try to drop one of these databases or rebuild a full-text index, the drop or rebuild process hangs indefinitely with a MSSEARCH wait type. The process can’t be killed, and a server reboot is required to get things running again. Based on a Microsoft forums post[1], it appears that the problem might be an improperly removed full-text catalog. Can anyone recommend a way to determine which catalog is causing the problem, without having to remove all of them? [1] [http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2681739&SiteID=1] “Yes we did have full text catalogues in the database, but since I had disabled full text search for the database, and disabled msftesql, I didn't suspect them. I got however an article from Microsoft support, showing me how I could test for catalogues not properly removed. So I discovered that there still existed an old catalogue, which I ,after and only after re-enabling full text search, were able to delete, since then my backup has worked”

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >