iotop - Page 2 - Developer IT

Per process I/O accounting on AIX

- by ipozgaj

Is there a way of getting per process I/O statistics on AIX, i.e. to get current disk I/O rate of a process? Commands like iostat, nmon, topas etc. can't display such data. Filemon also doesn't help. Actually, what I would need is something much like iotop(1) command on Linux. Update: it seems there is no builtin command(s) to do this. I will most probably make my own by using the SPMI API.

Read the article

How can I pin point a USB file transfer bottleneck in Unix?

- by HankHendrix

I'm experiencing very slow data transfer speeds over USB 2.0 on my nix box and was wondering how I can pin-point the cause of the problem. I've looked into iotop and top but the cpu and mem figures look normal (compared to guides I have checked). The box which is affected is Ubuntu 12.04 32bit Server running on an Asus EEE 701 2G model and I am transferring from the OS over USB 2.0 to an external HDD (which transfers at 30MB/s+ on Windows 7 on other machine). I get rsync write speeds of 1MB/s from OS to USB HDD which seems ridiculously slow. These speeds are consistent with other USB HDDs and sticks.

Read the article

ionice idle is ignored

- by Ferran Basora

I have been testing the ionice command for a while and the idle (3) mode seems to be ignored in most cases. My test is to run both command at the same time: du <big folder> ionice -c 3 du <another big folder> If I check both process in iotop I see no difference in the percentage of io utilization for each process. To provide more information about the CFQ scheduler I'm using a 3.5.0 linux kernel. I started doing this test because I'm experimenting a system lag each time a daily cron job updatedb.mlocate is executed in my Ubuntu 12.10 machine. If you check the /etc/cron.daily/mlocate file you realize that the command is executed like: /usr/bin/ionice -c3 /usr/bin/updatedb.mlocate Also, the funny thing is that whenever my system for some reason starts using swap memory, the updatedb.mlocate io process is been scheduled faster than kswapd0 process, and then my system gets stuck. Some suggestion? References: http://ubuntuforums.org/showthread.php?t=1243951&page=2 https://bugs.launchpad.net/ubuntu/+source/findutils/+bug/332790

Read the article

Why does 'top' say my machine is only 50% idle?

- by Chris Moore

What's going on here? I'm running nothing on the system, iotop and iftop show the network and hard drive are both idle, and top (sorted by %CPU) shows nothing running. So why is the system only 50% idle? What's the other 50% waiting for? How can I find out? top - 12:01:05 up 3 days, 15:03, 1 user, load average: 6.00, 6.01, 6.05 Tasks: 179 total, 1 running, 178 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 0.0%sy, 0.0%ni, 49.7%id, 49.7%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2053996k total, 1992600k used, 61396k free, 81680k buffers Swap: 4092924k total, 10740k used, 4082184k free, 1338636k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1042 deb 20 0 21468 1412 1000 R 1 0.1 0:00.03 top 1 root 20 0 24188 1952 1152 S 0 0.1 0:01.44 init 2 root 20 0 0 0 0 S 0 0.0 0:00.05 kthreadd Update: dmesg shows the printer driver misbehaving: [28858.561847] cnijnetprn[1503]: segfault at 29 ip 00007f56cf3480f7 sp 00007fffb964ec30 error 4 in libcnnet.so.1.2.0[7f56cf345000+9000] [68851.187802] cnijnetprn[9180]: segfault at 29 ip 00007ffe7636a0f7 sp 00007fff9a8b1990 error 4 in libcnnet.so.1.2.0[7ffe76367000+9000] [155412.107826] cnijnetprn[19966]: segfault at 29 ip 00007fc31de770f7 sp 00007fffc03aa8e0 error 4 in libcnnet.so.1.2.0[7fc31de74000+9000] and also some issue with cp: [248041.172067] INFO: task cp:27488 blocked for more than 120 seconds. [248041.172071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [248041.172075] cp D ffffffff81805120 0 27488 27345 0x00000004 [248041.172080] ffff880078d57a38 0000000000000046 ffff880078d579d8 ffffffff81032a79 [248041.172085] ffff880078d57fd8 ffff880078d57fd8 ffff880078d57fd8 0000000000012a40 [248041.172090] ffff88007b818000 ffff880069acc560 ffff880078d57a18 ffff88007f8532c0 [248041.172095] Call Trace: [248041.172104] [<ffffffff81032a79>] ? default_spin_lock_flags+0x9/0x10 [248041.172109] [<ffffffff8110a360>] ? __lock_page+0x70/0x70 [248041.172114] [<ffffffff815f0ecf>] schedule+0x3f/0x60 I did try copying something to the USB stick that's plugged into the router and mounted onto this computer using mount.cifs. That almost always causes everything to lock up, so I'm guessing that's the problem. I'll reboot and stop using mount.cifs.

Read the article

Memory limiting solutions for greedy applications that can crash OS?

- by Hooked

I use my computer for scientific programming. It has a healthy 8GB of RAM and 12GB of swap space. Often, as my problems have gotten larger, I exceed all of the available RAM. Rather than crashing (which would be preferred), it seems Ubuntu starts loading everything into swap, including Unity and any open terminals. If I don't catch a run-away program in time, there is nothing I can do but wait - it takes 4-5 minutes to switch to a command prompt eg. Ctrl-Alt-F2 where I can kill the offending process. Since my own stupidity is out of scope of this forum, how can I prevent Ubuntu from crashing via thrashing when I use up all of the available memory from a single offending program? At-home experiment*! Open a terminal, launch python and if you have numpy installed try this: >>> import numpy >>> [numpy.zeros((10**4, 10**4)) for _ in xrange(50)] * Warning: may have adverse effects, monitor the process via iotop or top to kill it in time. If not, I'll see you after your reboot.

Read the article

Shutdown Hangs for 5 Minutes on Kubuntu 14.04

- by Augustinus

I've had persistent problems with a 5 minute hang at shutdown for the last three versions of Kubuntu (13.04, 13.10, and now 14.04). I suspect this is not a KDE-specific problem. Recently, I performed a fresh installation of Kubuntu 14.04 from a live-USB, and shutdown worked normally for about a week. The hang-up is now happening again, and I can't figure out why. A brief description of the problem: The hang-up occurs with all methods of initiating a normal shutdown: Clicking the shutdown or restart button in KDE, sudo shutdown -h now, sudo reboot The shutdown splash screen appears. Using the down-arrow to access verbose messages, I see "Asking all remaining processes to terminate." This message remains for 5 minutes with no disk activity. Finally, a rapid series of messages flurries to the screen: * All processes ended within 300 seconds... [ OK ] nm-dispatcher.action: Caught signal 15, shutting down... ModemManager[852]: <warn> Could not acquire the 'org.freedesktop.ModemManager1' service name ModemManager[852]: <info> ModemManager is shut down * Deactivating swap... [ OK ] * Unmounting local filesystems... [ OK ] * Will now restart` Possible Sources of the Problem: Before the problem re-appeared, I have mainly been doing routine computing. I have kept the system up-to-date using apt-get upgrade and apt-get dist-upgrade. The only other notable incident was a power failure. I do not have the computer connected to a UPS, so the power failure resulted in an immediate shutdown. Could this have corrupted an important file which must be accessed at shutdown? Is there any way that could cause a 5-minute hang-up? Here is a list of packages that have been updated before the problem appeared: bash iotop dpkg dpkg-dev python3-software-properties libdpkg-perl software-properties-kde software-properties-common akonadi-backend-mysql libakonadiprotocolinternals1 akonadi-server firefox-locale-en firefox flashplugin-installer libqapt2 libqapt2-runtime thunderbird openjdk-7-jre-headless thunderbird-locale-en kubuntu-driver-manager qapt-deb-installer openjdk-7-jre qapt-batch icedtea-7-jre-jamvm libelf1 dpkg dpkg-dev libdpkg-perl libjbig0 gettext-base libgettextpo-dev libssl1.0.0 libgettextpo0 libasprintf-dev linux-headers-3.13.0-24 gettext libasprintf0c2 linux-headers-3.13.0-24-generic openssl linux-libc-dev gstreamer0.10-qapt kubuntu-desktop linux-image-extra-3.13.0-24-generic linux-image-3.13.0-24-generic I would appreciate any help with this.

Read the article

What kind of “sysadmin stuff” should I show to students during a talk?

- by Gregory Eric Sanderaon

A teacher asked me If I could talk about my job as a linux sysadmin in his class. The course is called "Introduction to Operating systems" and i've been given 45 minutes to talk. The students are beginning their second year, so they've had a bit of experience with programming in different languages. What i'm like to do is show a series of hands-on examples of the kinds of things I do on a regular basis. I've already got a few ideas jotted down, but I'm afraid that they might be either too advanced or too simple for the students to appreciate. Another concern is that a topic might be too long to explain and use too much time overall. Here are a few ideas : Program deployment using version control (git in my case) filtering apache logs using grep, awk, uniq, tail A couple of bash scripts that i've made for various stuff on servers live montitoring (htop, iotop, iptraf) creating databases and assigning roles in mysql/postgresql So, are these ideas any good ? Do you have better ideas ? are the ideas too simple and should I go for more "advanced" stuff ?

Read the article

SAN with iSCSI-Target Performance Horrendous

- by Justin

We have a poor man's SAN setup in a 1U Ubuntu server running iSCSI-Target with two 300GB drives in RAID-0. We then are using it for block level storage for virtual machines. The hypervisor is connected to the SAN via gigabit on a dedicated VLAN and interfaces. We only have a single virtual machine setup and doing some benchmarks. If we run hdparm -t /dev/sda1 from the virtual machine, we get 'ok' performance of 75MB/s from the virtual machine to the SAN. Then we basically compile a package with ./configure and make. Things start ok, but then all the sudden the load average on the SAN grows to 7+ and things slow down to a crawl. When we SSH into the SAN and run top, sure the load is 7+, but the CPU usage is basically nothing, also the server has 1.5GB of memory available. When we kill the compile on the virtual machine, slowly the LOAD on the SAN goes back to sub 1 figures. What in the world is causing this? How can we diagnosis this further? Here are two screenshot from the SAN during high load. 1> Output of iotop on the SAN: 2> Output of top on the SAN:

Read the article

webserver horrible slow, sometimes incredible fast

- by dhanke

i am running a small community ( 6000+ Members ) on a non-virtual 64-bit ubuntu 11.04 system. I am not a Linux-pro, not even advanced, i just tried to setup a webserver, which does nothing special actually. Delivering some dynamic PHP and RoR websites is its task. So it might be that my configuration files do look horrible bad. Also, i might use the wrong vocabulary, so in doubt, please ask. Having a current all-time record of 520 registered users (board-accounts, no system-users) online at same time, average server-load is about 2.0 - 5.0. Meantime (~250 users) average server load value is at about 0.4 - 0.8, sometimes, on some expensive searches a bit higher. everything fine. From time to time however, the load increases up to 120 (120.0, not 12.0 ;) ). In this time, its hard to even connect via SSH, but when i reach the server, and use top/htop/iotop to see whats happening, i cannot identify any process causing high CPU load. iotop tells me about a current reading/writing speed of about approx. 70kb/s, which is quite equal to power-off i think. Memory-Usage is max. at ~ 12GB of 16GB, so swap remains empty. now the odd (at least for me:) waiting some minutes ( since i always get a bit into a panic when this happens, it feels like 5 minutes, but i suppose its more like 20-30 minutes) and the server is back to normal. everything continues as normal. another odd fact: when i run hdparm -tT /dev/sda, i get answer like: /dev/sda: Timing cached reads: 7180 MB in 2.00 seconds = 3591.13 MB/sec Timing buffered disk reads: 348 MB in 3.02 seconds = 115.41 MB/sec when i run the same command while the server is "frozen", the answer is like /dev/sda: <- takes about 5 minutes until this line appears Timing cached reads: 7180 MB in 2.00 seconds = 3591.13 MB/sec <- 5 more minutes Timing buffered disk reads: 348 MB in 3.02 seconds = 115.41 MB/sec <- another 5 minutes so the values are the same, but the quoted time is completely wrong. using time command as prefix also tells me that ~ 15 minutes were used. I searched in dmesg, /var/log/[messages|syslog] - nothing found. /var/log/errors however tells me that: Jul 4 20:28:30 localhost kernel: [19080.671415] INFO: task php5-fpm:27728 blocked for more than 120 seconds. Jul 4 20:28:30 localhost kernel: [19080.671419] "echo 0 /proc/sys/kernel/hung_task_timeout_secs" disables this message. multiple times. now that message does tell me that php5-fpm task was blocked or did block ? - but not if that is the cause or just one of the results of that "freeze". Anyone? to cut the long story short, i dont know where even to start analyzing. So if you can give me any advice by looking at following specs and configs, or ask me to provide more information, i`d be glad. Specs: 6 Core AMD Phenom(tm) II X6 1055T Processor * 16 Gigabyte Ram 2x 1.5 TB Seagate ST1500DL003-9VT16L via SATA 3 via SoftwareRaid (i suppose) Services: (due to service --status-all, those with [ + ]) nginx Webserver 1.0.14 mySQL 5.1.63 Server Ruby on Rails 2.3.11 ( passenger-nginx-module ) php5-fpm 5.3.6-13ubuntu3.7 SSH ido2db Further services: default crontab + nightly backup. syslog-ng Website consists of 2 subdomains, forum. and www. where forum is a phpBB3.x PHP-Board, and www a Ruby on Rails 2.3.11 application (portal). Mini-Note: sometimes i notice that the forum is pretty slow, in contrast to the always-fast (except for this "freeze") portal. Both share the same Database, but the portal is using it read-only. The Webserver is nginx, using phusion passenger module to communicate with the ruby-application. Also, for the forum it communicates with php5-fpm via socket: relevant nginx configuration parts ( with comments/questions starting by ; ) ; in case of freeze due to too high Filesystem activity, maybe adding a limit? #worker_rlimit_nofile 50000; user www-data; ; 6 cores, so i read 6 fits. maybe already wrong? worker_processes 6; pid /var/run/nginx.pid; events { worker_connections 1024; } http { passenger_root /var/lib/gems/1.8/gems/passenger-3.0.11; passenger_ruby /usr/bin/ruby1.8; ; the forum once featured a chat, which was working w/o websockets. ; so it was a hell of pull requests (deactivated now, freeze still happening) keepalive_timeout 65; keepalive_requests 50; gzip on; server { listen 80; server_name www.domain.tld; root /var/www/domain/rails/public; passenger_enabled on; } server { listen 80; server_name forum.domain.tld; location / { root /var/www/domain/forum; index index.php; } ; satic stuff to be handled by nginx location ~* ^/style/.+.(jpg|jpeg|gif|css|png|js|ico|xml)$ { access_log off; expires 30d; root /var/www/domain/forum/; } ; now the php magic, note the "backend"-fcgi_pass location ~ .php$ { fastcgi_split_path_info ^(.+\.php)(.*)$; fastcgi_pass backend; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /var/www/domain/forum$fastcgi_script_name; include fastcgi_params; fastcgi_param QUERY_STRING $query_string; fastcgi_param REQUEST_METHOD $request_method; fastcgi_param CONTENT_TYPE $content_type; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_intercept_errors on; fastcgi_ignore_client_abort off; fastcgi_connect_timeout 60; fastcgi_send_timeout 180; fastcgi_read_timeout 180; fastcgi_buffer_size 128k; fastcgi_buffers 256 16k; fastcgi_busy_buffers_size 256k; fastcgi_temp_file_write_size 256k; fastcgi_max_temp_file_size 0; } location ~ /\.ht { deny all; } } ;the php5-fpm socket. i read that /dev/shm/ whould be the fastes place for this. bad idea in general? upstream backend { server unix:/dev/shm/phpfpm; } ... } php5-fpm settings (i changed this values due to php5-fpm error log messages higher and higher.. (freeze-problem was there before as well)* listen = /dev/shm/phpfpm user = www-data group = www-data pm = dynamic ; holy, 4000! well, shinking this value to earth-level gave me ; 100s of 502 bad gateway commands. this values were quite stable. ; since there are only max 520 users online i dont get it, why i would need ; as many children as configured here. due to keep-alive maybe? ; asking questions is easier for me since restarting server will make ; my community-members angry ;) pm.max_children = 4000 pm.start_servers = 100 pm.min_spare_servers = 50 pm.max_spare_servers = 150 pm.max_requests = 10 pm.status_path = /status ping.path = /ping ping.response = pong slowlog = log/$pool.log.slow ;should i use rlimit? ;rlimit_files = 1024 chdir = / mysql/my.cnf [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] user = mysql socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp skip-external-locking bind-address = 127.0.0.1 key_buffer = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 myisam-recover = BACKUP ; high number, but less gives some phpBB errors. max_connections = 450 table_cache = 512 ; i read twice the cpu cores, bad? thread_concurrency = 12 join_buffer_size = 2084K concurrent_insert = 3 query_cache_limit = 64M query_cache_size = 512M query_cache_type = 1 log_error = /var/log/mysql/error.log log_slow_queries = /var/log/mysql/mysql-slow.log long_query_time = 2 expire_logs_days = 10 max_binlog_size = 100M low_priority_updates=1 [mysqldump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M !includedir /etc/mysql/conf.d/ I used smartctl already, hdds seem to be fine. /proc/mdstatus quotes: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sda3[1] 1459264192 blocks [2/1] [_U] md1 : active raid1 sda1[0] 3911680 blocks [2/1] [U_] unused devices: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 127727 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 127727 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I quote some questions in my configuration files, these are not (intentional) directly problem-related, but would be nice for me to know wether they are indeed questionable or done right. One additional Fact: my MYSQL-database is at 12GB size. i dont know if that does matter, but mytop sometimes shows me 4-5 seconds long insert queries, some are 20-30 seconds long. Its just a feeling that i am unable to prove (because i dont know how), but when i disable the database, the freeze seems not to happen. Example: i created a dummy rails application to see the development log. the app made some sql-queries, reads and inserts. the log quite often was like: DbTest Load (0.3ms) SELECT * FROM `db_test` WHERE (`db_test`.`id` = 31722) LIMIT 1 SQL (0.1ms) BEGIN DbTest Update (0.3ms) UPDATE `db_test` SET `updated_at` = '2012-07-04 23:32:34' WHERE `id` = 31722 - now the log stands still for 5-60 seconds. SQL (49.1ms) COMMIT - SQL-Update time in the log does not include freeze time Rendering test/index Completed in 96ms (View: 16, DB: 59) | 200 OK [http://localhost:9000/test] Bad part is: this mini-freeze here only happens from time to time as well. note: meanwhile i cannot even upload files via scp. I currently feel like running form bad to worse and back by googling for my server-problem due to immense lack of knowledge regarding server configurations. It still makes me wonder, why those problems even appear, since 250 users a time is not such a high amount, right? So my questions: whats wrong and how to fix? ;) or: what information can i provide to make the situation more clear? can you point at some critical bad configuration-line which i should consider to catch up in the documentation? are there any tools i can run to see some possible bottlenecks? any further advice? (next to: "pay someone who knows what he does" - its a private project, server costs enough already. :)) Thanks for your time and help. Best Regards, Daniel P.S.: i renamed the configfiles to domain.tld since i dont want to have any % more load to the server until its fixed. might be a exaggeratedly thought.. P.P.S: if i asked a complete duplicate question, sorry. my search results seemed to be quite specific in their own way.

Read the article

MySQL is killing the server IO.

- by OneOfOne

I manage a fairly large/busy vBulletin forums (running on gigenet cloud), the database is ~ 10 GB (~9 milion posts, ~60 queries per second), lately MySQL have been grinding the disk like there's no tomorrow according to iotop and slowing the site. The last idea I can think of is using replication, but I'm not sure how much that would help and worried about database sync. I'm out of ideas, any tips on how to improve the situation would be highly appreciated. Specs : Debian Lenny 64bit ~12Ghz (6x2GHz) CPU, 7520gb RAM, 160gb disk. Kernel : 2.6.32-4-amd64 mysqld Ver 5.1.54-0.dotdeb.0 for debian-linux-gnu on x86_64 ((Debian)) Other software: vBulletin 3.8.4 memcached 1.2.2 PHP 5.3.5-0.dotdeb.0 (fpm-fcgi) (built: Jan 7 2011 00:07:27) lighttpd/1.4.28 (ssl) - a light and fast webserver PHP and vBulletin are configured to use memcached. MySQL Settings : [mysqld] key_buffer = 128M max_allowed_packet = 16M thread_cache_size = 8 myisam-recover = BACKUP max_connections = 1024 query_cache_limit = 2M query_cache_size = 128M expire_logs_days = 10 max_binlog_size = 100M key_buffer_size = 128M join_buffer_size = 8M tmp_table_size = 16M max_heap_table_size = 16M table_cache = 96 Other : From the cloud's IO chart, we're averaging 100mb/s read. > vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 9 0 73140 36336 8968 1859160 0 0 42 15 3 2 6 1 89 5 > /etc/init.d/mysql status Threads: 49 Questions: 252139 Slow queries: 164 Opens: 53573 Flush tables: 1 Open tables: 337 Queries per second avg: 61.302. moved from superuser

Read the article

no disk io but iowait very high

- by Dan

there is no disk io going results of iotop Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO< COMMAND 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [3] 1930 be/4 named 0.00 B/s 0.00 B/s 0.00 % 0.00 % named -u ~d/run-root 1931 be/4 named 0.00 B/s 0.00 B/s 0.00 % 0.00 % named -u ~d/run-root 1932 be/4 named 0.00 B/s 0.00 B/s 0.00 % 0.00 % named -u ~d/run-root 1933 be/4 named 0.00 B/s 0.00 B/s 0.00 % 0.00 % named -u ~d/run-root 1810 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % sh /usr/b~user=mysql 9795 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 0.00 % httpd 8004 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 0.00 % httpd 3226 be/4 postfix 0.00 B/s 0.00 B/s 0.00 % 0.00 % tlsmgr -l -t unix -u 8154 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 0.00 % httpd 9759 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % find -name php.ini 9249 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 0.00 % httpd 1756 be/4 postfix 0.00 B/s 0.00 B/s 0.00 % 0.00 % psa-pc-re~@localhost 1863 be/4 mysql 0.00 B/s 0.00 B/s 0.00 % 0.00 % mysqld --~mysql.sock 3123 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % crond 1758 be/4 postfix 0.00 B/s 0.00 B/s 0.00 % 0.00 % psa-pc-re~@localhost 1865 be/4 mysql 0.00 B/s 0.00 B/s 0.00 % 0.00 % mysqld --~mysql.sock 1592 be/4 sw-cp-se 0.00 B/s 0.00 B/s 0.00 % 0.00 % sw-cp-ser~ver/config 7612 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % sshd: root@pts/0 7614 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % sftp-server 7615 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % -bash 1602 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % sshd 8003 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % httpd but iowait very high ? iostat report avg-cpu: %user %nice %system %iowait %steal %idle 0.83 0.00 0.13 13.83 0.00 85.20 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn server runs like a snail what could be wrong here ? thanks

Read the article

Nginx + uWSGI + Django performance stuck on 100rq/s

- by dancio

I have configured Nginx with uWSGI and Django on CentOS 6 x64 (3.06GHz i3 540, 4GB), which should easily handle 2500 rq/s but when I run ab test ( ab -n 1000 -c 100 ) performance stops at 92 - 100 rq/s. Nginx: user nginx; worker_processes 2; events { worker_connections 2048; use epoll; } uWSGI: Emperor /usr/sbin/uwsgi --master --no-orphans --pythonpath /var/python --emperor /var/python/*/uwsgi.ini [uwsgi] socket = 127.0.0.2:3031 master = true processes = 5 env = DJANGO_SETTINGS_MODULE=x.settings env = HTTPS=on module = django.core.handlers.wsgi:WSGIHandler() disable-logging = true catch-exceptions = false post-buffering = 8192 harakiri = 30 harakiri-verbose = true vacuum = true listen = 500 optimize = 2 sysclt changes: # Increase TCP max buffer size setable using setsockopt() net.ipv4.tcp_rmem = 4096 87380 8388608 net.ipv4.tcp_wmem = 4096 87380 8388608 net.core.rmem_max = 8388608 net.core.wmem_max = 8388608 net.core.netdev_max_backlog = 5000 net.ipv4.tcp_max_syn_backlog = 5000 net.ipv4.tcp_window_scaling = 1 net.core.somaxconn = 2048 # Avoid a smurf attack net.ipv4.icmp_echo_ignore_broadcasts = 1 # Optimization for port usefor LBs # Increase system file descriptor limit fs.file-max = 65535 I did sysctl -p to enable changes. Idle server info: top - 13:34:58 up 102 days, 18:35, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3983068k total, 2125088k used, 1857980k free, 262528k buffers Swap: 2104504k total, 0k used, 2104504k free, 606996k cached free -m total used free shared buffers cached Mem: 3889 2075 1814 0 256 592 -/+ buffers/cache: 1226 2663 Swap: 2055 0 2055 **During the test:** top - 13:45:21 up 102 days, 18:46, 1 user, load average: 3.73, 1.51, 0.58 Tasks: 122 total, 8 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 93.5%us, 5.2%sy, 0.0%ni, 0.2%id, 0.0%wa, 0.1%hi, 1.1%si, 0.0%st Mem: 3983068k total, 2127564k used, 1855504k free, 262580k buffers Swap: 2104504k total, 0k used, 2104504k free, 608760k cached free -m total used free shared buffers cached Mem: 3889 2125 1763 0 256 595 -/+ buffers/cache: 1274 2615 Swap: 2055 0 2055 iotop 30141 be/4 nginx 0.00 B/s 7.78 K/s 0.00 % 0.00 % nginx: wo~er process Where is the bottleneck ? Or what am I doing wrong ?

Read the article

Reducing IO caused by nginx

- by glumbo

I have a lot of free RAM but my IO is always 100 %util or very close. What ways can I reduce IO by using more RAM? My iotop shows nginx worker processes with the highest io rate. This is a file server serving files ranging from 1mb to 2gb. Here is my nginx.conf #user nobody; worker_processes 32; worker_rlimit_nofile 10240; worker_rlimit_sigpending 32768; error_log logs/error.log crit; #pid logs/nginx.pid; events { worker_connections 51200; } http { include mime.types; default_type application/octet-stream; access_log off; limit_conn_log_level info; log_format xfs '$arg_id|$arg_usr|$remote_addr|$body_bytes_sent|$status'; sendfile off; tcp_nopush off; tcp_nodelay on; directio 4m; output_buffers 3 512k; reset_timedout_connection on; open_file_cache max=5000 inactive=20s; open_file_cache_valid 30s; open_file_cache_min_uses 2; open_file_cache_errors on; client_body_buffer_size 32k; server_tokens off; autoindex off; keepalive_timeout 0; #keepalive_timeout 65;

Read the article

Ubuntu lucid, maverick high iowait

- by netom

I'm using Ubuntu, and I've got the same problem with Lucid and Maverick. From time to time, especially a few minutes after boot, the iowait goes between 50-100% and the box is unusable. Everything that tries to access the disk freezes. I have the following setup: Hard disk: Model Family: Western Digital Caviar Green family Device Model: WDC WD15EADS-00P8B0 Serial Number: WD-WMAVU0391287 Firmware Version: 01.00A01 User Capacity: 1.500.301.910.016 bytes I have a quad core Intel Core2 Q6600 processor, and 4G of memory. When the high iowait occurs, usually 4 processes are active: kdmflush (two procs) jbd2/dm-0-8 jbd2/db-1-8 and a few more starving user processes of course. I know this from top and iotop. Any suggestions about why this is happening? There are a lot of q/a-s about linux and high iowait, but none of them helped so far, I even tweaked the hard disk not to park the head in every 8 seconds (Load cycle count is 50334!!!!! :o ), but nothing. Problem persists. Thank you in advance.

Read the article

recursive grep started at / hangs

- by Martin

I have used following grep search pattern on multiple platforms: grep -r -I -D skip 'string_to_match' / For example on FreeBSD 8.0, FreeBSD 6.4 and Debian 6.0(squeeze). Command does a recursive search starting from root directory, assumes that binary files do not have the 'string_to_match' and skips devices, sockets and named pipes. FreeBSD 8.0 and FreeBSD 6.4 use GNU grep version 2.5.1 and Debian 6.0 uses GNU grep version 2.6.3. On FreeBSD 6.4, last information printed to stderr was "grep: /dev/cuad0: Device busy". After this grep just idles as according to "top -m io -o total" the I/O usage of grep is nonexistent. Same behavior is true under FreeBSD 8.0, but last information sent to stderr is "grep: /tmp/.wine-0: Permission denied" on my installation. In case of Debian, last output to stderr is "grep: /proc/sysrq-trigger: Input/output error". If I check the I/O usage of grep process under Debian, it is following: root@Debian:~# iotop -bp 22439 Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 22439 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % grep -r -I -D skip 10.10.10.99 / Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 22439 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % grep -r -I -D skip 10.10.10.99 / Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND 22439 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % grep -r -I -D skip 10.10.10.99 / ^Croot@Debian:~# What might cause this? Is there a way to view which file grep is currently processing in case lsof is not present? I'm able to use lsof under Debian and looks like the problematic file name there is "0xc6b2c230 file struct, ty=0, op=0xc0d34120". I'm not sure what this is.. I'm not able to use lsof or fstat under FreeBSD. PS: I know I could use find utility, but this is not the question.

Read the article

MySQL is killing the server IO.

- by OneOfOne

I manage a fairly large/busy vBulletin forums (running on gigenet cloud), the database is ~ 10 GB (~9 milion posts, ~60 queries per second), lately MySQL have been grinding the disk like there's no tomorrow according to iotop and slowing the site. The last idea I can think of is using replication, but I'm not sure how much that would help and worried about database sync. I'm out of ideas, any tips on how to improve the situation would be highly appreciated. Specs : Debian Lenny 64bit ~12Ghz (6 cores) CPU, 7520gb RAM, 160gb disk. Kernel : 2.6.32-4-amd64 mysqld Ver 5.1.54-0.dotdeb.0 for debian-linux-gnu on x86_64 ((Debian)) Other software: vBulletin 3.8.4 memcached 1.2.2 PHP 5.3.5-0.dotdeb.0 (fpm-fcgi) (built: Jan 7 2011 00:07:27) lighttpd/1.4.28 (ssl) - a light and fast webserver PHP and vBulletin are configured to use memcached. MySQL Settings : [mysqld] key_buffer = 128M max_allowed_packet = 16M thread_cache_size = 8 myisam-recover = BACKUP max_connections = 1024 query_cache_limit = 2M query_cache_size = 128M expire_logs_days = 10 max_binlog_size = 100M key_buffer_size = 128M join_buffer_size = 8M tmp_table_size = 16M max_heap_table_size = 16M table_cache = 96 Other : > vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 9 0 73140 36336 8968 1859160 0 0 42 15 3 2 6 1 89 5 > /etc/init.d/mysql status Threads: 49 Questions: 252139 Slow queries: 164 Opens: 53573 Flush tables: 1 Open tables: 337 Queries per second avg: 61.302. Edit Additional info.

Read the article

How to diagnose storage system scaling problems?

- by Unknown

We are currently testing the maximum sequential read throughput of a storage system (48 disks total behind two HP P2000 arrays) connected to HP DL580 G7 running RHEL 5 with 128 GB of memory. Initial testing has been mainly done by running DD-commands like this: dd if=/dev/mapper/mpath1 of=/dev/null bs=1M count=3000 In parallel for each disk. However, we have been unable to scale the results from one array (maximum throughput of 1.3 GB/s) to two (almost the same throughput). Each array is connected to a dedicated host bust adapter, so they should not be the bottleneck. The disks are currently in JBOD configuration, so each disk can be addressed directly. I have two questions: Is running multiple DD commands in parallel really a good way to test maximum read throughput? We have noticed very high SWAPIN-% numbers in iotop, which I find hard to explain because the target is /dev/null How shoud we proceed in trying to find the reason for the scaling problem? Do you thing the server itself is the bottleneck here, or could there be some linux parameters that we have overlooked?

Read the article

Frequent freezes with constant disk activity on SSD netbook

- by SamsLembas

I am running Arch Linux on an HP Mini 1000 with a SSD. The machine is a little under a year old and fairly heavily used. About a month ago the machine started freezing up. During the freezes, the system is almost completely unresponsive, seemingly especially for disk-intensive tasks such as launching an application for the first time since reboot. The disk activity led is always constantly illuminated during the freezes. After somewhere between 30 sec and 3 minutes, the machine returns to normal operation. I am pretty sure that the SSD is the source or the problem. Iotop reports a disk transfer rate of 0 during the freezes, so I think it must be getting "stuck" and simply not performing any r/w during the time. I can't seem to find any information on these symptoms on the Internet, so any input on exactly what might be the cause of this would be greatly appreciated. The machine is under warranty, but I would rather not deal with HP until I actually know what is going on. Thanks.

Read the article

How to keep subtree removal (`rm -rf`) from starving other processes for Disk I/O?

- by David Eyk

We have a very large (multi-GB) Nginx cache directory for a busy site, which we occasionally need to clear all at once. I've solved this in the past by moving the cache folder to a new path, making a new cache folder at the old path, and then rm -rfing the old cache folder. Lately, however, when I need to clear the cache on a busy morning, the I/O from rm -rf is starving my server processes of disk access, as both Nginx and the server it fronts for are read-intensive. I can watch the load average climb while the CPUs sit idle and rm -rf takes 98-99% of Disk IO in iotop. I've tried ionice -c 3 when invoking rm, but it seems to have no appreciable effect on the observed behavior. Is there any way to tame rm -rf to share the disk more? Do I need to use a different technique that will take its cues from ionice? Update: The filesystem in question is an AWS EC2 instance store (the primary disk is EBS). The /etc/fstab entry looks like this: /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2

Read the article

High disk I/O activity in CentOS server

- by triiim

I have about 16 websites in a CentOS dedicated, and I am having some problems on high traffic hours, it seems to be a high disk I/O activity causing a general slowdown. I've installed atop and this is what I see on the bottom (the server has been restarted thats why the values are so low): *** system and process activity since boot *** PID RDDSK WRDSK WCANCL DSK CMD 1/18 2176 1.7G 7.3G 854.4M 39 mysqld 671 1248K 3.0G 0K 13 flush-8:0 566 0K 1.1G 0K 5 jbd2/sda2-8 2401 124.2M 529.1M 22408K 3 crond 2032 2.2G 502.0M 0K 12 nginx 2360 425.8M 115.3M 4188K 2 httpd flush-8:0 and jbd2/sda2-8 are the processes I see with iotop using 99% on the IO column, and they are the processes that write the most on the hdd (after mysql). From what I saw in google this could be caused by some ext4 related bug, the current kernel is: Linux srvr.com 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun 27 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux I asked the hosting support to update the kernel and they tried but they now say that the server wont boot with the new installed kernel and they had to go back to the previous, they are not helping very much. Does someone has any idea how could I solve the high disk usage caused by flush-8:0 and jbd2/sda2-8 processes?

Read the article

Server down at 23:26 every night

- by miccet

We are having a big problem with our sites stability the last couple of weeks and after endless hours of troubleshooting I don't get anywhere. So I turn to you dear community. Setup: 2 x VPS servers - Front end, 8 core, 8G RAM. - Database, 5 core, 3G RAM. Both running Ubuntu. Ruby on Rails EE with Passenger 3 and Rails 2.3.11. MySQL 5.1.67. The problem is that each night, at the exact same time (23:26) the SQL server suddenly shows a processlist full of COMMIT with an increasing Time. After 30-40 seconds (can go longer) a wave seems processed and the site responds for a few seconds before it repeats. During this hick up the database server load spikes while the front end is relaxing. I have looked at slow queries, but is not finding any locks or other unusual queries ran at this time. I have looked at iotop at the time of the halt and there is no activity from mysql. I also tried turning off query_cache and messed around with the mysql configuration file without much change. Any ideas?

Read the article

MySQL Extremely High Disk Activity (Read Operations)

- by Jake Schoermer

I have 1GB Linode VPS with a standard LAMP stack. Apache is tuned fine but for some reason MySQL's disk usage is high. This is causing really slow site load times. RAM and CPU usage are fine. Can anyone give me any pointers on tuning mysql's disk performance? I'm using InnoDB. iotop output is below. Total DISK READ: 38.50 M/s | Total DISK WRITE: 27.20 K/s TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND 9808 be/4 mysql 22.40 M/s 0.00 B/s 0.00 % 63.75 % mysqld 10045 be/4 mysql 2.06 M/s 0.00 B/s 0.00 % 26.65 % mysqld 9987 be/4 mysql 1694.38 K/s 0.00 B/s 0.00 % 18.33 % mysqld 10015 be/4 mysql 1554.47 K/s 0.00 B/s 0.00 % 12.71 % mysqld 10019 be/4 mysql 1461.21 K/s 0.00 B/s 0.00 % 5.58 % mysqld 9839 be/4 mysql 1383.48 K/s 0.00 B/s 0.00 % 25.69 % mysqld 10031 be/4 mysql 1243.58 K/s 0.00 B/s 0.00 % 5.68 % mysqld 10023 be/4 mysql 1057.04 K/s 0.00 B/s 0.00 % 2.02 % mysqld 10020 be/4 mysql 1025.95 K/s 0.00 B/s 0.00 % 7.05 % mysqld 10001 be/4 mysql 808.33 K/s 683.97 K/s 0.00 % 1.16 % mysqld 10025 be/4 mysql 746.15 K/s 0.00 B/s 0.00 % 3.28 % mysqld 10043 be/4 mysql 715.06 K/s 0.00 B/s 0.00 % 0.48 % mysqld 10044 be/4 mysql 672.31 K/s 0.00 B/s 0.00 % 5.25 % mysqld 10034 be/4 mysql 668.42 K/s 1989.73 K/s 0.00 % 5.31 % mysqld 9985 be/4 mysql 450.80 K/s 124.36 K/s 0.00 % 8.83 % mysqld 9989 be/4 mysql 357.53 K/s 0.00 B/s 0.00 % 5.21 % mysqld 10033 be/4 mysql 186.54 K/s 0.00 B/s 0.00 % 1.59 % mysqld 10021 be/4 mysql 155.45 K/s 435.25 K/s 0.00 % 1.23 % mysqld 10007 be/4 mysql 124.36 K/s 0.00 B/s 0.00 % 0.53 % mysqld 9763 be/4 www-data 38.86 K/s 0.00 B/s 0.00 % 4.56 % apache2 -k start 10027 be/4 mysql 31.09 K/s 0.00 B/s 0.00 % 4.24 % mysqld 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd] 3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0] 4 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0] 5 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u:0] 6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0] 7 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1]

Read the article

Postmaster uses excessive CPU and Disk Writes

- by wolfcastle

using PostgreSQL 9.1.2 I'm seeing excessive CPU usage and large amounts of writes to disk from postmaster tasks. This happens even while my application is doing almost nothing (10s of inserts per MINUTE). There are a reasonable number of connections open however. I've been trying to determine what in my application is causing this. I'm pretty newb with postgresql, and haven't gotten anywhere so far. I've turned on some logging options in my config file, and looked at connections in the pg_stat_activity table, but they are all idle. Yet each connection consumes ~ 50% CPU, and is writing ~15M/s to disk (reading nothing). I'm basically using the stock postgresql.conf with very little tweaks. I'd appreciate any advice or pointers on what I can do to track this down. Here is a sample of what top/iotop is showing me: Cpu(s): 18.9%us, 14.4%sy, 0.0%ni, 53.4%id, 11.8%wa, 0.0%hi, 1.5%si, 0.0%st Mem: 32865916k total, 7263720k used, 25602196k free, 575608k buffers Swap: 16777208k total, 0k used, 16777208k free, 4464212k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 17057 postgres 20 0 236m 33m 13m R 45.0 0.1 73:48.78 postmaster 17188 postgres 20 0 219m 15m 11m R 42.3 0.0 61:45.57 postmaster 17963 postgres 20 0 219m 16m 11m R 42.3 0.1 27:15.01 postmaster 17084 postgres 20 0 219m 15m 11m S 41.7 0.0 63:13.64 postmaster 17964 postgres 20 0 219m 17m 12m R 41.7 0.1 27:23.28 postmaster 18688 postgres 20 0 219m 15m 11m R 41.3 0.0 63:46.81 postmaster 17088 postgres 20 0 226m 24m 12m R 41.0 0.1 64:39.63 postmaster 24767 postgres 20 0 219m 17m 12m R 41.0 0.1 24:39.24 postmaster 18660 postgres 20 0 219m 14m 9.9m S 40.7 0.0 60:51.52 postmaster 18664 postgres 20 0 218m 15m 11m S 40.7 0.0 61:39.61 postmaster 17962 postgres 20 0 222m 19m 11m S 40.3 0.1 11:48.79 postmaster 18671 postgres 20 0 219m 14m 9m S 39.4 0.0 60:53.21 postmaster 26168 postgres 20 0 219m 15m 10m S 38.4 0.0 59:04.55 postmaster Total DISK READ: 0.00 B/s | Total DISK WRITE: 195.97 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 17962 be/4 postgres 0.00 B/s 14.83 M/s 0.00 % 0.25 % postgres: aggw aggw [local] idle 17084 be/4 postgres 0.00 B/s 15.53 M/s 0.00 % 0.24 % postgres: aggw aggw [local] idle 17963 be/4 postgres 0.00 B/s 15.00 M/s 0.00 % 0.24 % postgres: aggw aggw [local] idle 17188 be/4 postgres 0.00 B/s 14.80 M/s 0.00 % 0.24 % postgres: aggw aggw [local] idle 17964 be/4 postgres 0.00 B/s 15.50 M/s 0.00 % 0.24 % postgres: aggw aggw [local] idle 18664 be/4 postgres 0.00 B/s 15.13 M/s 0.00 % 0.23 % postgres: aggw aggw [local] idle 17088 be/4 postgres 0.00 B/s 14.71 M/s 0.00 % 0.13 % postgres: aggw aggw [local] idle 18688 be/4 postgres 0.00 B/s 14.72 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle 24767 be/4 postgres 0.00 B/s 14.93 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle 18671 be/4 postgres 0.00 B/s 16.14 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle 17057 be/4 postgres 0.00 B/s 13.58 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle 26168 be/4 postgres 0.00 B/s 15.50 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle 18660 be/4 postgres 0.00 B/s 15.85 M/s 0.00 % 0.00 % postgres: aggw aggw [local] idle

Read the article

Slow disk transfer rate

- by Nooklez

I have problem with slow disk transfer rate. It's static files server for our website. I was making backup of data and noticed that tar is very slow. So I did hdparm -t and... hdparm -t /dev/sda3 /dev/sda3: Timing buffered disk reads: 6 MB in 4.70 seconds = 1.28 MB/sec It's low traffic hour now on our site, so huge I/O traffic is not a reason (iotop show less than 1 MB/s). It's RAID10 setup (2x2 SATA drives). Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy ------------------------------------------------------------------------------ u0 RAID-10 OK - - 64K 1396.96 W ON VPort Status Unit Size Type Phy Encl-Slot Model ------------------------------------------------------------------------------ p0 OK u0 698.63 GB SATA 0 - WDC WD7500AADS-00M2 p1 OK u0 698.63 GB SATA 1 - WDC WD7500AADS-00M2 p2 OK u0 698.63 GB SATA 2 - WDC WD7500AADS-00M2 p3 OK u0 698.63 GB SATA 3 - WDC WD7500AADS-00M2 We have recently changed almost all components of server (excluding 3ware controller + disks). And I think problems started since then. Can it be configuration problem or hardware? EDIT: I found something like that in dmesg [166843.625843] irq 16: nobody cared (try booting with the "irqpoll" option) [166843.625846] Pid: 0, comm: swapper Not tainted 3.1.5-gentoo #3 [166843.625847] Call Trace: [166843.625848] <IRQ> [<ffffffff810859d5>] __report_bad_irq+0x35/0xc1 [166843.625856] [<ffffffff81085cec>] note_interrupt+0x165/0x1e1 [166843.625859] [<ffffffff8108445f>] handle_irq_event_percpu+0x16f/0x187 [166843.625861] [<ffffffff810844a9>] handle_irq_event+0x32/0x51 [166843.625863] [<ffffffff8108640b>] handle_fasteoi_irq+0x75/0x99 [166843.625866] [<ffffffff810039d7>] handle_irq+0x83/0x8b [166843.625868] [<ffffffff810036ad>] do_IRQ+0x48/0xa0 [166843.625871] [<ffffffff8155082b>] common_interrupt+0x6b/0x6b [166843.625872] <EOI> [<ffffffff812981e8>] ? acpi_safe_halt+0x22/0x35 [166843.625877] [<ffffffff812981e2>] ? acpi_safe_halt+0x1c/0x35 [166843.625879] [<ffffffff81298216>] acpi_idle_do_entry+0x1b/0x2b [166843.625881] [<ffffffff81298276>] acpi_idle_enter_c1+0x50/0x99 [166843.625884] [<ffffffff813b792a>] cpuidle_idle_call+0xed/0x171 [166843.625886] [<ffffffff81001257>] cpu_idle+0x55/0x81 [166843.625888] [<ffffffff81532a69>] rest_init+0x6d/0x6f [166843.625891] [<ffffffff81aa1aca>] start_kernel+0x329/0x334 [166843.625893] [<ffffffff81aa12a6>] x86_64_start_reservations+0xb6/0xba [166843.625894] [<ffffffff81aa139c>] x86_64_start_kernel+0xf2/0xf9 [166843.625896] handlers: [166843.625898] [<ffffffff812dc8de>] twl_interrupt [166843.625900] Disabling IRQ #16 It's related to problem? EDIT #2: Based on feedback in comments, here is more informations. cat /proc/interrupts 16: 390813 0 0 0 IO-APIC-fasteoi 3w-sas Controller model: [ 1.095350] 3ware Storage Controller device driver for Linux v1.26.02.003. [ 1.095467] 3ware 9000 Storage Controller device driver for Linux v2.26.02.014. [ 1.095641] LSI 3ware SAS/SATA-RAID Controller device driver for Linux v3.26.02.000. [ 1.095787] 3w-sas 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 1.095881] 3w-sas 0000:01:00.0: setting latency timer to 64 [ 1.910801] 3w-sas: scsi0: Found an LSI 3ware 9750-4i Controller at 0xfe560000, IRQ: 16. [ 2.216537] 3w-sas: scsi0: Firmware FH9X 5.08.00.008, BIOS BE9X 5.07.00.011, Phys: 8. [ 2.216836] scsi 0:0:0:0: Direct-Access LSI 9750-4i DISK 5.08 PQ: 0 ANSI: 5 And motherboard: description: Motherboard product: P8H67-M vendor: ASUSTeK Computer INC.

Read the article

e2fsck extremly slow, although enough memory exists

- by kaefert

I've got this external USB-Disk: kaefert@blechmobil:~$ lsusb -s 2:3 Bus 002 Device 003: ID 0bc2:3320 Seagate RSS LLC As can be seen in this dmesg output, there are some problems that prevents that disk from beeing mounted: kaefert@blechmobil:~$ dmesg | grep sdb [ 114.474342] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.475089] sd 5:0:0:0: [sdb] Write Protect is off [ 114.475092] sd 5:0:0:0: [sdb] Mode Sense: 43 00 00 00 [ 114.475959] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 114.477093] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.501649] sdb: sdb1 [ 114.502717] sd 5:0:0:0: [sdb] 732566645 4096-byte logical blocks: (3.00 TB/2.72 TiB) [ 114.504354] sd 5:0:0:0: [sdb] Attached SCSI disk [ 116.804408] EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 3976 failed (47397!=61519) [ 116.804413] EXT4-fs (sdb1): group descriptors corrupted! So I went and fired up my favorite partition manager - gparted, and told it to verify and repair the partition sdb1. This made gparted call e2fsck (version 1.42.4 (12-Jun-2012)) e2fsck -f -y -v /dev/sdb1 Although gparted called e2fsck with the "-v" option, sadly it doesn't show me the output of my e2fsck process (bugreport https://bugzilla.gnome.org/show_bug.cgi?id=467925 ) I started this whole thing on Sunday (2012-11-04_2200) evening, so about 48 hours ago, this is what htop says about it now (2012-11-06-1900): PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 3704 root 39 19 1560M 1166M 768 R 98.0 19.5 42h56:43 e2fsck -f -y -v /dev/sdb1 Now I found a few posts on the internet that discuss e2fsck running slow, for example: http://gparted-forum.surf4.info/viewtopic.php?id=13613 where they write that its a good idea to see if the disk is just that slow because maybe its damaged, and I think these outputs tell me that this is not the case in my case: kaefert@blechmobil:~$ sudo hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 3562 MB in 2.00 seconds = 1783.29 MB/sec Timing buffered disk reads: 82 MB in 3.01 seconds = 27.26 MB/sec kaefert@blechmobil:~$ sudo hdparm /dev/sdb /dev/sdb: multcount = 0 (off) readonly = 0 (off) readahead = 256 (on) geometry = 364801/255/63, sectors = 5860533160, start = 0 However, although I can read quickly from that disk, this disk speed doesn't seem to be used by e2fsck, considering tools like gkrellm or iotop or this: kaefert@blechmobil:~$ iostat -x Linux 3.2.0-2-amd64 (blechmobil) 2012-11-06 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 14,24 47,81 14,63 0,95 0,00 22,37 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0,59 8,29 2,42 5,14 43,17 160,17 53,75 0,30 39,80 8,72 54,42 3,95 2,99 sdb 137,54 5,48 9,23 0,20 587,07 22,73 129,35 0,07 7,70 7,51 16,18 2,17 2,04 Now I researched a little bit on how to find out what e2fsck is doing with all that processor time, and I found the tool strace, which gives me this: kaefert@blechmobil:~$ sudo strace -p3704 lseek(4, 41026998272, SEEK_SET) = 41026998272 write(4, "\212\354K[_\361\3nl\212\245\352\255jR\303\354\312Yv\334p\253r\217\265\3567\325\257\3766"..., 4096) = 4096 lseek(4, 48404766720, SEEK_SET) = 48404766720 read(4, "\7t\260\366\346\337\304\210\33\267j\35\377'\31f\372\252\ffU\317.y\211\360\36\240c\30`\34"..., 4096) = 4096 lseek(4, 41027002368, SEEK_SET) = 41027002368 write(4, "\232]7Ws\321\352\t\1@[+5\263\334\276{\343zZx\352\21\316`1\271[\202\350R`"..., 4096) = 4096 lseek(4, 48404770816, SEEK_SET) = 48404770816 read(4, "\17\362r\230\327\25\346//\210H\v\311\3237\323K\304\306\361a\223\311\324\272?\213\tq \370\24"..., 4096) = 4096 lseek(4, 41027006464, SEEK_SET) = 41027006464 write(4, "\367yy>x\216?=\324Z\305\351\376&\25\244\210\271\22\306}\276\237\370(\214\205G\262\360\257#"..., 4096) = 4096 lseek(4, 48404774912, SEEK_SET) = 48404774912 read(4, "\365\25\0\21|T\0\21}3t_\272\373\222k\r\177\303\1\201\261\221$\261B\232\3142\21U\316"..., 4096) = 4096 ^CProcess 3704 detached around 16 of these lines every second, so 4 read and 4 write operations every second, which I don't consider to be a lot.. And finally, my question: Will this process ever finish? If those numbers from fseek (48404774912) represent bytes, that would be something like 45 gigabytes, with this beeing a 3 terrabyte disk, which would give me 134 days to go, if the speed stays constant, and he scans the disk like this completly and only once. Do you have some advice for me? I have most of the data on that disk elsewhere, but I've put a lot of hours into sorting and merging it to this disk, so I would prefer to getting this disk up and running again, without formatting it anew. I don't think that the hardware is damaged since the disk is only a few months and since I can't see any I/O errors in the dmesg output. UPDATE: I just looked at the strace output again (2012-11-06_2300), now it looks like this: lseek(4, 1419860611072, SEEK_SET) = 1419860611072 read(4, "3#\f\2447\335\0\22A\355\374\276j\204'\207|\217V|\23\245[\7VP\251\242\276\207\317:"..., 4096) = 4096 lseek(4, 43018145792, SEEK_SET) = 43018145792 write(4, "]\206\231\342Y\204-2I\362\242\344\6R\205\361\324\177\265\317C\334V\324\260\334\275t=\10F."..., 4096) = 4096 lseek(4, 1419860615168, SEEK_SET) = 1419860615168 read(4, "\262\305\314Y\367\37x\326\245\226\226\320N\333$s\34\204\311\222\7\315\236\336\300TK\337\264\236\211n"..., 4096) = 4096 lseek(4, 43018149888, SEEK_SET) = 43018149888 write(4, "\271\224m\311\224\25!I\376\16;\377\0\223H\25Yd\201Y\342\r\203\271\24eG<\202{\373V"..., 4096) = 4096 lseek(4, 1419860619264, SEEK_SET) = 1419860619264 read(4, ";d\360\177\n\346\253\210\222|\250\352T\335M\33\260\320\261\7g\222P\344H?t\240\20\2548\310"..., 4096) = 4096 lseek(4, 43018153984, SEEK_SET) = 43018153984 write(4, "\360\252j\317\310\251G\227\335{\214`\341\267\31Y\202\360\v\374\307oq\3063\217Z\223\313\36D\211"..., 4096) = 4096 So this number of the lseeks before the reads, like 1419860619264 are already a lot bigger, standing for 1.29 terabytes if the numbers are bytes, so it doesn't seem to be a linear progress on a big scale, maybe there are only some areas that need work, that have big gaps in between them. (times are in CET)

Search Results

Search found 52 results on 3 pages for 'iotop'.

Page 2/3 | < Previous Page | 1 2 3 | Next Page >

- by ipozgaj

- by HankHendrix

- by Ferran Basora

- by Chris Moore

- by Hooked

- by Augustinus

- by Gregory Eric Sanderaon

- by Justin

- by dhanke

- by OneOfOne

- by Dan

- by dancio

- by glumbo

- by netom

- by Martin

- by OneOfOne

- by Unknown

- by SamsLembas

- by David Eyk

- by triiim

- by miccet

- by Jake Schoermer

- by wolfcastle

- by Nooklez

- by kaefert

< Previous Page | 1 2 3 | Next Page >