Search Results

Search found 22968 results on 919 pages for 'stuck again'.

Page 523/919 | < Previous Page | 519 520 521 522 523 524 525 526 527 528 529 530 | Next Page >

HAProxy + Percona XtraDB Cluster

- by rottmanj

I am attempting to setup HAproxy in conjunction with Percona XtraDB Cluster on a series of 3 EC2 instances. I have found a few tutorials online dealing with this specific issue, but I am a bit stuck. Both the Percona servers and the HAproxy servers are running ubuntu 12.04. The HAProxy version is 1.4.18, When I start HAProxy I get the following error: Server pxc-back/db01 is DOWN, reason: Socket error, check duration: 2ms. I am not really sure what the issue could be. I have verified the following: EC2 security groups ports are open Poured over my config files looking for issues. I currently do not see any. Ensured that xinetd was installed Ensured that I am using the correct ip address of the mysql server. Any help with this is greatly appreciated. Here are my current config Load Balancer /etc/haproxy/haproxy.cfg global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 user haproxy group haproxy debug #quiet daemon defaults log global mode http option tcplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 frontend pxc-front bind 0.0.0.0:3307 mode tcp default_backend pxc-back frontend stats-front bind 0.0.0.0:22002 mode http default_backend stats-back backend pxc-back mode tcp balance leastconn option httpchk server db01 10.86.154.105:3306 check port 9200 inter 12000 rise 3 fall 3 backend stats-back mode http balance roundrobin stats uri /haproxy/stats MySql Server /etc/xinetd.d/mysqlchk # default: on # description: mysqlchk service mysqlchk { # this is a config for xinetd, place it in /etc/xinetd.d/ disable = no flags = REUSE socket_type = stream port = 9200 wait = no user = nobody server = /usr/bin/clustercheck log_on_failure += USERID #only_from = 0.0.0.0/0 # recommended to put the IPs that need # to connect exclusively (security purposes) per_source = UNLIMITED } MySql Server /etc/services Added the line mysqlchk 9200/tcp # mysqlchk MySql Server /usr/bin/clustercheck # GNU nano 2.2.6 File: /usr/bin/clustercheck #!/bin/bash # # Script to make a proxy (ie HAProxy) capable of monitoring Percona XtraDB Cluster nodes properly # # Author: Olaf van Zandwijk <[email protected]> # Documentation and download: https://github.com/olafz/percona-clustercheck # # Based on the original script from Unai Rodriguez # MYSQL_USERNAME="testuser" MYSQL_PASSWORD="" ERR_FILE="/dev/null" AVAILABLE_WHEN_DONOR=0 # # Perform the query to check the wsrep_local_state # WSREP_STATUS=`mysql --user=${MYSQL_USERNAME} --password=${MYSQL_PASSWORD} -e "SHOW STATUS LIKE 'wsrep_local_state';" 2>${ERR_FILE} | awk '{if (NR!=1){print $2}}' 2>${ERR_FILE}` if [[ "${WSREP_STATUS}" == "4" ]] || [[ "${WSREP_STATUS}" == "2" && ${AVAILABLE_WHEN_DONOR} == 1 ]] then # Percona XtraDB Cluster node local state is 'Synced' => return HTTP 200 /bin/echo -en "HTTP/1.1 200 OK\r\n" /bin/echo -en "Content-Type: text/plain\r\n" /bin/echo -en "\r\n" /bin/echo -en "Percona XtraDB Cluster Node is synced.\r\n" /bin/echo -en "\r\n" else # Percona XtraDB Cluster node local state is not 'Synced' => return HTTP 503 /bin/echo -en "HTTP/1.1 503 Service Unavailable\r\n" /bin/echo -en "Content-Type: text/plain\r\n" /bin/echo -en "\r\n" /bin/echo -en "Percona XtraDB Cluster Node is not synced.\r\n" /bin/echo -en "\r\n" fi

Read the article
How can I simulate blocking RTMP over port 80 on Windows?

- by Christian Nunciato

It seems like this should be so simple, but since this isn't my area of expertise, I'm having a hell of a time figuring out how to do it. Basically, I have a Flash app and I'm connecting to a Flash Media Server to stream some content. The URL I'm using to do this, for example, looks like this: rtmp://someserver.com/some/path/mp3:somefile Everything works -- but that's sort of the problem. When I'm trying to do is simulate my users attempting to play back my media under more restrictive conditions than the ones I have here (i.e., none) -- namely being stuck behind firewalls or proxy servers that block access to RTMP streams. Flash, according to Adobe, is equipped to handle proxy servers and firewalls automatically, like so (from the docs): When you do not specify a port number in an RTMP address, Flash will attempt to connect to port 1935. If it fails it will then try to connect to port 443; if that fails, it will try port 80. [And if that fails, it will attempt to connect via RTMPT (i.e., HTTP tunneling) on port 80.] So no coding is required to access ports 1935, 443, or port 80 if you do not specify a port in the RTMP address. The problem I'm having is setting up a reliable environment in which to test that this behavior actually happens. I'm on a Windows machine, for example, so with Windows Firewall, I can block certain ports and protocols (1935, 443), but I don't want to block port 80, because the final fallback protocol (RTMPT) is supposed to run on port 80, and Windows Firewall only gives me enough granularity (as far as I know, anyway) to block "all outbound TCP traffic to remote port 80" -- that is, I can't, apparently, block "all outbound RTMP traffic to port 80" while leaving RTMPT traffic to port 80 unaffected. My understanding thus far is that I'll probably need to set up a proxy server to do this. Is this correct? Or is there a simpler way (on Win 7, at least) to filter out RTMP to 1935, RTMP to 443, RTMP to 80, but still allow RTMPT to 80 (where all four hostnames are identical)? And if I do have to set up a proxy server, what's the simplest way to go on Windows? I've set up WinProxy, which seems a bit janky but apparently works -- but then what I can't figure out is how to tell Windows to force all TCP traffic (including RTMP, RTMPT and HTTO) through this proxy server so I can turn around and reject the requests for RTMP. Any help would be hugely appreciated. This isn't my realm of expertise and I've alreasdy spent more time on it than I probably should. :)

Read the article
File/folder permissions and groups on Linux with Apache

- by phobia

I'm trying to learn about permissions on linux webserver with apache. Some clues to the system: The server I have to play around with is Fedora based. Apache runs as apache:apache. To allow for e.g. php to write to a file the file needs to be chmod 777. 755 is not sufficiant. What I'm wondering is basically how set up permissions like they should be on e.g. a "shared web host". My main problem is that if I set a permission so that one user cannot access anothers home folder, then apache can't read from the public_html folder either. To keep the users out I need to set chmod 700. But to let apache to read I need to have at least execute on world, so a 701 basically works, but won't let some users in. So I'm really stuck on what to do. Have been concidering adding the apache user to the frous grours below to avoid having to add the world execute flag, but is that a bad thing? Should it be the other way around, the users in the groups below should also be in the apache group? I was aiming at having 4 groups: 1. webapp same as dev_int, but is the only one that can go inside the webapp/live folder to e.g. do an update from the repo. 2. dev_int can read,write and execute everything in the "web root", including the two below, but nothing outside of the web root 3. dev_ext can read write and execute in all client folders, but cannot access anything outside of the webapp root 4. clientsBasic ftp accounts. Has a home folder with a public_html, but cannot access any other home folders An example of folder structure: webroot no users in the aforementioned groups can go outside of here some_project :dev_int only webapp live :webapp only staging :dev_int and :dev_ext clients :dev_int and :dev_ext client_1 :dev_int, :dev_ext and client1:clients public_html dev developer_1 developer_1:dev_int OR :dev_ext public_html

Read the article
centos 6 ps aux hangs up

- by Guntis

I have problem with my server. Server is running centos 6 (CloudLinux Server release 6.2). uname -a = 2.6.32-320.4.1.lve1.1.4.el6.x86_64 That is a kvm guest. On host is debian 6. If i run command ps aux, it stuck on random process (shows some processes only), top command is working fine. htop doesn't work too (black screen). top - 12:11:51 up 34 min, 1 user, load average: 4.26, 6.71, 16.15 Tasks: 201 total, 7 running, 192 sleeping, 0 stopped, 2 zombie Cpu(s): 7.9%us, 2.8%sy, 0.0%ni, 87.5%id, 1.6%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 9862044k total, 2359484k used, 7502560k free, 171720k buffers Swap: 10485720k total, 0k used, 10485720k free, 1336872k cached server has one Intel(R) Xeon(R) CPU E5606 @ 2.13GHz, free -m total used free shared buffers cached Mem: 9630 2336 7293 0 170 1324 -/+ buffers/cache: 841 8789 Swap: 10239 0 10239 php -v PHP 5.3.19 (cli) (built: Nov 28 2012 10:03:07) Copyright (c) 1997-2012 The PHP Group Zend Engine v2.3.0, Copyright (c) 1998-2012 Zend Technologies with the ionCube PHP Loader v4.2.2, Copyright (c) 2002-2012, by ionCube Ltd., and with Zend Guard Loader v3.3, Copyright (c) 1998-2010, by Zend Technologies with Suhosin v0.9.33, Copyright (c) 2007-2012, by SektionEins GmbH mysql Server version: 5.1.63-cll php -i disable_functions => apache_child_terminate, apache_setenv, define_syslog_variables, escapeshellarg, escapeshellcmd, eval, exec, fp, fput, ftp_connect, ftp_e xec, ftp_get, ftp_login, ftp_nb_fput, ftp_put, ftp_raw, ftp_rawlist, highlight_file, ini_alter, ini_get_all, ini_restore, inject_code, openlog, passthru, php _uname, phpAds_remoteInfo, phpAds_XmlRpc, phpAds_xmlrpcDecode, phpAds_xmlrpcEncode, popen, posix_getpwuid, posix_kill, posix_mkfifo, posix_setpgid, posix_set sid, posix_setuid, posix_setuid, posix_uname, proc_close, proc_get_status, proc_nice, proc_open, proc_terminate, shell_exec, syslog, system, xmlrpc_entity_de code, xmlrpc_server_create, putenv, show_source,mail => apache_child_terminate, apache_setenv, define_syslog_variables, escapeshellarg, escapeshellcmd, eval, exec, fp, fput, ftp_connect, ftp_exec, ftp_get, ftp_login, ftp_nb_fput, ftp_put, ftp_raw, ftp_rawlist, highlight_file, ini_alter, ini_get_all, ini_restore, inject_code, openlog, passthru, php_uname, phpAds_remoteInfo, phpAds_XmlRpc, phpAds_xmlrpcDecode, phpAds_xmlrpcEncode, popen, posix_getpwuid, posix_kill, pos ix_mkfifo, posix_setpgid, posix_setsid, posix_setuid, posix_setuid, posix_uname, proc_close, proc_get_status, proc_nice, proc_open, proc_terminate, shell_exe c, syslog, system, xmlrpc_entity_decode, xmlrpc_server_create, putenv, show_source,mail ... suhosin.executor.disable_eval => Off => Off suhosin.executor.eval.blacklist => include,include_once,require,require_once,curl_init,fpassthru,base64_encode,base64_decode,mail,exec,system,proc_open,leak, syslog,pfsockopen,shell_exec,ini_restore,symlink,stream_socket_server,proc_nice,popen,proc_get_status,dl, pcntl_exec, pcntl_fork, pcntl_signal,pcntl_waitpid, pcntl_wexitstatus, pcntl_wifexited, pcntl_wifsignaled,pcntl_wifstopped, pcntl_wstopsig, pcntl_wtermsig, socket_accept,socket_bind, socket_connect, socket_cr eate, socket_create_listen,socket_create_pair,link,register_shutdown_function,register_tick_function,gzinflate => include,include_once,require,require_once,c url_init,fpassthru,base64_encode,base64_decode,mail,exec,system,proc_open,leak,syslog,pfsockopen,shell_exec,ini_restore,symlink,stream_socket_server,proc_nic e,popen,proc_get_status,dl, pcntl_exec, pcntl_fork, pcntl_signal,pcntl_waitpid, pcntl_wexitstatus, pcntl_wifexited, pcntl_wifsignaled,pcntl_wifstopped, pcntl _wstopsig, pcntl_wtermsig, socket_accept,socket_bind, socket_connect, socket_create, socket_create_listen,socket_create_pair,link,register_shutdown_function, register_tick_function,gzinflate Sometimes i cannot kill httpd process. I run kill -9 PID even several times, and nothing happens. php runs via suphp. I learned somewhere that it can be trojan. I ran strace ps aux and it stops on open("/proc/PID/cmdline", O_RDONLY) If i reboot server, problem is gone but after some time it is back again .. :( Thanks.

Read the article
Debugging IO limitation

- by Martin F

I have a Fedora box with some severe IO limitations which I have no idea how to debug. The server has a Areca Technology Corp. ARC-1130 12-Port PCI-X to SATA RAID Controller with 12 7200 RPM 1.5 TB disks and a Marvell Technology Group Ltd. 88E8050 PCI-E ASF Gigabit Ethernet Controller. uname -a output: 2.6.32.11-99.fc12.x86_64 #1 SMP Mon Apr 5 19:59:38 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux The server is a file server running Nginx with the stub status module enabled, so I can see the current amount of connections. The problem present itself when I have a high number of simultaneous connections in a writing state. Usually around 350, at this very moment it's at 590 and the server is almost unusable and stuck at 230mbit/s. If I run stop and hit 1 to see CPU core usages I have all 4 cores with around 99% io wait, if I run iotop the nginx workers are the only processes producing any read load, currently at around 25MB/s. I have each of the workers bound to their own core. Initially I figured it was just the disks being bugged. But I've run fscheck and smartmontools checks and found no errors. I also ran an iozone test which you can see the result of here: http://www.pastie.org/951667.txt?key=fimcvljulnuqy2dcdxa Additionally, when the amount of connections are low I have no problem getting a good speed. If I wget over the local network it easily hits 60MB/sec. Right now I just tried putting a file in /dev/shm, then I symlinked a file from the public dir to it and used wget over the local network and only got 50KB/s. Also, if I try to cp /dev/shm/test /root/test it quickly copies around 740MB and then slows down HEAVILY. Again with iotop reporting 99% iowait. I'm not really sure how to go about figuring out what the problems are. It could be a natural disk limitation but then the file from /dev/shm ought to transfer so it seems there's a network limit, but that's fine when there's not many connections. Perhaps it's a TCP stack problem but I really have no idea how to check that. Any suggestions on how to proceed with debugging would be very welcome. If additional information is required then let me know and I'll try to get it. Thanks.

Read the article
Can't set up printing from Mac OS X (10.5.7) to an HP PSC 2410 shared from PC running Ubuntu 9.10

- by Weston C

I've got an HP PSC 2410 printer shared from a fresh Ubuntu 9.10 installation. I'm able to send documents to this printer over the network from another Ubuntu machine. But so far, I haven't been able to find a setup where I can send documents to that printer from a MacBook running 10.5.7. On the Mac side, when setting things up, I go into System Prefs Print & Fax, click on the "+" mark, select "IP", pick "IPP", enter the IP address of the Ubuntu box, leave the queue blank, enter the Name and location, and I think it's when I get to the "Print Using" (driver selection) part that I'm running into issues. If I use "Auto Select", it defaults to "Generic PostScript Printer", which I doubt the PSC 2410 is (and sure enough, if I print, the jobs don't go through). If I try "Select a driver to use...", there's not an option for an HP PSC 2400. This seems a little odd: I can plug the printer directly into one of our Macs and it immediately figures out the driver and I can print no problem, but that's apparently the way things work. So, that leaves one option: "Other", which, when selected, brings up a dialog apparently for the purpose of manually locating a driver. I've tried visiting HP's web site. They have drivers for earlier versions of Mac OS X, but state that after 10.4, Mac OS X should just come with the relevant drivers. I've also tried setting things up by interacting with the CUPS server on the Mac through a browser: I go to http://localhost:631/, select "Add New Printer", pick "Internet Printing Protocol (http)" for the Device selection, enter "http://ubuntu.machine.ip.address:631/printers/hp-psc-2400-series" for the Device URI, select "HP" for Make, and then on the next screen, we're back to the problem where the PSC 2400 just doesn't show up. There's an option to "provide a PPD file", which I assume would be the printer driver I can't find. A Google search for "HP PSC 2410 ppd Leopard" doesn't seem to yield much other than a reminder that the printer is supposed to just work out of the box on Leopard. A local search for ".ppd" or "2410" on either Mac also doesn't yield anything that looks like a relevant print driver. I'm totally stuck at this point. Any advice?

Read the article
Problems migrating an EBS backed instance over AWS Regions

- by gshankar

Note: I asked this question on the EC2 forums too but haven't received any love there. Hopefully the ServerFault community will be more awesome. The new AWS Sydney region opening up is something that we've been waiting for for a long time but I'm having a lot of trouble migrating our instances over from N. California. I managed to migrate 1 instance over using CloudyScripts to move a snapshot and then firing up a new instance in the Sydney region. This was a very new instance so both the source and destination were running on a Ubuntu 12.04 LTS server and I had no issues there. However, the rest of our instances are all Ubuntu 10.04 LTS and with these, I'm having a lot of problems. I've tried following: 1- following the AWS whitepaper on moving instances which was given to us at the recent Customer Appreciation Day in Sydney where the new region was launched. The problem with this approach was with the last step (Step 19) here you register the image: ec2-register -s snap-0f62ec3f -n "Wombat" -d "migrated Wombat" --region ap-southeast-2 -a x86_64 --kernel aki-937e2ed6 --block-device-mapping "/dev/sdk=ephemeral0" I keep getting this error: Client.InvalidAMIID.NotFound: The AMI ID 'ami-937e2ed6' does not exist which I think is due to the kernel_id not existing in the Sydney region? 2- Using CloudyScripts to move a snapshot and then creating a new volume and attaching to a new instance in Sydney This results in the instance just hanging on boot and failing the status checks. I can't SSH in or look at the server log I suspect that my issue is with finding the right kernel_id for the volume in the new region. However I can't seem to work out how to go about finding this kernel_id, the ones I've tried (from the original instance) don't result in the Client.InvalidAMIID.NotFound: The AMI ID 'ami-937e2ed6' error and any other kernel_id just won't boot. I've tried both 12.04 and 10.04 versions of Ubuntu. Nothing seems to work, I've been banging my head against a wall for a while now, please help! New (broken) instance i-a1acda9b ami-9b8611a1 aki-31990e0b Source instance i-08a6664e ami-b37e2ef6 aki-937e2ed6 p.s. I also tried following this guide on updating my Ubuntu LTS version to 12.04 before doing the migration but it didn't seem to work either, still getting stuck on updating the kernel_id http://ubuntu-smoser.blogspot.com.au/2010/04/upgrading-ebs-instance.html

Read the article
Windows XP installation problems

- by Samurai Waffle

I recently asked a question on here, and thought I had it working... Here is a link to it. Windows XP Installation problems So basically I'm having trouble getting XP installed. To sum it up, a computer I have had a boot sector virus, and I used Darik's Nuke and Boot to wipe the hard drive clean. So the hard drive has nothing on it. I had to try and install Windows through a DOS prompt, because for some reason it won't read it off the DVD. The UBCD is able to look at the files located on the DVD I have in, but I can't boot from it for some reason. So I extracted it to a USB drive, booted to DOS and started the setup process. Here's the weird thing with DOS... It can only find the C: drive. The C: drive in DOS is the flash drive that I have in, running DOS. I can't find the hard drive anywhere! So anyways, after starting the setup process, it copied the files over to the "hard drive" (which took 16 hours because the version of DOS I ran couldn't run smartdrv.exe), and it said the computer had to reboot. So I let it reboot, and it stopped and said there is no boot device. So I popped in UBCD that I have installed on a flash drive, and I discovered that it had copied the Windows files over to the flash drive and not the hard drive. It never asked where it should extract the files... So I toyed around with UBCD, ran a memory test on the hard drive to make sure it was fine, and it came out clean. So I'm stuck now. How can I get this installed? Writing this, I came up with an idea. If I copy the DOS startup files over to the hard drive, would I be able to start DOS from it? If so, I believe that could fix my problems. Any help is greatly appreciated, because I am running out of ideas and am at my wits end with this computer.

Read the article
Postfix: How to apply header_checks only for specific Domains?

- by Lukas

Basically what I want to do is rewriting the From: Header, using header_checks, but only if the mail goes to a certain domain. The problem with header_check is, that I can't check for a combination of To: and From: Headers. Now I was wondering if it was possible to use the header_checks in combination with smtpd_restriction_classes or something similar. I've found a lot information about header_checks and multiple header fields, when searching the net. All of them basically telling me, that one can't combine two header for checking. But I didn't find any information if it was possible to only do a header check if a condition (eg. mail goes to example.com) was met. Edit: While doing some more Research I've found the following article which suggests to add a Service in postfix master.cf, use a transportmap to pass mails for the Domain to that service and have a separate header_check defined with -o. The thing is that I can't get it to work... What I did so far is adding the Service to the master.cf: example unix - - n - - smtpd -o header_checks=regexp:/etc/postfix/check_headers_example Adding the followin Line to the transportmap: example.com example: Last but not least I have two regexp-files for header checks, one for the newly added service, and one to redirect answers to the rewritten domain. check_headers_example: /From:(.*)@mydomain.ain>(.*)/ REPLACE From:[email protected]>$2 Obviously if someone answers, the mail would go to nirvana, so I have the following check_headers defined in the main postfix process: /To:(.*)<(.*)@mydomain.example.com>(.*)/ REDIRECT [email protected]$2 Somehow the Transport is ignored. Any help is appreciated. Edit 2: I'm still stuck... I did try the following: smtpd_restriction_classes = header_rewrite header_rewrite = regexp:/etc/postfix/rewrite_headers_domain smtpd_recipient_restrictions = (some checks) check_recipient_access hash:/etc/postfix/rewrite_table, (more checks) In the rewrite_table the following entries exist: /From:(.*)@mydomain.ain>(.*)/ REPLACE From:[email protected]>$2 All it gets me is a NOQUEUE: reject: 451 4.3.5 Server configuration error. I couldn't find any resources on how you would do that but some people saying it wasn't possible. Edit 3: The reason I asked this question was, that we have a customer (lets say customer.com) who uses some aliases that will forward mail to a domain, let's say example.com. The mailserver at example.com does not accept any mail from an external server that come from a sender @example.com. So all mails that are written from example.com to [email protected] will be rejected in the end. An exception on example.com's mailserver is not possible. We didn't really solve this problem, but will try to work around it by using lists (mailman) instead of aliases. This is not really nice though, nor a real solution. I'd appreciate all suggestions how this could be done in a proper way.

Read the article
No Telnet login prompt when used over SSH tunnel

- by SCO

Hi there ! I have a device, let's call it d1, runnning a lightweight Linux. This device is NATed by my internet box/router, hence not reachable from the Internet. That device runs a telnet daemon on it, and only has root as user (no pwd). Its ip address is 192.168.0.126 on the private network. From the private network (let's say 192.168.0.x), I can do: telnet 192.168.0.126 Where 192.168.0.126 is the IP address in the private network. This works correctly. However, to allow administration, I'd need to access that device from outside of that private network. Hence, I created an SSH tunnel like this on d1 : ssh -R 4455:localhost:23 ussh@s1 s1 is a server somewhere in the private network (but this is for testing purposes only, it will endup somewhere in the Internet), running a standard Linux distro and on which I created a user called 'ussh'. s1 IP address is 192.168.0.48. When I 'telnet' with the following, let's say from c1, 192.168.0.19 : telnet -l root s1 4455 I get : Trying 192.168.0.48... Connected to 192.168.0.48. Escape character is '^]'. Connection closed by foreign host . The connection is closed after roughly 30 seconds, and I didn't log. I tried without the -l switch, without any success. I tried to 'telnet' with IP addresses instead of names to avoid reverse DNS issues (although I added to d1 /etc/hosts a line refering to s1 IP/name, just in case), no success. I tried on another port than 4455, no success. I gathered Wireshark logs from s1. I can see : s1 sends SSH data to c1, c1 ACK s1 performs an AAAA DNS request for c1, gets only the Authoritave nameservers. s1 performs an A DNS request, then gets c1's IP address s1 sends a SYN packet to c1, c1 replies with a RST/ACK s1 sends a SYN to c1, C1 RST/ACK (?) After 0.8 seconds, c1 sends a SYN to s1, s1 SYN/ACK and then c1 ACK s1 sends SSH content to d1, d1 sends an ACK back to s1 s1 retries AAAA and A DNS requests After 5 seconds, s1 retries a SYN to c1, once again it is RST/ACKed by c1. This is repeated 3 more times. The last five packets : d1 sends SSH content to s1, s1 sends ACK and FIN/ACK to c1, c1 replies with FIN/ACK, s1 sends ACK to c1. The connection seems to be closed by the telnet daemon after 22 seconds. AFAIK, there is no way to decode the SSH stream, so I'm really stuck here ... Any ideas ? Thank you !

Read the article
Varnish 503 Guru Mediation errors with pfsense and healthy apache

- by Fammy

We are running a pfsense firewall / load balancer with varnish as service, In front of Fedora linux webservers running apache. We are getting intermittent 503 guru mediation errors. We are a bit stuck scratching our heads because it is not easily repeatable. The timeouts are set to 30s (connect and first byte) but yet the 503 page will show instantly, not after 30s. Then if you refresh immediately it may very well work instantly and sometimes for a 100 refreshes. The load average on the web servers is < 1, the DB server is < 3 (all servers (web, db, pfsense/varnish) are physical rather than VM. I would have thought if the timeouts were being hit then the 503 page would only appear after 30s am I mistaken? Also when an error happens there does not appear to be any corresponding error in apache's log files. This seems to affect pages as well as images, so it is possible to have the page load fine, and for 9/10 images on the page to be fine but 1 not work An example of the varnish debug is below. It says no backend connection but I can't figure out why, if the load was high on apache I could understand it being flaky The machines are on the same gig ethernet lan 21 ReqStart c *IP-REMOVED* 33418 1274368062 21 RxRequest c GET 21 RxURL c /fashion/ 21 RxProtocol c HTTP/1.1 21 RxHeader c User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.5) Gecko/2008121622 Fedora/3.0.5-1.fc10 Firefox/3.0.5 21 RxHeader c Host: *ourdomain.com* 21 RxHeader c Accept: */* 21 RxHeader c Accept-Encoding: deflate, gzip 21 VCL_call c recv lookup 21 VCL_call c hash 21 Hash c /fashion/ 21 Hash c *ourdomain.com* 21 VCL_return c hash 21 VCL_call c miss fetch 21 FetchError c no backend connection 21 VCL_call c error restart 21 VCL_call c recv lookup 21 VCL_call c hash 21 Hash c /fashion/ 21 Hash c *ourdomain.com* 21 VCL_return c hash 21 VCL_call c miss fetch 21 FetchError c no backend connection 21 VCL_call c error restart 21 VCL_call c recv lookup 21 VCL_call c hash 21 Hash c /fashion/ 21 Hash c *ourdomain.com* 21 VCL_return c hash 21 VCL_call c miss fetch 21 FetchError c no backend connection 21 VCL_call c error deliver 21 VCL_call c deliver deliver 21 TxProtocol c HTTP/1.1 21 TxStatus c 503 21 TxResponse c Service Unavailable 21 TxHeader c Server: Varnish 21 TxHeader c Content-Type: text/html; charset=utf-8 21 TxHeader c Content-Length: 384 21 TxHeader c Accept-Ranges: bytes 21 TxHeader c Date: Wed, 11 Apr 2012 10:36:17 GMT 21 TxHeader c X-Varnish: 1274368062 21 TxHeader c Age: 0 21 TxHeader c Via: 1.1 varnish 21 TxHeader c Connection: close 21 TxHeader c X-Cache: MISS 21 Length c 384 21 ReqEnd c 1274368062 1334140577.449995041 1334140577.450334787 1.794108152 0.000282764 0.000056982

Read the article
Looking for a new backup solution to replace dying tape drive

- by E3 Group

We're running Windows Server 2003 SBS and another machine with Server 2003 Standard on it. The SBS server is about 7 years old running pretty much 24/7 - a HP server of some description. We have an Ultrium 448 cycling LTO2 400GB tapes daily and incrementally backing up approximately 100gb worth of data (20gb C:\ and system state, 40gb exchange, 40gb database for some crap marketing software) on BackupExec 10D. As of 5 months ago, the backups have been consistently failing with IO errors, bad reads and some write errors. When I say consistent, I mean every time and we haven't had a proper backup for the entire 5 months - So if the server explodes tomorrow, 7 years worth of data will just cease to exist. I've only just recently rejoined the company and am looking at rectifying the more concerning problems, so the first thing I did was try a backup to an USB2.0 external drive. It was excruciatingly slow. In fact it was so slow it took 40 hours and it still wasn't finished. I ended up cancelling it and reconfiguring the selections again to reduce file size. This, however, isn't a permanent solution. I concluded that the IO error was either from a faulty tape drive (which has a tape stuck in there right now and not coming out) or from a dying SCSI controller. Neither of them are good news and both are extremely expensive to fix. I'm operating on an extremely low budget so have been looking at outsourcing the backups. A company in Sydney (where I'm located) offer incremental online backups via a NAS. It costs almost double a new tape drive but offers monthly repayments which will let us get through times when cash flow is minimal. It seems like a sweet deal but it is still a little bit pricey. So I'm looking for a cheaper, yet reliable solution. Maybe some in-house NAS or something offsite? The idea is to avoid using tapes. Are there any recommendations for rectifying my current situation? Or are tapes the only way to go? I'm concerned that the server will die one day in the near future and I must be able to restore it to another server with different hardware.

Read the article
Web Service gets unavailable after several concurrent calls

- by Roman

We are testing GoDaddy Virtual Data Center and came to a very strange issue when our web site gets unavailable. GoDaddy Support keeps saying the issue is in our web server settings, but looking at the result of our tests I doubt it. TEST ENVIRONMENT Virtual DataCenter with Windows hosted at GoDaddy.com. All servers have Windows Server 2008 R2 Datacenter, IIS 7. Server One with IP address 10.1.0.4 Server Two with IP address 10.1.0.3 Both servers are in private network not visible from outside. Port Forward with IP address 50.62.13.174. Port Forward is assigned to Server One TEST DESCRIPTION JMeter is used as a Client App to simulate 30 concurrent users sending 100 SOAP requests each. Interval between requests is 1 second. Http link used for testing: http://50.62.13.174/v2/webservices.asmx TEST ONE Test is run from a computer in our office. After JMeter starts running test, almost immediately, the link above becomes unavailable in a browser. After test completion, the link is not available in a browser for about 5 more minutes. Remote Desktop is working well, so we can connect to Server One remotely. After about 5 minutes since test completion, the link becomes available in a browser again. TEST TWO Test is run from Server Two (that is part of our virtual data center). Test works very well, no visible delays in processing. The link is available in a browser all the time. TEST THREE Test is run from Server One using localhost. The result is the same as in TEST TWO - no issues. TEST FOUR We repeated TEST ONE from other computers that we have located in different countries, all with the same result as TEST ONE. CONCLUSION As the test works well from Server Two, but does not work from outside our virtual data center, we feel there are issues with the network or its capacity. The whole behaviour looks like out requests from outside get stuck somewhere before reaching our virtual data center. Has anybody had similar issues in the past? Are there chances that something is wrong with our server settings?

Read the article
Missing startup screens and slow bootup/login after using WinClone to expand Bootcamp partition

- by user26453

I used WinClone to backup my Bootcamp partition, which was a Windows 7 Ultimate install, on my late 2006 Macbook Pro. I desired to expand the Bootcamp partition's size. It worked reasonably well with some hiccups along the way and some remaining issues. First issue I ran into was the Bootcamp Assistant utility - it would not recreate the partition. This was due to a lack of contiguous space that is required for the Bootcamp partition. As a result I wiped the whole drive and reinstalled Snow Leopard, did the minimum amount of system updates, and created and formated a new Bootcamp partition. WinClone restored the image without complaint and the image was automatically resized to the new partition's size. Second issue I ran into was after the first boot into Windows. The first thing I noticed was that instead of the newer "slick" startup screen (4 colors wisping around, a Windows 7 title), there was more of an old school style startup screen (a progress bar with block increments, yellow/greenish color, nothing else really). The initial bootup to a login screen was slow, perhaps as Windows dealt with the partition changes. After logging in, the screen goes blank and the computer seems to hang for a minute, before completing the login. After subsequent restarts, the slick screen is still missing, boot to login screen is normal, but the time from login to desktop active is still very slow. As a side note, this behavior of a long time from login to the desktop finally loaded I've previously only seen when the computer would try to hibernate and fail (battery is really bad). On the next startup, I would see this behavior, but not subsequently. So a potential cause: I imaged the partition after hibernating out of Windows. From reading some posts/guides on the subject, this was not recommended, and perhaps shouldn't even have worked? Could the partition be stuck in some weird mode as a result that makes the boot issues appear? I've attempted to disable hibernation and restart, trying to delete the .sys file that hibernation uses. Other fixes I'm thinking of attempting are booting a Win7 disc and repairing the install/partition. I can't shake the nagging feeling something isn't right as a result of the modified boot screens and the slow login process.

Read the article
Routing RFC1918 addresses through dd-wrt via a switch

- by espenfjo

I am a bit stuck with an experiment of mine. I have a network looking somewhat like this. | Internet | | ---- |Switch| ---- | | Server w/pub IP | DD-WRT router 192.168.1.1 | | RFC1918 clients 192.168.1.0/24 What I want is for the RFC1918 clients to speak directly with each others. On the server with the public IP I have this route: 192.168.1.0/24 dev eth0 scope link and can see that packets are infact reaching the dd-wrt router for 192.168.1.1, even though if I get no answer. Trying to reach one of the RFC1918 clients from the public IP server will get no result, as the dd-wrt router is not announcing that network on to its external interface (arp who-has 192.168.1.107 tell xxx.xxx.xxx.xxx, but no answer). The router being an WLAN dd-wrt router has of course a load of routes, VLANs and interfaces: xxx.xxx.xxx.1 dev vlan2 scope link 192.168.1.0/24 dev br0 proto kernel scope link src 192.168.1.1 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.244 84.215.64.0/18 dev vlan2 proto kernel scope link src xxx.xxx.xxx.xxx 169.254.0.0/16 dev br0 proto kernel scope link src 169.254.255.1 127.0.0.0/8 dev lo scope link 0.0.0.0 via xxx.xxx.xxx.1 dev vlan2 xxx.xxx.xxx.xxx being the public IP, and xxx.xxx.xxx.1 being the default route for the public IP. I am not sure where to continue with this. I would recon that I both need routing on the dd-wrt router, as well as some iptables magic? Why do something this complex? Why not ;) Also, do not mind that "Internet" can get RFC1918 traffic, it wont go outside of the walls. EDIT 1: Following the tip from stew I do indeed get the correct ARP flowing. And adding an iptables rule for allowing traffic from that specific public IPd machine I get traffic between the systems! Oddly enough though, the speed I get from Server w/pub IP - RFC1918 clients are the same as if the traffic were routed out onto the Internet and back. Edit 2: Ok, disconnecting the external Internet connection will still give the same, crappy transfer speed. So it has to be something else. Edit 3: Ok, I guess there are other reasons for this crappy speed. Case closed. :)

Read the article
Rsync over ssh: "ERROR: module is read only" suddenly appeared

- by user978548

I've used from some time rsync/ssh to backup my shared host contents to my personal Synology NAS (212j for that matter), and it worked quite well. For information, I use a password-less ssh connection. 3 days ago, I updated my NAS software and since (or at least I believe it's since that), the backup won't work anymore. I get the following error on the host: rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) ERROR: module is read only ..which I do not understand. beside that nothing changed that I know of in both source and destination that can be related to rsync or ssh, I did check a few things and all seems to be alright: I can still connect through ssh from the host to my NAS with the good user, so ssh stuff like keys haven't changed. I also have the correct file permissions on the NAS (I checked, and also tried to create files, directories, .. with the user used by rsync through ssh). I read here and there that the error means that I have to ensure that my rsyncd.conf have the right read only = no in it, but as far as I know, I never used rsyncd as well as I never configured anything for it and until now it worked like a charm.. I use the following command to do the backup: rsync -ab --recursive \ --files-from="$FILES_FROM" \ --backup-dir=backup_$SUFFIX \ --delete \ --filter='protect backup_*' \ $WDIRECTORY/ \ remote_backup:$REMOTE_BACKUP/ So I'm stuck and really can't figure out what happened. Edit: As suggested in comments, I also tried passing commands to ssh (but not from inside a ssh session), that worked as expected, and also tried a single rsync command, which didnt worked, failing just like the complete backup command. (sharedHost):hostuser:~ > touch test.txt (sharedHost):hostuser:~ > rsync test.txt remote_backup:backups/test.txt ERROR: module is read only rsync error: syntax or usage error (code 1) at main.c(1034) [Receiver=3.0.8] rsync: connection unexpectedly closed (9 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.7] and (sharedHost):hostuser:~ > ssh remote_backup 'touch /abs_path_to_backups/backups/test2.txt && echo "ProoF" > /abs_path_to_backups/backups/test2.txt' (sharedHost):hostuser:~ > ssh remote_backup 'cat /abs_path_to_backups/backups/test2.txt' ProoF

Read the article
How do I permanently delete e-mail messages in the sendmail queue and keep them from coming back?

- by Steven Oxley

I have a pretty annoying problem here. I have been testing an application and have created some test e-mails to bogus e-mail addresses (not to mention that my server isn't really set up to send e-mail anyway). Of course, sendmail is not able to send these messages and they have been getting stuck in the sendmail queue. I want to manually delete the messages that have been building up in the queue instead of waiting the 5 days that sendmail usually takes to stop retrying. I am using Ubuntu 10.04 and /var/spool/mqueue/ is the directory in which every how-to I have read says the e-mails that are queued up are kept. When I delete the files in this directory, sendmail stops trying to process the e-mails until what appears to be a cron script runs and re-populates this directory with the messages I don't want sent. Here are some lines from my syslog: Jun 2 17:35:19 sajo-laptop sm-mta[9367]: o530SlbK009365: to=, ctladdr= (33/33), delay=00:06:27, xdelay=00:06:22, mailer=esmtp, pri=120418, relay=e.mx.mail.yahoo.com. [67.195.168.230], dsn=4.0.0, stat=Deferred: Connection timed out with e.mx.mail.yahoo.com. Jun 2 17:35:48 sajo-laptop sm-mta[9149]: o4VHn3cw003597: to=, ctladdr= (33/33), delay=2+06:46:45, xdelay=00:34:12, mailer=esmtp, pri=3540649, relay=mx2.hotmail.com. [65.54.188.94], dsn=4.0.0, stat=Deferred: Connection timed out with mx2.hotmail.com. Jun 2 17:39:02 sajo-laptop CRON[9510]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm) Jun 2 17:39:43 sajo-laptop sm-mta[9372]: o52LHK4s007585: to=, ctladdr= (33/33), delay=03:22:18, xdelay=00:06:28, mailer=esmtp, pri=1470404, relay=c.mx.mail.yahoo.com. [206.190.54.127], dsn=4.0.0, stat=Deferred: Connection timed out with c.mx.mail.yahoo.com. Jun 2 17:39:50 sajo-laptop sm-mta[9149]: o51I8ieV004377: to=, ctladdr= (33/33), delay=1+06:31:06, xdelay=00:03:57, mailer=esmtp, pri=6601668, relay=alt4.gmail-smtp-in.l.google.com. [74.125.79.114], dsn=4.0.0, stat=Deferred: Connection timed out with alt4.gmail-smtp-in.l.google.com. Jun 2 17:40:01 sajo-laptop CRON[9523]: (smmsp) CMD (test -x /etc/init.d/sendmail && /usr/share/sendmail/sendmail cron-msp) Does anyone know how I can get rid of these messages permanently? As a side note, I'd also like to know if there is a way to set up sendmail to "fake" sending e-mail. Is there?

Read the article
PC monitors shut off and system hangs while playing 3D games, but sound continues - Diagnosis?

- by Jon Schneider

Two days ago, I started running into a problem with my Windows PC: The PC's two connected monitors simultaneously lose signal and go black (as though the PC had been powered off). The keyboard's Numlock, Capslock, and Scroll Lights will become "stuck" in their current positions, as though the PC is hung. (For example, the Numlock light on the keyboard remains lit regardless of me pressing the Numlock key repeatedly.) No keyboard input does anything. (Ctrl+Alt+Del, Ctrl+Shift+Esc, Ctrl+C, etc.) However -- Whatever sound/music the PC was playing continues to play, and the PC's fans continue running, so the PC hasn't powered itself off or rebooted itself. Opening up the case, the graphics card is pretty hot to the touch. I had this happen 3 times in one evening. In all cases, I was playing a game with 3D graphics when the problem occurred (Torchlight, Minecraft, Magic: The Gathering 2012, Avadon: The Black Fortress demo). I have yet to have the problem happen when I'm not playing a game. This system has been running stable for about 2.5 years prior to this. I didn't make any changes to the system prior to the problem starting to occur. System specs: OS: Windows 7 64-bit Processor: Intel Core 2 Duo E7200 Wolfdale 2.53GHz Video Card: XFX GeForce 9800 GT 512 MB Motherboard: Foxconn P45A-S LGA 775 Intel ATX RAM: Corsair 4 GB (2x 2GB) DDR2-800 (PC2 6400) Full specs: New PC 2008 Troubleshooting tried so far (the problem occurred again after taking each of these steps, one at a time): Updated the video drivers with the latest drivers from NVidia's site. Opened up the case and cleaned out the video card and processor fans (both were pretty dirty). Installed and ran temperature monitor software. The processor idles at about 50 degrees C, and goes up to about 63 degrees C while playing a game (seems on the warm side, but not excessively so?). The software wasn't able to report the temperature of the GPU -- not sure this particular GPU supports software temperature readout? My initial diagnosis is that maybe the GPU is on its last legs (given that it seems to be running pretty hot, and the problem only occurs while playing 3D games). Does this seem likely? Or is it likely that this problem is caused by the processor, RAM, or motherboard? Or could this be a software issue of some kind? Thanks for any advice!

Read the article
SQL Server database filled the hard drive and freeing up space isn't possible

- by Jon

I have a database in SQL Server 2008 on a 1Tb hard drive and it filled the drive, there is only 4Kb free. The MDF file is 323Gb and the LDF is 653Gb. The hard disk this DB is on has no other files on it other than the MDF and LDF so it's impossible to free up any space on the drive. The main hard disk is smaller but there is enough room to transfer the MDF to that drive, in case that helps. This server is overseas at a customer site and it's not possible at the moment to add more disk space to the server. It's also not possible to delete any records because the DB is in a failed mode (due to no disk space) and it doesn't respond to most commands. The Db is currently in full recovery mode which is why the LDF file is so large. This DB really doesn't need to be in full recovery so going forward we plan on switching it to simple mode which will save us a lot of space. I also don't care about losing the LDF file, but I need all of the data. I've spent a lot of time looking for a way out of this problem but everything I've found first involves either freeing up disk space or adding more disk space, neither of which is an option at this time. I'm stuck and any help would be greatly appreciated. I get the following log when trying to switch the DB to online mode. Msg 945, Level 14, State 2, Line 3 Database 'DBNAME' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details. Msg 5069, Level 16, State 1, Line 3 ALTER DATABASE statement failed. Msg 1101, Level 17, State 12, Line 3 Could not allocate a new page for database 'DBNAME' because of insufficient disk space in filegroup 'DEFAULT'. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup. I've found the following solutions but none work due to having no disk space on that drive, and since the DB is in a failed state I can't run most commmands. - DBCC SHRINKFILE - can't be run because doing a 'use DBNAME' fails - Detaching the DB and then changing the location of the MDF/LDF files, this fails because the DB is in an offline mode so you can't run detach. I'm at a loss about what else to try. Thanks.

Read the article
How to get more information from the system crash

- by viraptor

I'd like to debug an issue I'm having with a linux (debian stable) server, but I'm running out of ideas of how to confirm any diagnosis. Some background: The servers are running DL160 class with hardware raid between two disks. They're running a lot of services, mostly utilising network interface and CPU. There are 8 cpus and 7 "main" most cpu-hungry processes are bound to one core each via cpu affinity. Other random background scripts are not forced anywhere. The filesystem is writing ~1.5k blocks/s the whole time (goes up above 2k/s in peak times). Normal CPU usage for those servers is ~60% on 7 cores and some minimal usage on the last (whatever's running on shells usually). What actually happens is that the "main" services start using 100% CPU at some point, mainly stuck in kernel time. After a couple of seconds, LA goes over 400 and we lose any way to connect to the box (KVM is on it's way, but not there yet). Sometimes we see a kernel reporting hung task (but not always): [118951.272884] INFO: task zsh:15911 blocked for more than 120 seconds. [118951.272955] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [118951.273037] zsh D 0000000000000000 0 15911 1 [118951.273093] ffff8101898c3c48 0000000000000046 0000000000000000 ffffffffa0155e0a [118951.273183] ffff8101a753a080 ffff81021f1c5570 ffff8101a753a308 000000051f0fd740 [118951.273274] 0000000000000246 0000000000000000 00000000ffffffbd 0000000000000001 [118951.273335] Call Trace: [118951.273424] [<ffffffffa0155e0a>] :ext3:__ext3_journal_dirty_metadata+0x1e/0x46 [118951.273510] [<ffffffff804294f6>] schedule_timeout+0x1e/0xad [118951.273563] [<ffffffff8027577c>] __pagevec_free+0x21/0x2e [118951.273613] [<ffffffff80428b0b>] wait_for_common+0xcf/0x13a [118951.273692] [<ffffffff8022c168>] default_wake_function+0x0/0xe .... This would point at raid / disk failure, however sometimes the tasks are hung on kernel's gettsc which would indicate some general weird hardware behaviour. It's also running mysql (almost read-only, 99% cache hit), which seems to spawn a lot more threads during the system problems. During the day it does ~200kq/s (selects) and ~10q/s (writes). The host is never running out of memory or swapping, no oom reports are spotted. We've got many boxes with similar/same hardware and they all seem to behave that way, but I'm not sure which part fails, so it's probably not a good idea to just grab something more powerful and hope the problem goes away. Applications themselves don't really report anything wrong when they're running. I can run anything safely on the same hardware in an isolated environment. What can I do to narrow down the problem? Where else should I look for explanation?

Read the article
Internet Working, Browsing Not.

- by jeffreypriebe

I have a very odd problem that I can't resolve. I am connected to the internet, but my browsing doesn't work. I don't mean a web browser - I mean browsing. Firefox, Chrome, Curl all fail to successfully connect to an HTTP address. However existing connections, e.g. to mail in Outlook (Exchange Server and also IMAP server) continue to work. Also, the internet is on, I can confirm both from my machine (other ports / connections) as well as from any other computer connected to the same network. Additionally, it appears to be HTTP, not simple a port issue as HTTP over port 8443 (Tortoise SVN if you must know - running over HTTP not over SVN) also fails. I am using Windows Vista SP2 (build 6002). It seems to "creep up" in that after running the computer for a few hours it will fail. (No found way to systematically reproduce the problem.) Additionally, it seems to be more prone on days where the internet connection is flaky already (not sure why the internet is flaky, just is, lot's of failed browsing requests and have to retry/reload often). What I have tried (when the problem arises) - none have yielded any resolution: Resetting the network connection (dis-connect, re-connect) Disable/re-enable the network adapter Double-checked the ip settings Double-checked the HOSTS file. Note: DNS continues to work (both new and cached responses to DNS queries). (Thanks for the suggestion Daniel and antenore.) Checked the routing tables (ip4 only as ipv6 is beyond my understanding) resetting all involved hardware (routers and modems) Close and reopen browsers Looked for malware interference: Run HijackThis Looked for suspicious processes using SysInternals procexp. Looked for explorer hijacks, lsa provider interference, winsock provider interference using SysInternals Autoruns. Run a complete anti-virus scan. Reviewed the output of a netstat -onab to see if there were stuck ports open or unusual processes running somewhere The only thing that works is to do a full reboot. That works 100% of the time to restore browsing. What else can I try to nail down the problem?

Read the article
serving mp3s to mobile devices is flooding nginx with partial requests

- by drumfire

I am serving mp3s with a minimalistic nginx server. What I see in my log files is that there are a lot of requests, in particular from AppleCoreMedia and sometimes Android useragents, that flood the server with short requests. Sometimes they keep requesting to download the same partial content for a very long time; sometimes more than an hour. For example: "GET /somefile.mp3 HTTP/1.1" 206 33041 "AppleCoreMedia/1.0.0.9B206 (iPhone; U; CPU OS 5_1_1 like Mac OS X; en_us)" "GET /somefile.mp3 HTTP/1.1" 206 33041 "AppleCoreMedia/1.0.0.9B206 (iPhone; U; CPU OS 5_1_1 like Mac OS X; en_us)" "GET /somefile.mp3 HTTP/1.1" 206 33041 "AppleCoreMedia/1.0.0.9B206 (iPhone; U; CPU OS 5_1_1 like Mac OS X; en_us)" [...] I also get a lot, but not as much, of these: "-" 400 0 "-" "-" 400 0 "-" The IP addresses are always from clients that start downloading shortly after that request, usually they have roughly the same UserAgent as in the first example. emphasized text I have enabled server throttling and connection limits in nginx to limit the huge amount of log entries from equivalent IPs at least somewhat. There was a performance issue when I saw the same behaviour on the previous server that used Apache. I installed nginx on a better server then moved the site. When Apache could not handle more connections from the increasing number of clients effectively that server was ddossed. There was no bandwidth issue with already connected clients and I don't know if the already connected clients were using more than one connection at a time. Please tell me: Are clients that appear to get stuck on a download a Bad Thing™ I heard people say their mobile bandwidth use was much higher than they could account for. I'm thinking this type of client behaviour can account for that. And costs us more bandwidth too. Which up to date alternatives exist out there that can handle serving this type of data better than plain HTTP? Useful general insights for someone who just came into this field straight out of the late 90s. :-)

Read the article
load-causing processes disappearing from "top" ps -o pcpu shows bogus numbers

- by Alec Matusis

I administer a large number of servers, and I have this problem only with Ubuntu 10.04 LTS: I run a server under normal load (say load average 3.0 on an 8-core server). The "top" command shows processes taking certain % of CPU that cause this load average: say PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11008 mysql 20 0 25.9g 22g 5496 S 67 76.0 643539:38 mysqld ps -o pcpu,pid -p11008 %CPU PID 53.1 11008 , everything is consistent. The all of the sudden, the process causing the load average disappears from "top", but the process continues to run normally (albeit with a slight performance decrease), and the system load average becomes somewhat higher. The output of ps -o pcpu becomes bogus: # ps -o pcpu,pid -p11008 %CPU PID 317910278 1587 This happened to at least 5 different severs (different brand new IBM System X hardware), each running different software: one httpd 2.2, one mysqld 5.1, and one Twisted Python TCP servers. Each time the kernel was between 2.6.32-32-server and 2.6.32-40-server. I updated some machines to 2.6.32-41-server, and it has not happened on those yet, but the bug is rare (once every 60 days or so). This is from an affected machine: top - 10:39:06 up 73 days, 17:57, 3 users, load average: 6.62, 5.60, 5.34 Tasks: 207 total, 2 running, 205 sleeping, 0 stopped, 0 zombie Cpu(s): 11.4%us, 18.0%sy, 0.0%ni, 66.3%id, 4.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 74341464k total, 71985004k used, 2356460k free, 236456k buffers Swap: 3906552k total, 328k used, 3906224k free, 24838212k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 805 root 20 0 0 0 0 S 3 0.0 1493:09 fct0-worker 982 root 20 0 0 0 0 S 1 0.0 111:35.05 fioa-data-groom 914 root 20 0 0 0 0 S 0 0.0 884:42.71 fct1-worker 1068 root 20 0 19364 1496 1060 R 0 0.0 0:00.02 top Nothing causing high load is showing on top, but I have two highly loaded mysqld instances on it, that suddenly show crazy %CPU: #ps -o pcpu,pid,cmd -p1587 %CPU PID CMD 317713124 1587 /nail/encap/mysql-5.1.60/libexec/mysqld and #ps -o pcpu,pid,cmd -p1624 %CPU PID CMD 2802 1624 /nail/encap/mysql-5.1.60/libexec/mysqld Here are the numbers from # cat /proc/1587/stat 1587 (mysqld) S 1212 1088 1088 0 -1 4202752 14307313 0 162 0 85773299069 4611685932654088833 0 0 20 0 52 0 3549 27255418880 5483524 18446744073709551615 4194304 11111617 140733749236976 140733749235984 8858659 0 552967 4102 26345 18446744073709551615 0 0 17 5 0 0 0 0 0 the 14th and 15th numbers according to man proc are supposed to be utime %lu Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations. stime %lu Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). On a normal server, these numbers are advancing, every time I check the /proc/PID/stat. On a buggy server, these numbers are stuck at a ridiculously high value like 4611685932654088833, and it's not changing. Has anyone encountered this bug?

Read the article
Hell: NTFS "Restore previous versions"...

- by ttsiodras

The hell I have experienced these last 24h: Windows 7 installation hosed after bluetooth driver install. Attempting to recover using restore points via "Repair" on the bootable Win7 installation CD. Attempting to go back one day in the restore points. No joy. Attempting to go back two days in the restore points. No joy. Attempting to go back one week in the restore points. Stil no joy. Windows won't boot. Apparently something is REALLY hosed. And then it hits me - PANIC - the restore points somehow reverted DATA files to their older versions! Word, Powerpoint, SPSS, etc document versions are all one week old now. Using the "freshest" restore point. Failed to restore yesterday's restore point!!! I am stuck at old versions of the data!!! Booting KNOPPIX, mounting NTFS partition as read-only under KNOPPIX. Checking. Nope, the data files are still the one week old versions. Booting Win7 CD, Recovery console - Cmd prompt - navigating - yep, data files are still one week old. Removing the drive, mounting it under other Win7 installation. Still old data. Running NTFS undelete on the drive (read-only scan), searching for file created yesterday. Not found. Despair. At this point, idea: I will install a brand new Windows installation, keeping the old one in Windows.old (default behaviour of Windows installs). I boot the new install, I go to my C:\Data\ folder, I choose "Restore previous versions", click on yesterday's date, and click open... YES! It works! I can see the latest versions of my files (e.g. from yesterday). Thank God. And then, I try to view the files under the "yesterday snapshot-version" of c:\Users\MyAccount\Desktop ... And I get "Permission Denied" as soon as I try to open "Users\MyAccount". I make sure I am an administrator. No joy. Apparently, the new Windows installation does not have access to read the "NTFS snapshots" or "Volume Shadow Snapshots" of my old Windows account! Cross-installation permissions? I need to somehow tell the new Windows install that I am the same "old" user... So that I will be able to access the "Users\MyAccount" folder of the snapshot of my old user account. Help?

Read the article
Nginx + Wordpress Multisite 3.4.2 + subdirectories + static pages and permalinks

- by UrkoM

I am trying to setup Wordpress Multisite, using subdirectories, with Nginx, php5-fpm, APC, and Batcache. As many other people, I am getting stuck in the rewrite rules for permalinks. I have followed these two guides, which seem to be as official as you can get: http://evansolomon.me/notes/faster-wordpress-multisite-nginx-batcache/ http://codex.wordpress.org/Nginx#WordPress_Multisite_Subdirectory_rules It is partially working: http://blog.ssis.edu.vn works. http://blog.ssis.edu.vn/umasse/ works. But other permalinks, like these two to a post or to a static page, don't work: http://blog.ssis.edu.vn/umasse/2008/12/12/hello-world-2/ http://blog.ssis.edu.vn/umasse/sample-page/ They either take you to a 404 error, or to some other blog! Here is my configuration: server { listen 80 default_server; server_name blog.ssis.edu.vn; root /var/www; access_log /var/log/nginx/blog-access.log; error_log /var/log/nginx/blog-error.log; location / { index index.php; try_files $uri $uri/ /index.php?$args; } # Add trailing slash to */wp-admin requests. rewrite /wp-admin$ $scheme://$host$uri/ permanent; # Add trailing slash to */username requests rewrite ^/[_0-9a-zA-Z-]+$ $scheme://$host$uri/ permanent; # Directives to send expires headers and turn off 404 error logging. location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ { expires 24h; log_not_found off; } # this prevents hidden files (beginning with a period) from being served location ~ /\. { access_log off; log_not_found off; deny all; } # Pass uploaded files to wp-includes/ms-files.php. rewrite /files/$ /index.php last; if ($uri !~ wp-content/plugins) { rewrite /files/(.+)$ /wp-includes/ms-files.php?file=$1 last; } # Rewrite multisite '.../wp-.*' and '.../*.php'. if (!-e $request_filename) { rewrite ^/[_0-9a-zA-Z-]+(/wp-.*) $1 last; rewrite ^/[_0-9a-zA-Z-]+.*(/wp-admin/.*\.php)$ $1 last; rewrite ^/[_0-9a-zA-Z-]+(/.*\.php)$ $1 last; } location ~ \.php$ { # Forbid PHP on upload dirs if ($uri ~ "uploads") { return 403; } client_max_body_size 25M; try_files $uri =404; fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include /etc/nginx/fastcgi_params; } } Any ideas are welcome! Have I done something wrong? I have disabled Batcache to see if it makes any difference, but still no go.

Read the article

Search Results

Search found 22968 results on 919 pages for 'stuck again'.

Page 523/919 | < Previous Page | 519 520 521 522 523 524 525 526 527 528 529 530 | Next Page >

- by rottmanj

- by Christian Nunciato

- by phobia

- by Guntis

- by Martin F

- by Weston C

- by gshankar

- by Samurai Waffle

- by Lukas

- by SCO

- by Fammy

- by E3 Group

- by Roman

- by user26453

- by espenfjo

- by user978548

- by Steven Oxley

- by Jon Schneider

- by Jon

- by viraptor

- by jeffreypriebe

- by drumfire

- by Alec Matusis

- by ttsiodras

- by UrkoM

< Previous Page | 519 520 521 522 523 524 525 526 527 528 529 530 | Next Page >