global state - Page 484

Upstart multiple instances of service not working

- by Dax

I started playing with MongoDB on Lucid. Now I would like to run a DB and Config server on the same box. They both use the same binary to launch, but with different config files and running on different ports. All directories for log and lib is split so one goes to mongodb and the other to mongoconf. Each process can be started without any problems on their own. start mongodb stop mongodb start mongoconf stop mongoconf But if I try to start both, the second one would just start and exit. Using 'initctl log-priority debug' I got the following in the logs. Jan 6 12:44:12 mongo4 init: event_finished: Finished started event Jan 6 12:44:12 mongo4 init: job_process_handler: Ignored event 1 (1) for process 5690 Jan 6 12:44:12 mongo4 init: mongoconf (mongoconf) main process (5690) terminated with status 1 Jan 6 12:44:12 mongo4 init: mongoconf (mongoconf) goal changed from start to stop Jan 6 12:44:12 mongo4 init: mongoconf (mongoconf) state changed from running to stopping man 5 init shows that you can use instance names to differentiate the two. I tried using 'instance mongoconf' in the on upstart script and 'instance mongodb' in the other one, and it still fails. I can manually start the other process, so there is definitely no conflicts on port numbers or directories. Any ideas on what to try or how to get output on why it is 'terminated with status 1'? Thanx

Read the article

Ext3 fs: Block bitmap for group 1 not in group (block 0). is fs dead?

- by ip

My company has a server with one big partition with Mysql database and php files. Now this partition seems to be corrupted, as reported from kernel messages when I tried to mount it manually: [329862.817837] EXT3-fs error (device loop1): ext3_check_descriptors: Block bitmap for group 1 not in group (block 0)! [329862.817846] EXT3-fs: group descriptors corrupted! I've tried to recovery it running tools from a PLD livecd. These are the tools I have tested: - e2retrieve - testdisk - photorec - dd_rescue/dd_rhelp - ddrescue - fsck.ext2 - e2salvage without any success. dumpe2fs 1.41.3 (12-Oct-2008) Filesystem volume name: /dev/sda3 Last mounted on: <not available> Filesystem UUID: dd51610b-6de0-4392-a6f3-67160dbc0343 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal filetype sparse_super Default mount options: (none) Filesystem state: not clean with errors Errors behavior: Continue Filesystem OS type: Linux Inode count: 9502720 Block count: 18987570 Reserved block count: 949378 Free blocks: 11555345 Free inodes: 11858398 First block: 0 Block size: 4096 Fragment size: 4096 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16384 Inode blocks per group: 512 Last mount time: Wed Mar 24 09:31:03 2010 Last write time: Mon Apr 12 11:46:32 2010 Mount count: 10 Maximum mount count: 30 Last checked: Thu Jan 1 01:00:00 1970 Check interval: 0 (<none>) Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Journal backup: inode blocks dumpe2fs: A block group is missing an inode table while reading journal inode e2fsck 1.41.3 (12-Oct-2008) fsck.ext3: Group descriptors look bad... trying backup blocks... fsck.ext3: A block group is missing an inode table while checking ext3 journal for /dev/sda3 I tried also backup superblocks, same error result. There's any other tools I have to test before considering these disk definitely unrecoverable? Many thanks, ip

Read the article

What steps should I take to debug this non-starting hvm virtual machine?

- by Ophidian

I have a dom0 machine running CentOS 5.4 with all the latest updates using Xen as my hypervisor. I am using Xen in part because this machine was set up prior to KVM being included in RHEL, and in part because KVM's network bridging configuration is not nearly as simple as Xen's. The dom0 machine is headless and I do all of my VM management via virsh from the command line. I have two hvm domU's: A web server running CentOS 5.4 A mail server running Gentoo Both VM's are backed by LV's on the dom0 but do not use LVM in the domU. Both have virtually identical libvirt configurations (differing by expected things like name, UUID, NIC MAC, VNC port, etc). The web server domU (WSdomU hereafter) does not start since applying the most recent kernel update (kernel-xen-2.6.18-164.15.1.el5.x86_64 and kernel-2.6.18-164.15.1.el5.x86_64 for the dom0 and WSdomU respectively). By 'not start' I mean it appears to be running but it does not use an CPU cycles, does not bring up a graphical console, and does not respond on the network. The WSdomU is listed as no state rather than the normal running or blocked in xentop. The mail server domU starts fine and functions normally. Here are the steps I have taken so far that did not solve the problem: Reboot the dom0 to see if things come up on their own Check xen dmesg on dom0 Check xend logs (a cursory viewing did not show anything blatant; specific suggestions of things to look for would be appreciated) Attempted to connect to the WSdomU's graphical (VNC) console from the dom0 Shutdown the mail server domU and attempt to start the WSdomU Check the SELinux labels on backing LV's (they're the same) Set SELinux to permissive and attempt to start the WSdomU Use virsh edit to try tweaking the WSdomU config virsh undefine, reboot, virsh define the WSdomU config dd the WSdomU LV to an .img file, copy it to my Fedora desktop and run it under KVM (works fine) What steps should I take next to debug this? I will edit in any additional configuration's requested in the comments.

Read the article

FreeBSD's ng_nat stopping pass the packets periodically

- by Korjavin Ivan

I have FreeBSD router: #uname 9.1-STABLE FreeBSD 9.1-STABLE #0: Fri Jan 18 16:20:47 YEKT 2013 It's a powerful computer with a lot of memory #top -S last pid: 45076; load averages: 1.54, 1.46, 1.29 up 0+21:13:28 19:23:46 84 processes: 2 running, 81 sleeping, 1 waiting CPU: 3.1% user, 0.0% nice, 32.1% system, 5.3% interrupt, 59.5% idle Mem: 390M Active, 1441M Inact, 785M Wired, 799M Buf, 5008M Free Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 4 155 ki31 0K 64K RUN 3 71.4H 254.83% idle 13 root 4 -16 - 0K 64K sleep 0 101:52 103.03% ng_queue 0 root 14 -92 0 0K 224K - 2 229:44 16.55% kernel 12 root 17 -84 - 0K 272K WAIT 0 213:32 15.67% intr 40228 root 1 22 0 51060K 25084K select 0 20:27 1.66% snmpd 15052 root 1 52 0 104M 22204K select 2 4:36 0.98% mpd5 19 root 1 16 - 0K 16K syncer 1 0:48 0.20% syncer Its tasks are: NAT via ng_nat and PPPoE server via mpd5. Traffic through - about 300Mbit/s, about 40kpps at peak. Pppoe sessions created - 350 max. ng_nat is configured by by the script: /usr/sbin/ngctl -f- <<-EOF mkpeer ipfw: nat %s out name ipfw:%s %s connect ipfw: %s: %s in msg %s: setaliasaddr 1.1.%s There are 20 such ng_nat nodes, with about 150 clients. Sometimes, the traffic via nat stops. When this happens vmstat reports a lot of FAIL counts vmstat -z | grep -i netgraph ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP NetGraph items: 72, 10266, 1, 376,39178965, 0, 0 NetGraph data items: 72, 10266, 9, 10257,2327948820,2131611,4033 I was tried increase net.graph.maxdata=10240 net.graph.maxalloc=10240 but this doesn't work. It's a new problem (1-2 week). The configuration had been working well for about 5 months and no configuration changes were made leading up to the problems starting. In the last few weeks we have slightly increased traffic (from 270 to 300 mbits) and little more pppoe sessions (300-350). Help me please, how to find and solve my problem?

Read the article

How to export SQL Server data from corrupted database (with disk write error)

- by damitamit

IT realised there was a disk write error on our production SQL Server 2005 and hence was causing the backups to fail. By the time they had realised this the nightly backup was old, so were not able to just restore the backup on another server. The database is still running and being used constantly. However DBCC CheckDB fails. Also the SQL Server backup task fails, Copy Database fails, Export Data Wizard fails. However it seems all the data can be read from the tables (i.e using bcp etc) Another observation I have made is that the Transaction Log is nearly double the size of the Database. (Does that mean all the changes arent being written to the MDF?) What would be the best plan of attack to get the database to a state where backups are working and the data is safe? Take the database offline and use the MDF/LDF to somehow create the database on another sql server? Export the data from the database using bcp. Create the database (use the Generate Scripts function on the corrupt db to create the schema on the new db) on another sql server and use bcp again to import the data. Some other option that is the right course of action in this situation? The IT manager says the data is safe as if the server fails, the data can be restored from the mdf/ldf. I'm not sure so insisted that we start exporting the data each night as a failsafe (using bcp for example). IT are also having issues on the hardware side of things as supposedly the disk error in on a virtualized disk and can't be rebuilt like a normal raid array (or something like that). Please excuse my use of incorrect terminology and incorrect assumptions on how Sql Server operates. I'm the application developer and have been called to help (as it seems IT know less about SQL Server than I do). Many Thanks, Amit

Read the article

what to disable on Windows server? (by list of opened ports)

- by javapowered

I'm using HP DL360p Gen8 for HFT trading. I want to disable any network services I don't need cause I also want to try to disable Windows Firewall to test if this will improve perfomance. Could someone suggest what currently is turned on and can be likely turned off having ports list below? I need only RDP (also I drag & drop files via RDP) Proto Local Address Foreign Address State TCP 0.0.0.0:135 Term:0 LISTENING TCP 0.0.0.0:445 Term:0 LISTENING TCP 0.0.0.0:2301 Term:0 LISTENING TCP 0.0.0.0:2381 Term:0 LISTENING TCP 0.0.0.0:3389 Term:0 LISTENING TCP 0.0.0.0:47001 Term:0 LISTENING TCP 0.0.0.0:49152 Term:0 LISTENING TCP 0.0.0.0:49153 Term:0 LISTENING TCP 0.0.0.0:49154 Term:0 LISTENING TCP 0.0.0.0:49156 Term:0 LISTENING TCP 0.0.0.0:49157 Term:0 LISTENING TCP HIDEN:139 Term:0 LISTENING TCP HIDEN:3389 HIDEN:63373 ESTABLISHED TCP HIDEN:139 Term:0 LISTENING TCP HIDEN:139 Term:0 LISTENING TCP [::]:135 Term:0 LISTENING TCP [::]:445 Term:0 LISTENING TCP [::]:2301 Term:0 LISTENING TCP [::]:2381 Term:0 LISTENING TCP [::]:3389 Term:0 LISTENING TCP [::]:47001 Term:0 LISTENING TCP [::]:49152 Term:0 LISTENING TCP [::]:49153 Term:0 LISTENING TCP [::]:49154 Term:0 LISTENING TCP [::]:49156 Term:0 LISTENING TCP [::]:49157 Term:0 LISTENING UDP 0.0.0.0:68 *:* UDP 0.0.0.0:123 *:* UDP 0.0.0.0:161 *:* UDP 0.0.0.0:500 *:* UDP 0.0.0.0:4500 *:* UDP 0.0.0.0:5355 *:* UDP HIDEN:137 *:* UDP HIDEN:138 *:* UDP HIDEN:137 *:* UDP HIDEN:138 *:* UDP HIDEN:137 *:* UDP HIDEN:138 *:* UDP [::]:123 *:* UDP [::]:161 *:* UDP [::]:500 *:* UDP [::]:4500 *:* UDP [::]:5355 *:*

Read the article

CDN Rerouting on 404 (file not yet in synch with original storage)

- by Alan Ristic

Here is the problem. I've setup my app(on EC2) to store uploaded images directly on Amazon S3. I'd like to be able to serve static files(cdn) from my 'home' server so I wrote script that does sync from S3. But there is a window of (at least) one minute in synch. Now I see two solutions on the problem of pics not been available on 'home' server here: 1.I write script on EC2 (where the app resides) to fetch from DB pics that have status of "not-yet-synch", which is default state when user uploads picture. The script then does a ping to picture and if it gets OK response, updates DB from "not-yet-synch" to "synch". 2.Prefered solution would be to let apache (in this case) redirect request for an image if it sees 404 (e.g. doesent find image requested) to S3. This way I wouldn't need script from solution 1. So what approach do you suggest I take in solving this redundancy problem? Or what is practice in production environments? To further clarify; I'd like so serve images first from 'home' server, if that fails serve them from S3. Tnx, Alan

Read the article

Apache2 VirtualHosts 403 Oddity

- by Carson C.

I'm sure this is something I should already understand, but I'm finding myself confused. The configs in play add up to this: NameVirtualHost *:80 Listen 80 <VirtualHost *:80> <Directory /> Options FollowSymLinks AllowOverride None Order deny,allow Deny from all </Directory> </VirtualHost> <VirtualHost *:80> ServerAdmin [email protected] ServerName domain.tld ServerAlias *.domain.tld DocumentRoot /var/www/domain.tld <Directory /var/www/domain.tld> Options -Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny Allow from all </Directory> ErrorLog ${APACHE_LOG_DIR}/error.log LogLevel warn CustomLog ${APACHE_LOG_DIR}/access.log combined </VirtualHost> DNS is working correctly. The issue is, every variant of http://*.domain.tld/ (including http://domain.tld/) works correctly, except http://www.domain.tld/ which throws a 403. The logs state: client denied by server configuration: /etc/apache2/htdocs If I remove the first VirtualHost block from play, everything works as expected including http://www.domain.tld. This leads me to believe that for some reason, Apache is not considering www.domain.tld to match the second VirtualHost block, and is thereby falling back to deny all. This seems wrong. Shouldn't the second block match www.domain.tld? I've been able to resolve this, but I still don't understand why. In my original configs, I was using the real ip address of the server instead of *. Switching all instances to * as shown above made everything work as expected. Does this have something to do with the way browsers request resources?

Read the article

Solaris 10 invalid ARP requests from 0.0.0.0? Link up/down every hour or 2

- by JWD

The guys at the data center where I'm hosting a server running Solaris 10 are telling me that my server is making a lot of invalid arp requests. This is an example of a portion of what was sent to me from the logs (with Mac addresses and IP addresses changed). [mymacaddress]/0.0.0.0/0000.0000.0000/[myipaddress]/[Datestamp]) It's being logged every hour. I don't see anything in the arp tables (arp -a) or routing tables (netstat -r) and I don't see anything relating to 0.0.0.0 when snoping the arp requests. The only place I see any reference to 0.0.0.0 is if I do netstat -a for the SCTP SCTP: Local Address Remote Address Swind Send-Q Rwind Recv-Q StrsI/O State ------------------------------- ------------------------------- ------ ------ ------ ------ ------- ----------- 0.0.0.0 0.0.0.0 0 0 102400 0 32/32 CLOSED But not really sure what that means. Doesn't seem like I can disable SCTP. There are some tunable SCTP parameters but it's not something I'm familiar with. Do I have to add changes to /etc/system? Looks like sctp_heartbeat_interval might be what I need to change? If it makes any difference, I have a few solaris zones running on this server, each with their own IP address on a virtual interface. eth0:0, eth0:1, etc. Does anyone have any idea what might be causing this and how to stop it? I think the switch I'm connected to doesn't like it and momentarily drops the connection. Is there anyway to at least block those requests using ipfilter or something else? Update: This was happening more frequently but now it seems to be happening roughly every hour or every two hours. It's not consistent. I tried setting setting the link speed and duplex to match the switch port and that seemed to make it stop happening for a few hours but then it started again.

Read the article

Windows 7 Explorer keeps crashing

- by Daniel Liang

I currently have an issue with Windows Explorer. It keeps crashing when I browse through a network drive. This is happening on several computers. I have already obtained a crash dump file but it doesn't state much: Microsoft (R) Windows Debugger Version 6.12.0002.633 X86 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [C:\LocalDumps\explorer.exe.3964.dmp] User Mini Dump File with Full Memory: Only application data is available Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols Executable search path is: Windows 7 Version 7601 (Service Pack 1) MP (2 procs) Free x86 compatible Product: WinNt, suite: SingleUserTS Machine Name: Debug session time: Mon Oct 21 11:21:30.000 2013 (UTC - 4:00) System Uptime: 0 days 0:06:20.449 Process Uptime: 0 days 0:05:54.000 ................................................................ ................................................................ .... Loading unloaded module list ............. This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (f7c.fe4): Access violation - code c0000005 (first/second chance not available) eax=00000000 ebx=07a3f080 ecx=00000400 edx=00000000 esi=00000002 edi=00000000 eip=76e170f4 esp=07a3f030 ebp=07a3f0cc iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246 ntdll!KiFastSystemCallRet: 76e170f4 c3 ret I've already tried removing certain context menu items. I disabled all unnecessary start-up items. Ran memtest86 and it looks fine on that end. It also happens when I browse through my local disk. Can anyone take a look into this? Thanks!

Read the article

add_header directives in location overwriting add_header directives in server

- by user64204

Using nginx 1.2.1 I am able to add multiple headers using add_header as follows: server { listen 80; server_name localhost; root /var/www; add_header Name1 Value1; <=== HERE add_header Name2 Value2; <=== HERE location / { echo "Nginx localhost site"; } } GET / HTTP/1.1 200 OK Name1: Value1 Name2: Value2 However I soon as I use the add_header directive inside location, the other add_header directives under server are ignored server { listen 80; server_name localhost; root /var/www; add_header Name1 Value1; <=== HERE add_header Name2 Value2; <=== HERE location / { add_header Name3 Value3; <=== HERE add_header Name4 Value4; <=== HERE echo "Nginx localhost site"; } } GET / HTTP/1.1 200 OK Name3: Value3 Name4: Value4 The documentation says that both server and location are valid context and doesn't state that using add_header in one prevents using it in the other. Q1: Do you know if this is a bug or the intended behaviour and why? Q2: Do you see other options to get this fixed than using the HttpHeadersMoreModule module?

Read the article

IIS crashes with unhandled exception in ASP.NET

- by SnowCrash

We had an issue recently with an unhandled exception in an ASP.NET C# application bringing down IIS and all application pools that it was hosting. IIS Manager was unable to restart or stop/start the service and I was unable start IIS again after killing w3wp.exe in the task manager. A system reboot restored IIS to a running state; as a primarily Linux admin, I generally consider an unplanned system reboot to resolve a software error to be an act of high heresy. Is there a way to "harden" IIS so that a faulting application does not affect anything but the request that exposes the fault? Some details on the server and application fault. IIS: 7.5 .NET: 4.0 Windows Server 2008 R2 Faulted on call to System.Net.Dns.Resolve() with a url pointing to a non-existant domain as the argument. (I'm aware that this method is deprecated but the point that a page code issue shouldn't bring down the server still stands) The exception generated was SocketException. The faulting module according to event viewer was KERNELBASE.dll The issue was resolved by wrapping the call in a try-catch, logging the exception and displaying some generic content on the page. I'm hoping that I missed something in the IIS config that would switch it to "production" mode or something.

Read the article

SQL SERVER 2005 with Windows 7 Problems

- by azamsharp

First of all I restored the database from other server and now all the stored procedures are named as [azamsharp].[usp_getlatestposts]. I think [azamsharp] is prefixed since it was the user on the original server. Now, on my local machine this does not run. I don't want the [azamsharp] prefix with all the stored procedures. Also, when I right click on the Sproc I cannot even see the properties option. I am running the SQL SERVER 2005 on Windows 7. UPDATE: The weird thing is that if I access the production database from my machine I can see the properties option. So, there is really something wrong with Windows 7 security. UPDATE 2: When I ran the orphan users stored procedure it showed two users "azamsharp" and "dbo1". I fixed the "azamsharp" user but "dbo1" is not getting fixed. When I run the following script: exec sp_change_users_login 'update_one', 'dbo1', 'dbo1' I get the following error: Msg 15291, Level 16, State 1, Procedure sp_change_users_login, Line 131 Terminating this procedure. The Login name 'dbo1' is absent or invalid.

Read the article

Apache httpd workers retry

- by David Newcomb

I have an Apache httpd web server running mod_proxy and mod_proxy_balancer. The whole of /somedir is sent to 2 worker machines which service the requests using the round robin scheduler. Each worker machine is running IIS but I don't think that is important. I can demonstrate the load balancer working by repeatedly requesting a single page which contains the IP address of the machine and can see that it switches from one to the other in a predictable round robin fashion. If I switch off one of the IIS servers and start requesting the same page then each page only contains the IP address of the machine that is up. However, if I start IIS and don't run my IIS application then /somedir returns 500 (as it should). I've added 500 to the failonstatus (Apache 2.4) so when it hits the error Apache places the worker machine into error state. Apache still returns the proxy error to the client though. How can I make Apache catch the proxy failure and retry using a different worker in the same way that a connection failure does. Update There is almost the same question asked in StackOverflow so joining them together. http://stackoverflow.com/questions/11083707/httpd-mod-proxy-balancer-failover-failonstatus-transperant-switching

Read the article

Windows 7 hangs after going into sleep a second time

- by Brian Stephenson

I've searched everywhere around Google and can't figure out why this is happening so I decide to ask here to see if anyone has a problem like this. Like it says in the title, whenever I sleep ONCE I'm able to wake the system, but going back to sleep again AFTER waking up for the first time results in it hanging on no input and no output, with the fan spinning as fast as possible and alot of heat being spewed out by the fan as well. I've tried various things like setting all USB Hub Root's to not get switched off for power saving, disabling USB selective suspend, disabling PCI-e link state power management, and even unplugging ALL USB devices and it wont wake up after the second attempt. And I've even waited up to a full hour of the CPU fan spinning loudly and it's still stuck trying to wake up. The only USB devices I use are a Microsoft USB Comfort Curve Keyboard 2000 (IntelliType Pro) and a generic HID compliant mouse from Creative model number OMC90S "CREATIVE MOUSE OPTICAL LITE". My other devices like external drives and controllers are unplugged when I'm not using them as having too many USB devices plugged in at a time causes a deadlock on almost all of the ports I have. Here's my system specifications (Most of these are from CPU-Z): Brand: Gateway DX4300-19 Mainboard: Gateway RS780 Chipset: AMD 780G Rev 00 Southbridge: AMD SB700 Rev 00 LPCIO: ITE IT8718 BIOS: American Megatrends Inc. ver P01-A4 09/15/2009 CPU: AMD Phenom II X4 810 at 2.60 GHz RAM: 8.0 GB DDR2 Dual Channel Ganged Mode at 400 MHz GPU: ATI Radeon HD3200 Graphics Intergrated - RS780 OS: Windows 7 Home Premium x64 OEM (Acer Group) HDD: WDC WD10EADS-22M2B0 1.0 TB (Western Digital Green Caviar) My BIOS has absolutely no control over how I setup the sleep mode to be either S1 or S3. So I can't check these settings or even change them. Hybrid sleep is also disabled, I can successfully go into hibernation and wake from hibernation but this is painfully slow due to a harddrive problem I'm having with this "Green Drive". (Hibernation takes over ~3 minutes to complete) Any help would be appreciated, thanks.

Read the article

How do I diagnose the cause of a freeze after resuming in Windows XP (SP3)?

- by Software Monkey

I have just built a new computer from parts. Whenever I resume from any sleep mode (S1, S3 or S4) the computer freezes within about 60 seconds of the welcome screen appearing. I have updated the BIOS and all drivers to current from the motherboard manufacturer's site. I have reset BIOS settings to default, including disabling AMD Cool n Quiet. The windows event logs are not helpful at all. Other than immediately after resuming the system is stable as long as AMD CnQ is disabled. The system is: Mobo : MSI 790GX-G65 CPU : AMD Phenom II 965 BE at 3.6 GHz Memory : Corsair DDR3 1600, at 1333 MHz and 9-9-9-21 HDDs : 1 EIDE, 2 SATA in RAID-0 DVD : 1 Card Reader: 1 multi-card reader Keyboard is attached via PS2 and mouse is USB. Any thoughts or pointers would be most welcome. EDIT: It appears that the computer may not freeze if a program is left running which puts it under significant load. I left a stress test running which keeps all cores under 85% load, and my son put the computer to sleep - while this program is running it I have been able to resume from S3 successfully 4 times, compared against about 20 tests with the computer idle which have all frozen. So this may be related to being in an idle state when it resumes.

Read the article

Sane patch schedule for Windows 2003 cluster

- by sixlettervariables

We've got a cluster of 75 Win2k3 nodes at work in a coarse grained compute cluster. The cluster is behind a mountain of firewalls and resides in its own VLAN. Jobs of all sizes and types run on the cluster and all of the executables running are custom-made. (ed: additional notes on our executables) The jobs range from 30 seconds to 7 days in duration, and may contain one executable or 2000 sub-jobs (of short duration). Obviously we are trying to avoid the situation where our IT schedules a reboot during a 7 day production job. We have scheduling software which accomodates all of the normal tasks for a coarse grained cluster and we can control which machines are active for submission, etc. If WSUS was in some way scriptable (or the client could state it's availability for shutdown) we could coordinate the two systems and help out. Currently, the patch schedule is the Sunday after Super Tuesday regardless of what is running on the cluster. We have to ask for an exemption every time we want to delay patching a machine for a long running production job. Basically, while our group is responsible for the machines we have little control over IT's patch schedule. Is patching monthly with MS's schedule sane for a production Windows cluster? Are there software hooks in WSUS where we could say, "please don't reboot just yet"?

Read the article

Preventing h/w RAID cards from dropping slow JBOD disks

- by Kevin

I'm considering buying a used SAS h/w RAID card for externally attaching HDDs to an HP ProLiant I'm setting up. However, I only require RAID functionality on some of the drives. Theoretically it should be simple to JBOD the other drives, but some of them are inexpensive SATA disks and probably cannot have TLER disabled. I'd like to know, prior to actually ordering a RAID card, whether typically RAID cards would still enforce dropping of disks that do not respond within a few seconds, even if the disk is in a JBOD, and whether there is any way to disable this. Ideally it would be nice to be able to select certain SAS ports that will be pass-through, bypassing the RAID engine entirely and just acting as an HBA for those ports. I know I could buy a separate SAS HBA but that seems like a waste of $ and is also impractical as it's a 1U server so space is extremely limited. My question then is whether the functionality I'm looking for (pass-through on certain ports or at least JBOD drives not getting themselves dropped due to slow response) is typical of proper h/w RAID cards such as PERC 5/E etc. I've browsed through the latter's manual but unfortunately, as with most user manuals, it states the obvious and doesn't state the unobvious. Thanks for any info, Kevin

Read the article

big cpu load on vmware server / linux

- by dezfafara

Hi, I currently using a server 2.x hosting 4 virtual machines on a linux system Today, on my physical server, I saw an enormous load average: this is the "top" of the server, illustrating my 4 virtual guests. top - 11:02:02 up 194 days, 23:09, 5 users, load average: 18.78, 12.05, 13.55 Tasks: 113 total, 4 running, 109 sleeping, 0 stopped, 0 zombie Cpu0 : 71.6%us, 19.0%sy, 0.0%ni, 8.8%id, 0.0%wa, 0.3%hi, 0.3%si, 0.0%st Cpu1 : 74.3%us, 10.4%sy, 0.0%ni, 15.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 72.5%us, 17.6%sy, 0.0%ni, 9.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 79.5%us, 4.6%sy, 0.0%ni, 16.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8178884k total, 8129980k used, 48904k free, 134904k buffers Swap: 10490436k total, 148k used, 10490288k free, 6129728k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7312 root 6 -10 1149m 921m 559m R 97 11.5 107947:09 vmware-vmx 6995 root 6 -10 779m 687m 317m R 92 8.6 107374:31 vmware-vmx 6693 root 6 -10 880m 659m 409m S 85 8.3 76947:33 vmware-vmx 12937 root 6 -10 960m 719m 523m S 75 9.0 67219:49 vmware-vmx In bold are the cpu usage for my 4 virtuals guests These guests are running on a linux system, and the appropriate process are usually 5% - 15% of cpu I don't understang why , since a few days I have this big problem. This is the "top" on a virtual guest which is at 95% of cpu load top - 11:23:15 up 194 days, 23:13, 4 users, load average: 0.25, 0.47, 0.59 Tasks: 92 total, 2 running, 90 sleeping, 0 stopped, 0 zombie Cpu(s): 1.4%us, 7.7%sy, 0.0%ni, 90.5%id, 0.5%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 382296k total, 369732k used, 12564k free, 145156k buffers Swap: 979924k total, 13956k used, 965968k free, 86988k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3691 root 20 0 23948 1148 960 S 13.0 0.3 15339:23 vmware-guestd 3840 root 20 0 19880 584 512 S 7.7 0.2 1729:17 hald-addon-stor This virtual guest state is ok ... If anyone has any ideas .. Thanks

Read the article

Is execution of sync(8) still required before shutting down linux?

- by Amos Shapira

I still see people recommend use of "sync; sync; sync; sleep 30; halt" incantations when talking about shutting down or rebooting Linux. I've been running Linux since its inception and although this was the recommended procedure in the BSD 4.2/4.3 and SunOS 4 days, I can't recall that I had to do that for at least the last ten years, during which I probably went through shutdown/reboot of Linux maybe thousands of times. I suspect that this is an anachronism since the days that the kernel couldn't unmount and sync the root filesystem and other critical filesystems required even during single-user mode (e.g. /tmp), and therefore it was necessary to tell it explicitly to flush as much data as it can to disk. These days, without finding the relevant code in the kernel source yet (digging through http://lxr.linux.no and google), I suspect that the kernel is smart enough to cleanly unmount even the root filesystem and the filesystem is smart enough to effectively do a sync(2) before unmounting itself during a normal "shutdown"/"reboot"/"poweorff". The "sync; sync; sync" is only necessary in extreme cases where the filesystem won't unmount cleanly (e.g. physical disk failure) or the system is in a state that only forcing a direct reboot(8) will get it out of its freeze (e.g. the load is too high to let it schedule the shutdown command). I also never do the "sync" procedure before unmounting removable devices, and never hit a problem. Another example - Xen allows the DomU to be sent a "shutdown" command from the Dom0, this is considered a "clean shutdown" without anyone having to login and type the magical "sync; sync; sync" first. Am I right or was I lucky for a few thousands of system shutdowns?

Read the article

How does one skip "Windows did not shut down successfully" in Win7-64?

- by XenonofArcticus

Migrating an app from an expensive and unreliable dedicated embedded x86 box running WinXP-embedded to COTS hardware (Dell E6410 laptop) running normal Win7-64. At this time, it's not feasible to deploy using Windows 7 embedded. The problem is, that the system is still sort of "embedded". The power could shut off at virtually any time without prior warning. We've stripped the OS down and removed the battery capability so that it will power down as desired. The app never writes to the disk, so it's not like we're going to corrupt anything terribly. The system is essentially idle after our app is up and running (with the exception of some computation, graphics, and TCP/IP and serial communications) so the OS enters a pretty stable state rather quickly. After a power-loss however, it rightly complains that Windows did not shut down successfully and presents the user with the Windows Error Recovery text screen. If left alone, it does eventually move on booting just fine, but we'd like to skip that step if possible. WinXP-embedded is designed to do this automatically, so I know it's possible. I've looked at the Kernel Switches but I didn't see anything documented for "Skip Windows Error Recovery". I've also read extensively on the startup process: http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/windows-nt-6-boot-process.html I know I can disable the auto chkdsk in the registry, but that's not the same thing either. So, how do I streamline the boot process to not hassle the user about a situation that will be the regular normal situation?

Read the article

Site hanging in iis7 - how do I troubleshoot?

- by Chris Foot

I am currently having a problem with a windows 2008 server running IIS 7. The server runs several sites but only seems to have the issue with one particular site. Every so often, the whole server slows to a crawl with nearly all requests timing out! Invariably, when we log in to take a look there is always an IIS process using up around 90% cpu. Looking into the worker processes in IIS there are usually one or two requests that have been running for a long time. They are always in the ExecuteRequestHandler state with ManagedPipeline as the module name and the current ones i'm looking at have been running for 7686248 (what units is this in, it doesn't say?). It is also not always the same page, in fact we have seen at least 3 different pages listed under url when this has happened. It seems that the only way to bring the server back to life is to kill the 90% process! The site is running under .Net 4.0 and the code on it is very similar to other sites on the server which do not have the problem! How do I start troubleshooting this?

Read the article

How to back up a database with thousands of files

- by Neal

I am working with a Fedora server that runs a customized software package. The server software is quite old, and its database consists of 1,723 files. The database files are constantly changing - they continually grow and changes are not necessarily appended to the end. So right now, we currently back up every 24 hours at midnight when all users are off of the system and the database is in an internally consistent state. The problem is that we have the potential to lose an entire day's worth of work, which would be unrecoverable. So I'd like to know if there is a way to take some sort of an instantaneous snapshot of these database files that we could back up every 30 minutes or so. I've read about Linux LVM snapshots, and am thinking that I might be able to do accomplish the goal by taking a snapshot, rsync'ing the files to a backup server, then dropping the snapshot. But I've never done this before,so I don't know if this is the "right" fix. Any ideas on this? Any better solutions? Thanks!

Read the article

PowerShell 3.0 x64 bit broken after installing KB2506143

- by Dave Parker

I have searched using all kinds of variations on relevant terms and I cannot find a single other instance of someone else having this excact same problem, so I am hoping someone here may have a clue. Problem I installed Windows Management Framework 3.0 (KB2506143) by downloading and running Windows6.1-KB2506143-x64.msu from Microsoft.com. Once completed I rebooted my machine as requested. After rebooting and logging in, I try to run the 64-bit PowerShell command shell and it comes up for a second then goes away. The 32-bit shell seems to work fine, it is just the 64-bit one that fails. Looking in the Fusion logs, I found: *** Assembly Binder Log Entry (10/4/2012 @ 1:51:48 PM) *** The operation failed. Bind result: hr = 0x80070002. The system cannot find the file specified. Assembly manager loaded from: C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscorwks.dll Running under executable C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe --- A detailed error log follows. === Pre-bind state information === LOG: User = ********\***** LOG: DisplayName = Microsoft.PowerShell.ConsoleHost, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL <remainder omitted> GacUtil reveals that there is a Microsoft.PowerShell.ConsoleHost, Version=1.0.0.0, but not 3.0.0.0. I tried uninstalling KB2506143 (which removed MSVCRT90.dll and caused Windows Live Messenger to fail on load after rebooting again, so I ran a repair in stall on Windows Live Essentials and that fixed the Messenger problem) and then re-installing it, but nothing changed. If it helps, here are what I think may be the relevant parts of my hardware/software environment. Environment Dell Latitude E6510, 8GB RAM Windows 7 Professional 64-bit with SP1 Visual Studio 2010 Professional installed (includes .NET 4.0) Visual Studio 2012 Professional installed Microsoft Forefront Client Security Any clues out there? Thanks, Dave

Read the article

EMC VNX iSCSI setup - unsure about SP/port assignment

- by pauska

We have a new VNX5300 waiting to get configured, and I need to plan out the network infrastructure before the EMC tech arrives. It has 4x1gbit iSCSI per SP (8 ports in total), and I'd like to get the most out of the performance until we jump over to 10gig iSCSI. From what I can read from the docs - the recommendation is to use only two ports per SP, with 1 active and 1 passive. Why is this? It seems kind of pointless to have quad-port i/o-modules and then recommend to not use more than two of them? Also - I'm a bit unsure about the zoning. The best practices guide state that you should separate each port on each SP from each other on different logical networks. Does this mean that I have to create 4 logical networks to be able to use all 8 ports? It also gives the following example: Does this mean that A0 and B0 should sit on the same physical switch aswell? Won't this make all traffic go on one switch (if both A1 and B1 are passive)? Edit: Another brainpuzzle I don't get it - each host (as in server) should not have more iSCSI bandwidth available than the storage processor. What on earth does this matter? If serverA have 1gbit and serverB have 100mbit, then the resulting bandwith between them is 100mbit. How can this result in some kind of oversubscription? Edit4: Wait, what. Active and passive ports? The VNX runs in a ALUA configuration with asymmetrical active/active.. there shouldn't be any passive ports, only preferred ones..

Search Results

Search found 15137 results on 606 pages for 'global state'.

Page 484/606 | < Previous Page | 480 481 482 483 484 485 486 487 488 489 490 491 | Next Page >

- by Dax

- by ip

- by Ophidian

- by Korjavin Ivan

- by damitamit

- by javapowered

- by Alan Ristic

- by Carson C.

- by JWD

- by Daniel Liang

- by user64204

- by SnowCrash

- by azamsharp

- by David Newcomb

- by Brian Stephenson

- by Software Monkey

- by sixlettervariables

- by Kevin

- by dezfafara

- by Amos Shapira

- by XenonofArcticus

- by Chris Foot

- by Neal

- by Dave Parker

- by pauska

< Previous Page | 480 481 482 483 484 485 486 487 488 489 490 491 | Next Page >