Search Results

Search found 205 results on 9 pages for 'outage'.

Page 6/9 | < Previous Page | 2 3 4 5 6 7 8 9  | Next Page >

  • How do I know if my PHP application is using too much memory?

    - by John
    I'm working on a PHP web application that let's users network with each other, book events, message etc... I launched it a few months ago and at the moment, there's only about 100 users. I set up the application on a VPS with ubuntu 9.10, apache 2, mysql 5 and php 5. I had 360 Mb of RAM, but upgraded to 720 MB a few minutes ago. Lately, my web application has been experiencing outages due to excessive memory usage. From what I can tell in error logs, it seems the server automatically kills apache processes that consume too much memory. As a result, I upgraded memory from 360 MB to 720 MB as a stop-gap measure. So my question is, how do I go about resolving these outage issues? How do I know if my website's need for more memory is due to poor code or if it's part of the website's natural growth? What's the most efficient way to determine which PHP scripts consume the most memory?

    Read the article

  • Safely deploying changes to production servers

    - by oazabir
    When you deploy incremental changes on a production server, which is running and live all the time, you some times see error messages like “Compiler Error Message: The Type ‘XXX’ exists in both…”. Sometimes you find Application_Start event not firing although you shipped a new class, dll or web.config. Sometimes you find static variables not getting initialized and so on. There are so many weird things happen on webservers when you incrementally deploy changes to the server and the server has been up and running for several weeks. So, I came up with a full proof house keeping steps that we always do whenever we deploy some incremental change to our websites. These steps ensure that the web sites are properly recycled , cached are cleared, all the data stored at Application level is initialized. First of all you should have multiple web servers behind load balancer. This way you can take one server our of the production traffic, do your deployment and house keeping tasks like restarting IIS, and then put it back. Then you can do it for the second server and so on. This ensures there’s no outage for customer. If you can do it reasonable fast, hopefully customers won’t notice discrepancy between the servers some having new code and some having old code. You should only do this when your changes aren’t drastic. For ex, you aren’t delivering a complete revamped UI. In that case, some users hitting server1 with latest UI will suddenly get a completely different experience and then on next page refresh, they might hit server2 with old code and get a totally different experience. This works for incremental non-dramatic changes only.   During deployment you should follow these steps: Take server X out of load balancer so that it does not get any traffic. Stop all windows services on the server. Stop IIS. Delete the Temporary ASP.NET folders of all .NET versions incase you have multiple .NET versions running. You can follow this link. Deploy the changes. Flush any distributed cache you have, for ex, Velocity or Memcached. Start IIS. Start the windows services on the server. Warm up all websites by hitting major URLs on the websites. You should have some automated script to do this. You can use tinyget to hit some major URLs, especially pages that take a lot of time to compile. Read my post on keeping websites warm with zero coding. Put server X back to load balancer so that it starts receiving traffic. That’s it. It should give you a clean deployment and prevent unexpected errors. You should print these steps and hang on the desk of your deployment guys so that they never forget during deployment pressure.

    Read the article

  • App_Offline.htm, taking site down for maintenance

    - by Vipin
    There is much simpler and graceful way to shut down a site while doing upgrade to avoid the problem of people accessing site in the middle of a content update.   Basically, if you place file with name 'app_offline.htm' with below contents in the root of a web application directory, ASP.NET will shut-down the application,  and stop processing any new incoming requests for that application.  ASP.NET will also then respond to all requests for dynamic pages in the application by sending back the content of the app_offline.htm file (for example: you might want to have a “site under construction” or “down for maintenance” message).   Then after upgrade, just rename/delete app_offline.htm file…and the site would be back to normal. Just remember that the size of the file should be greater than 512 bytes, doesn't matter even if you add some comments to it to push the byte size as long as it's of the size greater than 512 - it'll work fine.     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html http://www.w3.org/1999/xhtml"http://www.w3.org/1999/xhtml" ><head>    <title>Maintenance Mode - Outage Message</title></head><body>    <h1>Maintenance Mode</h1>     <p>We're currently undergoing scheduled maintenance. We will come back very shortly.</p>     <p>Sorry for the inconvenience!</p>     <!--            Adding additional hidden content so that IE Friendly Errors don't prevent    this message from displaying (note: it will show a "friendly" 404    error if the content isn't of a certain size).        <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>      <h2>Site under maintenance...</h2>         --></body></html>

    Read the article

  • Oracle GoldenGate 11gR2 Event Marker System

    - by Doug Reid
    0 false 18 pt 18 pt 0 0 false false false /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:"Times New Roman"; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} Oracle GoldenGate 11gR2 includes a number of refinements to the Event Marker system. Using event markers enables GoldenGate processes to take a defined action based on an event in the data stream. This feature within Oracle GoldenGate simplifies methods to embed specific custom processing in the areas of error handling, alerts, and notification. The event marker system effectively allows for DML driven workflows to be created within GoldenGate and enables customers to craft non-standard processing based on special events. There are a number of supported event actions including: trace, log, checkpoint before, suspend, abort, and several others. With 11gR1 events can now be triggered by DDL operations, plus variables can be passed in and out of the system to shell scripts. Some good use cases for this feature are Automatic switchover to the secondary system during planned outages Better monitoring over source systems’ performance and automated switchover to the standby system in case of an outage with the primary system Automatic switchover from initial load to changed data movement Automatic synchronization of any type of batch processing taking place on both the source and target databases for database consistency Automatic stoppage of the Delivery module to allow end-of-day reporting Finding, tracking, and reporting on transactions that are of interest including the ones that do not have primary keys or transaction record numbers If you would like to see a demo, please visit our youtube channel (http://youtube.com/oraclegoldengate)  To learn more about the new features of Oracle GoldenGate 11gR2 and to ask questions to the PM team, please join us on September 12th  8am or 10am PST for our live webcast. Click here to register.

    Read the article

  • Understanding When Social Interactions Should Be Resolved in Another Channel

    - by Christina McKeon
    Guest Blogger: Aphrodite Brinsmead, Senior Analyst at Ovum Agents need to respond to customers’ social comments and questions quickly and in the right tone. But more importantly, they need to offer resolutions. Customers care most about how long it takes to find information rather than which channel they are using. They choose to use social media because they are comfortable with the channel and it offers a convenient way to communicate. Ideally agents will resolve questions within social media, but they need guidance as to how and when to escalate interactions to a more private channel. First, businesses should assess the way in which customers are using social media to communicate with them and categorize posts into groups: complaints, feedback, technical queries or more general support questions. They should then consider the types of interactions that can easily be handled within social media and those that need to be followed up in another channel. This will be very dependent on the industry. Examples of queries that can be resolved in social media include Shipping pricing and timeframes Outage updates and resolution plans Flight status information Product stock check Technical support videos or forum posts Availability of facilities Both customers and agents need to be educated about the types of questions they can expect to resolve within social media. As the channel matures as a customer service tool, it needs to have value other than just as a forum for complaints. Social customer service agents need the power to start a web chat or phone call Any questions where customers need to divulge personal details in order to get a resolution will need to be addressed in a private channel: a private social message, web chat, email or phone call. Customers should never disclose their date of birth, social security, credit card number, or healthcare records in a public forum. Flight issues, changes to a booking, billing queries or account updates will all need to be completed via a private interaction. Agents responding to questions on social media need the ability to start a web chat or phone call with the customer. The customer doesn’t want to have to repeat their question and the agent should be empowered to connect customer records and access account or billing information. These agents will need to be trained across different channels and should be able to view all customer communications in one application. They also need to follow up questions that began on a public forum in the initial channel to make it clear that the issue was addressed. In order to make this possible, social media needs to be integrated as part of a broader customer service strategy. Irrespective of how many channels are used to complete an interaction, businesses should prioritize customer satisfaction and issue resolution. They need a clear strategy and trained agents that can handle and respond to social interactions. Follow me on Twitter @diteb. 

    Read the article

  • Difficulty restoring a differential backup in SQL Server, 2 media families are expected or no files

    - by digiguru
    I have sql backups copied from server A to server B on a nightly basis. We want to move the sql server from server A to server B without much downtime, but the files are very large. I assumed that performing a differential backup and restore would solve the problem with the databases. Copy full backup from server A to copy to server B (10+gb) Open SQL Server Managment Studio on server B Right mouse on databases Restore Database Type in the new DB-name Choose "From Device" and browse to the backup file Click Okay. This is now resorting the original "full" backup. Test new db with dev application - everything works :) On original database rightmouse on DB Tasks Backup... Backup Type = Differential, Backup to disk, add a new file, and remove the old one (it needs to be a small file to transfer for the smallest amount of outage) Copy the diff backup onto the new db Right mouse on DB Tasks Restore Database This is where I get stuck. If I add both the new differential file, and the original backup to the restore process I get an error The media loaded on "M:\path\to\backup\full.bak" is formatted to support 1 media families, but 2 media families are expected according to the backup device specification. RESTORE HEADERONLY is terminating abnormally. But if I try to restore using just the differential file I get System.Data.SqlClient.SqlError: The log or differential backup cannot be restored because no files are ready to rollforward. (Microsoft.SqlServer.Smo) Any idea how to do it? Is there a better way of restoring backups with limited downtime?

    Read the article

  • CHKDSK: What option DOES NOT delete files and turn them into .chk files?

    - by CHKDSKuser
    I had a recent power outage while using my computer, with a 1TB hard drive being directly accessed as the power went out. When the power came back on, and I rebooted my computer, one of my 1TB hard drives would not register with WinXP SP3, and showed a Total Space of 0, and an Available Space of 0. The file system (NTFS) also did not register...every entry for the drive was either blank or zeroed. My assumption is that the file tables were damaged/corrupted because the drive was being directly accessed when the power went out. After doing some research, I ran CHKDSK with whatever default options it runs with (I'm not sure what they are as I didn't see them displayed). Upon completion of CHKDSK, the drive registered with WinXP as a 1TB hard drive, with an accurately-reflected amount of available space. But CHKDSK also deleted about 16GB of files from their original directories, and changed them all into sequentially-named *.chk files. My question is how can CHKDSK be run in a situation like mine where the file tables needed to be restored, but without having CHKDSK delete any files from their original directories, even if they may be damaged/corrupt? I'd simply like to be able to run CHKDSK and have it restore the file tables, and repair bad sector damage, as it did, but not have it do anything else such as delete files and convert them to CHK files. Any ideas? Or is there a CHKDSK alternative that can perform the same functions without the file deletions?

    Read the article

  • SNMPD running but not listening for connections at random

    - by Lukasz
    OS: CentOS release 5.7 (Final) Net-SNMP: net-snmp-5.3.2.2-14.el5_7.1 (from RPM) Periodically my NMS notifies me that SNMP has gone down on this machine. The service is restored in between 10 to 30 minutes. My NMS also pings and check SSH and those services are not affected during the SNMP outage. SNMPD log file shows that it is working and apparently receiving packets (either from local agents from 127.0.0.1 or from my NMS at 172.16.37.37) however attempting to snmpwalk locally or from the NMS system fails with a timeout. I have 7 of these servers running mixture of CentOS 5.7 and RHEL 5.7 with this specific version of Net-SNMP installed from RPM - none of them have this issue except this one. 5 of the machines (including the NMS system and this problem server) are in the same rack connected using one switch. Restarting SNMPD does not fix the issue - it clears up by itself eventually. Any suggestions where I can begin diagnosing the issue? It's a closed subnet so IPTables is not used. SNMPD config below: # Following entries were added by HP Insight Management Agents at # Tue May 15 10:58:17 CLT 2012 dlmod cmaX /usr/lib64/libcmaX64.so rwcommunity public 127.0.0.1 rocommunity public 127.0.0.1 rwcommunity 3adRabRu 172.16.37.37 rocommunity 3adRabRu 172.16.37.37 rwcommunity 3adRabRu 172.16.37.36 rocommunity 3adRabRu 172.16.37.36 trapcommunity callmetraps trapsink 172.16.37.37 callmetraps trapsink 172.16.37.36 callmetraps syscontact Lukasz Piwowarek syslocation Santiago, Chile # ---------------------- END -------------------- agentAddress udp:161 com2sec rwlocal default public com2sec rolocal default public com2sec subnet default 3adRabRu group rwv2c v2c rwlocal group rov2c v2c rolocal group rov2c v2c subnet view all included .1 access rwv2c "" any noauth exact all all none access rov2c "" any noauth exact all none none

    Read the article

  • Virutal Machine loses network connectivity on Hyper V Cluster

    - by Chris W
    We're running a number of VMs on a 6 node failover cluster of blades using Hyper V. We have an intermittent issue (every few days at different times - not a fixed frequency) of VMs losing network connectivity. Console access to the VM suggests all is fine and the underlying blade has normal connectivity. To resolve the problem we either have to re-start the VM or, more usually, we do a live migration to another blade which fires up connectivity and we then migrate it back to the original blade. I've had 3 instances of this happen with a specific VM running on a particular blade however it has happened once with a different VM running on a different blade. All VMs and blades have the same basic setup and are running Windows 2008 R2. Any ideas where I should be looking to diagnose the possible causes of this problem as the event logs provide no help? Edit: I've checked that each blade is running the latest NIC drivers and all seem to be fine. Something that is confusing me - a failover or restart of the VM resolves the issue. Whilst I need to work out the underlying issue that is causing the NICs to hang I'm also concerned that the VM didn't failover to another node which would have solved the outage for me. Is there a way to configure the cluster so that it can tell that the VM guest has lost connectivity and fail it over? As things stand the cluster is assuming that the VM is running happily as I presume Hyper V says everything is great even though there is a problem.

    Read the article

  • Experience with AMCC 3ware 9650se raid cards? Ours seems dead

    - by antiduh
    We have a 8-port 3ware 9650se raid card for our main disk array. We had to bring the server down for a pending power outage, and when we turned the machine back on, the raid card never started. This card has been in service for a couple years without problems, and was working up until the shutdown. Now, when we turn the machine on, the bios option rom that normally kicks in before the bootloader doesn't show up, none of the drives start, and when the OS tries to access the device, it just times out. The firmware on it has been upgraded in the past, so it's possible we've hit some sort of firmware bug. We're using it in a Silicon Mechanics R272 machine with gentoo for the OS. The OS eventually boots, but alas, without the card. We've ordered a new one, but I'm worried that if we replace the card it won't recognize the existing array. Has anybody performed a card swap before? Any help would be greatly appreciated. Edit: These are the kernel errors we see: 3ware 9000 Storage Controller device driver for Linux v2.26.02.012. 3w-9xxx 0000:09:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 3w-9xxx 0000:09:00.0: setting latency timer to 64 3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x001F): Microcontroller not ready during reset sequence. 3w-9xxx: scsi0: ERROR: (0x06:0x0036): Response queue (large) empty failed during reset sequence. 3w-9xxx 0000:09:00.0: PCI INT A disabled

    Read the article

  • ProCurve network expansion

    - by Blue Warrior NFB
    I've hit a bit of a wall with our network scale-out. As it stands right now: We have five ProCurve 2910al switches connected as above, but with 10GbE connections (two CX4, two fiber). This fully populates the central switch above, there will be no more 10GbE Ethernet connections from that device. This group of switches is not stacked (no stack directive). Sometime in the next two or three months I'll need to add a sixth, and I'm not sure how deep of a hole I'm in. Ideally I'd replace the core switch with something more capable and has more 10GbE ports. However, that's a major outage and that requires special scheduling. The two edge switches connected via fiber have dual-port 10GbE cards in them, so I could physically put another switch on the far end of one of those. I don't know how much of a good or bad idea that would be though. Is that too many segments between end-points? Some config-excerpts: Running configuration: ; J9147A Configuration Editor; Created on release #W.14.49 hostname "REDACTED-SW01" time timezone 120 module 1 type J9147A module 2 type J9008A module 3 type J9149A no stack trunk B1 Trk3 Trunk trunk B2 Trk4 Trunk trunk A1 Trk11 Trunk trunk A2 Trk12 Trunk vlan 15 name "VM-MGMT" untagged Trk2,Trk5,Trk7 ip helper-address 10.1.10.4 ip address 10.1.11.1 255.255.255.0 tagged 37-40,Trk3-Trk4,Trk11-Trk12 jumbo ip proxy-arp exit

    Read the article

  • is it worth to use load balancer on web server/website

    - by user427969
    I have a website and a while ago, the web server of the company hosting my website was down for about a day. I consulted the company for a solution on how i can stop this from happening in future and they suggested to have a second machine and which will be connected to my current website/web server by a "load balancer" (at an additional huge cost!!!). The second machine will be replicate of the first one and so if i goes down, the other will always be running. ---- Explanation ----- My hosting company suggested that it will be a good idea to have a second machine running at the same time and both the machines will be connected by a load balancer which reduces the rist of a downtime. The second machine will be a mirror of the first and any changes to first must be replicated in the second. I don't mind spending money if it really saves my website from going down. I want to know is it worth having this "load balancer" for my purpose? My website is a 24/7 service. I cannot afford an outage of 24 hours/1 hour. I don't mind using this "load balancer" as far as it is really worth. I am not sure if its just a marketing trick of my hosting company or really a "best" solution Thanks for help. Regards

    Read the article

  • How to direct reverse proxy requests using wildcard vhosts

    - by HonoredMule
    I'm interested in running a reverse proxy with 2-3 virtual machines behind it. Each internal server will run multiple virtual hosts, and rather than manually configuring each individual vhost on the proxy (a variety of vhosts come and go too often for this to be practical), I would like to use something which can employ pattern matching in a sequential order to find the appropriate back-end server. For example: Server 1: *.dev.mysite.com Server 2: *.stage.mysite.com Server 3: *.mysite.com, dev.mysite.com, stage.mysite.com, mysite.com Server 4: * In the above configuration, task.dev.mysite.com would go to Server 1, dev.mysite.com would go to Server 3, yoursite.stage.mysite.com to Server 2, www.mysite.com to Server 3, and yoursite.com to Server 4. I've looked into using Squid, Varnish, and nginx so far. I have my opinions regarding their respective desirability and general suitability, but it's not readily apparent if any of them can handle dynamic server selection in this manner and not require per-vhost configuration. Apache on the other hand can do this handily and simply, but otherwise (aside from being well-known and familiar) seems very poorly suited to the partly-performance-serving task. Performance isn't actually a major concern yet, but it seems foolish to use Apache if another system will perform far better and can also handle the desired 'hands-free' configuration. But so is frequently having to adjust the gateway for all production services and risk network-wide outage...and so also is setting oneself up for longer downtime later if Apache becomes a too-small bottleneck. Which of these (or other) reverse proxies can do it/would do it best? And maybe I should post this as a separate question, but if Apache is the only practical option, how safe/reliable/predictable is apache-mpm-event in apache2.2 (Ubuntu 12.04.1) particularly for a dedicated reverse proxy? As I understand it the Event MPM was declared "safe" as of 2.4 but it's unclear whether reaching stability in 2.4 has any implications for the older (2.2) versions available in official/stable package channels of various distros.

    Read the article

  • ProCurve 1800 switch issue

    - by user98651
    I recently deployed ProCurve 1800-24G switches in place of some older ProCurve 2424M switches in my network. However, I'm having a serious problem with the switch connected to the router. It seems, every night when our Windows 2008 R2 server (off site) runs a backup to a iSCSI target (on site) [facilitated through a PPTP tunnel] the LAN loses connectivity with the router. To clarify, there is only one router which is connected to the switch affected by this problem. The only way to resolve the issue is to either reboot the router or pull the ethernet cable that goes to the router and plug it back in. During the outage, clients cannot receive DHCP requests, DNS requests, ping, or do anything else with the router in this state. Now, neither the switch or router are configured extensively and the issue only seems to have surfaced with the new switch in place. I have tried a number of things including replacing cables, rebooting and checking the switch configuration (it is literally as basic as you can get at this point-- flat LAN, no trunking). Interestingly, the router shows (accessed externally) no changes in configuration or status during this state but similarly cannot ping or access other hosts on the network. This issue occurs in different stages of backup (ie, different amounts transferred). I've also dumped packets from the switch into WireShark but cannot seem to find any anomaly yet (I'm looking at packets around the time the issue appeared and at the time when I reset the NIC). Any suggestions for what to look for? Ideas on what could be causing this? I'm seeing some transmit/receive errors on the NIC from both the router and switch side but nothing serious when compared to the total packet counts. I'm seriously doubting hardware at this point, as I have tried another switch, different cables, and a different NIC on the router.

    Read the article

  • Recovering Data from a Linkstation LS-WXL/R1

    - by kingkool68
    I've been running a Buffalo Linkstation LS-WXL/R1 in RAID1 for a few weeks. Two nights ago we had a brief power outage. Yesterday when I tried to mount the disk it couldn't be found. I logged in to the web admin and it couldn't see any storage attached and no arrays available. Well this sucks. I ended up taking the disks out and mounting them to an Ubuntu virtual machine. They how up as an Array but I can't start the Array to the best of my limited knowledge. I could still see the 6 partitions, so I'm confident the data is there. I can use Photorec (http://www.cgsecurity.org/wiki/PhotoRec) to recover most of the files I want, it just takes some time. Could I make an image of the data partition and mount that in Ubuntu so I can get at the data I want through the file system? 95% of the data on my Linkstation LS-WXL/R1 is backup. I just put a few folders of images on there that I need to get back. I'm already preparing to format the disks when I'm done and re-building the RAID1 array from scratch. Any advice would be appreciated.

    Read the article

  • The Server Fault Wiki of recommended practices [migrated]

    - by Avery Payne
    So I've noticed that there are several recommendations on basic practices on Server Fault, but there doesn't seem to be a cohesive view as to how those recommendations would all fit together. So I thought I would lump these together as a kind of mental exercise to see what the "ServerFault Community IT Department" would look like if it were implemented. This would give a few things: it would make a reasonable wiki (in the true wiki spirit of many contributions), it would provide several links to well-vetted practices, and it would be kind of fun to see what the amalgamation would look like. And who knows, it may even point out some interesting issues between different forms of "best practices", although I would be stunned if there was a conflict hidden in there someplace... Add your favorites from Server Fault as answers, and I'll re-edit this section with the results. Here's a few catagories to collect different ideas together. Hardware Configuration(s) Server room configuration. Server room temperature Firmware Updates and Scheduling Storage Configuration(s) Selecting a NAS box Linux: Dealing with /tmp Linux: Install apps in /var or /opt? Network Configuration(s) checking DNS health and compliance Security Practice(s) Password (General) Best Practices Password sharing methods Windows Update Updating Windows Servers that are hosts for VMs Network Service(s) User Service(s) User Naming & Deletion Upgrade Process(es) Disaster Recovery Checking Backups Documenting an outage for a post-mortem review Last Edit: 2010-02-17

    Read the article

  • MGE UPS Cut power - What happened?

    - by JT.WK
    I have 3 x MGE Pulsar M 3000 2700w UPS units within my server room which have run perfectly up until now. On Saturday morning I noticed that one of these UPS units was no longer outputting power, the lcd displayed a message saying "load not powered" and told me to press the power button to start output. Needless to say that the servers, switches and routers is was supporting were all turned off. I tried pressing and even holding the power button, but the unit refused to start back up again. Only power cycling the unit got it back up again. I have checked the logs on the UPS, although they were useless. Nothing out of the ordinary, and no email notifications had been sent. The output level sits on about 51% and all battery checks are OK. It is now three days on and the UPS is still up and running (although I am scheduling an outage to get it out of there ASAP). Does anyone have any idea what could have gone wrong here? Is there anything else that I can check that could help?

    Read the article

  • Find out when a system went down?

    - by Clinton Blackmore
    I have a Mac OS X 10.5 server, with a RAID set in it, that went down due to a power outage on Thursday, and the machine is not happily booting right now*. It is possible to find out when the machine went down, while not booted off the internal drive? (I'm booted off an external drive, waiting for the RAID sets to initialize.) Normally, I'd run last. The man page doesn't indicate that I can run it against a different startup volume. It looks possible to parse /var/log/utmpx, but I don't think it'd be worthwhile to try to do that from scratch for this one-off problem. * I'm still trying to figure out why it isn't happy, and may ask a follow-up question. Right now I can see that UserNotificationCenter crashed repeatedly early Thursday morning, and that securityd, mdworker, and ARDAgent crash shortly after startup [I think -- I want to verify when the box went up and down]. The login window does not come up right (I think it is crashing or not able to cope with a dead securityd). The box is supposed to be set to go down when the UPS tells it power is out; at the moment, I'm wondering if it went down, and turned back on multiple times! I sure hope not.

    Read the article

  • Can't get past bootcamp Windows logo, freezes immediately after picking boot partition -- Windows 7

    - by cksubs
    Hi, I had a fine install of Windows 7 alongside Snow Leopard. I had a power outage, and Windows restarted. When I booted via the "Hold Option" sceen, it started hanging at the grayscale "Windows logo" (squares) that comes up after picking the Windows drive from the bootcamp loader (the Windows version of the grayscale Apple logo when OS X loads). I think this happened once before, and installing rEFIt helped. I did that, but it continued to hang at the same point. I finally got fed up with it and erased the Windows partition via Disk Utility. I then reinstalled Windows 7 x64 from DVD to the fresh partition. Seemed fine. Ran Windows Update then restarted. FFFFFFFFF. Hangs in the same place. What can I do? Like I said, I already have rEFIt installed. Booting the Windows 7 install CD results in the same infinite loading error. I don't have a spare Snow Leopard disk handy.

    Read the article

  • Dell PS/2 Keyboard Stops Working Randomly After Power Disturbance

    - by Kenneth Murphy
    I have a Dell PS/2 keyboard connected to a desktop PC running Slackware 12.2 & Windows XP. After a recent, brief power outage/disturbance at my home, the keyboard has begun to quit working at random times. It has stopped at POST, but not by keyboard error -- I have to press the F1 key to continue booting, and at times the keyboard has already stopped working. Other times, the keyboard will work perfectly for a long time (a day or more) before it finally quits. It has stopped at boot, in Windows XP, and in Slackware. The led lights continue to work regardless. I have tried another PS/2 keyboard and it seems to be immune to this problem. The USB mouse always works. Does anyone have any ideas about how this might have happened? If this is related to the power disturbance that killed the power to the running PC, is it feasible that it would have only fried the keyboard itself (which still works sometimes) and not the PS/2 port nor anything else? I have experienced no other problems since the event.

    Read the article

  • How to debug silent hang on shutdown of Solaris 10?

    - by jblaine
    We're experiencing a mysterious hang on shutdown of a newly-imaged Oracle/Sun Solaris 10 SPARC box. It is repeatable (in the same spot ... from what we can tell). We let it try to work itself out multiple times for 5-10 minutes and it never progressed. I've never seen this happen before. The last thing displayed on the console is that syslogd was sent signal 15. Prior to us disabling snmpdx on the box, the last thing on the console was that snmpdx was sent signal 15 (after syslogd was sent signal 15). While very rare to find, in Solaris days past, I'd have a better idea from experience where the problem might be, and could then narrow things down further with silly (but effective) debugging echo statments in /etc/*.d scripts. With SMF in the picture, I'm not really quite sure where to start. We forced a crash dump via sync at the {ok} prompt for later analysis, and then let the box come up because it's a production server and our scheduled outage window was closing. /var/adm/messages shows nothing of use. How would you debug this situation? mdb ps of the savecore shows the following processes were running at hang time (afsd is the OpenAFS client and that many are expected): > > ::ps S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 00000000018387c0 sched R 108 0 0 0 0 0x00020001 00000600110fe010 zpool-silmaril-p R 3 0 0 0 0 0x00020001 0000060010b29848 fsflush R 2 0 0 0 0 0x00020001 0000060010b2a468 pageout R 1 0 0 0 0 0x4a024000 0000060010b2b088 init R 1327 1 1327 329 0 0x4a024002 00000600176ab0c0 reboot R 747 1 7 7 0 0x42020001 0000060017f9d0e0 afsd R 749 1 7 7 0 0x42020001 00000600180104d0 afsd R 752 1 7 7 0 0x42020001 0000060017cb44b8 afsd R 754 1 7 7 0 0x42020001 0000060017fc8068 afsd R 756 1 7 7 0 0x42020001 0000060017fcb0e8 afsd R 760 1 7 7 0 0x42020001 00000600177f4048 afsd R 762 1 7 7 0 0x42020001 000006001800f8b0 afsd R 764 1 7 7 0 0x42020001 000006001800ec90 afsd R 378 1 378 378 0 0x42020000 0000060013aee480 inetd R 7 1 7 7 0 0x42020000 0000060010b28008 svc.startd R 329 7 329 329 0 0x4a024000 00000600110ff850 sh Z 317 7 317 317 0 0x4a014002 0000060013b3a490 sac

    Read the article

  • UPS power requirements for server

    - by captainentropy
    Greetings! So, I just placed an order for a new server. The company recommended that I get a 3000W UPS. (!) As best as I could I calculated the following wattage consumption based on benchmarked data or datasheets provided by the manufacturers of each component: number watts **total watts** MoBo 1 240 240 CPUs (E5540) 2 80 160 RAID cards (3ware) 2 18 36 RAM (6x4GB) 6 3 18 DVD drive 1 7 7 floppy 1 2 2 RE4 drives 8 7 56 WD20 drives 8 6 48 Intel X25 SSD 2 0.15 0.3 total = 567 So that is for the PSU requirements only. The PSUs in the machine are a 720W for the master node and 800W each for two subsystems. That's a total of 2320W that can be delivered by these PSUs. But that is 4X the amount being consumed, at most, by the components. I didn't count case fans or the eSATA card (3W maybe?) or what the PSUs themselves require but assuming I double or triple my calculations I'm not even remotely close to the 3000W UPS I was suggested to get. They run at least $1100. I could get a 2000W for about $750 or a 1500W for $450 and still be well over my estimated power need. I don't think I need a whole lot of run time in the case of a power outage, maybe 20 minutes max, enough time to shutdown if the power doesn't come on within 5-10 minutes. Any thoughts? Am I off on my calculations? Did I overlook something major? If so what are your suggestions for a UPS? Thanks!

    Read the article

  • Data Store/Volume disconnecting. How to resume copy of VMDK?

    - by Serge
    I'm having an issue with my ESXi 4.1 hosts losing the datastore with FC SAN after a power outage. All 3 hosts disconnect so it's definitely a SAN issue. I've tried to resolve the issue on the SAN side with the SAN software support and Adaptec hardware support. No luck there. So I'm stuck with a SAN that will randomly disconnect the volume. I need to get the virtual machines (VMDK files) from the datastore. The problem is I can only get 5-20% before the data store disconnects. I have backups that are slightly older that I can use to replicate the VMDK differences to. What has not worked so far: Powering up the VMs, will boot up for 5-15 minutes then freeze vCenter migrate or clone of VM, will fail after similar period of time vCenter copy/paste of VMDK. Was able to get one 30GB VMDK and no luck after that. vMware Data Recovery. Fails at low %, can't resume, so next backup starts from begining. Veeam Backup & Recovery. Same as above, no resume function. If I can just find a backup solution that will resume from the failed spot that would solve my issue. Anyone have any ideas that I could try? EDIT 1 The SAN is Open-E DSS 6 running on a Supermicro 24 drive enclosure with 4 port Qlogic FC. Adaptec 52445 RAID card.

    Read the article

  • Rebuilding LVM after RAID recovery

    - by Xiong Chiamiov
    I have 4 disks RAID-5ed to create md0, and another 4 disks RAID-5ed to create md1. These are then combined via LVM to create one partition. There was a power outage while I was gone, and when I got back, it looked like one of the disks in md1 was out of sync - mdadm kept claiming that it only could find 3 of the 4 drives. The only thing I could do to get anything to happen was to use mdadm --create on those four disks, then let it rebuild the array. This seemed like a bad idea to me, but none of the stuff I had was critical (although it'd take a while to get it all back), and a thread somewhere claimed that this would fix things. If this trashed all of my data, then I suppose you can stop reading and just tell me that. After waiting four hours for the array to rebuild, md1 looked fine (I guess), but the lvm was complaining about not being able to find a device with the correct UUID, presumably because md1 changed UUIDs. I used the pvcreate and vgcfgrestore commands as documented here. Attempting to run an lvchange -a y on it, however, gives me a resume ioctl failed message. Is there any hope for me to recover my data, or have I completely mucked it up?

    Read the article

  • How to get data from a borked Windows Home Server

    - by harhoo
    Yesterday we had a power surge, followed by a power outage. This left my WHS borked: powering on just gives to a flashing blue light (the led on the power supply also flashes green) - no fan or boot activity, nothing. I urgently needed some files off there in the short term (and the 500GB of photos, music, personal video etc in the long term) so I took the hard drive out and put it in my computer. The files and folders showed up, but I couldn't access them - clicking on an image gave an invalid image error in Picasa, I couldn't play MP3s etc. I changed the ownership and permissions of the files, still nothing. I booted in with a LiveCD, the same: files appear, but won't open. Is there anything else I can do? I'm now wondering if it was just the power cable that's broken, but if so, why can;t I access my files from the hard drive? If it is the power cable, and I replace that and the hard drive, will I have done any harm messing around with ownership and file permissions?

    Read the article

< Previous Page | 2 3 4 5 6 7 8 9  | Next Page >