Search Results

Search found 8021 results on 321 pages for 'resource monitoring'.

Page 217/321 | < Previous Page | 213 214 215 216 217 218 219 220 221 222 223 224 | Next Page >

Linux CentOS strange memory readings

- by user2008937

I am actually a young junior sys admin. I have a question - i am trying to understand how linux deals with memory... while playing around different monitoring programs I found some strange thing. When I run top on my laptop it shows me that FIREFOX process with pid 8778 takes 18,3% of memory (%MEM column). grep "MemTotal" /proc/meminfo Above command give me 1848336kb/1024 = 1805megs of memory (its ok - i have 2 gigs of ram). So if the firefox process takes 18,3% of MEM(according to tops %MEM column) then it takes 0.183 * 1805 which is approximately 325mb of memory. Quite a lot as for firefox... But well, in Linux there are lots of shared libraries that programs commonly uses (like famous libc). And those libraries are added to memory utilization of every process that uses it in the system, despite they are actually reading same file(single object in memory). So top may show too big mem utilization because of those shared libraries. Well, it is time to use PMAP which should show us the real mem utilization of process. But.. pmap -d $(pidof firefox) mapped: 983460K writeable/private: 757164K shared: 66416K so pmap shows that 983460/1024=993MB of memory is mapped to this process. It is in fact much bigger than mem utilization showed by top. Whats wrong here? How pmap can show more than top? even when top adds also the shared libraries (which in fact are single objects in memory) for each process that uses it? and pmap omits it? Regards Krzysztof

Read the article
Intermittently uncommunicative subnets

- by mhd

Last week proved me a veritable Cassandra: I've always said that it's a bad idea to have only one firewall/router, without a backup or failover. And thus our Cisco PIX went haywire, refusing to route properly. And of course, the only one available here on short notice is me, and while I'm quite grounded in Linux, I'm really a developer not a sysadmin (the fact that this hit me on sysadmin appreciation day is a bit ironic). Anyway, this weekend I tried to hack up a temporary solution: I used an old server with enough NICs (two built-in, four on a card) to serve as a gateway and firewall. Due to some problems with the raid controller, I got only two router distros running, and between Untangle and Ebox I decided for the latter. Now everything is quite okay. I've got all the different subnets we've got here (all with separate switches) talking to each other and even to the internet (Cisco 2800 router, T1 lines). But from time to time (20-60 minute intervals), I get a total routing failure. Our main, office subnet can't talk to our server subnet and can't connect to the internet. This is not the end of a gradual slowdown, either everything's working perfectly or I get a total lack of communication for about two minutes each time. Now I'm a bit at wits end what to check. At least with the default EBox setup, nothing in /var/log shows anything weird and it doesn't exactly have lots of built-in monitoring tools. So I'm hoping someone here could give me some pointers about what to look out for. I did change the ethernet cable from the office switch to the firewall, with no results. I might change switches, although within the switch it seems to work ok enough. Edit: I'm not sure whether this is the sole cause of the problem, but after I noticed a few DHCP entries just before the last drop of connectivity, I tried to reproduce that. And alas, whenever I renew a DHCP connection, I can't access other subnets anymore. Running ISC DHCPD 3.0.6.

Read the article
How do I replicate Gmail filtering (forwarding mostly)?

- by projectdp

I have reached the limits of Gmail forwarding. Before there was no need to verify forwarding addresses. It's a problem for me now because the addresses I want to forward to are not natural inboxes but automated systems with no way to track the verification email contents. I want to set this up for example: mobile - email - facebook-email - flickr-email - tumblr-email - posterous-email How do I do this without Gmail filters? I think I need to use fetchmail to watch my inbox and then autoforward to the above addresses. Is fetchmail the best solution to this issue? Any other MRA's? I'd like to do some more complicated things with the emails in an automated fashion too, how would I go about monitoring the inbox, doing some actions to the email before forwarding, and forward everywhere? prerequisites: a server: fetchmail daemon to poll the account local mailbox script to clean & forward appropriately (python probably) sendmail + ~/.forward file backup email account (Gmail probably) Any help would be greatly appreciated. I'm trying to automate my social content distribution.

Read the article
Passwortgeschützter Traffic-meter

- by UncleBob

Hallo erstmal, ich habe hier ein kleines Problem für das ich bis jetzt noch keine Lösung habe. Ich lebe in Bosnien und teile hier die Internetverbindung mit der Vermieterin, und wie es in Bosnien so ist haben wir keine Flatrate, sondern eine 15 Giga traffic limite. Das wäre eigentlich mehr als genug, wenn der Sohn der Vermieterin nicht immer überziehen würde, sodass die Rechnungen immer ziemlich teuer ausfallen. Ich habe ihm bereits ein Messprogramm installiert, aber das schaltet er offensichtlich aus sobald er in die Nähe seiner Limite kommt und behauptet dann die Limite nicht überzogen zu haben. Ich brauche also mindestens ein Messprogramm das Passwortgeschützt ist und/oder im Log Zeiten vermerkt wärend denen es nicht eingeschaltet war. Noch besser wäre ein Programm das ihm den Netzzugriff einfach abklemmt wenn er seinen Anteil überschreitet, also eine Mischung aus Trafic-meter und Parental Guard. Kann mir da jemand weiterhelfen? Gtranslated version Hi first, I have a small problem for which I yet have no solution. I live in Bosnia and share the Internet connection here with the owner, and how it is in Bosnia, we do not have a flat rate, but a 15 Giga traffic limite. That would actually would be more than enough, if the son of the landlady does not always cover so that the bills always turn out quite expensive. I have it already installed a monitoring program, but he apparently turns out as soon as he comes close to its limit and then claims not to have the limit excessive. I therefore need at least a measurement program that is password protected and / or in the log notes During low periods where it has not turned on. Even better would be a program that disconnects him from accessing the network if it simply exceeds its share, ie a mixture of Traffic parameters and Parental Guard. Can someone help me there?

Read the article
Dell OMCI: Wacky values for Temperature and etc? (Win7x64)

- by Yablargo

Hey All. I am running a Dell Precision R5400 Workstation with dell OMCI installed. I am using it to test polling various data over WMI for our monitoring across the enterprise. I'm getting some weird results. perhaps someone can help point me in the direction of some clarification? Posted is the results of my DCIM\SYSMAN\DCIM_NumericSensor probe for sensor type 2(temp sensor) Microsoft (R) Windows Script Host Version 5.8 Copyright (C) Microsoft Corporation. All rights reserved. ----------------------------------- DCIM_NumericSensor instance ----------------------------------- Accuracy: AccuracyUnits: AdditionalAvailability: Availability: AvailableRequestedStates: BaseUnits: 2 Caption: CommunicationStatus: CreationClassName: DCIM_NumericSensor CurrentReading: -214748365 CurrentState: Unknown Description: DetailedStatus: DeviceID: Root/MainSystemChassis/TemperatureObj ElementName: Temperature Sensor:CPU0 EnabledDefault: 2 EnabledState: 2 EnabledThresholds: ErrorCleared: ErrorDescription: HealthState: 5 Hysteresis: IdentifyingDescriptions: InstallDate: IsLinear: LastErrorCode: LocationIndicator: LowerThresholdCritical: LowerThresholdFatal: LowerThresholdNonCritical: MaxQuiesceTime: MaxReadable: MinReadable: Name: NominalReading: NormalMax: NormalMin: OperatingStatus: OperationalStatus: 2 OtherEnabledState: OtherIdentifyingInfo: OtherSensorTypeDescription: PollingInterval: PossibleStates: Unknown,Normal,Fatal,Lower Non-Critical,Upper Non-Critical,Lower Critical,Upper Critical PowerManagementCapabilities: PowerManagementSupported: PowerOnHours: PrimaryStatus: ProgrammaticAccuracy: RateUnits: 0 RequestedState: 12 Resolution: SensorType: 2 SettableThresholds: Status: StatusDescriptions: StatusInfo: SupportedThresholds: SystemCreationClassName: DCIM_ComputerSystem SystemName: dt:5Q7BKK1 TimeOfLastStateChange: Tolerance: TotalPowerOnHours: TransitioningToState: 12 UnitModifier: 0 UpperThresholdCritical: UpperThresholdFatal: UpperThresholdNonCritical: ValueFormulation: 2 I'm not really sure whats going on, but note the CurrentReading: -214748365. I have reinstalled OMCI a few times, installed the OMCI 7x compatability and same thing I consistently get that error. It almost looks like its a issue between 32/64 bit value or something? Do I have to convert it to a float ? :)

Read the article
Joomla performance problems on AWS

- by Bobby Jack

I'm running a site on AWS with the following setup: Single m1.small instance (web server) Single RDS m1.small db Joomla 1.5 Generally, the site is performant, but is fairly low-traffic - say around 50-100 visits / hour. However, at peak time, we see about double that traffic. During peak time, pretty much every day: CPU usage on the web server slowly climbs to 100% CPU usage on the RDS server climbs quite quickly to about 30%, from an average of about 15 Database connections shoot up to about 140, from a normal average of about 2 or 3 The site is then occasionally unreachable, certainly according to pingdom monitoring. Does anyone recognise this behaviour? Can you point me in the right direction to begin investigating? Of course, RDS makes it difficult to do things like slow query logging, so I've started by regularly dumping the mysql process list into a file to see if there's anything I can spot there, but it would be good to have something more concrete to investigate. UPDATE At least, can someone confirm that I'm definitely right in saying that the level of traffic implies the problem must be a specific type of query taking way longer than it should to execute? This would happen if a table gets locked, and many queries need to write to it, right? For this very reason, I've already changed the __session table type to InnoDB.

Read the article
Trouble connecting to a local SQL server instance from the web

- by dfarney

We have a small network behind a firewall (WatchGuard XTM 2 series) and network switch. On our network we have multiple instances of SQL server, but 1 in specific that I would like to be able to access remotely from our website. We have a static IP address from our ISP and then all the machines on the network have a locally assigned dynamic IP address. When trying to connect to the database from outside our network how do I get the request to be directed to the proper machine / SQL instance? Is it a parameter in my connection string or something in my firewall? A few things to rule out: 1) The firewall is allowing access from the website to our network. I added the site's IP and opened up port 1433. Also, when trying to connect and monitoring the firewall no exceptions come up as they did before I added the proper IP address. 2) Remote connections on the SQL server has been setup and enabled. I've done a lot of reading up on remote connections and I am sure it has been setup properly. I am currently getting this error message on my site: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)

Read the article
central apache log analysis of many hosts

- by Jason Antman

We have 30+ apache httpd servers, and are looking to perform analysis on the logs both for historical trending and near "real time" monitoring/alerting. I'm mainly interested in things like error rates (4xx/5xx), response time, overall request rate, etc. but it would also be very useful to pull out more compute-intensive statistics like unique client IPs and user agents per unit of time. I'm leaning towards building this as a centralized collector/server/storage, and am also considering the possibility of storing non-apache logs (i.e. general syslog, firewall logs, etc.) in the same system. Obviously a large part of this will probably have to be custom (at least the connection between pieces and the parsing/analysis we do), but I haven't been able to find much information on people who have done stuff like this, at least at shops smaller than Google/Facebook/etc. who can throw their log data into a hundred-node compute cluster and run Map/Reduce on it. The main things I'm looking for are: - All open source - Some way of collecting logs from apache machines that isn't too resource-intensive, and transports them relatively quickly over the network - Some way of storing them (NoSQL? key-value store?) on the backend, for a given amount of time (and then rolling them up into historical averages) - In the middle of this, a way of graphing in near-real-time (probably also with some statistical analysis on it) and hopefully alerting off of those graphs. Any suggestions/pointers/ideas, to either "products"/projects or descriptions of how other people do this would be greatly helpful. Unfortunately, we're not exactly a new-age-y devops shop, lots of old stuff, homogeneous infrastructure, and strained boxes.

Read the article
What can I do to determine the root cause of a Windows server hanging/freezing?

- by Aaronaught

We set up a new server here a few weeks ago that I am informally responsible for managing. Almost everything works perfectly except for one thing: Every so often it hangs without warning. To clarify: When I say hangs, I mean completely. None of the services respond and I'm unable to even get onto a local console - the display acts as though there's no VGA signal. One time, the server actually responded to pings, another time I got the "destination host unreachable" response, but most of the time the pings just time out, as one would expect for a hung server. Event logs don't show anything after a reboot. I don't mean that they don't show anything interesting, I mean that they don't show anything at all from before the failure occurs to after the reboot. And there are never any performance problems, strange errors, or other obvious signs of impending doom before it happens. I don't expect any easy answers here. What I'd like to know his I can methodically determine the root cause of this problem, be it a misbehaving service, defective hardware, or something else. Is there any kind of logging I can set up that will help me get to the bottom of this? Any hardware diagnostics or remote monitoring? Anything else I can do to help me discover what's actually happening, or at least be able to eliminate what isn't wrong? Just to reiterate, I really don't want to start speculating about possible causes and take a trial-and-error approach, because it's going to be at least several days at a time before I would have conclusive results. I'm looking for solutions to reliably trace the problem to its source.

Read the article
Random and Selective ARP blindness in VMWare ESXi 4.1

- by Peter Grace

We have multiple VMWare ESX servers spread out amongst our company, doing various tasks. One particular ESXi host is exhibiting very peculiar behavior. We detect it when our monitoring system (Orion) notifies us that it can no longer ping the box. Upon jumping on the local console of the guest in question, we see that it cannot ping any new addresses that aren't already in its ARP table. At first we thought that the problem was just related to one of our guests, as the problem seemed to always happen to another guest, DevRedis. However, this afternoon the problem swapped and started happening on ApacheBox rather than DevRedis. When I have been fortunate to catch the problem, I have run tcpdump on both sides of the connection (one side being vmware, the other side being a physical webserver) and have noticed the following course of events: Guest ApacheBox sends an ARP request for the physical address of server WindowsBeast WindowsBeast tenders an ARP is-at back to the network indicating its physical mac address. ApacheBox never sees the ARP is-at response. The ESX host in question is running VMware ESXi, 4.1.0, 348481 The two guests (DevRedis and ApacheBox) are both running CentOS 6.3, however they are running two separate kernel versions ( 2.6.32-279.9.1.el6.x86_64 and 2.6.32-279.el6.x86_64 ) so I'm not entirely sure it's a CentOS problem. Does anyone have any thoughts on what might cause this? Has anyone run into it before?

Read the article
Apache suddenly very slow on http and faster on https

- by hsnm

Background: I have Apache 2 running on ubuntu. There is a low usage on it and mostly being accessed for a web service URL from mobile apps. It was working fine until I installed SSL certificates. I now have both http and https. When I access the server using https, I get a fairly quick response (but probably not as fast as before). When I use http, it's so slow. What I tried: From this post: I curl localhost from the host and it takes some time, meaning there is no routing issue. The server runs on Amazon EC2 instance and is managed by me only. Also: I see that Apache once running, creates the maximum number of processes it is allowed to, which was not the case before. I lowered the MaxClients to 20 and I think I'm getting faster responses but it still takes over a minute and I always have MaxClients Apache processes. dmesg returns many [ 1953.655703] TCP: Possible SYN flooding on port 80. Sending cookies. When I netstat I get many entries with SYN_RECV. Possibly a DDoS attack? From EC2's monitoring diagrams I see a pattern of high "Maximum Network In (Bytes)" since 2 days ago. By the way the server is still being tested, the actual traffic is very low and not consistent. I tried to go with this solution to limit incoming connections using iptables, still no luck, but I'm trying. Question: What could be the problem? Is this a DDoS attack?

Read the article
vSphere - datastore falling off a host

- by Chadddada

Recently we have been running the vCheck powershell script daily in order to help in monitoring our vSphere ESX 4.0 environment. One of the oddities that we have been seeing is that some of the datastores on the SAN don't always show up on every host. Our hosts are connected redundantly, via FC, to some brocade FC switches, which then connect via fiber to our EMC Ax4 SAN. While all the datastores are presented to each host we have, and they see them initially, they sometimes seem to fall off and are no longer visible. It easy enough to rescan for datastores and add them back to the hosts the hosts but this seems to be an error. Has anyone else seen this or know why it may be happening? Responses to questions: 1. Is it always the same ESX servers that lose their connection? – Scott Warren No this happens randomly on random hosts. If a VM is running on a particular host, of which the VM's disks are on a SAN datastore, then that datastore won't disappear. It seems to happen if a host doesn't touch a datastore for a bit and it just forgets about it.

Read the article
Internet setup for my office

- by prakash

We have two internet connections to our office and our current setup is like this.. The internet connections require pppoe log in so i take each cable and plug it into a wifi router and configure the router to log in to the pppoe and then plug in a cable from the router to a switch and distribute the internet throughout my office. The problem with this setup is it is really hard to monitor and im not able to monitor who is hogging internet usage and what he or she is actually using it for. apart from this we also have a nas setup which is routed through another switch . Could someone please throw a little light on how i can restructure this setup for easy monitoring and better transparency... ? each wan router is connected to a different switch and is distributed to users accordingly.. we have around 40 users in the office.. we want to setup a single linux box to which i want to connect the two wan connections and from there distribute it to all our users.... im looking for a solution where we do not have to invest more that buying a single pc and a couple of nics

Read the article
Juniper router dropping pings to external interface

- by Alexander Garden

My organization has a Juniper SSG20-WLAN that routes our traffic to the outside world. We've been having intermittent problems with our internet connection so I wrote up a Python script to ping the internal interface of the router, the external interface, a couple of our internal servers, the ISP router our router talks to, their upstream provider, and Google and Yahoo for good measure. It does that about every minute. What I have found is that when our internet goes out, our Juniper router ceases responding to pings on the external interface. Everything past that is, of course, unreachable. The internal interface and our internal servers continue to echo back without interruption. None of the counters indicate dropped packets of any type. They all look normal. The logs complain about VIP servers being unavailable but otherwise nothing indicative of network issues. My questions are these: Does this exonerate our ISP? Or, contrawise, might a problem with the connection be causing the external interface to go down? Is there somewhere else in the SSG20, beside the system log and counters, that might help me track down info on the problem? UPDATE: Turned out that one of the switches between my monitoring box and the router was a router itself, and occasionally diverting from the gateway to itself. Kudos to those who made suggestions along those lines. Not really sure which answer to mark as accepted, as it was really stuff in the comments that turned out to be right. Thanks for the suggestions.

Read the article
what means parameter -mailboxcredenctial

- by cotablise

H3llo, I am writing regarding the Exchange powershell commands. When I want to use following cmdlets, I have to insert parameter -mailboxcredential Test-OwaConnectivity Test-OutlookWebServices Test-ImapConnectivity Test-PopConnectivity In the Microsoft official site is written: "The MailboxCredential parameter specifies the mailbox credential for a single URL test." I am not sure why this parameter is needed... I inserted incorrect credentials, however the command was finished successfully... Could you tell me reason why this parameter is needed ? Example: Wrong/incorrect credential [PS] C:\>Test-WebServicesConnectivity -ClientAccessServer EXhub1 -MailboxCredential (Get-Credential blablabla) CasServer LocalSite Scenario Result Latency(MS) Error --------- --------- -------- ------ ----------- ----- EXhub1 Default-Fi... GetFolder Failure [System.Net.WebExcept... Without parameter: [PS] C:\>Test-WebServicesConnectivity -ClientAccessServer EXhub1 WARNING: Test user 'extest_91ef41d34eef4' isn't accessible, so this cmdlet won't be able to test Client Access server connectivity. Could not find or sign in with user ********\extest_91ef41d34eef4. If this task is being run without credentials, sign in as a Domain Administrator, and then run Scripts\new-TestCasConnectivityUser.ps1 to verify that the user exists on Mailbox server EXHUB1.****** + CategoryInfo : ObjectNotFound: (:) [Test-WebServicesConnectivity], CasHealthCouldN...edInfoException + FullyQualifiedErrorId : FB9A14B6,Microsoft.Exchange.Monitoring.TestWebServicesConnectivity WARNING: No Client Access servers were tested. Thank you in advance

Read the article
Windows/IIS Hosting :: How much is too much?

- by bsisupport

I have 4 Windows 2003 servers running IIS 6. These servers host a bunch of unique web sites (in that they are all different in build/architecture/etc). The code behind these sites range from straight HTML, classic ASP, and 1.1/2.0/3.x flavors of .NET. Some (most) of the sites use a SQL backend, which is hosted on one or two different servers – not the IIS servers themselves. No virtualization on these servers and no load balancing for these particular sites. The problem I’m running into is coming up with some baseline metrics to determine, or basically come up with a “baseline score” to know when a web server has reached its hosting limit. Today, some basic information about each server is used: how much bandwidth does the server pump out, hard drive space availability, and basic (very basic) RAM & CPU utilization (what it looks like at peak traffic times.) I would be grateful if those of you that are 1000x smarter than I am could indulge me with your methods of managing IIS environments. Whether performance monitoring specifics, “score” determination as I’m trying to determine, or the obvious combination of both. Thanks in advance.

Read the article
Is there a limit to how many sites can be hosted on a single IP address when using HTTP Host Headers on Windows 2008?

- by Kev

For reasons that are lost in the mists of time, our older Windows (2000, 2003) servers have been configured with a "Administrative" IP address and three further "Hosting" IP addresses. There are also additional IP's for sites with SSL certificates. The "Administrative" IP address is where all our internal provisioning, monitoring and other such apps are bound to. We lock this down and don't permit access to it from the outside world (other than over our VPN). The three "Hosting" IP addresses are used for IIS website hosting (in conjunction with host headers). Historically, new site IP address allocations have been rotated through these three IP addresses. I'm not really sure why. I'm building a new batch of servers and I'm considering just having a single hosting IP address. Our servers can host up to 1200 sites on a single machine. Is there a technical limit to the number of IIS sites that can bind to a single IP address? Our Linux platform seems to do just fine with just a single shared IP + host headers. I initially thought this might be an SEO thing, but given that IPv4 address space conservation is paramount I hardly think Google or other search engines could reasonably penalise site rankings just because hundreds of sites hang off the same IP.

Read the article
ubuntu's average load never below "0.00 0.01 0.05"

- by Karma Fusebox

I have several ubuntu 12.04 VMs running on a ubuntu 12.04 KVM host. Those of the virtual machines that are totally idle with no services (except syslog and the other "small" standard stuff of a fresh installation) show a constant load of "0.00 0.01 0.05" in top/htop as average 1/5/15. When there are "real" applications running, the load averages behave perfectly normal but they never fall below the mentioned values. While this doesn't affect performance at all and could easily be ignored, it screws up the monitoring graphs in a very annoying way: (Notice how load15 behaves nicely if 0.05 for a short time in the right half of the pic) Unfortunately I don't know what diagnostic outputs might be helpful for you, so here's some default stuff: # top top - 16:31:01 up 1:05, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 62 total, 1 running, 61 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.2%sy, 0.0%ni, 99.2%id, 0.5%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1019464k total, 73452k used, 946012k free, 6140k buffers Swap: 0k total, 0k used, 0k free, 22504k cached . # free -m total used free shared buffers cached Mem: 995 72 923 0 6 21 -/+ buffers/cache: 43 951 Swap: 0 0 0 . # iostat -x /dev/vda Linux 3.2.0-32-virtual (vm3) 11/15/2012 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.25 0.00 0.65 0.20 0.24 98.66 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 0.14 0.12 0.51 0.22 6.74 1.46 22.50 0.02 23.26 20.64 29.30 7.63 0.56 Need something else? Has anyone ever seen this behavior? Might this be a bug in kvm/ubuntu/kernel 3.x in the end? Thanks a lot!

Read the article
Nagios DNX plugins

- by danneh3826

I'm toying with the idea of multiple Nagios instances setup to monitor our infrastructure. I've looked at all the various methods of distributed Nagios checks, and I think DNX comes out the closest. DNX handles failure of worker nodes, that's fine. What happens if the main DNX server fails though? Is there a way to replicate the server too? I'm using AWS EC2 primarily, so I can utilise Elastic Load Balancing for the web UI, but I need to be able to handle the AZ where the monitoring server is to fail over, and essentially for a second to pick up the checking load (active/passive, active/active, so long as it doesn't fail completely) The other thing I'm trying to solve is an issue with routing. What I'd like is to have multiple nodes report a fault before Nagios confirms it as critical. Not the NRPE checks, as they're pretty self explanitory, but things more like check_ping. I often have routing issues out of AWS to certain datacenters, so Nagios can often report bad/no ping/timeout as a critical issue, even though the machine in question is working fine. Would it be possible to have a setup where a worker complains a service check is critical, and have a second worker node (positioned in another datacenter/AZ) also report the service as critical before the Nagios central server issues a critical alert? I realise I might be asking a bit much (how far down the line do you go setting up failover systems before it starts to get ridiculous), however surely someone must have thought of this scenario when developing DNX?

Read the article
Computer randomly shuts itself off

- by Decency

I have not been able to determine a pattern for why this happens, despite my best efforts. I've attempted to run it on full power with Prime95 and this doesn't trigger a restart. Generally the restarts occur while I'm playing games, watching videos, or even just having multiple tabs open in a browser. However, I often play processor intense games for hours without any restarts occurring, and sometimes they'll happen 3-4 times in an hour during less intense activity, so I don't think that is the problem. I imagine it has something to do with overheating or power consumption so I've been monitoring CPU temperature and cleaning with compressed air, but the problem keeps happening. I don't know how to track power consumption, and assume that this is the problem. Whenever this occurs, the sound gets stuck in a short loop of whatever was playing at the time, though restarts also occur when nothing is playing. Here is a screenshot of temperatures: and under load: Here's the parts list: http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=10546754 As shown in the list, the case includes a 585W Power Supply, which I've been told should be plenty. I built the computer myself with a friend's guidance but it's very possible I did something wrong. Right now I'm looking into ensuring that I have the latest drivers for all components. Any help would be appreciated- thanks.

Read the article
Outbound ports to allow through firewall - core requirements

- by dunxd

This question was asked before, but in a rather general way. I'm asking more specifically based on my current requirements. We have a number of remote offices made up of a bunch of PCs and an ASA 5505 which is used as firewall and VPN termination point. In the offices we share the internet connection with one or more other organisations over whom we have very little control, asides from the config on the ASAs. For a bunch of reasons I'd like to lock down these ASA 5505s to only allow outbound traffic to ports used by applications we know we need. I'm putting a standard config to roll out to all the ASAs, and if we need to open up ports for the other orgs we can do it on request. But I want to leave open the most commonly required ports so we can get up and running without waiting on other folks technical staff to get back. I plan to allow the following TCP ports to support email and web access, which I know everyone will need: POP3 (110 and 995) HTTP (80 and 443) IMAP4 (143 and 993) SMTP (25 and and 465) The question really is, what other ports do I need to leave open to allow for "normal" working? I've seen UDP port 53 for DNS as one. Are there any others that would be worth opening up? Just to note - I'll also be setting up monitoring systems to keep an eye on the ports we do allow. Any of the above could be misused of course. We'll also back all this up with signed agreements. But I'm aiming for a technical solutions where I don't have to start out with the full requirements of everyone we share connections with. See also: outbound ports that are always open

Read the article
Can't reliably ping 6224 router from directly-attached system

- by David Mackintosh

OK, here's my situation. This is on the internet. The 6224 is the router in this picture and physically resides in Kanata. Both VLAN 1697 and 3994 are provided by an internet service provider. These VLANs are provided through a single 1Gb ethernet wire. The Kanata hosts are directly attached to the 6224; the other two sites are remote. VLAN 3994 is a single IP address space, so theoretically it shouldn't matter physically where the hosts on that subnet are. Here's the problem. I have a monitoring system which is connected further into the internet, so probes from the monitor would come in to this diagram on the 1697 VLAN. When I ping hosts at Albert or Bells Corners from the internet, there is 0 loss. The connection looks perfect. When I ping hosts at Kanata, I lose anywhere from 10 to 40% of the pings. The loss is not predictable, but: when I do lose them, I always lose at least 3, usually 4, rarely more, pings in a bunch. I have attached a monitor directly to the 6224 in Kanata on 3994.. When the monitor pings the 6224 routing interface, I see exactly the same loss pattern -- but NOT at the same time as the loss from the remote system. Ping time is around 1ms. When the monitor pings another system directly attached to the 6224, there is 0 loss. Ping time is about 0.1ms, one-tenth of the time to ping the router. Anyone know what is going on here?

Read the article
Software for RAID Failure Alerts?

- by QF_Developer

I have two 256 GB Samsung 840 Pro SSD disks in a RAID 1 array. I would like to receive a notification if one of the disks in the array fails. Can anybody recommend an application I can install on the server to fire an email if such an event occurs? Here are some additional specs: Supermicro X9SCM-IIF motherboard utilising the hardware RAID controller. OS = Windows 2012 Standard Also is it possible to simulate a disk failure by pulling it out of the bay? SSDs appear to fail close together when in a mirrored config so I'd like to know ASAP if one goes down so I can swap them out with minimum delay. UPDATE 26th June 2013 ------------------------ None of the software that ships with the Supermicro X9SCM-* motherboards offer support for RAID monitoring. As has been pointed out here, these boards are built on an Intel chipset for RAID and so I installed Intel Rapid Storage Technology that supports automated email notifications on RAID failure http://www.intel.com/support/chipsets/imsm/sb/cs-020784.htm One small issue, the software only allows you to send email notifications without SMTP authentication. There's a bunch of different workarounds here: http://communities.intel.com/thread/30771

Read the article
linux automatic change permissions in resolv.file

- by rikr

In various linux servers I see how the permissions of the /etc/resolv.conf file change automatically. In state normal: -r--r--r-- 1 root root 103 Jul 4 11:50 resolv.conf In changed state: -r--r----- 1 root root 103 Jul 4 11:50 resolv.conf I installed auditd for monitoring it, and these are the two entries between the change: type=PATH msg=audit(07/04/2012 12:20:02.719:303) : item=0 name=/etc/resolv.conf inode=137102 dev=fe:00 mode=file,644 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(07/04/2012 12:20:02.719:303) : cwd=/ type=SYSCALL msg=audit(07/04/2012 12:20:02.719:303) : arch=x86_64 syscall=open success=yes exit=3 a0=7feeb1405dec a1=0 a2=1b6 a3=0 items=1 ppid=1585 pid=3445 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=4294967295 comm=hostid exe=/usr/bin/hostid key=(null) type=PATH msg=audit(07/04/2012 12:50:03.727:304) : item=0 name=/etc/resolv.conf inode=137102 dev=fe:00 mode=file,440 ouid=root ogid=root rdev=00:00 type=CWD msg=audit(07/04/2012 12:50:03.727:304) : cwd=/ type=SYSCALL msg=audit(07/04/2012 12:50:03.727:304) : arch=x86_64 syscall=open success=yes exit=3 a0=7f2bcf7abdec a1=0 a2=1b6 a3=0 items=1 ppid=1585 pid=3610 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=4294967295 comm=hostid exe=/usr/bin/hostid key=(null) any ideas?

Read the article
Does anyone know where I could find a 2 input USB voltage meter?

- by John O

What we really need is a tiny UPS, of sorts. We'll be hooking up a solar cell and a battery to a single board computer. Currently, that SBC is a custom Pic32 device, and it does it's own UPS and voltage monitoring duties. I've been tasked with trying to replicate all of its features with off the shelf products... and for the most part I've succeeded. But I don't currently have any way to switch between two sources of juice, or monitor when they're getting low. These guys have something: http://www.mini-box.com/picoUPS-100-12V-DC-micro-UPS-system-battery-backup-system I really like it, the price is well within the budget. We might even work it in though it does 12V and I'll probably be using 5V... there are enough engineers on hand to figure out something. But I'd still have no idea what the voltage was for the PV or battery. I was hoping that there was some simple little USB multimeter thing that I could use to monitor this with, but I can't seem to come up with anything. I've found all sorts of cool hardware, but nothing that will help us. Does anyone know of anything?

Read the article

Search Results

Search found 8021 results on 321 pages for 'resource monitoring'.

Page 217/321 | < Previous Page | 213 214 215 216 217 218 219 220 221 222 223 224 | Next Page >

- by user2008937

- by mhd

- by projectdp

- by UncleBob

- by Yablargo

- by Bobby Jack

- by dfarney

- by Jason Antman

- by Aaronaught

- by Peter Grace

- by hsnm

- by Chadddada

- by prakash

- by Alexander Garden

- by cotablise

- by bsisupport

- by Kev

- by Karma Fusebox

- by danneh3826

- by Decency

- by dunxd

- by David Mackintosh

- by QF_Developer

- by rikr

- by John O

< Previous Page | 213 214 215 216 217 218 219 220 221 222 223 224 | Next Page >