Search Results

Search found 2429 results on 98 pages for 'monitoring'.

Page 22/98 | < Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >

  • Monitoring used connections on mysql to debug 'too many connections'

    - by Nir
    On LAMP production server I get the 'too many connections' error from MYSQL occasionally, I want to add monitoring to find if the reason is that I exceed the max-connections limit. My question: How can I query from mysql or from mysqladmin the current number of used connections? (I noticed that show status gives total connections and not the currently used ones.)

    Read the article

  • FTP monitoring and downloading of new files

    - by Jojo
    Hi Guys, I have an FTP monitoring/downloading application using C# sockets. I got this error message: 421 Disconnecting you since you were inactive for 300 seconds. Can someone have an explanation for this? I did a search on this one but still I can't seem to find a good explanation. Thanks.

    Read the article

  • Nagios plug-in check_snmp receives NO SNMP data from a CISCO Router

    - by Shehryar
    I have tried setting up Nagios on Ubuntu 10.10, successfully installed and can login to web interface, I am however stuck on configuring snmp or I am doing something wrong here, i have followed various sites / nagios wiki to setup configuration (cfg) files. When I check on the web interface, it gives the following error on one of my cisco router: Current Status: UNKNOWN (for 0d 2h 55m 56s) Status Information: SNMP problem - No data received from host CMD: /usr/bin/snmpget -t 1 -r 5 -m RFC1213-MIB -v 1 [authpriv] 192.168.1.1:161 ifOperStatus.1 On the command-line itself, when I type the following, it just sits there waiting and waiting : sudo /usr/local/nagios/libexec/check_snmp -H 192.168.1.1 -C Routers -o sysUpTime.0 When I type the following command : I get an OK /usr/bin/snmpget -v1 192.168.1.1:161 1.3.6.1.2.1.1.5.0 -c "Routers" I have configured SNMP properly on our cisco device as we can collect SNMP Data via two other monitoring tool (SolarWinds and Manage Engine), we are tempted towards Nagios as its opensource. Will be grateful if someone could assist in rectifying this situation and guide me with setting up nagios to monitor Cisco Routers, Switches and a Few Servers. We want to monitor Bandwidth, cpu utilization, uptime and other necessary counters. Will be grateful for your assistance Thanks for reading Shehryar

    Read the article

  • Hyperic HQ- Monitor process statistics for 50+ processes on Linux machine

    - by Chris
    Is there an easy way to get metrics on all processes that start with the letters XYZ? I have about 80 processes that I have to monitor individually that all start with the prefix XYZ. I have created a query using the sigar shell: ps State.Name.sw=XYZ, which will give me a list of the processes that I want. What I need to do is define this list of processes through said query and collect and track statistics from the Process service: http://support.hyperic.com/display/hypcomm/Process+service What I need is 3 or 4 key statistics for each of the XYZ processes defined by my query to show up as graphs in the web front end. Note: Hyperic HQ server is installed on a windows machine and I'm monitoring a Linux box via an agent. Thanks, Chris Edit: Here is my try at a plugin that may give me what I want, but it's not being inventoried/detected by the Hyperic web UI. Simply pointing me to one of Hyperic's tutorials won't do. Thanks. <!DOCTYPE plugin [ <!ENTITY process-metrics SYSTEM "/pdk/plugins/process-metrics.xml">]> <plugin> <server name="ABCStats"> <config> <option name="process.query" description="Process Query" default="State.Name.sw=XYZ"/> </config> <metric name="Availability" alias="Availability" template="sigar:Type=ProcState,Arg=%process.query%:State" category="AVAILABILITY" indicator="true" units="percentage" collectionType="dynamic"/> &process-metrics; <plugin type="autoinventory"/> <plugin type="measurement" class="org.hyperic.hq.product.MeasurementPlugin"/> </server> </plugin>

    Read the article

  • Hyperic HQ- Monitor process statistics for 50+ processes on Linux machine

    - by Chris
    Is there an easy way to get metrics on all processes that start with the letters XYZ? I have about 80 processes that I have to monitor individually that all start with the prefix XYZ. I have created a query using the sigar shell: ps State.Name.sw=XYZ, which will give me a list of the processes that I want. What I need to do is define this list of processes through said query and collect and track statistics from the Process service: http://support.hyperic.com/display/hypcomm/Process+service What I need is 3 or 4 key statistics for each of the XYZ processes defined by my query to show up as graphs in the web front end. Note: Hyperic HQ server is installed on a windows machine and I'm monitoring a Linux box via an agent. Thanks, Chris Edit: Here is my try at a plugin that may give me what I want, but it's not being inventoried/detected by the Hyperic web UI. Simply pointing me to one of Hyperic's tutorials won't do. Thanks. <!DOCTYPE plugin [ <!ENTITY process-metrics SYSTEM "/pdk/plugins/process-metrics.xml">]> <plugin> <server name="ABCStats"> <config> <option name="process.query" description="Process Query" default="State.Name.sw=XYZ"/> </config> <metric name="Availability" alias="Availability" template="sigar:Type=ProcState,Arg=%process.query%:State" category="AVAILABILITY" indicator="true" units="percentage" collectionType="dynamic"/> &process-metrics; <plugin type="autoinventory"/> <plugin type="measurement" class="org.hyperic.hq.product.MeasurementPlugin"/> </server> </plugin>

    Read the article

  • 16TB Volumes and SNMP On Windows

    - by John K
    As volumes larger than 16TB became more common, it was recognized that the 32 bit value used to report disk size and usage within the standard "HOST-RESOURCES" MIB in SNMP was not large enough to report the proper disk size. Net-SNMP seems to have addressed this issue by simply manipulating the value of "AllocationUnits" to maintain a 32 bit value for disk utilization (since total disk size/usage is equal to the 32 bit space value times the allocation unit), to allow for the calculation of a volume larger than 8/16TB. Presuming you don't have any reporting interest in the allocation unit, this seems like a fine solution. https://bugzilla.redhat.com/show_bug.cgi?id=654384 Window's built in SNMP service, however, seems to continue to suffer from this error, simply reporting the modulo of the used/assigned disk space, resulting in inaccurate disk size reporting. Is there a way to enable Windows to correctly report disk usage for volumes over 16TB? We attempted to simply install Net-SNMP 5.5 x64 and disable Windows SNMP service entirely, however this unfortunately did not fix our issue. I've seen people in the Cacti community mention simply scripting out a solution. Unfortunately, we're using Observium for quick and basic systems monitoring. If the issue can't be correct on the Window's side, can Observium be made to report custom MIBs?

    Read the article

  • How can I setup BluePill to Monitor a Rails App Running via Passenger (mod_rails)

    - by Jim Jeffers
    I recently launched a site running phusion passenger. Unfortunately, the site went down due to a frozen thread. I was able to save the server by doing kill -9 to the specific PID. Still though, I thought passenger was able to manage this automatically. I have a server with 1GB of memory running one rails app with passenger allotted up to 7 instances. However, when I came to discover the site went down I found that passenger had spawned 6 instances with one of them using up over 800mb of memory causing the server to swap. As a result I am hoping to setup something like bluepill on the server but I'm slightly confused as to how you go about doing it. Mainly because bluepill expects to start/stop the processes it's monitoring. However, in our case, passenger already restarts processes for us so we only need to monitor the pids of passengers instances and kill them once they've gotten too large. Has anyone here setup BluePill to monitor a rails app running under phusion's passenger? Any insight would be useful.

    Read the article

  • How should I monitor memory usage/performance in SunOS/Solaris?

    - by exhuma
    Last week we decided to add some SunOS (uname -a = SunOS bbs-sam-belair 5.10 Generic_127128-11 i86pc i386 i86pc) machines into our running munin instance. First off, the machines are pre-configured appliances, so, I want to avoid touching the system too much without supervision of the service provider. But adding it to munin was fairly easy by writing a small socket-service (if anyone is interested, I put it up on github: https://github.com/munin-monitoring/contrib/tree/master/tools/pypmmn) Yesterday, I implemented/adapted the required plugins for our machines. And here the questions start: First, I have not found a way to determine detailed memory usage values. I get the total memory by running prtconf | grep Memory, and the free memory using vmstat. Fiddling together a munin-plugin, gives me the following graph: This is pretty much uninformative. Compare this to the default plugin for linux nodes which has a lot more detail: Most importantly, this shows me how much memory is actually used by applications. So, first question: Is it possible to get detailed memory information on SunOS with the default system tools (i.e. not using top)? Onto the next puzzle: Seeing the graphs, I noticed activity in the "Paging in/out" graphs, even though the memory graph still has unused memory: Upon further investigation, I found out that df reports that /tmp is mounted on swap. Drilling around on the web, I understood that df will display swap, but in fact, it's mounted as a tmpfs. Now I don't know if this explains the swap activity. The default munin-plugin for solaris uses kstat -p -c misc -m cpu_stat to get these values. I find it already strange that this is using the cpu_stat module. So maybe I simply misinterpret the "paging" graphs? Second question: Do the paging graphs indicate that parts of the memory are paged to disk? Or is the activity caused by file operations in /tmp?

    Read the article

  • Representing server state with a metric

    - by Sal
    I'm using Microsoft's Performance Monitor to dump logs of RAM, CPU, network, and disk usage from multiple servers. I'd like to get a single metric that captures the state of a given variable to a good extent. For instance, disk usage is pretty stable, so if I take a single reading that says I have 50% remaining disk space, that reading will give me an accurate measure for the day. (The servers aren't doing heavy IO writing.) However, the tricky part here is monitoring CPU and network usage. The logs currently dump the % CPU usage every ten seconds. If I take a straight average of the numbers, it may not represent reality, as % CPU will be much lower during the night than day. (We host websites that sell appliance items.) I'd like to get an average over a span during peak hours (about 5 hours in the day) and present a daily peak hour metric. Of course, there are most likely some readings that will come in as overly spiked (if multiple users pinged the server at once) or no use (a momentary idle state). Is there a standard distribution/test industries use in these situation?

    Read the article

  • Integrating Nagios with a ticketing system/incident mnagement system

    - by sektor
    Is there a free ticketing system/incident management system which will help me in achieving the following? 1) If a service goes down then Nagios alerts the on-duty staff and pushes the status to some backend or DB as a ticket, say the initial status is "New". 2) The on-duty staff logs in through a frontend and acknowledges the new ticket by marking it as "In progress", so now the status of the ticket changes from "New" to "In progress". 3) If even after "n" number of minutes no person from on-duty staff has changed the ticket status to "In progress" then Nagios alerts the next level of contacts. Although if the on-duty staff has acknowledged the ticket then there is no need to alert the next level. 4) When the service comes up Nagios closes the ticket by marking it "Closed" Now I already have Nagios monitoring set up and currently it alerts by sending text messages and mails, what I'm looking for is some framework which only escalates the issue(alerts the second level) if the first level(on-duty staff) fails to respond to the initial alert. By "responding to the alert" I mean, the on-duty staff can login via some frontend and basically change the status to something like "Acknowledged" or "In progress".

    Read the article

  • How can I setup a Proxy I can sniff traffic from using an ESX vswitch in promiscuous mode?

    - by sandroid
    I have a pretty specific requirement, detailed below. Here's what I'm not looking for help for, to keep things tidy and on topic: How to configure a standard proxy Any ESX setup required to facilitate traffic sniffing How to sniff traffic Any changes in design (my scope limits me) I need to setup a test environment for a network-sniffing based HTTP app monitoring tool, and I need to troubleshoot a client issue but he only has a prod network, so making changes to the config on client's system "just to try" is costly. The goal here is to create a similar system in my lab, and hit the client's webapp and redirect my traffic - using a proxy - into the lab environment. The reason I want to use a proxy is so that only this specific traffic is redirected for all to see, and not all my web traffic (like my visits to serverfault :P). Everything will run inside an ESX 4.1 machine. In there, there is a traffic collection vswitch in promiscuous mode that is not on the local network for security reasons. The VM containing our listening agent is connected to this vswitch. On the same ESX host, I will setup a basic linux server and install a proxy (either apache + mod_proxy or squid, doesn't matter). I'm looking for ideas on how to deploy this for my needs so I can then figure out how to set it up accordingly. Some ideas I've had were to setup two proxies, and have them talk to eachother through this vswitch in promiscuous mode, but it seems like alot of work. Another idea is a dual-homed proxy, but I've never seen/done that before so I'm not sure how doable it is for what I'd like. I am OK with setting up a second vswitch in promiscuous mode to facilitate this if need be, but I cannot put the vswitch on the lan (which is used so my browser would communicate with the proxy) in promiscuous mode. Any ideas are welcome.

    Read the article

  • How can I setup BluePill to Monitor a Rails App Running via Passenger (mod_rails)

    - by Jim Jeffers
    I recently launched a site running phusion passenger. Unfortunately, the site went down due to a frozen thread. I was able to save the server by doing kill -9 to the specific PID. Still though, I thought passenger was able to manage this automatically. I have a server with 1GB of memory running one rails app with passenger allotted up to 7 instances. However, when I came to discover the site went down I found that passenger had spawned 6 instances with one of them using up over 800mb of memory causing the server to swap. As a result I am hoping to setup something like bluepill on the server but I'm slightly confused as to how you go about doing it. Mainly because bluepill expects to start/stop the processes it's monitoring. However, in our case, passenger already restarts processes for us so we only need to monitor the pids of passengers instances and kill them once they've gotten too large. Has anyone here setup BluePill to monitor a rails app running under phusion's passenger? Any insight would be useful.

    Read the article

  • What Logs / Process Stats to monitor on a Ubuntu FTP server?

    - by Adam Salkin
    I am administering a server with Ubuntu Server which is running pureFTP. So far all is well, but I would like to know what I should be monitoring so that I can spot any potential stability and security issues. I'm not looking for sophisticated software, more an idea of what logs and process statistics are most useful for checking on the health of the system. I'm thinking that I can look at various parameters output from the "ps" command and compare to see if I have things like memory leaks. But I would like to know what experienced admins do. Also, how do I do a disk check so that when I reboot, I don't get a message saying something like "disk not checked for x days, forcing check" which delays the reboot? I assume there is command that I can run as a cron job late at night. How often should it be run? What things should I be looking at to spot intrusion attempts? The only shell access is SSH on a non-standard port through UFW firewall, and I regularly do a grep on auth.log for "Fail" or "Invalid". Is there anything else I should look at? I was logging the firewall (UFW) but I have very few open ports (FTP and SSH on a non standard port) so looking at lists of IP's that have been blocked did not seem useful. Many thanks

    Read the article

  • How to find the reason for a weekly downtime on an Ubuntu web server hosted by AWS?

    - by IceSheep
    We started monitoring our web server using Pingdom and found out that we have a downtime of a few minutes every Sunday at 0:00 UTC. The test runs every minute and checks if a successful HTTP response (code 200) is returned on port 80. The test fails due to a timeout (no response after 30 seconds). Here's what we've already checked – without success: Since we run our webserver behind a load balancer, I've set the Pingdom test on the load balancer's public DNS and the webserver's public DNS in order to find out if there's a problem with the AWS load balancer – both tests return the same result We set up Munin on our webserver. Everything looked fine even after the failure. Since the last failure lasted only 2 minutes I suppose Munin couldn't capture a potential problem (it only checks every 5 minutes) I have checked /var/log/apache2/error.log and /var/log/syslog for suspicious entries I have checked /etc/cron.weekly and /etc/crontab for suspicious entries I have searched for files created or last-modified during 0:00 and 0:15 using this method: touch -t 201209020000 start touch -t 201209020015 end find / -newer start -and ! -newer end (nothing found) Has anybody experienced a similar problem? Any proposals on how to find the reason for this behavior? It's Ubuntu 10.04 LTS running on an AWS m1.large instance. Thanks!

    Read the article

  • What should I use to ping multiple IPs and get notified of time outs?

    - by HumanVirus
    I've been using MultiPing to ping hundreds of IPs (from access points and such) and check their performance (packet loss, latency) and uptime. The program is very easy to use, but I was wondering if someone could recommend me something that would work better and that would also work in Linux. The features I'm looking for are: Notification Types: At least desktop notifications and SMS, but it would be great if it also had e-mail, IM, or other types of notifications. (MultiPing has some of these, but they don't work too well.) Being notified about the root problem only: Since some devices are dependent on others, I'd like to be notified only about the root problem. E.g. Let's say I have A[x.x.x.222]B[x.x.x.33C[x.x.x.44]D[x.x.x.55], and B goes down, therefore C and D will also be down. Is it possible to get a notification only about B being down? Light on resources. Ideally multiplatform or at least available for both Linux and Windows. I've heard about Nagios and Shinken being used for monitoring. Would you recommend that I use something of the sort or would that be too much for my needs? If using Nagios, Shinken, or similar software is recommended, can anyone tell me what sites I should go to or what books I should get that would be good for someone who is totally new at this? I'd appreciate any suggestions.

    Read the article

  • What Logs / Process Stats to monitor on a Ubuntu FTP server?

    - by Adam Salkin
    I am administering a server with Ubuntu Server which is running pureFTP. So far all is well, but I would like to know what I should be monitoring so that I can spot any potential stability and security issues. I'm not looking for sophisticated software, more an idea of what logs and process statistics are most useful for checking on the health of the system. I'm thinking that I can look at various parameters output from the "ps" command and compare to see if I have things like memory leaks. But I would like to know what experienced admins do. Also, how do I do a disk check so that when I reboot, I don't get a message saying something like "disk not checked for x days, forcing check" which delays the reboot? I assume there is command that I can run as a cron job late at night. How often should it be run? What things should I be looking at to spot intrusion attempts? The only shell access is SSH on a non-standard port through UFW firewall, and I regularly do a grep on auth.log for "Fail" or "Invalid". Is there anything else I should look at? I was logging the firewall (UFW) but I have very few open ports (FTP and SSH on a non standard port) so looking at lists of IP's that have been blocked did not seem useful. Many thanks

    Read the article

  • IRP_MJ_WRITE latency up to 15 seconds

    - by racitup
    We have written an application that performs small (22kB) writes to multiple files at once (one thread performing asynchronous queued writes to multiple locations on behalf of other threads) on the same local volume (RAID1). 99.9% of the writes are low-latency but occasionally (maybe every minute or two) we get one or two huge latency writes (I have seen 10 seconds and above) without any real explanation. Platform: Win2003 Server with NTFS. Monitoring: Sysinternals Process Monitor (see link below) and our own application logging. We have tried multiple things to try and solve this that have been gleaned from a few websites, e.g.: Making the first part of file names unique to aid 8.3 name generation Writing files to multiple directories Changing Intel Disk Write Caching Windows File/Printer Sharing Minimize memory used Balance Maximize data throughput for file sharing Maximize data throughput for network applications System-Advanced-Performance-Advanced NtfsDisableLastAccessUpdate - use fsutil behavior set disablelastaccess 1 disable 8.3 name generation - use "fsutil behavior set disable8dot3 1" + restart Enable a large size file system cache Disable paging of the kernel code IO Page Lock Limit Turn Off (or On) the Indexing Service But nothing seems to make much difference. There's a whole host of things we haven't tried yet but we wondered if anyone had come across the same problem, a reason and a solution (programmatic or not)? We can reproduce the problem using IOMeter and a simple setup: Start IOMeter and remove all but the first worker thread in 'Topology' using the disconnect button. Select the Worker thread and put a cross in the box next to the disk you want to use in the Disk Targets tab and put '2000000' in Maximum Disk Size (NOTE: must have at least 1GB free space; sector size is 512 bytes) Next create a new access specification and add it to the worker thread: Transfer Request Size = 22kB 100% Sequential Percent of Access Spec = 100% Percent Read/Write = 100% Write Change Results Display Update Frequency to 5 seconds, Test Setup Run Time to 20 seconds and both 'Number of Workers to Spawn Automatically' settings to zero. Select the Worker Thread in the Topology panel and hit the Duplicate Worker button 59 times to create 60 threads with identical settings. Hit the 'Go' button (green flag) and monitor the Results tab. The 'Maximum I/O Response Time (ms)' always hits at least 3500 on our machine. Our machine isn't exactly slow (Xeon 8 core rack server with 4GB and onboard RAID). I'd be interested to see what other people get. We have a feeling it might be something to do with the NTFS filesystem (ours is currently 75% full of fragmented files) and we are going to try a few things around this principle. But it is also related to disk performance since we don't see it on a RAMDisk and it's not as severe on a RAID10 array. Any help is much appreciated. Richard Right-click and select 'Open Link in New Tab': ProcMon Result

    Read the article

  • Splitting an internet connection between multiple separate subnetworks

    - by pythonian4000
    Problem I have an internet connection that I want to split between four separate networks. My requirements are: I need to be able to monitor the amount of bandwidth and data being used by each network, and notify or control as necessary. The four networks should only be able to connect to the internet, not each other. My parents need to be able to operate it, so it needs a simple, preferably Windows-based GUI. Progress so far Server I have a mini-ITX server with six Gigabit ethernet ports - one for the ethernet internet connection, one for each of the four networks, and one for remote access to the server for administration. Bandwidth control I spent a long time researching solutions here. The majority of the control systems/software I found could control bandwidth usage via QOS, but could not monitor or control the amount of data being used. Eventually I found the SoftPerfect Bandwidth Manager, which has everything I need in terms of monitoring and control - per-interface quota management, usage statistics, a web interface for checking usage, and email notifications when quotas are exceeded. It is also Windows-based and has a simple GUI. Internet sharing This is where I am having issues. I am currently using Windows XP Pro SP2 for the server (yes, I know this is far from ideal, but it's the only spare Windows OS I currently have). I can't use the built-in Internet Connection Sharing for several reasons: The upstream internet router has an IP of 192.168.0.1 which ICS clashes with, and I cannot change the router settings. ICS can only share an internet connection with a single interface, but I have four. I have tried bridging the four network cards, but then the Bandwidth Manager cannot see the four individual interfaces - it only sees the bridge. I have tried setting up Dual DHCP DNS server (and am having issues getting DHCP offers to be received by clients), but that would still require gateway software of some sort, which I have been unable to find. My current attempt is to use OpenVPN, with a server for the internet NIC and a separate client for each of the four networks. My thought is that I could bridge the OpenVPN TAP devices to each NIC, meaning that the Bandwidth Manager would control traffic from the bridge instead of the interface. I have not made much progress here though - I've never used OpenVPN before. Questions Is there a Windows software package that does everything I need? (Unlikely, I know) Is there a Windows software package that will share internet between multiple NICs without bridging? Are either of my about attempts feasible? Would it help to have a newer/server version of Windows? Is there a non-Windows alternative that is easy to use?

    Read the article

  • Snmpd update interface counters slowly or something like this

    - by Korjavin Ivan
    I update one my freebsd box to 9-stable (totally new installation) and install net-snmp for monitoring. uname -r 9.1-PRERELEASE pkg_info net-snmp-5.7.1_7 Information for net-snmp-5.7.1_7: Comment: An extendable SNMP implementation .... cat /var/db/ports/net-snmp/options # This file is auto-generated by 'make config'. # Options for net-snmp-5.7.1_7 _OPTIONS_READ=net-snmp-5.7.1_7 _FILE_COMPLETE_OPTIONS_LIST= IPV6 MFD_REWRITES PERL PERL_EMBEDDED PYTHON DUMMY TKMIB DMALLOC MYSQL AX_SOCKONLY UNPRIVILEGED OPTIONS_FILE_UNSET+=IPV6 OPTIONS_FILE_UNSET+=MFD_REWRITES OPTIONS_FILE_SET+=PERL OPTIONS_FILE_SET+=PERL_EMBEDDED OPTIONS_FILE_UNSET+=PYTHON OPTIONS_FILE_SET+=DUMMY OPTIONS_FILE_UNSET+=TKMIB OPTIONS_FILE_SET+=DMALLOC OPTIONS_FILE_UNSET+=MYSQL OPTIONS_FILE_UNSET+=AX_SOCKONLY OPTIONS_FILE_UNSET+=UNPRIVILEGED I have about 500 vlan on this machine, and collect info about interface through snmpd to 2 different software, zabbix and cacti. And both of them plot the graphs with blank fields. I tryed change polling time in zabbix, from 15, sec to 30,60,90,120,10. And anyway i have blank fields. snmpd.conf is empty - only a access controls. This configuration worked fine on freebsd 8. Where is my fault? How fix this graphs? UPD: Changing pooling time, switch off one of agent, doesnt help. I look at zabbix log (recieved data from snmpd) and see that: sorry for russian locale, just look at numbers: and thats is not true, as my "iftop" show speed was about 90Mbits, but snmpd return 2Mbits. I understand that snmpd doesnt return speed, it return just a counter. But how its possible? why 2Mbit/s ? I tryed recompile snmpd with 64-bit counters, and without it. In both variants this blank fields present. So i think its my OS (freebsd) doesnt update interface counters well. I still collect tcpdump for found this request/response. But have problem with that, to much trash. UPD2: I decrypt tcpdump-ed file, and public this as google doc at gdocfile Timediff looks strange.. Like zabbix sometimes "forget" do request, and then do twice at row, ehh UPD3: I parse log from command "while true; do netstat -bin -I vlan4008 /var/log/netstat; sleep 300; done" and load as google docs, and add formula for speed : link Looks like all counters in OS are good. Now i think problem in : 1. zabbix get request twice at row (and what about cacti) 2. snmpd use counter32

    Read the article

  • Sysadmin 101: How can I figure out why my server crashes and monitor performance?

    - by bflora
    I have a Drupal-powered site that seems to have neverending performance problems. It was butt-slow about 5 months ago. I brought in some guys who installed nginx for anonymous visitors, ajaxified a few queries so they wouldn't fire during page load, and helped me find a few bottlenecks in the code. For about a month, the site was significantly faster, though not "fast" by any stretch of the word. Meanwhile, I'm now shelling out $400/month to Slicehost to host a site that gets less than 5,000/uniques a day. Yes, you read that right. Go Drupal. Recently the site started crashing again and is slow again. I can't afford to hire people to come in, study my code from top to bottom, and make changes that may or may not help anymore. And I can't afford to throw more hardware at the problem. So I need to figure out what the problem is myself. Questions: When apache crashes, is it possible to find out what caused it to crash? There has to be a way, right? If so, how can I do this? Is there software I can use that will tell me which process caused my server to die? (e.g. "Apache crashed because someone visited page X." or "Apache crashed because you were importing too many RSS items from feed X.") There's got to be a way to learn this, right? What's a good, noob-friendly way to monitor my current apache performance? My developer friends tell me to "just use Top, dude," but Top shows me a bunch of numbers without any context. I have no clue what qualifies as a bad number or a good number in Top, or which processes are relevant and which aren't. Are there any noob-friendly server monitoring tools out there? Ideally, I could have a page that would give me a color-coded indicator about how apache is performing and then show me a list of processes or pages that are sucking right now. This way, I could know when performance is bad and then what's causing it to be so bad. Why does PHP memory matter? My apparently has a 30MB memory foot print. Will it run faster if I bring that number down? Thanks for any advice. I spent a year or so trying to boost my advertising income so I could hire a contractor to solve my performance woes. I didn't want to have to learn all this sysadmin voodoo. I'm now resigned to the fact that might not have a choice.

    Read the article

  • Windows DHCP Server - get notification when a non-AD joined device gets an IP address

    - by TheCleaner
    SCENARIO To simplify this down to it's easiest example: I have a Windows 2008 R2 standard DC with the DHCP server role. It hands out IPs via various IPv4 scopes, no problem there. WHAT I'D LIKE I would like a way to create a notification/eventlog entry/similar whenever a device gets a DHCP address lease and that device IS NOT a domain joined computer in Active Directory. It doesn't matter to me whether it is custom Powershell, etc. Bottom line = I'd like a way to know when non-domain devices are on the network without using 802.1X at the moment. I know this won't account for static IP devices. I do have monitoring software that will scan the network and find devices, but it isn't quite this granular in detail. RESEARCH DONE/OPTIONS CONSIDERED I don't see any such possibilities with the built in logging. Yes, I'm aware of 802.1X and have the ability to implement it long-term at this location but we are some time away from a project like that, and while that would solve network authentication issues, this is still helpful to me outside of 802.1X goals. I've looked around for some script bits, etc. that might prove useful but the things I'm finding lead me to believe that my google-fu is failing me at the moment. I believe the below logic is sound (assuming there isn't some existing solution): Device receives DHCP address Event log entry is recorded (event ID 10 in the DHCP audit log should work (since a new lease is what I'd be most interested in, not renewals): http://technet.microsoft.com/en-us/library/dd759178.aspx) At this point a script of some kind would probably have to take over for the remaining "STEPS" below. Somehow query this DHCP log for these event ID 10's (I would love push, but I'm guessing pull is the only recourse here) Parse the query for the name of the device being assigned the new lease Query AD for the device's name IF not found in AD, send a notification email If anyone has any ideas on how to properly do this, I'd really appreciate it. I'm not looking for a "gimme the codez" but would love to know if there are alternatives to the above list or if I'm not thinking clear and another method exists for gathering this information. If you have code snippets/PS commands you'd like to share to help accomplish this, all the better.

    Read the article

  • Problem with Hyperic monitoring on CloudFoundry - frequent alerts

    - by Pavel P
    Hi, I'm running single instance CloudFoundry configuration with one web application. I turned on Hyperic monitoring with notification for case of web app unavailability. Now I randomly receive alert emails (Subject "An alert has been triggered - Deployment myapp - context unavailable") that the application is not running, but it obviously is running fine. In access log of Apache I see two requests every 15 seconds: 127.0.0.1 - - [17/Mar/2010:15:37:33 +0100] "GET /server-status?auto HTTP/1.1" 200 438 "-" "Jakarta Commons-HttpClient/3.1" 127.0.0.1 - - [17/Mar/2010:15:37:33 +0100] "GET /myapp HTTP/1.1" 200 - "-" "Jakarta Commons-HttpClient/3.1" At the time when I get the alert emails, everything in log still seems to be fine - two requests. Do you have idea what could be wrong? Did anybody have this kind of problem and solve it? Thanks, P

    Read the article

< Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29  | Next Page >