Search Results

Search found 334 results on 14 pages for 'nagios'.

Page 12/14 | < Previous Page | 8 9 10 11 12 13 14 | Next Page >

application monitoring tools

- by Shachar

we're an ISV about to deploy our SaaS application over the internet to our end users, and are currently looking for an application monitoring solution. In addition to monitoring the usual OS-level suspects (I/O, disk space, logs, CPU, RAM, swapping, etc.), we're also looking to monitor, alert and report on internal application events, conditions, and counters (think queue size for internal service, or latency of a service we're getting from a third party via custom APIs). We're started looking at Nagios, Zenoss, etc., but found out those do only low-level stuff, and are currently looking at MOM and ManageEngine. Still, they are far from being an custom app monitoring tool. So - do you have anything to suggest?

Read the article
Monitoring System for the cloud?

- by Maxim Veksler

I need a monitoring system, much like ganglia / nagios that is build for the cloud. I need it to support : Adding / removing nodes dynamically. (Node shuts down, dose not imply node failure...) Dynamic node based categorization, meaning node can identify them self as being part of group X (ganglia gets this almost right, but lacks the dynamic part...) Does not require multicast support (generally not allowed in cloud based setups) Plugins for recent cool stuff such as Hadoop, Cassandra, Mongo would be cool. More features include: External API, web interface and co. I've looked at Ganglia, munin and they both seem be almost there (but not exactly). I would also go for reasonably priced Software as Service solution. I'm currently doing research, so Suggestions are highly appreciated. Thank you, Maxim

Read the article
Spawn phone call from EC2 alerts

- by Matt

I have a system setup on AWS/EC2, it currently is using their CloudWatch alert system. The problem is this sends just to email, when ideally I would like this to be making a phone call and/or sending text messages to certain phone numbers when an alert fires (Note that I do not need the phone call to actually say anything, just call the person). We are trying to solve the problem that Amazon alerts are only useful if people are checking their email, which isnt always the case because all server problems love to happen at 4am on saturday... Please respond with any possible solutions/ideas, ideally I do not want to implement an entire monitoring system (IE: Nagios) on top of everything to handle this.

Read the article
How to refresh open source software pkg manager on oldish OpenSolaris?

- by Luke404

I'm being presented with an OpenSolaris vps, actually a Solaris Container, which is based on SXCE snv_121 and is active since mid 2007: the good old Sun days, IIRC even before the Indiana stuff! For various reasons the system itself can't be rebuilt/upgraded but we can do whatever we want with the additional package manager on it. My Solaris skills and especially knowledge of the free package managers ecosystem is a bit rusty so I don't know what I can actually use while keeping the somewhat oldish base system. Currently there is pkg-get using some older Blastwave mirror, it has been used to install things such as Apache2, PHP, Python, Nagios. I would like to remove all the old rusty stuff and all of Blastwave, and start fresh with some newer package distribution. Can the current Blastwave system be used on that snv_121? Is there any better alternative still compatible with that system (eg. OpenCSW or anything else) ?

Read the article
SNMP keeps crashing

- by jldugger

We're using OpsView/Nagios to monitor our servers. We've added the SNMP service to all our servers and deployed the configuration via GPO, but one win2k3 server seems to have a problem; it crashes pretty regularly. The event log carries messages like: Event Type: Error Event Source: Service Control Manager Event Category: None Event ID: 7034 Date: 6/11/2009 Time: 7:11:49 PM User: N/A Computer: HOSTNNAME Description: The SNMP Service service terminated unexpectedly. It has done this 2 time(s). and also Event Type: Error Event Source: Application Error Event Category: (100) Event ID: 1000 Date: 6/11/2009 Time: 7:11:18 PM User: N/A Computer: HOSTNAME Description: Faulting application snmp.exe, version 5.2.3790.3959, faulting module ntdll.dll, version 5.2.3790.3959, fault address 0x000417af. Now, I could probably set it to simply restart on crash in perpetuity, but I think it's better to fix problems like this. Is this a known problem? If not, what should I do to diagnose it?

Read the article
Disk space mismatch on OS X Server (Leopard)

- by John Gardeniers

My Nagios system sent me an alert to inform me that the disk space on one of the drives on our OS X server is very low. When I run df /Volumes/Apps/ I get /dev/disk0s3 117209520 114932472 2277048 99% /Volumes/Apps When I run du -c /Volumes/Apps it reports 11489944 total Why might there be such a vast difference? Even more importantly, how do I find the problem and what can I do about it? I'm essentially just a Windows admin, so am well out of my comfort zone here. I use a Mac but I'm not a Mac admin in any real sense of the word.

Read the article
Lightweight monitoring for a Windows XP laptop

- by kazanaki

Hello I have a windows XP laptop in a remote location. I would like to have an overview for CPU/Memory statistics from a remote location. Monitoring a specific service (a Tomcat instance) would be nice but not essential. I have seen the monitoring solutions (Nagios, cacti e.t.c) and they are all very heavy. I do not want to install mysql, web server and other stuff like that on the laptop. I don't even need a web solution at all. It could just be a simple command line app with a server port and on my machine another GUI application would connect there (and not a web browser) Is there something like this available?

Read the article
Android software for the system administrator on the move

- by GruffTech

My company has over service through Verizon, and AT&T Service in the area is "shoddy" at its best, so I haven't been able to join the "iPhone party" like so many of my fellow system administrators have been able to. That being said, this week finally a phone I like has hit Verizon, the HTC Incredible. (I've been waiting for the Desire or Nexus One, but after seeing spec sheets and reviews, HTC Incredible comes out ahead anyway). So (finally) I'm looking for Android Apps that are "gotta-haves" for system administrators. I've found the bottom three. If there are others you prefer over these let me know. RDP Program - RemoteRDP SSH Client - ConnectBot Nagios - NagMonDroid Reply with your favorite Android App and why!

Read the article
Monitoring ASA packet loss via SNMP

- by dunxd

I want to monitor packet loss on my ASA 5505 VPN endpoints using SNMP. This is so I can graph the rates in Cacti and/or get alerts in Nagios. However, I am not sure what SNMP values I should use to measure packet loss. In the ASA I can run sh interface Internet stats to show traffic statistics for the interface connected to the Internet. This shows 1 minute and 5 minute drop rates. Are these measures an indicator of packet loss? Are there SNMP values I can access that correspond to those values? Should I be looking at different values? Is the ASA even able to measure packet loss?

Read the article
Monitoring several remote servers over different VPNs

- by Ciaran

I'm a developer with about 20 different clients running our server application. I access each of the clients' servers remotely through VPN to provide support, updates, etc. Is there any tool available that I can set up locally that will connect through each of the VPNs automatically to allow me to monitor? The idea sounds very far fetched to me as the VPN software varies a good bit but maybe someone's had to do something similar before? It's been a few years since I last used Nagios but I think it'd be quite cool to have that set up pointing at each of the remote servers through VPN somehow.

Read the article
Restarting rsyslog re-sends logs again

- by Jay Taylor

I am running Ubuntu 12.04.1 LTS on EC2. I have a bunch of application servers which are configured to forward their logs to a central server via rsyslog. Since putting in Nagios monitoring on the log files on the central server, I've been getting alerts indicating that particular application servers are failing to forward their logs to the centralized server. Logging into the machines and restarting the rsyslog service fixes the problem. However, rsyslog then re-transmits the logs again, resulting in duplicates on the collector. Why is it doing this?

Read the article
CLI-Based monitoring tool for KVM

- by Pinnacle

I am developing a scheduler for running VMs on KVM. The scheduling has over-commitment of resources like memory and CPU. For this, I need a CLI-based monitoring tool that keeps me giving information about the resource usage of each VM, because it might be the case that due to over-provisioning of resources, VMs on a particular host are running very slowly depending on the benchmarks/programs each VM is running, and then I need to migrate a VM to another host and so on. I looked into libvirt-based tools like collects, MUNIN, Nagios-vert, etc.( http://libvirt.org/apps.html#monitoring ) I also looked into Ubuntu utility perf-kvm ( http://manpages.ubuntu.com/manpages/maverick/man1/perf-kvm.1.html ) I want to ask which CLI-based would be recommended by the community so that I can make a automated scheduler that takes care of the above situation.

Read the article
Server monitoring for medium scale UNIX network

- by nbartolomeo

I'm looking for suggestions for a good monitoring tools, or tools, to handle a mixed Linux (RedHat 4-5) and HPUX environment. Currently we are using Hobbit which is working reasonably well but it is becoming harder to keep track of what alerts are sent out for what servers. Features I'd like to see: Easy configuration of servers. The ability to monitor CPU, network, memory, and specific processes I've looked into Nagios but from what I have seen it won't be easy to set up the configuration for all of our servers ~200 and that without installing a plugin into each agent I won't be able to monitor processes.

Read the article
What are best monitoring tool customizable for cluster / distributed system?

- by Adil

I am working on a system having multiple servers. I am interested in monitoring some server specific data like CPU/memory usage, disk/filesystem usage, network traffic, system load etc. and some other my process specific data. What are available open source that can serve my purpose? If it provides to customize the parameter to be monitored and monitor your own data by creating plugin / agent. Any suggestions? I heard of Nagios, Zabbix and Pandora but not sure if they provide such interface.

Read the article
Network monitoring library, or objects, for a cloud

- by Andrew Smith

I am looking for library to support server / switch monitoring, to actually be able to check with the device if it's working OK. However this requires some sort of auto-detection and device support. Basically I need to automatically detect a new device, start monitoring it like CPU and PING. So how do I auto-detect the machine remotely, this is something I need library for. Rackspace has something like this - "Cloud Monitoring API". But is there anything opensource which can be used same way? The Nagios and others doesnt have such API, and the big and expensive systems are too big to handle in public cloud, so there must be some other network monitoring engine with API, which can add a new servers automatically and support user isolation for example so I dont see other servers except mine.

Read the article
Graphing/charting of CPU utilisation [on hold]

- by Peter

So nagios can be good at graphing particular resource utilisation or other metrics, but I'm looking for a tool that can output a chart or other graphical representation of how much CPU time/CPU utilisation /all/ services on a server are currently consuming. I think New Relic could probably achieve this to an extent, but I was wondering if there was a popular open source app used for this. In case I am explaining this in a bad way, my actual problem is that I have a shared server with suexec enabled (ie. httpd cgi running under multiple user accounts). I'd like to know which users are using the most CPU during periods of the day.

Read the article
How to manage configuration software installations of non-domain Windows XP machines?

- by Digi

I have a large set of unattended Windows XP machines who are not connected to a domain or even to each other. I am struggling to find any tools out there that I can use to deal with them in one application. I am hoping to find software that I can perhaps install a client on each machine, then have it essentially proxy out configuration information and possibly commands (install, uninstall, stop service, etc) across the whole network. The closest I've come is Nagios and its client, but it cannot be used to push files through and run commands remotely. Any suggestions?

Read the article
Debian Wheezy: installing from sources or repositories? upgrading to new software release?

- by user269842

a. I'm wondering for some software if it is wiser to install them from sources or from official repositories when available like: glpi inventory fusion inventory monitoring tools like nagios I tried both for glpi: compiled from sources and installing from repositories. I also installed zabbix from sources. b. What about new software releases providing enhancements: is it better to keep the release installed from the repositories /compiled or is their a 'best practice' like downloading the new software release and compiling it again (I really have no clue)? Could someone make it more clear for me? Thanks!

Read the article
Enterprise Level Monitoring Solution

- by Garthmeister J.

My company is currently looking to replace our current solution used for monitoring our web-based enterprise solutions for both up-time and performance. Please note this is not intended to be a network monitoring-type solution (internally we currently use Nagios). If anyone has a provider that they have had a positive experience with, it would be much appreciated. Here is a list of our requirements: • Must have a large number of probes/agents around the globe to be representative of our customer base • Must have a flexible scripting capability to automate multi-step user actions • 24 hour a day monitoring • Flexible alerting system • Report generation capability • Mimic browser specific monitoring (optional, not a must-have)

Read the article
Windows 2008 Server network issues

- by Snowflow

I have this one server that just doesn't want to be on the internet It's a new server, a twinblade, the other twin works, but not this one. It can connect fine to everythign else in the LAN, but cannot go out on the net It can be reached by ICMP requests over the net (the nagios server can probe it, but not ping it for instance), but not TCP Everything seems fine both in firewall and machine, i get no issues. Anyone care to help me out where i can start looking, i'm seriously confused. edit: it can ping gateway and through the sonicwall site to site VPN, it\s also able to resolve DNS. the only thing it can`t do is reach anything outside of LAN/VPN

Read the article
Too many files open issue (in CentOS)

- by Ram

Recently I ran into this issue in one of our production machines. The actual issue from PHP looked like this: fopen(dberror_20110308.txt): failed to open stream: Too many open files I am running LAMP stack along with memcache in this machine. I also run a couple of Java applications in this machine. While I did increase the limit on the number of files that can be opened to 10000 (from 1024), I would really like to know if there is an easy way to track this (# of files open at any moment) as a metric. I know lsof is a command which will list the file descriptors opened by processes. Wondering if there is any other better (in terms of report) way of tracking this using say, nagios.

Read the article
Newbie, deciding Python or Erlang

- by Joe

Hi Guys, I'm a Administrator (unix, Linux and some windows apps such as Exchange) by experience and have never worked on any programming language besides C# and scripting on Bash and lately on powershell. I'm starting out as a service provider and using multiple network/server monitoring tools based on open source (nagios, opennms etc) in order to monitor them. At this moment, being inspired by a design that I came up with, to do more than what is available with the open source at this time, I would like to start programming and test some of these ideas. The requirement is that a server software that captures a stream of data and store them in a database(CouchDB or MongoDB preferably) and the client side (agent installed on a server) would be sending this stream of data on a schedule of every 10 minutes or so. For these two core ideas, I have been reading about Python and Erlang besides ruby. I do plan to use either Amazon or Rackspace where the server platform would run. This gives me the scalability needed when we have more customers with many servers. For that reason alone, I thought Erlang was a better fit(I could be totally wrong, new to this game) and I understand that Erlang has limited support in some ways compared to Ruby or Python. But also I'm totally new to the programming realm of things and any advise would be appreciated grately. Jo

Read the article
The Mystery of the Vanishing Disk Space

- by Oddthinking

My disk space is dwindling by about 2GB a day! I only have a few more days before I run out of space. $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda4 143G 126G 11G 93% / udev 491M 4.0K 491M 1% /dev tmpfs 200M 696K 199M 1% /run none 5.0M 0 5.0M 0% /run/lock none 499M 144K 499M 1% /run/shm /dev/sda2 1.9G 580M 1.2G 33% /tmp /dev/sda1 92M 29M 58M 33% /boot I have been searching for the biggest directories/log files, deleting and compressing. But I am still losing the war. Finally, I realised I have a big misunderstanding: julian@server1:~$ sudo du -h / | tail -n 1 16G / All of my files in / only add up to 16 GB. That leaves 110 GB unaccounted for! Clearly I have a misunderstanding: I thought the '/dev/sda4' line represented all the files visible from '/'. What should I be reading to understand where the other storage has gone? More details: I have an Ubuntu 11.10 server, that was set-up by data-center staff. It is running my own code (which is fairly prolific with log files, but otherwise doesn't store much stuff on the drive) duplicity for backups (which tends to store a lot of signature files) various other standard services, like Apache, nagios, etc. They are very lightly used. It has been up for about 4 months without a reboot. I lied about the du output (simplified it for effect). It also complained about not being able to access GVFS and the du processes's own resources. I believe they are irrelevant: . du: cannot access `/home/julian/.gvfs': Permission denied du: cannot access `/proc/10841/task/10841/fd/4': No such file or directory du: cannot access `/proc/10841/task/10841/fdinfo/4': No such file or directory du: cannot access `/proc/10841/fd/4': No such file or directory du: cannot access `/proc/10841/fdinfo/4': No such file or directory

Read the article
Synchronizing 3 servers over IP

- by user93078

I'm setting up a medical server for a hospital that has doctors located in 3 different locations, meaning there would be 3 servers (1 in each location). All 3 servers would just have the following software: Ubuntu Server 12.04 minimal MySQL, PHP 5, Apache The medical software which would read/write to the MySQL database Remote admin apps like Nagios & Webmin Rsync for backup (rsync-over-ssh) as a cron job and the doctors at each location would access patient & billing data from their respective servers. What I'd like is, that each of these servers all have synchronized info (especially the mySQL database's) - let's say on an hourly basis each of these servers synchronize data to a common remote server and the data is then brought down to each of the servers. I know an easier way would be to have the medical app running on a remote web server, but since this is medical that we're talking about and knowing how common it is in our area for the net to go gown, I wouldn't like a web based scenatio. Is such a setup possible? Would this be the right way to do things or is there a better way to this? Would really appreciate views and comments (or how to set this up) on this.

Read the article
Is Eclipse Remote System Explorer broken on Windows?

- by Kev

I have the following setup on Windows 7 Ultimate x64: Eclipse Indigo 2.7.2 (Build: M20120208-0800) Remote System Explorer 3.3.2 (see screenshot) (Oracle/Sun) Java 1.6 Update 31 (x86) Despite all my best efforts I am unable to connect to a remote system (a Centos 5.6 server on my local LAN) using a Remote System Explorer SSH connection - I've tried both password authentication and using my SSH private key. Here is a screenshot of both the Eclipse error dialogue and what is logged in my /var/log/secure log file: /var/log/secure: Apr 1 12:00:21 nagios sshd[6176]: Received disconnect from 172.16.3.88: 3: com.jcraft.jsch.JSchException: Auth fail When I connect for the first time I do get prompted to verify the authenticity of the remote host and the RSA key fingerprint. But that's as far as things go. Performing the same operation with the same credentials on my Fedora Core 16 box (also running the same version of Eclipse and Java) to the same server is successful. This leads me to believe that RSE SSH support on Windows is either broken or there's some piece of the SSH-on-Windows puzzle I'm missing. Is this the case?

Read the article

Search Results

Search found 334 results on 14 pages for 'nagios'.

Page 12/14 | < Previous Page | 8 9 10 11 12 13 14 | Next Page >

- by Shachar

- by Maxim Veksler

- by Matt

- by Luke404

- by jldugger

- by John Gardeniers

- by kazanaki

- by GruffTech

- by dunxd

- by Ciaran

- by Jay Taylor

- by Pinnacle

- by nbartolomeo

- by Adil

- by Andrew Smith

- by Peter

- by Digi

- by user269842

- by Garthmeister J.

- by Snowflow

- by Ram

- by Joe

- by Oddthinking

- by user93078

- by Kev

< Previous Page | 8 9 10 11 12 13 14 | Next Page >