OS Analytics with Oracle Enterprise Manager (by Eran Steiner)

Posted by Zeynep Koch on Oracle Blogs See other posts from Oracle Blogs or by Zeynep Koch
Published on Tue, 13 Nov 2012 18:28:27 +0000 Indexed on 2012/11/13 23:14 UTC
Read the original article Hit count: 420

Filed under:

Oracle Enterprise Manager Ops Center provides a feature called "OS Analytics". This feature allows you to get a better understanding of how the Operating System is being utilized. You can research the historical usage as well as real time data. This post will show how you can benefit from OS Analytics and how it works behind the scenes.

The recording of our call to discuss this blog is available here:

https://oracleconferencing.webex.com/oracleconferencing/ldr.php?AT=pb&SP=MC&rID=71517797&rKey=4ec9d4a3508564b3

Download the presentation here

See also:

Blog about Alert Monitoring and Problem Notification

Blog about Using Operational Profiles to Install Packages and other content


Here is quick summary of what you can do with OS Analytics in Ops Center:

  • View historical charts and real time value of CPU, memory, network and disk utilization
  • Find the top CPU and Memory processes in real time or at a certain historical day
  • Determine proper monitoring thresholds based on historical data
  • Drill down into a process details

Where to start

To start with OS Analytics, choose the OS asset in the tree and click the Analytics tab.

You can see the CPU utilization, Memory utilization and Network utilization, along with the current real time top 5 processes in each category (click the image to see a larger version):


 In the above screen, you can click each of the top 5 processes to see a more detailed view of that process. Here is an example of one of the processes:


One of the cool things is that you can see the process tree for this process along with some port binding and open file descriptors.

Next, click the "Processes" tab to see real time information of all the processes on the machine:


An interesting column is the "Target" column. If you configured Ops Center to work with Enterprise Manager Cloud Control, then the two products will talk to each other and Ops Center will display the correlated target from Cloud Control in this table. If you are only using Ops Center - this column will remain empty.




The "Threshold" tab is particularly helpful - you can view historical trends of different monitored values and based on the graph - determine what the monitoring values should be:

You can ask Ops Center to suggest monitoring levels based on the historical values or you can set your own. The different colors in the graph represent the current set levels: Red for critical, Yellow for warning and Blue for Information, allowing you to quickly see how they're positioned against real data.

It's important to note that when looking at longer periods, Ops Center smooths out the data and uses averages. So when looking at values such as CPU Usage, try shorter time frames which are more detailed, such as one hour or one day.


Applying new monitoring values

When first applying new values to monitored attributes - a popup will come up asking if it's OK to get you out of the current Monitoring Policy. This is OK if you want to either have custom monitoring for a specific machine, or if you want to use this current machine as a "Gold image" and extract a Monitoring Policy from it. You can later apply the new Monitoring Policy to other machines and also set it as a default Monitoring Profile.

Once you're done with applying the different monitoring values, you can review and change them in the "Monitoring" tab. You can also click the "Extract a Monitoring Policy" in the actions pane on the right to save all the new values to a new Monitoring Policy, which can then be found under "Plan Management" -> "Monitoring Policies".


Visiting the past

Under the "History" tab you can "go back in time". This is very helpful when you know that a machine was busy a few hours ago (perhaps in the middle of the night?), but you were not around to take a look at it in real time. Here's a view into yesterday's data on one of the machines:


You can see an interesting CPU spike happening at around 3:30 am along with some memory use. In the bottom table you can see the top 5 CPU and Memory consumers at the requested time. Very quickly you can see that this spike is related to the Solaris 11 IPS repository synchronization process using the "pkgrecv" command.

The "time machine" doesn't stop here - you can also view historical data to determine which of the zones was the busiest at a given time:


Under the hood

The data collected is stored on each of the agents under /var/opt/sun/xvm/analytics/historical/

  • An "os.zip" file exists for the main OS. Inside you will find many small text files, named after the Epoch time stamp in which they were taken
  • If you have any zones, there will be a file called "guests.zip" containing the same small files for all the zones, as well as a folder with the name of the zone along with "os.zip" in it
  • If this is the Enterprise Controller or the Proxy Controller, you will have folders called "proxy" and "sat" in which you will find the "os.zip" for that controller

The actual script collecting the data can be viewed for debugging purposes as well:

  • On Linux, the location is: /opt/sun/xvmoc/private/os_analytics/collect

If you would like to redirect all the standard error into a file for debugging, touch the following file and the output will go into it:

# touch /tmp/.collect.stderr  

The temporary data is collected under /var/opt/sun/xvm/analytics/.collectdb until it is zipped.

If you would like to review the properties for the Analytics, you can view those per each agent in /opt/sun/n1gc/lib/XVM.properties. Find the section "Analytics configurable properties for OS and VSC" to view the Analytics specific values.

I hope you find this helpful! Please post questions in the comments below.

Eran Steiner


© Oracle Blogs or respective owner

Related posts about /News and Articles