Weblogic domain scale up using EM Grid Control 11gR1

Posted by dmitry.nefedkin(at)oracle.com on Oracle Blogs See other posts from Oracle Blogs or by dmitry.nefedkin(at)oracle.com
Published on Wed, 29 Dec 2010 11:20:55 +0100 Indexed on 2010/12/30 11:58 UTC
Read the original article Hit count: 551

As you know a weblogic domain consists of set of servers running independently or in a cluster mode, sharing the distributed resources. And in most environments weblogic  cluster consists of multiple managed servers running simultaneously and working together to provide increased scalability and reliability.  These servers can run on the same machine, or be located on different machines. 
It's a common task to increase a cluster's capacity by adding new machines to the cluster to host the new server instances.  You can do it by manually installing weblogic binaries to the new host and use pack/unpack commands to add a managed server to this new host.  But with Enterprise Manager Grid Control 11gR1 (EMGC) there is  another way - Fusion Middleware Domain Scale Up  procedure. I'm going to show you how it works.

Here is a picture of  my medrec_oradb weblogic domain, what is registered in EMGC. It contains an admin server and a cluster MedRecCluster with  the single managed server MS1. Both admin and managed servers are on the same host oel46-vmware, it's a virtual machine with OEL 4.6 that runs inside our Oracle VM infrastructure. 

dn1_emgc_domain_before_scaleup_p1_600.png

And here are the application deployments, note that couple of applications are deployed to the cluster.
dn1_emgc_domain_before_scaleup_p2.PNGFirst of all I have to prepare a new machine that will host new managed sever of my cluster. I created new VM with OEL 5.4 using the corresponding Oracle VM template available in Oracle E-Delivery site for Oracle Linux and Oracle VM and named it wls1032.
Next step is to install Oracle EM Grid Control 11gR1 Agent to this new host.  You can download it from the OTN page and install it manually,  or you can use Agent Installation Deployment procedure available in EMGC  (Deployments->Agent Installation->Install Agent). Anyway, when you agent is up and running on the new machine, you will see it in EMGC Console in the Targets->Hosts subtab.

dn1_emgc_hosts.pngNow we are ready to scale up our weblogic domain. Click the Deployments tab in Oracle Enterprise Manager Grid Control, and then click Deployment Procedure. Select a Fusion Middleware Domain Scale Up procedure from the list, and click Schedule Deployment. The first page of the FMW Domain Scale Up Wizard is displayed and you can proceed with the deployment process.
Select the domain from list, enter the working directory on the admin server host, and also fill the weblogic credentials for the administration server console and the OS credentials for the  admin server host.  Click Next button. 
dn1_scaleup_wiz_step1_600.png
The next step allows you to configure you domain, to add a new manager server to the cluster you should select the cluster in the tree and click Add Server button. Select the newly added server in a tree, choose the target host and  enter the configuration details of your managed server. You can also add new machine and node manager details.  Please note that you cannot change the values in  Domain Location and Fusion Middleware Home fields, so these locations on the target host will be the same as for the admin server host.   Working directory on the target host should have enough free space to store FMW home binaries and domain configuration files.  In my experience the working directories should have at least 3 Gb of free space.  The last thing you should fill is the OS credentials for the target host.
dn1_scaleup_wiz_step2_600.png

The next steps allows you to schedule the execution of the procedure, it is started immediately in my example.
dn1_scaleup_wiz_step3.PNG
The last step is just a review the configuration for the domain scale up. Click Submit to launch the process.
dn1_scaleup_wiz_step4_600.png
You can track the status of the procedure execution by selecting Deployments->Deployment Procedures->Procedure Completion Status in the EMGC Console.
As you can see in the picture below, the procedure consists of the many steps, and I'm going to share my experience about the issues that I had at some of the steps. Please keep in mind that you can always continue the execution from the last successfully completed step by clicking Retry button.dn1_scaleup_procedure_structure_600.png
  • Check OUI Prerequisites  step may fail if the target host does  not pass prerequisites checks for Weblogic Server installation such as amount of RAM, linux packages installed, etc.
  • Create FMW Clone Archive step may fail if you do not have enough free space in the working directory on the administration server host.
  • Transfer cloning archive to targets  step  may fail if the EMGC agents on the admin server host or on target host are not secured.   You should secure the agent by issuing ./emctl secure agent  command from $AGENT_HOME/bin directory and entering the agent registration password.
  • Both Transfer cloning archive to targets and Apply Clone at target hosts steps may fail if you do not have enough free space in the working directory on the target host.
  • The most complicated issue I had on the Run Inventory Collection  step. The step failed and I noticed that the agent on the target server is also failed with the following error in the $AGENT_HOME/sysman/log/emagent.trc  log file:
2010-12-28 11:50:34,310 Thread-2838952848 ERROR upload: Failed to upload file A0000008.xml: Fatal Error.
Response received: 500|ORA-20603: The timezone of the multiagent target (/Farm_Localhost_MedRec_medrec_oradb/medrec_oradb,weblogic_domain)is not consistent with the timezone (America/Los_Angeles) reported by other agents.
2010-12-28 11:50:34,310 Thread-2838952848 ERROR upload: 1 Failure(s) in a row or XML error for A0000008.xml, retcode = -6, we give up
2010-12-28 11:50:35,552 Thread-2838952848 WARN  upload: FxferSend: received fatal error in header from repository: https://oel46-vmware:1159/em/upload
FATAL_ERROR::500|ORA-20603: The timezone of the multiagent target (/Farm_Localhost_MedRec_medrec_oradb/medrec_oradb,weblogic_domain)is not consistent with the timezone (America/Los_Angeles) reported by other agents.
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: number of fatal error exceeds the limit 3
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: agent will shutdown now
2010-12-28 11:50:35,552 Thread-2838952848 ERROR : Signalled to Exit with status 55. Too many fatal upload failures
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: 1 Failure(s) in a row or XML error for A0000008.xml, retcode = -6, we give up
2010-12-28 11:50:35,552 Thread-3044607680 ERROR main: EMAgent abnormal terminating


I checked the timezone of my domain target inside EMGC repository
select timezone_region
from mgmt_targets
where target_type = 'weblogic_domain'
  and display_name = 'medrec_oradb'

"TIMEZONE_REGION"
"America/Los_Angeles"
Then checked the timezone of my agents and indeed, they differed

select target_name, timezone_region
from mgmt_targets
where type_display_name = 'Agent'

"TARGET_NAME"    "TIMEZONE_REGION"
"oel46-vmware:3872"    "America/Los_Angeles"
"wls1032.imc.fors.ru:3872"    "America/New_York"

So I had to change the timezone on the wls1032 host and propagate this changes to the agent and to the EMGC repository. Here was the steps:
  • issued system-config-date command on wls1032.imc.fors.ru  and set timezone to "America/Los_Angeles"
  • propagated the changes to the agent bu executing ./emctl resetTZ agent  command from $AGENT_HOME/bin directory
  • connected to EMGC repository as sysman and executed the following PL/SQL block:
   begin
      mgmt_target.set_agent_tzrgn('wls1032.imc.fors.ru:3872','America/Los_Angeles');
      commit;
   end;

After that I had to clear the pending uploads on wls1032.imc.fors.ru:
  rm -r $AGENT_HOME/sysman/emd/state/*
  rm -r $AGENT_HOME/sysman/emd/collection/*
  rm -r $AGENT_HOME/sysman/emd/upload/*
  rm $AGENT_HOME/sysman/emd/lastupld.xml
  rm $AGENT_HOME/sysman/emd/agntstmp.txt
  $AGENT_HOME/bin/emctl start agent
  $AGENT_HOME/bin/emctl clearstate agent

The last part of this solution was to resync the agent in EMGC console by clicking Agent Resynchronization button (please leave "Unblock agent on successful completion of agent resynchronization" checkbox checked in the next screen).
dn1_resync_agent_600.pngAfter that I issued ./emctl upload command from $AGENT_HOME/bin on the wls1032 host,  and my previous error disappeared,  but I catched another one:

EMD upload error: Failed to upload file A0000004.xml: HTTP error.
Response received: ERROR-400|Data will be rejected for upload from agent 'https://wls1032.imc.fors.ru:3872/emd/main/', max size limit for direct load exceeded [7544731/5242880]

So the uploading XML file size was 7 Mb, and the limit on OMS was 5 Mb. 
To increase the max file size limit to 20 Mb I had to connect to the OMS host and execute the following commands from $OMS_HOME/bin directory:
./emctl set property -name em.loader.maxDirectLoadFileSz -value 20971520 -module emoms
 ./emctl stop oms
 ./emctl start oms

After that I issued ./emctl upload command from $AGENT_HOME/bin on the wls1032 one more time and it completed successfully.   The agent uploaded the configuration information to the EMGC  repository and I was able to see the results of my weblogic domain scale-up in EMGC Console.
dn1_emgc_domain_after_scaleup_p1_600.png
Deployments
dn1_emgc_domain_after_scaleup_p2.PNGSo, now the weblogic cluster contains 2 managed servers located on the different hosts.

This powerful feature of the Enterprise Manager Grid Control  is a part of  the WebLogic Server Management Pack Enterprise Edition.

© Oracle Blogs or respective owner

Related posts about middleware weblogic em gridcontrol