UAT Testing for SOA 10G Clusters

Posted by [email protected] on Oracle Blogs See other posts from Oracle Blogs or by [email protected]
Published on Mon, 17 May 2010 19:25:18 +0000 Indexed on 2010/05/17 19:32 UTC
Read the original article Hit count: 675

Filed under:

UAT

A lot of customers ask how to verify their SOA clusters and make them production ready. Here is a list that I recommend using for 10G SOA Clusters.

Test cases for each component - Oracle Application Server 10G

General Application Server test cases

This section is going to cover very General test cases to make sure that the Application Server cluster has been set up correctly and if you can start and stop all the components in the server via opmnct and AS Console.

Test Case 1

Check if you can see AS instances in the console

Implementation

1. Log on to the AS Console --> check to see if you can see all the nodes in your AS cluster. You should be able to see all the Oracle AS instances that are part of the cluster. This means that the OPMN clustering worked and the AS instances successfully joined the AS cluster.

Result

You should be able to see if all the instances in the AS cluster are listed in the EM console. If the instances are not listed here are the files to check to see if OPMN joined the cluster properly:

$ORACLE_HOME\opmn\logs{*}opmn.log*
$ORACLE_HOME\opmn\logs{*}opmn.dbg*

If OPMN did not join the cluster properly, please check the opmn.xml file to make sure the discovery multicast address and port are correct (see this link for opmn documentation). Restart the whole instance using opmnctl stopall followed by opmnctl startall. Log on to AS console to see if instance is listed as part of the cluster.

Test Case 2

Check to see if you can start/stop each component

Implementation

Check each OC4J component on each AS instance
Start each and every component through the AS console to see if they will start and stop.
Do that for each and every instance.

Result

Each component should start and stop through the AS console. You can also verify if the component started by checking opmnctl status by logging onto each box associated with the cluster

Test Case 3

Add/modify a datasource entry through AS console on a remote AS instance (not on the instance where EM is physically running)

Implementation

Pick an OC4J instance
Create a new data-source through the AS console
Modify an existing data-source or connection pool (optional)

Result

Open $ORACLE_HOME\j2ee\<oc4j_name>\config\data-sources.xml to see if the new (and or the modified) connection details and data-source exist. If they do then the AS console has successfully updated a remote file and MBeans are communicating correctly.

Test Case 4

Start and stop AS instances using opmnctl @cluster command

Implementation

1. Go to $ORACLE_HOME\opmn\bin and use the opmnctl @cluster to start and stop the AS instances

Result

Use opmnctl @cluster status to check for start and stop statuses.

HTTP server test cases

This section will deal with use cases to test HTTP server failover scenarios. In these examples the HTTP server will be talking to the BPEL console (or any other web application that the client wants), so the URL will be _http://hostname:port\BPELConsole

Test Case 1

Shut down one of the HTTP servers while accessing the BPEL console and see the requested routed to the second HTTP server in the cluster

Implementation

Access the BPELConsole
Check $ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp and the URL that was accessed by the user. Timestamp and URL would look like this

1xx.2x.2xx.xxx [24/Mar/2009:16:04:38 -0500] "GET /BPELConsole=System HTTP/1.1" 200 15

After you have figured out which HTTP server this is running on, shut down this HTTP server by using opmnctl stopproc --> this is a graceful shutdown.
Access the BPELConsole again (please note that you should have a LoadBalancer in front of the HTTP server and configured the Apache Virtual Host, see EDG for steps)
Check $ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp and the URL that was accessed by the user. Timestamp and URL would look like above

Result

Even though you are shutting down the HTTP server the request is routed to the surviving HTTP server, which is then able to route the request to the BPEL Console and you are able to access the console. By checking the access log file you can confirm that the request is being picked up by the surviving node.

Test Case 2

Repeat the same test as above but instead of calling opmnctl stopproc, pull the network cord of one of the HTTP servers, so that the LBR routes the request to the surviving HTTP node --> this is simulating a network failure.

Test Case 3

In test case 1 we have simulated a graceful shutdown, in this case we will simulate an Apache crash

Implementation

Use opmnctl status -l to get the PID of the HTTP server that you would like forcefully bring down
On Linux use kill -9 <PID> to kill the HTTP server
Access the BPEL console

Result

As you shut down the HTTP server, OPMN will restart the HTTP server. The restart may be so quick that the LBR may still route the request to the same server. One way to check if the HTTP server restared is to check the new PID and the timestamp in the access log for the BPEL console.

BPEL test cases

This section is going to cover scenarios dealing with BPEL clustering using jGroups, BPEL deployment and testing related to BPEL failover.

Test Case 1

Verify that jGroups has initialized correctly. There is no real testing in this use case just a visual verification by looking at log files that jGroups has initialized correctly.

Check the opmn log for the BPEL container for all nodes at $ORACLE_HOME/opmn/logs/<group name><container name><group name>~1.log. This logfile will contain jGroups related information during startup and steady-state operation. Soon after startup you should find log entries for UDP or TCP.
Example jGroups Log Entries for UDPApr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.UDP createSockets

·         INFO: sockets will use interface 144.25.142.172

·

·         Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.UDP createSockets

·

·         INFO: socket information:

·

·         local_addr=144.25.142.172:1127, mcast_addr=228.8.15.75:45788, bind_addr=/144.25.142.172, ttl=32

·         sock: bound to 144.25.142.172:1127, receive buffer size=64000, send buffer size=32000

·         mcast_recv_sock: bound to 144.25.142.172:45788, send buffer size=32000, receive buffer size=64000

·         mcast_send_sock: bound to 144.25.142.172:1128, send buffer size=32000, receive buffer size=64000

·         Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces

·

·         -------------------------------------------------------

·

·         GMS: address is 144.25.142.172:1127

·

-------------------------------------------------------

Example jGroups Log Entries for TCPApr 3, 2008 6:23:39 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start

·         INFO: server socket created on 144.25.142.172:7900

·

·         Apr 3, 2008 6:23:39 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces

·

·         -------------------------------------------------------

·         GMS: address is 144.25.142.172:7900

-------------------------------------------------------

In the log below the "socket created on" indicates that the TCP socket is established on the own node at that IP address and port the "created socket to" shows that the second node has connected to the first node, matching the logfile above with the IP address and port.Apr 3, 2008 6:25:40 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start

·         INFO: server socket created on 144.25.142.173:7901

·

·         Apr 3, 2008 6:25:40 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces

·

·         ------------------------------------------------------

·         GMS: address is 144.25.142.173:7901

·         -------------------------------------------------------

·         Apr 3, 2008 6:25:41 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable getConnection

INFO: created socket to 144.25.142.172:7900

Result

By reviewing the log files, you can confirm if BPEL clustering at the jGroups level is working and that the jGroup channel is communicating.

Test Case 2

Test connectivity between BPEL Nodes

Implementation

Test connections between different cluster nodes using ping, telnet, and traceroute. The presence of firewalls and number of hops between cluster nodes can affect performance as they have a tendency to take down connections after some time or simply block them.
Also reference Metalink Note 413783.1: "How to Test Whether Multicast is Enabled on the Network."

Result

Using the above tools you can confirm if Multicast is working and whether BPEL nodes are commnunicating.

Test Case3

Test deployment of BPEL suitcase to one BPEL node.

Implementation

Deploy a HelloWorrld BPEL suitcase (or any other client specific BPEL suitcase) to only one BPEL instance using ant, or JDeveloper or via the BPEL console
Log on to the second BPEL console to check if the BPEL suitcase has been deployed

Result

If jGroups has been configured and communicating correctly, BPEL clustering will allow you to deploy a suitcase to a single node, and jGroups will notify the second instance of the deployment. The second BPEL instance will go to the DB and pick up the new deployment after receiving notification. The result is that the new deployment will be "deployed" to each node, by only deploying to a single BPEL instance in the BPEL cluster.

Test Case 4

Test to see if the BPEL server failsover and if all asynch processes are picked up by the secondary BPEL instance

Implementation

Deploy a 2 Asynch process:

A ParentAsynch Process which calls a ChildAsynchProcess with a variable telling it how many times to loop or how many seconds to sleep
A ChildAsynchProcess that loops or sleeps or has an onAlarm

Make sure that the processes are deployed to both servers
Shut down one BPEL server
On the active BPEL server call ParentAsynch a few times (use the load generation page)
When you have enough ParentAsynch instances shut down this BPEL instance and start the other one. Please wait till this BPEL instance shuts down fully before starting up the second one.
Log on to the BPEL console and see that the instance were picked up by the second BPEL node and completed

Result

The BPEL instance will failover to the secondary node and complete the flow

ESB test cases

This section covers the use cases involved with testing an ESB cluster. For this section please follow Metalink Note 470267.1 which covers the basic tests to verify your ESB cluster.

Developer IT