I need advice on how to debug a cluster

Posted by alcor on Programmers See other posts from Programmers or by alcor
Published on 2012-08-23T14:30:15Z Indexed on 2012/09/03 15:51 UTC
Read the original article Hit count: 159

Filed under:
|
|
|

I'm the only developer of a complex critical software system, written in Visual C++ 2005. It's deployed on a classical Microsoft cluster scenario (active/passive), that has Windows Server 2003 R2.

If a server A goes down, the other one (B) starts and take the ownership of its duties.

You have to know that:

  • both servers have the same Microsoft patches/fixes, same hardware, same everything.
  • both servers use the same memory storage (a RAID-6 through fiber channel).
  • this software has a main module who launch the peripheral modules. if a peripheral module crashes, the main module restarts it.

When I switch the application in one of the two servers (let's say the B server) two of the peripheral modules of the main applications just started to crash apparently without reason about 2 seconds after the start of the peripheral module.

What could I do to analyze/inspect/resolve this weird situation?

© Programmers or respective owner

Related posts about debugging

Related posts about analysis