I found more than once an issue with the timeouts on response times of these OC4J Containers.
All of the situations I was faced with were SOA Suites that had just been taken into production. The behaviour differed a little from case to case, but nevertheless the solution to the problem was found in the same solution for each of them.
OPMN manages the OC4J containers in Application Server (10g, that is). Because of this, it tries to ping all of the containers in order to check whether they are still alive. If the ping doesn't get returned quickly enough, it restarts the OC4J container and writes an error into the opmn.log file located under $ORACLE_HOME/opmn/logs. This can lead to various problems: If OPMN restarts the HTTP Server, the website will temporarily be unavailable which can be rather disturbing. If other OC4J containers get restarted, it can lead to various other errors like HTTP-500 (Internal Server Error) or other problems.
The solution is to tune the OC4J container ping parameters in the opmn.xml file found under $ORACE_HOME/opmn/conf:
For the http server, look for the tag process-type id="HTTP_Server" module-id="OHS"
<process-set id="HTTP_Server" restart-on-death="true" numprocs="1">
<start timeout="300" retry="3"/>
<stop timeout="300"/>
<restart timeout="300" retry="3"/>
<ping timeout="60" interval="600"/>
</process-set>
Next, look for the
<data id="ping-url" value="/"/>
</category>
<category id="restart-parameters">
<data id="reverseping-timeout" value="345"/>
<data id="no-reverseping-failed-ping-limit" value="3"/>
<data id="reverseping-failed-ping-limit" value="6"/>
</category>
This will increase the timeout for the response of the OC4J container, giving it a little more time.
Hi,
ReplyDeleteAppreciate your post on this particular issue. We are having same issue in our prod and would like to know more on this. I have few questions as below:
1) Do we need to use all the parameters mentioned in OPMN.xml?
2) Can these parameters also been used for other containers?
3) Can you please provide us with the default values for all the parameters you have mentioned with the recommended values for those parameters?
Regards
Thanks for your comment, Anonymous. To answer your questions (as far as I can):
ReplyDelete1. I wouldn't recommend removing parameters from opmn.xml. They are there for specific reasons. They pretty much all can be tuned, however.
2. The parameters that I specified in my post are usable for all OC4J containers in opmn.xml. Not all parameters can be used for all containers, I guess.
3. Default values are not all known by me. I know that the
reverseping-timeout I think defaults to 300
no-reverseping-failed-ping-limit defaults to 1
reverseping-failed-ping-limit defaults to 3
Regards,
Arnoud
PS, you can contact me through LinkedIn if you want to have more info.