In one of my recent posts I discussed a performance problem with OC4J Containers being restarted in SOA Suite. This week I had to go back, the environment had become a lot more stable, however similar problems still occurred.
Conclusion: The system was more stable, however the cause of the problem had not been taken away yet, so further investigation was needed.
This time I did some more investigation in tuning java options (opmn.xml) and increased the heap space to 2GB, hoping this would work out, however to no result: The problems persisted.
Then, my eyes fell on the error_log of the http daemon. I wondered what the cause of all of those "oc4j_socket_recvfull timed out" messages could mean...
There is a number of Metalink Notes about this issue, however the solutions I checked were not solving our issues.
Then I realized that the environment was deployed on AIX 5.3. I have worked for IBM and with AIX for a couple of years. I have seen similar issues on other environments (like Oracle E-Business Suite and RAC) and I still had a note somewhere about network parameter tuning on AIX. I checked these on the server, and found and fixed the following parameters, which seemed to have solved the issues:
rfc1323 - This defaults to 0 on AIX, and should be set to 1. By default, the TCP window size is limited to 65536 bytes (64 K) but can be set higher if the rfc1323 value is set to 1 (see tcp_recvspace value).
tcp_sendspace - The TCP Send Buffer. This defaults to 16384 bytes and should set to a higher value for Oracle Application Server (OHS). I set it to 266140.
tcp_recvspace - The TCP Receive Buffer. This also defaults to 16384 and should be set to a higher value for OHS. I set it to 266140, similar to tcp_sendspace. In combination with rfc1323, this enables a connection to negotiate a larger TCP window. If you are sending data through adapters that have large MTU sizes (32 K or 64 K for example), TCP streaming performance might not be optimal unless this option is enabled because a single packet will consume the entire TCP window size. Therefore, TCP is unable to stream multiple packets as it will have to wait for a TCP acknowledgment and window update from the receiver for each packet.
tcp_nodelayack - Value should be set to 1 for Oracle applications. The tcp_nodelayack option prompts TCP to send an immediate acknowledgement, rather than the usual 200 ms delay. Sending an immediate acknowledgement might add a little more overhead, but in some cases, greatly improves performance. Performance problems have been seen when TCP delays sending an acknowledgement for 200 ms, because the sender is waiting on an acknowledgment from the receiver and the receiver is waiting on more data from the sender. This might result in low streaming throughput.