Diagnosing Oracle “reliable message” Waits

Reliable Message waits are cryptic by nature.  It is a general purpose wait event that tracks many different types of channel communications within the Oracle database.  I’ve read some blogs that suggest that this is a benign wait event that can be ignored.  My experience is that they are not benign and should not be ignored.  This post will show you how to decipher these events and resolve the issue.

Here is what you might see in an AWR report:

Continue reading

Advertisement

Using Wireshark to Diagnose a Connection Drop Issue in Oracle

As a long-time performance DBA, I’ve often felt that it is important to know something about troubleshooting the layers that are upstream and downstream of the database in the technology stack.  Lately, I’ve been making use of packet captures and Wireshark to solve tough issues in the TCP layer.  We recently resolved a long-standing issue with TCP retransmissions that were causing connection drops between an application server and one of our databases and I thought this might help others faced with similar issues.

This problem started with a series of TNS-12535 messages that were seen in the Oracle alert logs for one of our databases:

Continue reading

Investigating Linux Network Issues with netstat and nstat

One area that I’ve been spending quite a bit of time looking at lately is the TCP layer on our servers.  We have seen multiple issues that involve TCP and it is an oft-overlooked area when troubleshooting.

There are two tools that I’d like to focus on today – netstat and nstat.  Both tools pull statistics from the following Linux files, which track network-related statistics and SNMP counters:

/proc/net/netstat
/proc/net/snmp

Here is what the output of these two files looks like:

Continue reading