“Result Cache: Channel” Reliable Message Waits

In this earlier post, I wrote about “reliable message” waits in Oracle and how to find out more about the channel that is being waited on.  One of these waits that we’ve been seeing with some frequency is on “Result Cache: Channel”.  In some cases, these waits have been debilitating and I believe it merits its own post, in case someone else runs into this issue.

Here is an example of one issue we experienced in 11.2.0.3.  Users were reporting a slow system and in looking at session waits, we saw long wait times for the “reliable message” event. In querying GV$CHANNEL_WAITS, we saw the following waits on instance 3 of a RAC database:

select inst_id, channel, messages_published, wait_count, WAIT_TIME_USEC/1000000 wait_time_sec
from GV$CHANNEL_WAITS
order by inst_id, wait_time_sec desc;
 INST
   ID CHANNEL                                                MESSAGES_PUBLISHED WAIT_COUNT WAIT_TIME_SEC
----- ------------------------------------------------------ ------------------ ---------- -------------
. . .
    3 Result Cache: Channel                                           474745796  473767959    267825.106
    3 obj broadcast channel                                             4928144    2643616      5065.185
    3 kxfp control signal channel                                       3349090    1140344       484.500
    3 MMON remote action broadcast channel                               401055       6999       266.940
    3 RBR channel                                                        320568      37809       137.223
. . .

Notice how much time is spent on the “Result Cache: Channel” waits.  Foreground processes were waiting on this and we needed to find out why, quickly.  Some fast research showed us that this particular channel wait is used to track time waited for synchronization of the result set caches.  Oracle tracks DML changes to the tables that are being result cached and it sends messages across the interconnect to all other instances to update their caches related to that table.  If those tables have a high DML (insert/update/delete) frequency, it can cause quite a bit of contention on the system.

The RESULT_CACHE Hint

However, to our knowledge, we were not deliberately using result set caching in this database – so where was it coming from?  I checked DBA_TABLES to validate that no tables were being result cached and none were.  I was aware of the RESULT_CACHE hint, so I ran a query against GV$SQLAREA to see if there were any queries using that particular hint.  Lo and behold – there was:

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default)
opt_param('parallel_execution_enabled', 'false')  result_cache */
  COUNT(C1) 
FROM
 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "PR")  */ 1 AS C1 
  FROM ""."PRVSN_RQST"
  SAMPLE BLOCK(8.72684, 8) SEED(4)  "PR" 
  WHERE ("PR"."PRVSN_RQST_ID"=TO_NUMBER(:B1))) innerQuery

The Culprit?  SQL Tuning Advisor

Since this appeared to be an Oracle-generated query (based on the hints), we searched on MOS and found that this type of query comes from the SQL Tuning Advisor (STA).  Subsequent searching within Oracle Enterprise Manager (OEM) showed us that we did have some STA jobs that had been run recently:

STA

Apparently, one of our DBAs had been using the SQL Tuning Advisor to try to improve the performance of some of the queries.  However, because of this unknown side effect of the STA, it actually ended up causing performance issues.

The table involved has highly concurrent INSERT/UPDATE activity and the system was very busy trying to constantly update the result caches across all four nodes on this RAC cluster.  Consequently, foreground processes/sessions were spending a lot of time waiting on this cache synchronization.

Resolution

Resolving this issue was accomplished by doing two things:  first we flushed the result cache:

set serveroutput on

BEGIN
  IF dbms_result_cache.flush THEN
    dbms_output.put_line('Flush Successful');
  ELSE
    dbms_output.put_line('Flush Failure');
  END IF;
END;
/

Then, we told the DBAs not to use STA for tuning or that if they did, they would need to make sure they clean up after themselves afterwards.

According to MOS Doc ID 1951729.1, another option would be to disable the result cache altogether, using:

alter system set result_cache_max_size=0;

Possible Bug?

This definitely seems like buggy behavior to me – a tool that is supposed to help improve performance should not be able to cause performance issues. I’m not sure what value there is to the STA caching its results and I’m therefore not sure why they felt they needed to add that hint to the code they run.

Any insights are appreciated – please leave them in the comments.

One thought on ““Result Cache: Channel” Reliable Message Waits

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s