ActiveMQ redelivery plugin fails when client side redelivery is active

We recently had serious issues with the ActiveMQ redelivery plugin. Under high load messages would not be redelivered at all (according to the application logs), they went straight to the DLQ. In isolated tests redelivery worked, but not as intended. We would get six redeliveries in a few milliseconds and then nothing. Why?

It turned out to be a conflict between client-side redelivery and broker redelivery. The client side redelivery kicked in and quickly failed six times. Then when the broker got a chance the maximum redelivery count had already been reached, so the message was moved to the DLQ. See this post.

Following the advice in the post we added jms.redeliveryPolicy.maximumRedeliveries=0 to the broker url, and voila! It worked.

Categories: Java

Prevent hawtio from phoning home

Hawtio is bundled in several applications as a management console. In one project we are using it with JBoss EAP 6. However, when the application server starts hawtio attempts to update itself from Github:


Performing a pull in git repository .hawtio/config on remote URL:
https://github.com/hawtio/hawtio-config.git.
Subsequent pull attempts will use debug logging
Failed to pull from the remote git repo with credentials null due:
https://github.com/hawtio/hawtio-config.git:
407 Proxy Authentication Required. This exception is ignored.

I certainly don’t want applications in production to update themselves dynamically with unforseen effects and neither do I want them to phone home. What to do? Fortunately it is possible to control this, see the documentation. Simply add the following Java options:


-Dhawtio.offline=true
-Dhawtio.config.cloneOnStartup=false
-Dhawtio.config.pullOnStartup=false

Problem solved.

Categories: Java

Free toolset for JMS tests released

In 1996 I worked with IBM MQ Series using OS/2 and CICS. That was the first time I worked with performance tuning for messaging. Since then I have been involved in many projects with the same basic goal: make sure it works correctly (no lost messages under any circumstances) and make it fast. It has never been my full-time job, but as a consultant I have been there and done that a few times now. The more memorable have been webMethods, AQ, HornetQ, ActiveMQ/AMQ and (again) AQ. Every time I have ended up writing my own tools. Yes, every time. So, I’ll do it again, but this time I’ll release the code and make it work for as many JMS providers as possible. Or at least the ones I need.

You can find the project, now in 1.0, at GitHub. If you find it useful, please let me know. One major benefit of making the code open source is that the vendors always need to reproduce any issues on their own. With an open source tool that is easier. A widely used tool is even better, as it has some credibility from the start, so spread the word (and the code)!

If I have time during the summer I will write a series of posts on how to use the tools. Let me know if there is anything in particular I should cover!

Categories: Java, Performance

Debug XA transactions in JBoss EAP 6

How do you find out if XA transactions are actually working and if there are problems, how do you find them? Apart from good tests the answer is to read the logs. So, how do you enable detailed logs for XA transactions in JBoss EAP 6? Fire up the cli (jbossctl.sh cli) and run:


/subsystem=logging/logger=com.arjuna.ats.jta:add(level=TRACE)
/subsystem=logging/file-handler=ARJUNA:add(file=
 {"path"=>"arjuna.log",
  "relative-to"=>"jboss.server.log.dir"})
/subsystem=logging/file-handler=ARJUNA:read-resource
/subsystem=logging/logger=com.arjuna.ats.jta:assign-handler(name="ARJUNA")

This will log most of the relevant XA-related events to a new log file, arjuna.log. The noise ratio is very high, so the file will grow quickly. Be sure to disable logging after a while:


/subsystem=logging/logger=com.arjuna.ats.jta:write-attribute(
  name="level", value="WARN")

Sometimes it may be necessary to log even more. These are recommended by Red Hat:


/subsystem=logging/logger=org.jboss.jca:add(level=TRACE)
/subsystem=logging/logger=org.jboss.as.connector:add(level=TRACE)

Obviously that will make the log files grow even faster, so take care! Don’t try this on a busy system in production.

Categories: Java

JSF view scope with multiple tabs using JavaScript and sessionStorage

JSF applications often use view scope for server-side state. While that is convenient it can be problematic to handle multiple browser tabs. The backend has no way of knowing which tab the user is working with. The session cookie is global for the entire browser and any URL
parameters or hidden parameters will be copied to new tabs. What to do? And even worse, what to do when you have to live with an ancient version of JSF (2.0 in our case)?

It is fairly easy to write a custom view scope on the server side, been there done that a few times. In order to pick the right beans it needs to identify the active tab. The browser will not help us, so as usual these days we must resort to JavaScript.

My first idea was to use the sessionStorage attribute. Add a script to the top of every page that looks for a window id in the sessionStorage. If it is found, add a hidden field with it to all forms in the loaded page, register a click handler that prevents navigation from links and sets the location manually with the id appended as a query parameter, override window.open in order to append the id as a query parameter in that case as well and finally register a ajaxPrefilter handler with jQuery in order to intercept AJAX requests and add the id as a request header. Oh, and rewrite the history in order to remove the id from the browser’s address line as well.

Combined with a servlet filter that denies requests that lack the window id by returning a short JavaScript function that sets it and retries this worked well – in Chrome. Internet Explorer was not as helpful. According to the specification:

When a new top-level browsing context is created by cloning an existing browsing context, the new browsing context must start with the same session storage areas as the original, but the two sets must from that point on be considered separate, not affecting each other in any way.

The major browsers diverge in how they interpret that. To cut things short my cunning plan failed.

Fortunately there is an attribute that is unique across browser tabs: window.name! I changed the code to store the unique id in window.name instead and voila! It worked.

Lesson learned – as usual the browsers can be trusted not to be consistent. Take care with sessionStorage.

Categories: Java

Oracle AQ JMS Performance

Introduction

AQ is Oracle’s message queue implementation. Well, one of them. It supports JMS 1.1 and is included in all versions of the Oracle database, even the free version. It has been battle-tested for twenty years and last time I checked (don’t take my word for it) it required no extra license. What’s not to like?

AQ uses normal database constructs such as tables and SQL commands. That has several advantages, not least that the normal database performance tuning tools and methods can be used. Plus it makes it very easy to instrument and manipulate queues programmatically.

For the most part, tuning AQ is nothing special. However, there are a few dark corners. Stay tuned.

Basic configuration

First of all we need a user with AQ privileges and quota on a tablespace:


create user aqtest identified by whatever
  quota unlimited on users default tablespace users;
grant aq_administrator_role to aqtest;
grant create session to aqtest;

With a user in place we can create a queue table and a queue (as aqtest):


begin
  dbms_aqadm.create_queue_table(
    queue_table        => 'test_qtab',
    queue_payload_type => 'sys.aq$_jms_message',
    storage_clause     =>
    'lob (user_data.bytes_lob) store as securefile ' ||
    '(retention none cache) ' ||
    'lob (user_data.text_lob) store as securefile  ' ||
    '(retention none cache) '  ||
    'opaque type user_prop store as securefile ' ||
    'lob (retention none cache)');
  dbms_aqadm.create_queue(
    queue_name             => 'test_queue',
    queue_table            => 'test_qtab',
    max_retries            => 1,
    retry_delay            => 30,
    retention_time         => 0);
  dbms_aqadm.start_queue (queue_name => 'test_queue');
end;
/

Note the storage clause. It uses securefile (after all we’re in 2016 now), retention none as a message is read exactly once most of the time and cache as most messages are read and deleted almost immediately when posted. Keeping the data in memory makes sense.

Block sizes

Oracle can use many different block sizes. The default block size is typically 8k. Write-heavy applications can often benefit from smaller block sizes, as that reduces contention. The smallest reasonable block size for AQ is 4k:


alter system set db_4k_cache_size=100M scope=both;
create tablespace users4k datafile '/oradata/orcl/users4k.dbf'
  size 100M autoextend on next 5M extent management local
  segment space management auto;
alter user aqtest quota unlimited on users4k;

Recreate the queue:


begin
  dbms_aqadm.stop_queue (queue_name => 'test_queue');
  dbms_aqadm.drop_queue (queue_name => 'test_queue');
  dbms_aqadm.drop_queue_table (queue_table => 'test_qtab');

  dbms_aqadm.create_queue_table(
    queue_table        => 'test_qtab',
    queue_payload_type => 'sys.aq$_jms_message',
    storage_clause     => 'tablespace users4k ' ||
    'lob (user_data.bytes_lob) store as securefile ' ||
    '(retention none cache) ' ||
    'lob (user_data.text_lob) store as securefile  ' ||
    '(retention none cache) '  ||
    'opaque type user_prop store as securefile ' ||
    'lob (retention none cache)');
  dbms_aqadm.create_queue(
    queue_name             => 'test_queue',
    queue_table            => 'test_qtab',
    max_retries            => 1,
    retry_delay            => 30,
    retention_time         => 0);
  dbms_aqadm.start_queue (queue_name => 'test_queue');
end;
/

AQ can store the message payload inline (i.e. in the same row as the metadata) or separately. However, it will only be stored inline if the size of the payload is less than about 4000 bytes. All message queue implementations work best with small messages, but here a small difference in size can theoretically have a significant impact. In practice I haven’t seen any major effects though, up to 3% in some benchmarks and none in others. As usual in engineering it all depends – where is the bottleneck?

If the application uses larger messages, the tablespace for the payload can use a block size optimized for the average message size, making it more likely that the whole message fits into a single block.

Block compression

It is possible to compress the payload and the user properties and by all means the metadata. That costs CPU, but on the other hand it reduces the storage requirements and perhaps it makes the difference between using one or two blocks for a message?

It is very easy to add. Simply tack on a compress clause when creating the queue table:


dbms_aqadm.create_queue_table(
  queue_table        => 'test_qtab',
  queue_payload_type => 'sys.aq$_jms_message',
  storage_clause     => 'tablespace users4k '    ||
  'lob (user_data.bytes_lob) store as securefile '  ||
  '  (retention none cache compress low) '          ||
  'lob (user_data.text_lob) store as securefile '   ||
  '  (retention none cache compress low) '          ||
  'opaque type user_prop store as securefile lob '  ||
  '  (retention none cache compress low)');

Note that this almost certainly requires an extra license.

Driver versions

AQ JMS uses aqapi.jar and Oracle’s JDBC driver. While most versions are likely to work there can be a tremendous difference between the ancient versions and the latest ones. Make sure that the drivers are current and preferably matched (i.e. for the same target database)!

The JDBC driver can be downloaded from Oracle, but as far as I know aqapi.jar ships with the database and with Oracle’s application servers. Note that the database ships with two versions, one that can be used standalone and one that is intended for WebLogic or Oracle Application Server! Use the correct one.

Receive timeout

The way AQ JMS handles a receive timeout can be a real killer for applications that need to use many threads processing messages from the same queue. That equates to most Java EE applications.

When a client calls receive a single SELECT is issued in order to check if there are any messages on the queue. If there are no messages the client goes to sleep. If a message arrives the client wakes up, issues the same SELECT again and ideally consumes the message.

Unfortunately if there are 100 clients waiting for a message on a queue Oracle wakes them all when a message arrives. They will compete for it. One will succeed, the other 99 will fail. In this case the database needs to process 100 simultaneous SELECT statements and only one actually returns any data. The other 99 represent wasted resources. This can really kill the database, as it consumes large amounts of CPU.

A good test case for this is to spin up 100 consumer threads and one producer thread; then watch the load on the database and the top SQL. There should be an easy to find query similar to:

select  /*+ INDEX(TAB AQ$_TEST_QUEUE_TAB_I) */   tab.rowid, tab.msgid, tab.corrid, tab.priority, tab.delay,   tab.expiration ,tab.retry_count, tab.exception_qschema,   tab.exception_queue, tab.chain_no, tab.local_order_no, tab.enq_time,   tab.time_manager_info, tab.state, tab.enq_tid, tab.step_no,   tab.sender_name, tab.sender_address, tab.sender_protocol,   tab.dequeue_msgid, tab.user_prop, tab.user_data   from "AQTEST"."TEST_QUEUE_TAB" tab  where q_name = :1 and (state = :2  )  order by q_name, state, enq_time, step_no, chain_no, local_order_no for update skip locked

Check the statistics in Enterprise Manager or with SQL:


select rows_processed / executions rows_per_execution
from v$sqlarea where sql_text like
'select  /*+ INDEX(TAB AQ$_TEST_QUEUE_TAB_I) */   tab.rowid,%';

The number of rows returned per execution is very low, less than 0.01. The CPU load on the other hand is substantial.

The simple solution is to keep the number of threads down, but that is seldom possible. A more realistic alternative is to sleep on the client side. Use receiveNoWait instead of receive and if the method returns without a message, sleep for a short time in Java before the next attempt. At peak load all threads will get messages so no time is wasted sleeping and when there is less work available most threads will spend their time sleeping in the application server, not performing DOS attacks on the database. Ideally the number of listeners should ramp up and down based on traffic as well.

Dynamic destinations

In JMS a destination can be looked up with JNDI, or it can be created dynamically. For example:


Destination testQueue = session.createQueue("test_queue");

This is convenient, but there is a price to pay. Every time the method is called it issues a SELECT in order to find the destination. An application that creates a dynamic destination when it posts a message will do it for every single message. That adds up.

See https://www.javacodegeeks.com/2013/04/jms-and-spring-small-things-sometimes-matter.html for a more in-depth discussion.

What to do? Fortunately Destination is thread-safe, so it can be cached. With Spring the JndiDestinationResolver can be used with cache=true and fallbackToDynamicDestination=true. It will fail to make a JNDI lookup, do the dynamic lookup and then cache the result. Without Spring, use an application cache, for example a ConcurrentHashMap.

Ordered delivery

An application server that processes messages from the same queue in parallel using multiple threads really can’t guarantee that the messages are processed in order. Even if they are delivered in the same order as they were sent, they will be processed in parallel and will complete in non-deterministic order. However, by default Oracle ensures that messages on a queue are delivered in order. That can be a bit expensive, in particular if the application is using selectors.

Set the system property oracle.jms.orderWithSelector=false in order to cut corners here. This is unlikely to give a large boost, but unless ordered delivery is required it is a quick hit.

Final words

Tuning AQ is mostly about tuning the database. It is also about tuning the Java side and finding where the bottlenecks are. In other words it is fun! Oracle 12c comes with support for sharded queues – I haven’t had the opportunity to test them in a real-world scenario yet, but I look forward to that as they offer horizontal scaling for a single queue with RAC. I’ll be back.

Categories: Java, Oracle, Performance

VirtualBox intermittent network outages

For about a year I have had terrible problems with short but very frequent network outages between Linux guests running in VirtualBox and the world at large at a customer site. At home it works, but there the connection is lost and then restored every few minutes. Very frustrating. Today I finally found a solution. It appears that it is a known bug that goes way back, see ticket 13839. Sure enough, changing the virtual network card to PCnet-PCI II solved the issue! No more outages.

Categories: Networking
Follow

Get every new post delivered to your Inbox.