I recently helped a customer tune a slow JPQL query. It turned out to be at least partially due to Hibernate’s implementation of fetch joins. The problem is generic, so it is a good thing to be aware of.
The query was supposed to return entities one page at a time and several relations, including collections, had to be initialized. That was done using fetch joins in order to keep the number of database round-trips down.
When I checked the SQL statement I quickly noticed that it had no LIMIT clause. Hibernate normally uses the LIMIT clause with MySQL for paging, but not this time. Why?
The specification for JPA 1 (JSR 220) states that the effect of applying setMaxResults or setFirstResult to a query involving fetch joins over collections is undefined. Hibernate fetches the records and performs paging in memory, not in the database. As this tends to be expensive, a warning is logged:
WARNING: firstResult/maxResults specified with collection fetch; applying in memory!
Well, that was clear enough. But why? Poor thinking? No, rather a difficult problem. Hibernate wants to get a certain number of entities per page, not a certain number of database rows. With fetch joins for collections there is no way to know how many database rows to get for one entity, so what to do? Read database rows and pile them up in memory until the page size has been reached (including pages that have been visited before and are not returned for this request). At least that works.
So, as the specification and the Hibernate documentation advises, stay clear of fetch joins for collections if you want to use paging (or the other way around)!