Home > Java > Lazily read lines from gzip file with Java 8 streams

Lazily read lines from gzip file with Java 8 streams

Java 8 can read lines lazily from files using the Files.lines method, but what to do if the files are compressed? I needed to do just that, so here is a class that provides a line stream from a gzipped file. It can easily be converted for other file formats!


public class GZIPFiles {
  /**
   * Get a lazily loaded stream of lines from a gzipped file, similar to
   * {@link Files#lines(java.nio.file.Path)}.
   * 
   * @param path
   *          The path to the gzipped file.
   * @return stream with lines.
   */
  public static Stream<String> lines(Path path) {
    InputStream fileIs = null;
    BufferedInputStream bufferedIs = null;
    GZIPInputStream gzipIs = null;
    try {
      fileIs = Files.newInputStream(path);
      // Even though GZIPInputStream has a buffer it reads individual bytes
      // when processing the header, better add a buffer in-between
      bufferedIs = new BufferedInputStream(fileIs, 65535);
      gzipIs = new GZIPInputStream(bufferedIs);
    } catch (IOException e) {
      closeSafely(gzipIs);
      closeSafely(bufferedIs);
      closeSafely(fileIs);
      throw new UncheckedIOException(e);
    }
    BufferedReader reader = new BufferedReader(new InputStreamReader(gzipIs));
    return reader.lines().onClose(() -> closeSafely(reader));
  }

  private static void closeSafely(Closeable closeable) {
    if (closeable != null) {
      try {
        closeable.close();
      } catch (IOException e) {
        // Ignore
      }
    }
  }
}

Share and enjoy…

Categories: Java
  1. Martin
    2018-06-13 at 11:01

    It is not necessary to close all the wrapped streams. Closing the outermost stream closes the other too.

  2. 2018-06-13 at 20:46

    Indeed, on success I’m only closing the outermost reader. However, if there is an exception when I open the streams some may be null, so which one is the outermost? I could check for null and close the first that is not null, but instead I simply close all that are not null. That is easier and it doesn’t hurt.

  1. No trackbacks yet.

Leave a reply to Martin Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.