Lazily read lines from gzip file with Java 8 streams
Java 8 can read lines lazily from files using the Files.lines method, but what to do if the files are compressed? I needed to do just that, so here is a class that provides a line stream from a gzipped file. It can easily be converted for other file formats!
public class GZIPFiles {
/**
* Get a lazily loaded stream of lines from a gzipped file, similar to
* {@link Files#lines(java.nio.file.Path)}.
*
* @param path
* The path to the gzipped file.
* @return stream with lines.
*/
public static Stream<String> lines(Path path) {
InputStream fileIs = null;
BufferedInputStream bufferedIs = null;
GZIPInputStream gzipIs = null;
try {
fileIs = Files.newInputStream(path);
// Even though GZIPInputStream has a buffer it reads individual bytes
// when processing the header, better add a buffer in-between
bufferedIs = new BufferedInputStream(fileIs, 65535);
gzipIs = new GZIPInputStream(bufferedIs);
} catch (IOException e) {
closeSafely(gzipIs);
closeSafely(bufferedIs);
closeSafely(fileIs);
throw new UncheckedIOException(e);
}
BufferedReader reader = new BufferedReader(new InputStreamReader(gzipIs));
return reader.lines().onClose(() -> closeSafely(reader));
}
private static void closeSafely(Closeable closeable) {
if (closeable != null) {
try {
closeable.close();
} catch (IOException e) {
// Ignore
}
}
}
}
Share and enjoy…
Categories: Java
It is not necessary to close all the wrapped streams. Closing the outermost stream closes the other too.
Indeed, on success I’m only closing the outermost reader. However, if there is an exception when I open the streams some may be null, so which one is the outermost? I could check for null and close the first that is not null, but instead I simply close all that are not null. That is easier and it doesn’t hurt.