Proper way to read InputStream to byte array

Posted on Posted in Uncategorized

There are many ways to accomplish this but this one does not use any external dependencies like Apache commons.

Two common pitfalls that I see are that people forget to flush the ByteArrayOutputStream and they call ‘baos.write(buffer)’ instead of ‘baos.write(buffer, 0, read)’ without actually clearing the buffer, which causes the last write to append previous bytes if the read returned less than what have been read from input stream.

	private String extract(InputStream inputStream) throws IOException {	
		ByteArrayOutputStream baos = new ByteArrayOutputStream();				
		byte[] buffer = new byte[1024];
		int read = 0;
		while ((read = inputStream.read(buffer, 0, buffer.length)) != -1) {
			baos.write(buffer, 0, read);
		}		
		baos.flush();		
		return  new String(baos.toByteArray(), "UTF-8");
	}

7 thoughts on “Proper way to read InputStream to byte array

  1. You can shorten the code and move the declaration of ‘read’ into the scope of the loop using the following trick (turns out for-loops don’t need actual initializer values in the first clause):

    for (int read; (read = inputStream.read(buffer, 0, buffer.length)) != -1; ) {
    baos.write(buffer, 0, read);
    }

  2. BTW, ByteArrayOutputStream doesn’t implement flush(), and its superclass OutputStream has an empty body in its definition of flush(), so there’s technically no need to flush ByteArrayOutputStreams. If it is ever needed in some future implementation, I would hope that toByteArray() would call flush() itself.

  3. Also, the following is faster (it avoids a copy operation) if you know the length of the InputStream:

    // Read the file contents into a byte[] array
    final byte[] contents = new byte[lengthBytes];
    final int bytesRead = Math.max(0, inputStream.read(contents));
    // For safety, truncate the array if the file was truncated before we finish reading it
    final byte[] contentsRead = bytesRead == lengthBytes ? contents : Arrays.copyOf(contents, bytesRead);

  4. Thanks for the blog, very helpful.
    I decided to use a tad more than InputStream#available() as initial buffer size, I’m also sizing the ByteArrayOutputStream to the same size. This should cover the initial read, plus some more that may become available in the mean time, avoiding an unnecessary copy/allocation for just a few bytes overhang. Note that one must not risk an integer overflow (even though it’s unlikely, correctness should be paramount).
    Also I’m using “try with resources”, just in case the ByteArrayOutputStream will be changed in the future to require a close. Again: Better be safe than sorry; The JIT compiler will eliminiate calls to empty methods anyway, so I’m also leaving the flush() in the code. However, as we’re consuming all of the InputStream, it should also be closed, so I declared a new reference to it in the try-statement. Here’s the result:

    public static byte[] toByteArray(final InputStream input) throws IOException {
    		final byte[] buffer = new byte[(int) Math.min(Integer.MAX_VALUE, input.available() * 1.25)];
    		try (final ByteArrayOutputStream baos = new ByteArrayOutputStream(buffer.length); //
    		        final InputStream autoCloseInputStream = input) { // because it is fully consumed here
    			for (int read; (read = autoCloseInputStream.read(buffer, 0, buffer.length)) != -1;) {
    				baos.write(buffer, 0, read);
    			}
    			baos.flush();
    			return baos.toByteArray();
    		}
    	}

Leave a Reply

Your email address will not be published. Required fields are marked *