Asynchronous downloads from Google Storage with Kotlin
Google has a numerous Java libraries for almost all APIs on Google Cloud. Most of the libraries are using gRPC APIs which are very easy to integrate with Kotlin Coroutines since gRPC clients return ListenableFutures.
Unfortunately, Google Storage doesn’t have a gRPC API and its Java library uses a blocking HTTP client under the hood. 😪
Luckily there is an elegant way to work around the issue! 😉
First, let’s start with how Google Storage library is normally used. See the following code sample of reading a blob in chunks using a buffer:
val storage = ... // instance of com.google.cloud.storage.Storage
val blobInfo = ...val reader = storage.reader(blobInfo.blobId)
val buffer = ByteBuffer.allocate(bufferSize)
while (true) {
buffer.clear()
val bytesRead = withContext(Dispatchers.IO) {
reader.read(buffer)
}
buffer.flip()
if (bytesRead <= 0) {
break
}
// do stuff
}
Unfortunately, reader is an instance of java.nio.channels.ReadableByteChannel which is blocking!!!
Luckily, there is an elegant workaround! Google Storage supports Signed URLs and Storage Java library can generate such URLs! The only thing left is to use some fancy asynchronous HTTP client… Of course we’ll use Ktor! Here is how the same code sample will look like:
val storage = ... // instance of com.google.cloud.storage.Storage
val blobInfo = ...
val httpClient = ... // Ktor client// create a read-only signed URL valid for 1 hour
val blobUrl = storage.signUrl(
blobInfo,
1,
TimeUnit.HOURS,
Storage.SignUrlOption.httpMethod(HttpMethod.GET),
Storage.SignUrlOption.withV4Signature()
)val reader = httpClient.get<ByteReadChannel>(blobUrl)
reader.read(min = bufferSize) { buffer->
// do stuff
}
reader here is an instance of io.ktor.utils.io.ByteReadChannel which is asynchronous!
Hope you enjoyed my little finding during an investigation of performance issues related to a use of Google Storage library. Follow me on Twitter and DM with any questions!