.NET Coding

Download large files with Web Api as a relay

Consider the following scenario: you have a file storage in the cloud (google docs, openstack object storage etc.) and you need to provide your app consumers a way to download the files through your app. The challenges here are mainly memory related – if you get files in memory and then return them in the response, your app will soon be crushed with OutOfmemory Exception. So you need a way to turn your app into a relay which is kind of a pipe through which bytes flow from the file storage to the client without buffering any of the content. I guess you are already thinking streams and you are right.

Change the buffer policy

Web api by default will try to buffer the content that you are trying to download and that needs to be changed. You will need to be inherit from IHostBufferPolicySelector  or inherit from the only implementor of this interface (WebHostBufferPolicySelector) and override its methods :

nbsp;

public class WebHostBufferPolicySelector : IHostBufferPolicySelector
{
    public virtual bool UseBufferedInputStream(object hostContext);
    public virtual bool UseBufferedOutputStream(HttpResponseMessage response);
}

You will also need to override one of the two methods, depending if you want unbuffered content for download or upload. In our case we need UseBufferedOutputStream. A possible implementation would be:

public class NoBufferPolicy : WebHostBufferPolicySelector
    {
        public override bool UseBufferedOutputStream(HttpResponseMessage response)
        {
            if (response.RequestMessage.RequestUri.LocalPath.Contains("download"))
            {
                return false;
            }

            return base.UseBufferedOutputStream(response);
        }
    }

So now we use not buffered content when anybody hits a route that contains download and buffered for any other case. After we have done that we need to register our class in GlobalConfiguration:

 GlobalConfiguration.Configuration.Services.Replace(typeof(IHostBufferPolicySelector), new NoBufferPolicy());

Back to the controller

In the usual case the code in your controller should be something like:

 

var client = new HttpClient();
var responseWithHeadersOnly = await client.GetAsync(requestUrl, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);

Two things are worth mentioning here – the second parameter of the GetAsync method – HttpCompletionOption.ResponseHeadersRead which basically reads only the headers of the response and not the content. The second thing is the ConfigureAwait(false) which has to do with how async and await play in a web app.

Here is the time to check if the response is ok. Once we have the response, we need to get hold of the content stream without putting it in memory:

Stream streamToReadFrom = await responseWithHeadersOnly.Content.ReadAsStreamAsync();

After we get the headers, make sure that everything is ok (no service unavailable, server errors etc.) and have control over the stream it is time to start streaming to the client. We will use a push approach in this case, contrary to the traditional pull. Luckily web api gives us the tools for that – we will use the PushStreamContent object for our response content. What it does is that it reads from a stream and pushes the bytes to the client chunk-wise. It has a couple of overloads but it basically uses a function that takes a Stream, HttpContext and TransportContext as params and may return void or a Task. Let’s see some code:

 var client = new HttpClient();
var streamToReadFrom = await client.GetAsync(requestUrl, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);

                var buffer = new byte[65536];
                var result = Request.CreateResponse();
                int bytesRead = 0;

                result.Content = new PushStreamContent((stream, content, context) =>
                {
                    while ((bytesRead = streamToReadFrom.Read(buffer, 0, buffer.Length)) > 0)
                    {
                        stream.Write(buffer, 0, bytesRead);
                    }

                    stream.Close();
                    streamToReadFrom.Close();
                    client.Displose();
                });

                result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment");
                result.Content.Headers.ContentDisposition.FileName = filename;
                result.Content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
                return result;

Every time you go through the while loop, you read bytes from the incoming stream and pass them to the outgoing stream.

The flow here is as follows – a request comes in (1), you read the headers of the requested file from the file storage provider (2), return response from the method (return result; line (3)), file starts to download (stream.Write(buffer, 0, bytesRead); (4)), file is done downloading (stream.Close(); (5)). The whole thing ends when you close the outgoing stream and not when you return from the method.

That should be all to turn your web api in a working and scalable file download relay. You can find the source code HERE under the LargeFilesWebApi folder

1 comment

Leave a Reply

Your email address will not be published. Required fields are marked *