Combining FileStream and MemoryStream to avoid disk accesses/paging while receiving gigabytes of data?
        Posted  
        
            by 
                w128
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by w128
        
        
        
        Published on 2013-10-30T15:38:22Z
        Indexed on 
            2013/10/30
            15:54 UTC
        
        
        Read the original article
        Hit count: 288
        
I'm receiving a file as a stream of byte[] data packets (total size isn't known in advance) that I need to store somewhere before processing it immediately after it's been received (I can't do the processing on the fly). Total received file size can vary from as small as 10 KB to over 4 GB.
- One option for storing the received data is to use a 
MemoryStream, i.e. a sequence ofMemoryStream.Write(bufferReceived, 0, count)calls to store the received packets. This is very simple, but obviously will result in out of memory exception for large files. - An alternative option is to use a 
FileStream, i.e.FileStream.Write(bufferReceived, 0, count). This way, no out of memory exceptions will occur, but what I'm unsure about is bad performance due to disk writes (which I don't want to occur as long as plenty of memory is still available) - I'd like to avoid disk access as much as possible, but I don't know of a way to control this. 
I did some testing and most of the time, there seems to be little performance difference between say 10 000 consecutive calls of MemoryStream.Write() vs FileStream.Write(), but a lot seems to depend on buffer size and the total amount of data in question (i.e the number of writes). Obviously, MemoryStream size reallocation is also a factor.
Does it make sense to use a combination of
MemoryStreamandFileStream, i.e. write to memory stream by default, but once the total amount of data received is over e.g. 500 MB, write it toFileStream; then, read in chunks from both streams for processing the received data (first process 500 MB from theMemoryStream, dispose it, then read fromFileStream)?Another solution is to use a custom memory stream implementation that doesn't require continuous address space for internal array allocation (i.e. a linked list of memory streams); this way, at least on 64-bit environments, out of memory exceptions should no longer be an issue. Con: extra work, more room for mistakes.
So how do FileStream vs MemoryStream read/writes behave in terms of disk access and memory caching, i.e. data size/performance balance. I would expect that as long as enough RAM is available, FileStream would internally read/write from memory (cache) anyway, and virtual memory would take care of the rest. But I don't know how often FileStream will explicitly access a disk when being written to.
Any help would be appreciated.
© Stack Overflow or respective owner