-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Background and Motivation
I've discovered over the years that providing string-based content via a Stream is a common requirement. The conventional wisdom (as indicated by highly-voted Stack Overflow questions and answers, message board posts, etc.) appears to be to convert the string to a byte array using an Encoding and then wrap it with a MemoryStream. This likely works fine for small strings or other situations where performance and memory aren't of great concern, but is likely not an optimal approach given larger strings. Instead I'd propose a StringStream class be provided in-the-box that correctly provides a given string (or ReadOnlyMemory<char>) as a Stream while buffering chunks in a reusable buffer to reduce the overall memory footprint and avoid allocating an entire byte[] to contain the fully-encoded representation at once.
Proposed API
As with most other Stream derived classes, the API surface of a StringStream would be essentially identical. It might contain the following constructors as well as additional members (other Stream members omitted for brevity):
public class StringStream : Stream
{
public StringStream(string source);
public StringStream(string source, Encoding encoding);
public StringStream(string source, Encoding encoding, int bufferCharCount);
public StringStream(in ReadOnlyMemory<char> source);
public StringStream(in ReadOnlyMemory<char> source, Encoding encoding);
public StringStream(in ReadOnlyMemory<char> source, Encoding encoding, int bufferCharCount);
// Because the encoding of a string to a given encoding is not necessarily based only on character count,
// the `StringStream` would have to be forward-only and non-seekable to avoid encoding the entire string up-front
public override bool CanRead => true;
public override bool CanSeek => false;
public override bool CanWrite => false;
public override long Length => throw new NotSupportedException();
// The underlying source string should be available
public ReadOnlyMemory<char> Source { get; }
// A reset method should be provided that sets the stream
// back to it's initial position and clears the buffers
public virtual void Reset();
// As with other non-seekable streams, calling Position or Seek() should throw, but if setting
// position to 0 or seeking to an offset of 0 from SeekOrigin.Begin, Reset() could be called
// Other Stream members
// ...
}To further help explain the concept, a sample (possibly naïve) implementation can be found at: https://gist.github.com/daveaglick/e49145d650ea3a4dbc3b6d0f8482fd37 (thanks to @benaadams for several ideas here).
Usage Examples
Since the StringStream is intended to derive from the Stream class and conform to it's API, usage would be similar to any other Stream:
string reallyLongString = "...";
Stream stringStream = new StringStream(reallyLongString);
// Do some stuff with the stream
byte[] buffer = new byte[256];
stringStream.Read(buffer, 0, 256);
stringStream.Reset();
stringStream.Read(buffer, 0, 256);
// etc...Alternative Designs
As previously mentioned, similar functionality is often implemented by wrapping a fully-encoded representation of the string with a MemoryStream.
C++ has a stringstream class that appears to be more of an iterator over a stack of strings than an actual C#-like stream (and as far as I can tell it makes no attempt to handle character encoding).
Risks
Anything dealing with character encoding has the potential for edge cases and challenging logic, so care would need to be taken that all encodings are handled or at least well documented (such as preamble, fallback, etc.).