One day at work, someone needed to store a lot of text in a field of a domain object – and since we are using NHibernate for persistence, that field would get stored to the database, taking up a huge amount of space.
To circumvent this, we just made the actual type of the member variable
description be a
byte, and then we let the accessor property
Description zip and unzip when accessing the value.
readonly Encoding Enc = Encoding.UTF8;
public string Description
using(MemoryStream unzippedStream = new MemoryStream(description))
using (Stream zip = new GZipStream(unzippedStream, CompressionMode.Decompress))
using (StreamReader reader = new StreamReader(zip, Enc))
using (MemoryStream zippedStream = new MemoryStream())
using (Stream zip = new GZipStream(zippedStream, CompressionMode.Compress))
byte toWrite = Enc.GetBytes(value);
zip.Write(toWrite, 0, toWrite.Length);
description = zippedStream.ToArray();
It should be noted that it is crucial that the
ToArray() method of the compressed
MemoryStream be called after the
GZipStream stream has been disposed! That’s because the
GZipStream will not write the gzip stream footer bytes until the stream is disposed. So if
ToArray() is called before disposing, you will get an incomplete stream of bytes.
Moreover it should be noted that zipping strings in the database is not always cool because:
- Compression kicks in for strings at around 2-300 chars. The size of the compressed data is greater than that of the original string for shorter strings.
- Querying is impossible/hard/weird. 🙂
But it is a nifty little trick to put in one’s backpack, and I had all sorts of trouble figuring out exactly how to make the
GZipStream behave, so I figured it would be nice to put a working example on the net.