Annotations and gzcompress

Anyone have any luck saving into annotations, text data that has been gzcompressed?

Developing a mod that is using a page history and figured it would be better to compress the page data before adding it as an annotation, but it seems the annotations can't handle the special characters gzcompressing generates.

  • You're trading off CPU time for storage space on the disk. That seems like an odd exchange.

    I wouldn't expect any issues with saving gzip compressed data as an annotation but I've never done it.

  • CPU time doesn't matter, it isn't a critical feature or one that will be used particularly often, mostly just to revert any defamation to the page. The storage could make a difference in the long run.

    I didn't expect any issues either, yet I've run into one :)

    Using gzcompress isn't necessary, just something that might be useful with certain projects.



  • Elgg is quite lax in handling string data.  Sometimes it strips HTML tags in the wrong place, and sanitise_string(), which is applied to strings to be stored in the database, calls trim() (at least in 1.7).  I'm not sure if that's inherent in gz-compression, but a quick look at some gzip'ed files show that they end in a null byte.  It would be a problem if sanitise_string() strips that null byte.

    But even if Elgg did handle string properly, it wouldn't be a good idea to save binary data in a database string column.

  • Haven't run into any string issues in Elgg beyond this.

    And we aren't talking about gzipped files which are slightly different.

    The data is getting truncated on special characters.