Seminar: Chris Mullins, “Query-Aware Compression of Join Results”

The proliferation of lightweight client devices such as iPhones, iPads, Android phones and tablets, has created an increased demand for cloud-based services. In many of these services, queries over structured data are sent to cloud-based servers for processing and the results relayed back to the client devices. Network bandwidth between client devices and cloud-based servers is often a limited resource and any effort to reduce the amount of data transmitted across the network would not only conserve bandwidth but help with the battery life of the client devices.

In this thesis we propose a novel query-aware compression method for compressing query results sent from database servers to client applications. Our method is based on two key ideas. We exploit redundancy information obtained from the query plan and possibly from the database schema to achieve better compression than standard non-query aware compressors. We use a collection of memory-limited dictionaries to encode attribute values in a lightweight and efficient manner. We evaluate our method empirically using the TPC-H benchmark show that this technique is effective especially when used in conjunction with standard compressors. Our results show that compression ratios of up to 10 times over gzip are possible.

Committee:  Lipyeow Lim (chair), Henri Casanova, Kyungim Baek

Time/Place: Monday,  Dec 3 2012, 10am, POST 302