The query containment problem is a fundamental algorithmic problem in data
management. While this problem is well understood under set semantics, it is by
far less understood under bag semantics. In this paper we unveil tight
connections between information theory and the conjunctive query containment
under bag semantics.
Authors: Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, Dan Suciu. 2020.
In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of
Database Systems (PODS ‘20).
The query containment problem is a fundamental algorithmic problem in data
management. While this problem is well understood under set semantics, it is by
far less understood under bag semantics. In particular, it is a long-standing
open question whether or not the conjunctive query containment problem under bag
semantics is decidable.We unveil tight connections between information theory
and the conjunctive query containment under bag semantics.These connections are
established using information inequalities, which are considered to be the laws
of information theory. Our first main result asserts that deciding the validity
of a generalization of information inequalities is many-one equivalent to the
restricted case of conjunctive query containment in which the containing query
is acyclic; thus, either both these problems are decidable or both are
undecidable. Our second main result identifies a new decidable case of the
conjunctive query containment problem under bag semantics. Specifically, we give
an exponential-time algorithm for conjunctive query containment under bag
semantics, provided the containing query is chordal and admits a simple junction
tree.
Read the PDF:
Bag Query Containment and Information Theory