This paper developed the theoretical foundation of discriminative learning over
relational data, showing how one can exploit the relational structures in the
data and the feature extraction query to speed up model training.
Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian
Schleich. 2020.
In ACM Transactions on Database Systems (TODS ‘20). Vol. 45 No. 2, Article 7.
Integrated solutions for analytics over relational databases are of great
practical importance as they avoid the costly repeated loop data scientists have
to deal with on a daily basis: select features from data residing in relational
databases using feature extraction queries involving joins, projections, and
aggregations; export the training dataset defined by such queries; convert this
dataset into the format of an external learning tool; and train the desired
model using this tool. These integrated solutions are also a fertile ground of
theoretically fundamental and challenging problems at the intersection of
relational and statistical data models.
Read the PDF:
Learning Models over Relational Data using Sparse Tensors and Functional Dependencies