Relational Boosted Regression Trees. (arXiv:2107.12373v1 [cs.DB])

Many tasks use data housed in relational databases to train boosted
regression tree models. In this paper, we give a relational adaptation of the
greedy algorithm for training boosted regression trees. For the subproblem of
calculating the sum of squared residuals of the dataset, which dominates the
runtime of the boosting algorithm, we provide a $(1 + epsilon)$-approximation
using the tensor sketch technique. Employing this approximation within the
relational boosted regression trees algorithm leads to learning similar model
parameters, but with asymptotically better runtime.



Related post