AutoDS: Towards Human-Centered Automation of Data Science. (arXiv:2101.05273v1 [cs.HC])

Data science (DS) projects often follow a lifecycle that consists of
laborious tasks for data scientists and domain experts (e.g., data exploration,
model training, etc.). Only till recently, machine learning(ML) researchers
have developed promising automation techniques to aid data workers in these
tasks. This paper introduces AutoDS, an automated machine learning (AutoML)
system that aims to leverage the latest ML automation techniques to support
data science projects. Data workers only need to upload their dataset, then the
system can automatically suggest ML configurations, preprocess data, select
algorithm, and train the model. These suggestions are presented to the user via
a web-based graphical user interface and a notebook-based programming user

We studied AutoDS with 30 professional data scientists, where one group used
AutoDS, and the other did not, to complete a data science project. As expected,
AutoDS improves productivity; Yet surprisingly, we find that the models
produced by the AutoDS group have higher quality and less errors, but lower
human confidence scores. We reflect on the findings by presenting design
implications for incorporating automation techniques into human work in the
data science lifecycle.



Related post