A JupyterLab Plugin for Data Science Best Practices
DOI:
https://doi.org/10.13021/jssr2021.3223Abstract
Data Science is a multidisciplinary approach to extract insights from large volumes of data. It involves preparing and processing data, and performing analysis to reveal patterns. Due to its widespread application and important benefits, many developers often need to learn how to do data science effectively. However, they face many barriers learning the best practices they need to be successful. To better understand these best practices, we first examined tutorials and StackOverflow posts, documenting specific best practices important for success. We explored how new programming tools might better support developers in learning best practices by building a series of mockups. We then focused on a specific best practice, hyperparameter optimization, the process of configuring a machine learning model for a specific problem. We developed an early prototype of a plugin for a popular data science environment, JupyterLab, which may ultimately help developers more easily work through the process of hyperparameter optimization.
Published
Issue
Section
Categories
License
Copyright (c) 2022 AARAV BAJAJ, ANYA PAREKH, SARAH ALI, Sahar Mehrpour, Yang Yoo, Thomas LaToza
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.