A JupyterLab Plugin for Data Science Best Practices

Authors

  • AARAV BAJAJ Aspiring Scientists' Summer Internship Program Intern
  • ANYA PAREKH Aspiring Scientists' Summer Internship Program Intern
  • SARAH ALI Aspiring Scientists' Summer Internship Program Intern
  • Sahar Mehrpour Aspiring Scientists' Summer Internship Program Co-mentor
  • Yang Yoo Aspiring Scientists' Summer Internship Program Co-mentor
  • Thomas LaToza

DOI:

https://doi.org/10.13021/jssr2021.3223

Abstract

Data Science is a multidisciplinary approach to extract insights from large volumes of data. It involves preparing and processing data, and performing analysis to reveal patterns. Due to its widespread application and important benefits, many developers often need to learn how to do data science effectively. However, they face many barriers learning the best practices they need to be successful. To better understand these best practices, we first examined tutorials and StackOverflow posts, documenting specific best practices important for success. We explored how new programming tools might better support developers in learning best practices by building a series of mockups. We then focused on a specific best practice, hyperparameter optimization, the process of configuring a machine learning model for a specific problem. We developed an early prototype of a plugin for a popular data science environment, JupyterLab, which may ultimately help developers more easily work through the process of hyperparameter optimization.

Published

2022-12-13

Issue

Section

College of Engineering and Computing: Department of Computer Science

Categories