Assessing the performance of Kolmogorov-Arnold Networks in preventing catastrophic forgetting on continual learning tasks

Authors

  • Justin Lee Department of Computer Science, George Mason University, Fairfax, VA
  • Michael Crawshaw Department of Computer Science, George Mason University, Fairfax, VA
  • Mingrui Liu Department of Computer Science, George Mason University, Fairfax, VA

Abstract

Kolmogorov-Arnold Networks (KANs) are a recently introduced alternative to multi layer perceptrons (MLPs), motivated by interpretability and approximation efficiency. Previous work conjectured that KANs may be suitable for continual learning, which is a learning paradigm where data is presented to a learning algorithm sequentially and may change over time. This conjecture was supported by initial results for learning a single-variable target function in a continual learning setting, where the KAN demonstrated negligible forgetting compared to the baseline MLP. Utilizing the pykan package, we investigated the performance of KANs for more difficult continual learning settings than considered in previous work: we evaluated KANs with deeper architectures using more difficult datasets, including multi-variable target functions and Split-MNIST (a standard digit recognition benchmark for continual learning). When learning simple functions with small KANs, we reproduce the previous conclusion that KANs achieve minor forgetting compared to MLPs. However, on Split-MNIST, both KANs and MLPs suffered from catastrophic forgetting.

Published

2024-10-13

Issue

Section

College of Engineering and Computing: Department of Computer Science