A Generalization of Compositional Matrix-Space Models of Language for Short Answer Scoring
Automated Short Answer Grading (ASAG) consists of using computer programs to grade student short responses to free response questions. Current state of the art algorithms for ASAG rely on statistical representations of words, commonly generated through deep learning techniques like word2vec. One property of word2vec is commutativity, i.e. switching the order of words does not change the embedding. Motivated by this shortcoming, word2mat embeds words as matrices, whose multiplication is noncommutative. However, word2mat is both linear and associative, while English has nonlinear effects in certain contexts. To address these mathematical properties, we propose representing words as shallow neural networks. We then generalize word2mat by introducing a simple nonlinear activation function between each matrix multiplication, which we show makes the embedding noncommutative, nonlinear, and non-associative. Empirically, the nonlinear model performs an average of 0.83% worse than word2mat on the SentEval framework, and a neural network trained on both models’ embeddings to predict student grades has nearly identical accuracy. Possible justifications of the discrepancy between theoretical advantages and empirical performance are computational power and limited training time.
Copyright (c) 2022 STEPHEN HUAN, DHRUV SUNDARARAMAN, SHAWN MALIK, Mihai Boicu
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.