Discovering Equations from Graph Neural Networks using Symbolic Regression


Sebastian Höfler

08. July 2022, 15:00
WW8, Zoom, Fürth


Symbolic regression algorithms that scale exponentially with the number of input variables and operators [21] are not suited for problems in quantum chemistry. Deep learning models, which are able to work effectively with these large and complex datasets, lack interpretability. This thesis builds on the work of Cranmer [3] and tries to extract symbolic representations from a heavily regularized trained deep learning framework. The aim is to obtain an interpretable analytic approximation for the energy U 0 of a molecule by using symbolic regression on the internal message functions. We use different graph neural network architectures and apply a L 1 - regularization parameter to reduce the number of of messages components in the network. The components are then  approximated with symbolic regression. The hypothesis that applying a strong L 1 -regularization to the network will result in information being contained within a few message components, was observed not to be true in this case. Symbolic regression performed on the message components was unable to find equations consistent with the underlying governing equations like (e. g. Kohn-Sham). Further research should focus on determining different ways to reduce the message components and examine if PySR is suitable for the complexity of this problem.