Multi-task neural networks by learned contextual inputs
Journal article, Peer reviewed
Published version
Date
2024Metadata
Show full item recordCollections
Abstract
This paper explores learned-context neural networks. It is a multi-task learning architecture based on a fully shared neural network and an augmented input vector containing trainable task parameters. The architecture is interesting due to its powerful task adaption mechanism, which facilitates a low-dimensional task parameter space. Theoretically, we show that a scalar task parameter is sufficient for universal approximation of all tasks, which is not necessarily the case for more common architectures. Empirically it is shown that, for homogeneous tasks, the dimension of the task parameter may vary with the complexity of the tasks, but a small task parameter space is generally viable. The task parameter space is found to be well-behaved, which simplifies workflows related to updating models as new data arrives, and learning new tasks with the shared parameters are frozen. Additionally, the architecture displays robustness towards datasets where tasks have few data points. The architecture’s performance is compared to similar neural network architectures on ten datasets, with competitive results.