Parallel learning, namely the simultaneous learning of multiple patterns, constitutes a modern challenge for neural networks. While this cannot be accomplished by standard Hebbian associative neural networks, in this paper we show how the multitasking Hebbian network (a variation on the theme of the Hopfield model, working on sparse datasets) is naturally able to perform this complex task. We focus on systems processing in parallel a finite (up to logarithmic growth in the size of the network) number of patterns, mirroring the low-storage setting of standard associative neural networks. When patterns to be reconstructed are mildly diluted, the network handles them hierarchically, distributing the amplitudes of their signals as power laws w.r.t. the pattern information content (hierarchical regime), while, for strong dilution, the signals pertaining to all the patterns are simultaneously raised with the same strength (parallel regime). Further, we prove that the training protocol (either supervised or unsupervised) neither alters the multitasking performances nor changes the thresholds for learning. We also highlight (analytically and by Monte Carlo simulations) that a standard cost function (i.e. the Hamiltonian) used in statistical mechanics exhibits the same minima as a standard loss function (i.e. the sum of squared errors) used in machine learning.
Parallel learning by multitasking neural networks
Andrea Alessandrelli;Adriano Barra
;
2023-01-01
Abstract
Parallel learning, namely the simultaneous learning of multiple patterns, constitutes a modern challenge for neural networks. While this cannot be accomplished by standard Hebbian associative neural networks, in this paper we show how the multitasking Hebbian network (a variation on the theme of the Hopfield model, working on sparse datasets) is naturally able to perform this complex task. We focus on systems processing in parallel a finite (up to logarithmic growth in the size of the network) number of patterns, mirroring the low-storage setting of standard associative neural networks. When patterns to be reconstructed are mildly diluted, the network handles them hierarchically, distributing the amplitudes of their signals as power laws w.r.t. the pattern information content (hierarchical regime), while, for strong dilution, the signals pertaining to all the patterns are simultaneously raised with the same strength (parallel regime). Further, we prove that the training protocol (either supervised or unsupervised) neither alters the multitasking performances nor changes the thresholds for learning. We also highlight (analytically and by Monte Carlo simulations) that a standard cost function (i.e. the Hamiltonian) used in statistical mechanics exhibits the same minima as a standard loss function (i.e. the sum of squared errors) used in machine learning.File | Dimensione | Formato | |
---|---|---|---|
Agliari_2023_J._Stat._Mech._2023_113401.pdf
accesso aperto
Tipologia:
Versione editoriale
Licenza:
Creative commons
Dimensione
2.11 MB
Formato
Adobe PDF
|
2.11 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.