Verlagslink DOI: 10.48550/arXiv.2405.00743
Titel: On the weight dynamics of learning networks
Sprache: Englisch
Autorenschaft: Sharafi, Nahal 
Martin, Christoph 
Hallerberg, Sarah 
Erscheinungsdatum: 30-Apr-2024
Verlag: Arxiv.org
Zeitschrift oder Schriftenreihe: De.arxiv.org 
Zusammenfassung: 
Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-layer networks learning regression tasks. The results are valid for an arbitrary numbers of nodes and arbitrary choices of activation functions. Applying the results to a network learning a regression task, we investigate numerically, how stability indicators relate to the final training-loss. Although the specific results vary with different choices of initial conditions and activation functions, we demonstrate that it is possible to predict the final training loss, by monitoring finite-time Lyapunov exponents during the training process.
URI: http://hdl.handle.net/20.500.12738/15745
Begutachtungsstatus: Nur bei Preprints: Diese Version ist noch nicht begutachtet
Einrichtung: Fakultät Technik und Informatik 
Department Maschinenbau und Produktion 
Dokumenttyp: Vorabdruck (Preprint)
Enthalten in den Sammlungen:Publications without full text

Zur Langanzeige

Seitenansichten

49
checked on 23.11.2024

Google ScholarTM

Prüfe

HAW Katalog

Prüfe

Volltext ergänzen

Feedback zu diesem Datensatz


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.