Verlagslink DOI: | 10.48550/arXiv.2405.00743 | Titel: | On the weight dynamics of learning networks | Sprache: | Englisch | Autorenschaft: | Sharafi, Nahal Martin, Christoph Hallerberg, Sarah |
Erscheinungsdatum: | 30-Apr-2024 | Verlag: | Arxiv.org | Zeitschrift oder Schriftenreihe: | De.arxiv.org | Zusammenfassung: | Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-layer networks learning regression tasks. The results are valid for an arbitrary numbers of nodes and arbitrary choices of activation functions. Applying the results to a network learning a regression task, we investigate numerically, how stability indicators relate to the final training-loss. Although the specific results vary with different choices of initial conditions and activation functions, we demonstrate that it is possible to predict the final training loss, by monitoring finite-time Lyapunov exponents during the training process. |
URI: | http://hdl.handle.net/20.500.12738/15745 | Begutachtungsstatus: | Nur bei Preprints: Diese Version ist noch nicht begutachtet | Einrichtung: | Fakultät Technik und Informatik Department Maschinenbau und Produktion |
Dokumenttyp: | Vorabdruck (Preprint) |
Enthalten in den Sammlungen: | Publications without full text |
Zur Langanzeige
Volltext ergänzen
Feedback zu diesem Datensatz
Export
Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.