Publisher DOI: | 10.48550/arXiv.2405.00743 | Title: | On the weight dynamics of learning networks | Language: | English | Authors: | Sharafi, Nahal Martin, Christoph Hallerberg, Sarah |
Issue Date: | 30-Apr-2024 | Publisher: | Arxiv.org | Journal or Series Name: | De.arxiv.org | Abstract: | Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-layer networks learning regression tasks. The results are valid for an arbitrary numbers of nodes and arbitrary choices of activation functions. Applying the results to a network learning a regression task, we investigate numerically, how stability indicators relate to the final training-loss. Although the specific results vary with different choices of initial conditions and activation functions, we demonstrate that it is possible to predict the final training loss, by monitoring finite-time Lyapunov exponents during the training process. |
URI: | http://hdl.handle.net/20.500.12738/15745 | Review status: | Only preprints: This version has not yet been reviewed | Institute: | Fakultät Technik und Informatik Department Maschinenbau und Produktion |
Type: | Preprint |
Appears in Collections: | Publications without full text |
Show full item record
Add Files to Item
Note about this record
Export
Items in REPOSIT are protected by copyright, with all rights reserved, unless otherwise indicated.