Novel distributional reinforcement and ensemble learning algorithms

Aziz, Vanya

doi:10.48441/4427.3295.2

Please use this identifier to cite or link to this item: https://doi.org/10.48441/4427.3295.2

Publisher URL:	https://hdl.handle.net/10630/39287
Title:	Novel distributional reinforcement and ensemble learning algorithms
Other Titles:	Nuevos algoritmos de aprendizaje por refuerzo distributivo y aprendizaje ensamblador
Language:	English
Authors:	Aziz, Vanya
Keywords:	Robótica - Tesis doctorales; Programación lineal; Aprendizaje automático (Inteligencia artificial); Redes neuronales (Informática); Distributional Reinforcement Learning; Soft Actor-Critic; Robotics; Linear Programming; Ensemble
Issue Date:	2025
Examination Date:	2025
Publisher:	UMA Editorial
Abstract:	The term “Industry 4.0” describes the fourth industrial revolution and is characterized by the integration of digital technology into manufacturing processes. The transformative concepts in Industry 4.0 enable economic production at radically small lot sizes, requiring unprecedented levels of automation and adaptability. These requirements on production facilities necessitate autonomously acting and self-optimizing systems. The field of machine learning offers promising solutions to achieve the objectives of Industry 4.0, particularly by enabling data-driven decision-making and adaptive control mechanisms. This dissertation focuses on Deep Reinforcement Learning (DRL), a neural network-based approach for solving Markov Decision Processes in high-dimensional spaces with unknown transition dynamics. The main contribution of this thesis is the development of a novel state-of-the-art distributional reinforcement learning algorithm within the maximum-entropy Actor-Critic framework. This algorithm, termed ”Cramér-based Soft Distributional Soft Actor-critic” (CDSAC), demonstrates superior performance to other RL algorithms, especially in environments with high-dimensional spaces and complex dynamics. Its performance is shown to be partly rooted in a phenomenon arising in Cramér-metric-based Distributional Reinforcement Learning, referred to as confidence-driven model updates. This mechanism ensures that the value function approximator is updated more conservatively when confidence in its estimates is low. Theoretical justifications for the algorithm are provided, demonstrating its convergence in the policy evaluation setting and, under widely accepted mild assumptions, in the control setting as well. Beyond foundational algorithmic research, this thesis contributes to the practical application of RL in robotics. Given the crucial role of multi-joint robotic systems in modern production technology, a RL meta-algorithm called ”Reinforcement Learning - Inverse Kinematics” (RL-IK) is devised. This approach enhances the applicability of reinforcement learning to robotic control tasks by significantly accelerating convergence to near-optimal policies
URI:	https://hdl.handle.net/20.500.12738/19053.2
DOI:	10.48441/4427.3295.2
Review status:	This version was reviewed (alternative review procedure)
Institute:	Universidad de Málaga Universidad de Málaga. Departamento de Ingeniería mecánica, térmica y de fluidos Department Maschinenbau und Produktion (ehemalig, aufgelöst 10.2025) Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)
Type:	Thesis
Thesis type:	Doctoral Thesis
Additional note:	Aziz, Vanya. (2025). Novel distributional reinforcement and ensemble learning algorithms, I-VII, 1-153. dissertation. UMA Editorial. https://hdl.handle.net/10630/39287
Advisor:	Hendrix, Eligius María Theodorus Nowak, Ivo
Appears in Collections:	Publications with full text

Files in This Item:

File	Description	Size	Format
Thesis_30_03_26.pdf		3.75 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

HAW Katalog

Check

Note about this record

Export

Version History

Version	Item	Date	Summary
2	doi:10.48441/4427.3295.2	2026-04-02 07:58:52.153	updated version (only minor changes): 2026-03-30
1	doi:10.48441/4427.3295	2026-03-12 14:23:29.0	old version: 2025

Selected version

This item is licensed under a Creative Commons License

Files in This Item:

Google ScholarTM

HAW Katalog

Note about this record

Google Scholar^TM