Novel distributional reinforcement and ensemble learning algorithms

Aziz, Vanya

doi:10.48441/4427.3295.2

Please use this identifier to cite or link to this item: https://doi.org/10.48441/4427.3295.2

DC Field	Value	Language
dc.contributor.advisor	Hendrix, Eligius María Theodorus	-
dc.contributor.advisor	Nowak, Ivo	-
dc.contributor.author	Aziz, Vanya	-
dc.date.accessioned	2026-04-08T11:44:41Z	-
dc.date.available	2026-03-12T13:23:29Z	-
dc.date.available	2026-04-08T11:44:41Z	-
dc.date.issued	2025	-
dc.identifier.uri	https://hdl.handle.net/20.500.12738/19053.2	-
dc.description.abstract	The term “Industry 4.0” describes the fourth industrial revolution and is characterized by the integration of digital technology into manufacturing processes. The transformative concepts in Industry 4.0 enable economic production at radically small lot sizes, requiring unprecedented levels of automation and adaptability. These requirements on production facilities necessitate autonomously acting and self-optimizing systems. The field of machine learning offers promising solutions to achieve the objectives of Industry 4.0, particularly by enabling data-driven decision-making and adaptive control mechanisms. This dissertation focuses on Deep Reinforcement Learning (DRL), a neural network-based approach for solving Markov Decision Processes in high-dimensional spaces with unknown transition dynamics. The main contribution of this thesis is the development of a novel state-of-the-art distributional reinforcement learning algorithm within the maximum-entropy Actor-Critic framework. This algorithm, termed ”Cramér-based Soft Distributional Soft Actor-critic” (CDSAC), demonstrates superior performance to other RL algorithms, especially in environments with high-dimensional spaces and complex dynamics. Its performance is shown to be partly rooted in a phenomenon arising in Cramér-metric-based Distributional Reinforcement Learning, referred to as confidence-driven model updates. This mechanism ensures that the value function approximator is updated more conservatively when confidence in its estimates is low. Theoretical justifications for the algorithm are provided, demonstrating its convergence in the policy evaluation setting and, under widely accepted mild assumptions, in the control setting as well. Beyond foundational algorithmic research, this thesis contributes to the practical application of RL in robotics. Given the crucial role of multi-joint robotic systems in modern production technology, a RL meta-algorithm called ”Reinforcement Learning - Inverse Kinematics” (RL-IK) is devised. This approach enhances the applicability of reinforcement learning to robotic control tasks by significantly accelerating convergence to near-optimal policies	en
dc.language.iso	en	en_US
dc.publisher	UMA Editorial	en_US
dc.relation.replaces	http://dx.doi.org/10.48441/4427.3295	-
dc.subject	Robótica - Tesis doctorales	en_US
dc.subject	Programación lineal	en_US
dc.subject	Aprendizaje automático (Inteligencia artificial)	en_US
dc.subject	Redes neuronales (Informática)	en_US
dc.subject	Distributional Reinforcement Learning	en_US
dc.subject	Soft Actor-Critic	en_US
dc.subject	Robotics	en_US
dc.subject	Linear Programming	en_US
dc.subject	Ensemble	en_US
dc.subject.ddc	620: Ingenieurwissenschaften	en_US
dc.title	Novel distributional reinforcement and ensemble learning algorithms	en
dc.title.alternative	Nuevos algoritmos de aprendizaje por refuerzo distributivo y aprendizaje ensamblador	es
dc.type	Thesis	en_US
dc.identifier.doi	10.48441/4427.3295.2	-
dcterms.dateAccepted	2025	-
dc.description.version	AlternativeReviewed	en_US
openaire.rights	info:eu-repo/semantics/openAccess	en_US
thesis.grantor.department	Universidad de Málaga. Departamento de Ingeniería mecánica, térmica y de fluidos	en_US
thesis.grantor.place	Málaga	en_US
thesis.grantor.universityOrInstitution	Universidad de Málaga	en_US
tuhh.identifier.urn	urn:nbn:de:gbv:18302-reposit-237178	-
tuhh.oai.show	true	en_US
tuhh.publication.institute	Universidad de Málaga	en_US
tuhh.publication.institute	Universidad de Málaga. Departamento de Ingeniería mecánica, térmica y de fluidos	en_US
tuhh.publication.institute	Department Maschinenbau und Produktion (ehemalig, aufgelöst 10.2025)	en_US
tuhh.publication.institute	Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)	en_US
tuhh.publisher.url	https://hdl.handle.net/10630/39287	-
tuhh.type.opus	Dissertation	-
tuhh.type.rdm	true	-
dc.rights.cc	https://creativecommons.org/licenses/by-nc-nd/4.0/	en_US
dc.type.casrai	Dissertation	-
dc.type.dini	doctoralThesis	-
dc.type.driver	doctoralThesis	-
dc.type.status	info:eu-repo/semantics/updatedVersion	en_US
dc.type.thesis	doctoralThesis	en_US
dcterms.DCMIType	Text	-
local.comment.external	Aziz, Vanya. (2025). Novel distributional reinforcement and ensemble learning algorithms, I-VII, 1-153. dissertation. UMA Editorial. https://hdl.handle.net/10630/39287	en_US
tuhh.apc.status	false	en_US
item.fulltext	With Fulltext	-
item.grantfulltext	open	-
item.creatorOrcid	Aziz, Vanya	-
item.creatorGND	Aziz, Vanya	-
item.languageiso639-1	en	-
item.openairecristype	http://purl.org/coar/resource_type/c_46ec	-
item.cerifentitytype	Publications	-
item.advisorGND	Hendrix, Eligius María Theodorus	-
item.advisorGND	Nowak, Ivo	-
item.openairetype	Thesis	-
crisitem.author.dept	Department Maschinenbau und Produktion (ehemalig, aufgelöst 10.2025)	-
crisitem.author.parentorg	Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)	-
Appears in Collections:	Publications with full text

Files in This Item:

File	Description	Size	Format
Thesis_30_03_26.pdf		3.75 MB	Adobe PDF	View/Open

Show simple item record

Google Scholar^TM

Check

HAW Katalog

Check

Note about this record

Export

Version History

Version	Item	Date	Summary
2	doi:10.48441/4427.3295.2	2026-04-02 07:58:52.153	updated version (only minor changes): 2026-03-30
1	doi:10.48441/4427.3295	2026-03-12 14:23:29.0	old version: 2025

Selected version

This item is licensed under a Creative Commons License

Files in This Item:

Google ScholarTM

HAW Katalog

Note about this record

Google Scholar^TM