Novel distributional reinforcement and ensemble learning algorithms

Aziz, Vanya

doi:10.48441/4427.3295

Notice This is not the latest version of this item. The latest version can be found at:http://dx.doi.org/10.48441/4427.3295.2

Please use this identifier to cite or link to this item: https://doi.org/10.48441/4427.3295

DC Field	Value	Language
dc.contributor.advisor	Hendrix, Eligius María Theodorus	-
dc.contributor.advisor	Nowak, Ivo	-
dc.contributor.author	Aziz, Vanya	-
dc.date.accessioned	2026-03-12T13:23:29Z	-
dc.date.available	2026-03-12T13:23:29Z	-
dc.date.issued	2025	-
dc.identifier.uri	https://hdl.handle.net/20.500.12738/19053	-
dc.description.abstract	This dissertation focuses on Deep Reinforcement Learning (DRL), a neural network-based approach for solving Markov Decision Processes in high-dimensional spaces with unknown transition dynamics. The main contribution of this thesis is the development of a novel state-of-the-art distributional reinforcement learning algorithm within the maximum-entropy Actor-Critic framework. This algorithm, termed ”Cramér-based Soft Distributional Soft Actor-critic” (C-DSAC), demonstrates superior performance to other RL algorithms, especially in environments with high-dimensional spaces and complex dynamics. Its performance is shown to be partly rooted in a phenomenon arising in Cramér-metric-based Distributional Reinforcement Learning, referred to as confidence-driven model updates. This mechanism ensures that the value function approximator is updated more conservatively when confidence in its estimates is low. Theoretical justifications for the algorithm are provided, demonstrating its convergence in the policy evaluation setting and, under widely accepted mild assumptions, in the control setting as well. Beyond foundational algorithmic research, this thesis contributes to the practical application of RL in robotics. Given the crucial role of multi-joint robotic systems in modern production technology, a RL meta-algorithm called ”Reinforcement Learning - Inverse Kinematics” (RL-IK) is devised. This approach enhances the applicability of reinforcement learning to robotic control tasks by significantly accelerating convergence to near-optimal policies compared to standard RL methods. An essential prerequisite for real-world RL applications in control systems is machine perception for state identification. To address challenges in this field, this thesis explores novel Supervised Learning (SL) approaches, validated on image classification tasks, with a focus on ensemble learning strategies.	en
dc.language.iso	en	en_US
dc.publisher	UMA Editorial	en_US
dc.relation.isreplacedby	http://dx.doi.org/10.48441/4427.3295.2	-
dc.subject	Robótica - Tesis doctorales	en_US
dc.subject	Programación lineal	en_US
dc.subject	Aprendizaje automático (Inteligencia artificial)	en_US
dc.subject	Redes neuronales (Informática)	en_US
dc.subject	Distributional Reinforcement Learning	en_US
dc.subject	Soft Actor-Critic	en_US
dc.subject	Robotics	en_US
dc.subject	Linear Programming	en_US
dc.subject	Ensemble	en_US
dc.subject.ddc	620: Ingenieurwissenschaften	en_US
dc.title	Novel distributional reinforcement and ensemble learning algorithms	en
dc.title.alternative	Nuevos algoritmos de aprendizaje por refuerzo distributivo y aprendizaje ensamblador	es
dc.type	Thesis	en_US
dc.identifier.doi	10.48441/4427.3295	-
dcterms.dateAccepted	2025	-
dc.description.version	AlternativeReviewed	en_US
openaire.rights	info:eu-repo/semantics/openAccess	en_US
thesis.grantor.department	Universidad de Málaga. Departamento de Ingeniería mecánica, térmica y de fluidos	en_US
thesis.grantor.place	Málaga	en_US
thesis.grantor.universityOrInstitution	Universidad de Málaga	en_US
tuhh.identifier.urn	urn:nbn:de:gbv:18302-reposit-235844	-
tuhh.oai.show	true	en_US
tuhh.publication.institute	Universidad de Málaga	en_US
tuhh.publication.institute	Universidad de Málaga. Departamento de Ingeniería mecánica, térmica y de fluidos	en_US
tuhh.publication.institute	Department Maschinenbau und Produktion (ehemalig, aufgelöst 10.2025)	en_US
tuhh.publication.institute	Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)	en_US
tuhh.publisher.url	https://hdl.handle.net/10630/39287	-
tuhh.type.opus	Dissertation	-
tuhh.type.rdm	true	-
dc.rights.cc	https://creativecommons.org/licenses/by-nc-nd/4.0/	en_US
dc.type.casrai	Dissertation	-
dc.type.dini	doctoralThesis	-
dc.type.driver	doctoralThesis	-
dc.type.status	info:eu-repo/semantics/publishedVersion	en_US
dc.type.thesis	doctoralThesis	en_US
dcterms.DCMIType	Text	-
local.comment.external	Aziz, Vanya. (2025). Novel distributional reinforcement and ensemble learning algorithms, I-VII, 1-153. dissertation. UMA Editorial. https://hdl.handle.net/10630/39287	en_US
tuhh.apc.status	false	en_US
item.creatorGND	Aziz, Vanya	-
item.openairetype	Thesis	-
item.openairecristype	http://purl.org/coar/resource_type/c_46ec	-
item.languageiso639-1	en	-
item.advisorGND	Hendrix, Eligius María Theodorus	-
item.advisorGND	Nowak, Ivo	-
item.grantfulltext	open	-
item.fulltext	With Fulltext	-
item.creatorOrcid	Aziz, Vanya	-
item.cerifentitytype	Publications	-
crisitem.author.dept	Department Maschinenbau und Produktion (ehemalig, aufgelöst 10.2025)	-
crisitem.author.parentorg	Fakultät Technik und Informatik (ehemalig, aufgelöst 10.2025)	-
Appears in Collections:	Publications with full text

Files in This Item:

File	Description	Size	Format
TD_AZIZ_Vanya.pdf		3.56 MB	Adobe PDF	View/Open

Show simple item record

Google Scholar^TM

Check

HAW Katalog

Check

Note about this record

Export

Version History

Version	Item	Date	Summary
2	doi:10.48441/4427.3295.2	2026-04-02 07:58:52.153	updated version (only minor changes): 2026-03-30
1	doi:10.48441/4427.3295	2026-03-12 14:23:29.0	old version: 2025

Selected version

This item is licensed under a Creative Commons License

Files in This Item:

Google ScholarTM

HAW Katalog

Note about this record

Google Scholar^TM