Entwicklung eines Reinforcement Learning basierten Flugzeugautopiloten unter der Verwendung von Deterministic Policy Gradients

Wagner, Stefan Sylvius

DC Element	Wert	Sprache
dc.contributor.advisor	Meisel, Andreas	-
dc.contributor.author	Wagner, Stefan Sylvius
dc.date.accessioned	2020-09-29T14:34:20Z	-
dc.date.available	2020-09-29T14:34:20Z	-
dc.date.created	2018
dc.date.issued	2018-05-14
dc.identifier.uri	http://hdl.handle.net/20.500.12738/8284	-
dc.description.abstract	Einer der schwierigsten Aufgaben im Reinforcement Learning ist die Regelung von Systemen in einem kontinuierlichen Zustandsraum und die anschließende Steuerung in einem kontinuierlichen Aktionsraum. In dieser Arbeit wird ein Reinforcment Learning basierter Flugzeugautopilot konzipiert und implementiert, der einen kontinuierlichen Zustandsraum approximiert und ein Flugzeug mit Aktionen in einem kontinuierlichen Wertebereich steuert. Deterministic Policy Gradients bieten ein spezialisiertes Framework in Form einer Actor-Critic Architektur, die in der Lage ist aus einem kontinuierlichen Zustandsraum, kontinuierliche Aktionswerte zu ermitteln. Dieses Framework wird im Zusammenhang mit einer Belohnungsfunktion, die Feedback über das Verhalten des Autopiloten liefert implementiert. Um die Realisierbarkeit und Robustheit des Reinforcement Learning basierten Flugzeugautopiloten zu überprüfen werden unterschiedliche Szenarien erstellt, die anhand eines komerziellen Flugsimulators ausgeführt und anschließend statistisch analysiert werden.	de
dc.description.abstract	One of the most difficult challenges in reinforcement learning is the continuous control of systems in a continuous state and action space. This papers goal is to design and implement a reinforcement learning based airplane autopilot that controls an aircraft in continuous state and action space. Deterministic Policy Gradients define a framework for this purpose in the form of an actor-critic architecture that approximates a continuous action space and outputs a continuous action vector. The framework is accompanied by the implementation of a reward function that provides the autopilot with behavioral feedback. Finally, the feasibility and robustness of the implemented autopilot is tested inside a commercial flight simulator. For this purpose multiple scenarios are defined and the resulting data evaluated through statistical methods.	en
dc.language.iso	de	de
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	-
dc.subject.ddc	004 Informatik
dc.title	Entwicklung eines Reinforcement Learning basierten Flugzeugautopiloten unter der Verwendung von Deterministic Policy Gradients	de
dc.type	Thesis
openaire.rights	info:eu-repo/semantics/openAccess
thesis.grantor.department	Department Informatik
thesis.grantor.place	Hamburg
thesis.grantor.universityOrInstitution	Hochschule für angewandte Wissenschaften Hamburg
tuhh.contributor.referee	Fohl, Wolfgang	-
tuhh.gvk.ppn	1022193562
tuhh.identifier.urn	urn:nbn:de:gbv:18302-reposit-82869	-
tuhh.note.extern	publ-mit-pod
tuhh.note.intern	1
tuhh.oai.show	true	en_US
tuhh.opus.id	4225
tuhh.publication.institute	Department Informatik
tuhh.type.opus	Bachelor Thesis	-
dc.subject.gnd	Operante Konditionierung
dc.type.casrai	Supervised Student Publication	-
dc.type.dini	bachelorThesis	-
dc.type.driver	bachelorThesis	-
dc.type.status	info:eu-repo/semantics/publishedVersion
dc.type.thesis	bachelorThesis
dcterms.DCMIType	Text	-
tuhh.dnb.status	domain	-
item.openairetype	Thesis	-
item.cerifentitytype	Publications	-
item.creatorOrcid	Wagner, Stefan Sylvius	-
item.advisorGND	Meisel, Andreas	-
item.fulltext	With Fulltext	-
item.grantfulltext	open	-
item.languageiso639-1	de	-
item.creatorGND	Wagner, Stefan Sylvius	-
item.openairecristype	http://purl.org/coar/resource_type/c_46ec	-
Enthalten in den Sammlungen:	Theses