Title: | Vorhersage eines Taxifahrpreises in New York City mit maschinellen Lernverfahren | Language: | German | Authors: | Cao, Thu-Bao | Keywords: | Machine Learning; KDD; Logistic Regression; Random Forest; Yellow Taxi Dataset; Imbalance Dataset | Issue Date: | 27-Sep-2023 | Abstract: | In dieser Studie werden die Probleme und Methoden zur Steigerung der Datenqualität eines Realdatensatzes New York City Yellow Cab aufgezeigt. Es wird diskutiert, welcheexternen Daten und neuen Features durch das Feature Engineering für die Vorhersage des Taxifahrpreises in New York City relevant sind. Schließlich werden verschiedene Machine-Learning-Algorithmen und Versionen des Trainingdatensatzes getestet, um deren Auswirkungen auf die Vorhersageleistung gegeneinander zu evaluieren. This study shows the problems and methods for increasing the data quality of a real data set New York City Yellow Cab. It is discussed which external data and new features through feature engineering are relevant for predicting the taxi fare in New York City. Finally, different machine learning algorithms and different versions of the training data set are tested in order to evaluate their effects on the prediction performance against each other. |
URI: | http://hdl.handle.net/20.500.12738/14207 | Institute: | Department Informatik Fakultät Technik und Informatik |
Type: | Thesis | Thesis type: | Bachelor Thesis | Advisor: | von Luck, Kai | Referee: | Tiedemann, Tim |
Appears in Collections: | Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Bachelorarbeit_Cao_Thu-Bao_geschwärzt.pdf | 4.65 MB | Adobe PDF | View/Open |
Note about this record
Export
Items in REPOSIT are protected by copyright, with all rights reserved, unless otherwise indicated.