| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Taefi, Tessa | - |
| dc.contributor.author | Zach, Sophie | - |
| dc.date.accessioned | 2026-06-02T08:23:12Z | - |
| dc.date.available | 2026-06-02T08:23:12Z | - |
| dc.date.issued | 2025-07-10 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.12738/19399 | - |
| dc.description.abstract | This thesis investigates the influence of training language on the performance of handwriting text recognition (HTR) models. Two separate Vision Transformer-based models were trained using datasets in different languages, one with English data (IAM dataset), and another with German data (fhswf/german_handwriting). Both models were evaluated on their native test sets as well as on a cross-lingual test set to assess generalization and linguistic robustness. Quantitative evaluation using Character Error Rate (CER) and Word Error Rate (WER) shows a clear degradation in recognition performance when models are tested on a language different from their training set. This highlights the sensitivity of HTR models to language-specific features, even when based on language-agnostic decoding mechanisms like Connectionist Temporal Classification (CTC). A qualitative error analysis was conducted to illustrate how specific types of language-dependent character sequences contribute to recognition failures. Furthermore, a pipeline for n-gram-based error attribution on character level was implemented to explore whether misrecognitions correlate with language-dominant character patterns. Although the n-gram analysis could not be fully utilized due to insufficient cross-lingual performance, the results were discussed in the Appendix and the implemented tools remain available for future experimentation. The findings underscore the need for either multilingual training strategies or language-specific adaptation in practical HTR systems. The code is publicly available at: https://github.com/Mir0da/HTR-VT_Bachelor The german trained model is available at: https://huggingface.co/Mir0da/HTR-VT-german The english trained model is available at: https://huggingface.co/Mir0da/HTR-VT-english | en |
| dc.language.iso | en | en_US |
| dc.subject.ddc | 004: Informatik | en_US |
| dc.title | Comparison of language-specific HTR models : “Does the language of the training corpus affect the performance of a handwritten text recognition (HTR) model on crosslingual settings?” | en |
| dc.type | Thesis | en_US |
| openaire.rights | info:eu-repo/semantics/openAccess | en_US |
| thesis.grantor.department | Fakultät Design, Medien und Information (ehemalig, aufgelöst 10.2025) | en_US |
| thesis.grantor.department | Department Medientechnik (ehemalig, aufgelöst 10.2025) | en_US |
| thesis.grantor.universityOrInstitution | Hochschule für Angewandte Wissenschaften Hamburg | en_US |
| tuhh.contributor.referee | Schumann, Sabine | - |
| tuhh.identifier.urn | urn:nbn:de:gbv:18302-reposit-240720 | - |
| tuhh.oai.show | true | en_US |
| tuhh.publication.institute | Fakultät Design, Medien und Information (ehemalig, aufgelöst 10.2025) | en_US |
| tuhh.publication.institute | Department Medientechnik (ehemalig, aufgelöst 10.2025) | en_US |
| tuhh.type.opus | Bachelor Thesis | - |
| dc.type.casrai | Supervised Student Publication | - |
| dc.type.dini | bachelorThesis | - |
| dc.type.driver | bachelorThesis | - |
| dc.type.status | info:eu-repo/semantics/publishedVersion | en_US |
| dc.type.thesis | bachelorThesis | en_US |
| dcterms.DCMIType | Text | - |
| tuhh.dnb.status | domain | en_US |
| item.openairecristype | http://purl.org/coar/resource_type/c_46ec | - |
| item.cerifentitytype | Publications | - |
| item.openairetype | Thesis | - |
| item.fulltext | With Fulltext | - |
| item.creatorGND | Zach, Sophie | - |
| item.grantfulltext | open | - |
| item.languageiso639-1 | en | - |
| item.creatorOrcid | Zach, Sophie | - |
| item.advisorGND | Taefi, Tessa | - |
| Appears in Collections: | Theses | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| BA_A_comparison_of_language-specific_HTR_models.pdf | 1.05 MB | Adobe PDF | View/Open |
Note about this record
Export
Items in REPOSIT are protected by copyright, with all rights reserved, unless otherwise indicated.