Implementation of a Speech-command-interface on Microcontroller with TinyML

Pham, Duy Anh

Titel:	Implementation of a Speech-command-interface on Microcontroller with TinyML
Sprache:	Englisch
Autorenschaft:	Pham, Duy Anh
Schlagwörter:	Maschinelles Learnen; Deep Learning; Eingebettetes System; Spracherkennung; Eingebettetes KI; Machine learning; Embedded System; Voice Recognition; Embedded ML
Erscheinungsdatum:	5-Apr-2024
Zusammenfassung:	TinyML ist die neue Technologie, die die Implementierung und Bereitstellung von Maschinelles Learnen auf eingebetteten Systemen, insbesondere Mikrocontrollersystemen, ermöglicht. Das Kernstück einer TinyML-Anwendung ist die Inferenz-API, die auf dem TensorFlow Lite/Mikrokernel basiert. Diese Arbeit ist eine experimentelle Implementierung einer Sprachbefehlsschnittstelle auf einem Mikrocontroller. Das implementierte ML-Modell verwendet das MFCC als Sprachmerkmal, weil es häufig verwendet wird und sich in vielen Anwendungen als effektiv erwiesen hat. Anstelle eines Standard-CNN-Modells mit 2D-Faltungsfiltern wird der 1D-Faltungsoperator zum Extrahieren von Informationen aus Eingaben verwendet, da diese Methode dazu beiträgt, die Modellgröße noch weiter zu reduzieren, ohne viel Leistung zu verlieren. Am Ende wird ein winziges 1D-Conv-Modell geschaffen, das einen minimalen RAM-Verbrauch von 13, 8kB hat. Das SCI ist als individuelles Sprachverarbeitungsmodul konzipiert, sodass es über eine serielle Kommunikation oder UART mit dem AT-Befehl als Anwendungsnachrichtenprotokoll mit dem externen Hostsystem verbunden ist. TinyML is the new technology that enables the implementation and deployment of ML on embedded systems, particularly microcontroller systems. The core part of a TinyML application is the inference API built upon the TensorFlow Lite/micro-kernel. This document is an experimental implementation of a speech-command interface on a microcontroller. The implemented ML model uses the MFCC as speech features as it is commonly used and proven to be effective in many applications. Instead of a standard CNN model using 2D convolutional filters, the 1D convolution operator is applied for extracting information from inputs since this method helps to reduce the model size even more without losing much performance. In the end, we have achieved a tiny 1D-Conv model consuming minimal RAM usage of 13, 8kB. The SCI is designed as an individual speech processing module, interfacing with the external host system through a serial communication or UART with the AT command as the application message protocol.
URI:	http://hdl.handle.net/20.500.12738/15406
Einrichtung:	Fakultät Technik und Informatik Department Fahrzeugtechnik und Flugzeugbau
Dokumenttyp:	Abschlussarbeit
Abschlussarbeitentyp:	Bachelorarbeit
Hauptgutachter*in:	Meisel, Andreas
Gutachter*in der Arbeit:	Dahlkemper, Jörg
Enthalten in den Sammlungen:	Theses