Use TensorFlowJS via WebAudio API and WebGL GPU acceleration on Browser to recognize “keywords”. In our case, without retraining “UP” turns on LED and “DOWN” turns it off.

FFT on ESP32

GitHub: debsahu/SpeechRecognitionTensorFlowJS

Speech Recognition on Browser, AsyncWebServer served on ESP8266 to control LED_BUILTIN/GPIO16

  • Uses WebAudio API and WebGL GPU acceleration = speech recognition is done on the browser
  • http:// requests for microphone is blocked for chrome, use firefox instead
  • tf.min.js and speech-commands.min.js served from SPIFFs (1MB Program/3MB SPIFFs partition needed)
  • /upload and /update is a morden world’s take on updates to ESP8266
  • Uses HTML templates to report LED_BUILTIN/GPIO16 status
  • Speech recognition: “UP” = ON and “DOWN” = OFF, “RIGHT” and “LEFT” ignored

Arduino Libraries needed

platformio.ini is included, use PlatformIO and it will take care of installing the following libraries.

ESP Async WebServer