wmt_ai_study/asr_006.md at master · weimingtom/wmt_ai_study

20210624

-

pytorch-image-models

https://github.com/rwightman/pytorch-image-models

My Python Examples

https://github.com/geekcomputers/Python

500 + 𝗔𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗟𝗶𝘀𝘁 𝘄𝗶𝘁𝗵 𝗰𝗼𝗱𝗲

https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

AndroidEnv - The Android Learning Environment

https://github.com/deepmind/android_env

《计算机视觉实战演练：算法与应用》中文电子书、源码、读者交流社区（更新中，可以先 star）

https://github.com/Charmve/computer-vision-in-action

【自制】我把自行车做成了自动驾驶！！【硬核】

https://github.com/peng-zhihui/XUAN-Bike

TFCC is a C++ deep learning inference framework.

http://github.com/Tencent/WeChat-TFCC

灵活运用Python中numpy库的矩阵运算

http://www.elecfans.com/emb/580136.html

(IMP???) mdk keil 5 tensorflow pack, include TFLM speech example

https://github.com/MDK-Packs/tensorflow-pack

tensorflow-arduino-examples

https://github.com/antmicro/tensorflow-arduino-examples

(IMP) Create a USB Microphone with the Raspberry Pi Pico

https://www.hackster.io/sandeep-mistry/create-a-usb-microphone-with-the-raspberry-pi-pico-cc9bd5
https://github.com/sandeepmistry/pico-microphone

VCC<->VCC  
GND<->GND  
SEL<->GND  
DAT<->GPIO2  
CLK<->GPIO3

Note: Connecting PDM Mic. SEL to GND will result it clock
out new data after the clock signal falls (goes from logic level 1 to 0).

3.3V<->VCC    
GND<->GND  
GPIO26<->OUT

ref OpenPDM2PCM, see below

OpenPDM2PCM

https://os.mbed.com/teams/ST/code/X_NUCLEO_CCA02M1//file/53f8b511f2a1/Middlewares/OpenPDM2PCM/

Learn how to responsibly deliver value with ML.

https://github.com/GokuMohandas/MadeWithML

Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

https://github.com/jina-ai/jina

A repository to index and organize the latest machine learning courses found on YouTube.

https://github.com/dair-ai/ML-YouTube-Courses

ESP32 One (like ESP-EYE), waveshare

https://www.waveshare.net/wiki/ESP32_One
https://github.com/espressif/esp-who
https://github.com/espressif/esp-who/blob/master/docs/zh_CN/get-started/ESP-EYE_Getting_Started_Guide.md
say Hi Lexin (你好, 乐鑫)
https://github.com/espressif/esp-who/blob/master/examples/single_chip/face_recognition_solution/README.md
https://dl.espressif.com/dl/Hi_Lexin_wake-up_commend.wav
https://github.com/espressif/esp-who/blob/master/examples/single_chip/face_recognition_solution/main/app_main.c
https://github.com/espressif/esp-skainet/blob/master/examples/garbage_classification/README_cn.md

EMW3162, nuttx RTOS

https://github.com/apache/incubator-nuttx/blob/fd46d7a74fda13d6b4eb5505bc0689f3a2b78d5e/boards/Kconfig
https://www.waveshare.com/wiki/EMW3162

PaddleX -- 飞桨全流程开发工具

https://github.com/PaddlePaddle/PaddleX

xr872 tflm

https://gitee.com/tutuwin/xr872_-audio
search baidupan

google streaming kws

https://github.com/google-research/google-research/tree/master/kws_streaming
https://github.com/StuartIanNaylor/g-kws
stream_kws_cnn
Streaming keyword spotting on mobile devices
https://blog.csdn.net/weixin_37598106/article/details/106801481

(IMP) Edge Impulse

https://github.com/awih97/Custom-KWS-for-STM32-using-Edge-Impulse
https://github.com/awih97/Custom-KWS-for-STM32-using-Edge-Impulse/blob/main/Home-Automation-traininig/README.md
Custom-KWS-for-STM32-using-Edge-Impulse, STM32F446RE, INMP441
Edge Impulse
https://docs.edgeimpulse.com/docs/getting-started
https://github.com/smlee00/STM32-Keyword-Spotting-with-Edge-Impulse
https://github.com/smlee00/STM32-Keyword-Spotting-with-Edge-Impulse/blob/main/nucleo-f446-ei-kws/ei-keyword-spotting/model-parameters/model_metadata.h
ei_classifier_inferencing_categories[] = { "_noise", "_unknown", "bed", "marvin", "off", "on" };
https://github.com/smlee00/STM32-Keyword-Spotting-with-Edge-Impulse/blob/main/nucleo-f446-ei-kws/ei-keyword-spotting/model-parameters/dsp_blocks.h
extract_mfcc_features
https://github.com/smlee00/STM32-Keyword-Spotting-with-Edge-Impulse/blob/main/nucleo-f446-ei-kws/ei-keyword-spotting/edge-impulse-sdk/classifier/ei_run_classifier.h
run_classifier
run_classifier_continuous (猜测这个可能是滑动窗口版本，会连续出现多次命中，需要判断消除）
https://cdn.edgeimpulse.com/datasets/keywords2.zip
https://docs.edgeimpulse.com/docs/keyword-spotting
official prebuilt dataset for ASR
https://docs.edgeimpulse.com/docs/running-your-impulse-locally
（linux简单命令行版）如何在linux环境下编译运行推理代码的命令行，需要在参数中提供features.txt参数文件才能运行
把下载的文件混合到：https://github.com/edgeimpulse/example-standalone-inferencing
（linux复杂版）see https://github.com/edgeimpulse/example-standalone-inferencing-linux
https://docs.edgeimpulse.com/docs/using-cubeai

Clone a voice in 5 seconds to generate arbitrary speech in real-time

https://github.com/CorentinJ/Real-Time-Voice-Cloning

Aodie KAZOO：树莓派Zero音频播放器, Raspberry Pi 配件店, UGEEK

https://github.com/u-geek/AOIDE_KAZOO
https://github.com/u-geek/st7789-python

(IMP???) RaspiVoiceHAT

https://github.com/u-geek/RaspiVoiceHAT
https://github.com/u-geek/RaspiVoiceHAT/blob/main/examples/record.py
http://ukonline2000.com/?p=1207
https://github.com/weimingtom/wmt_ai_study/blob/master/RaspiVoiceHAT_001.txt

(IMP) aicontroler, C语言编写的基于百度语音识别、语音合成和图灵机器人的智能语音控制中心

https://gitee.com/withsalt/aicontroler
https://github.com/jjzhang166/aicontroler
树莓派开发板的智能语音控制程序
https://cloud.tencent.com/developer/article/1475526
gitee上有一个古老（大概2017年）的树莓派语音识别项目，用C实现（罕见，工作量大，难写），
智能语音控制中心：withsalt/aicontroler
感觉用Python会更好（根据作者说法，是为了移植到Nanopi和OrangePi），
因为现在已经很少有人用C写树莓派的硬件驱动代码。。。不过有一些用，如果想用C语言做产品，
可以参考一下算法（算法好像在record.c，简单的VAD阈值判断截取有声音的部分）。
网上还有篇相关的介绍，《树莓派开发板的智能语音控制程序》 https://github.com/jjzhang166/aicontroler/blob/master/aicontroler/src/record.c

基于LD3320和MSP430语音识别智能家居设计

http://www.eepw.com.cn/article/200760.htm
why use MMD ???

(video, open course) 阿里云智能语音交互技术实战

https://edu.csdn.net/course/detail/4514

(Too old, may be not usable) 科大讯飞qtts.h

https://github.com/ferstar/tts_sample

(video, open course) 云从科技：详解CNN-pFSMN模型以及在语音识别中的应用

https://edu.csdn.net/course/detail/10195

(TODO???) xinbanben.rar

基于MSP430单片机，用workbench环境，基于LPCC算法，实现简单的语音识别。
http://www.verysource.com/item/xinbanben_rar-723699.html
search baidupan, xinbanben.rar (not origin, downloaded with hand)

基于MSP430F449单片机开发板的数码录音机设计与实现

https://www.dianyuan.com/upload/community/2014/01/23/1390462614-34427.pdf

新唐M480系列, M487JIDAE, NuMaker-PFM-M487开发板

https://github.com/OpenNuvoton/M480BSP/blob/master/SampleCode/StdDriver/SPII2S_PDMA_PlayRecord/main.c
32mcu

Android Neural Networks API (NNAPI) 是一个 Android C API

https://developer.android.google.cn/ndk/guides/neuralnetworks/
https://github.com/android/ndk-samples/blob/main/nn-samples/basic/src/main/cpp/simple_model.cpp

PPLNN, A primitive library for neural network

https://github.com/openppl-public/ppl.nn

(IMP) NVIDIA NeMo is a conversational AI toolkit built ...

...for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech synthesis (TTS)
https://github.com/NVIDIA/NeMo/releases/tag/v1.1.0
https://developer.nvidia.com/nvidia-nemo
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html

【新书：设备端AI与机器学习开发指南】

《AI and Machine Learning for On-Device Development [Book]》by Laurence Moroney (O'Reilly 2021)
https://www.oreilly.com/library/view/ai-and-machine/9781098101732/

Machine Learning for Beginners - A Curriculum

https://github.com/microsoft/ML-For-Beginners

(IMP) 加窗谱估计

https://baike.baidu.com/item/加窗谱估计/22685832?fr=aladdin
(IMP, TODO) audacity, 菜单, 频率分析
其实audacity这个软件很好用的（除了不能导出数值文本，不知道以后会不会加这个功能），我打算后面用它来做numpy的fft频谱分析实验，不知道可不可靠，现在还很难说