Skip to content

Latest commit

 

History

History
343 lines (269 loc) · 15.9 KB

File metadata and controls

343 lines (269 loc) · 15.9 KB

20210624

-

pytorch-image-models

https://github.com/rwightman/pytorch-image-models

My Python Examples

https://github.com/geekcomputers/Python

500 + 𝗔𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗟𝗶𝘀𝘁 𝘄𝗶𝘁𝗵 𝗰𝗼𝗱𝗲

https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

AndroidEnv - The Android Learning Environment

https://github.com/deepmind/android_env

《计算机视觉实战演练:算法与应用》中文电子书、源码、读者交流社区(更新中,可以先 star)

https://github.com/Charmve/computer-vision-in-action

【自制】我把自行车做成了 自 动 驾 驶 !!【硬核】

https://github.com/peng-zhihui/XUAN-Bike

TFCC is a C++ deep learning inference framework.

http://github.com/Tencent/WeChat-TFCC

灵活运用Python中numpy库的矩阵运算

http://www.elecfans.com/emb/580136.html

(IMP???) mdk keil 5 tensorflow pack, include TFLM speech example

https://github.com/MDK-Packs/tensorflow-pack

tensorflow-arduino-examples

https://github.com/antmicro/tensorflow-arduino-examples

(IMP) Create a USB Microphone with the Raspberry Pi Pico

https://www.hackster.io/sandeep-mistry/create-a-usb-microphone-with-the-raspberry-pi-pico-cc9bd5
https://github.com/sandeepmistry/pico-microphone

VCC<->VCC  
GND<->GND  
SEL<->GND  
DAT<->GPIO2  
CLK<->GPIO3  

Note: Connecting PDM Mic. SEL to GND will result it clock
out new data after the clock signal falls (goes from logic level 1 to 0).

3.3V<->VCC    
GND<->GND  
GPIO26<->OUT    

ref OpenPDM2PCM, see below

OpenPDM2PCM

https://os.mbed.com/teams/ST/code/X_NUCLEO_CCA02M1//file/53f8b511f2a1/Middlewares/OpenPDM2PCM/

Learn how to responsibly deliver value with ML.

https://github.com/GokuMohandas/MadeWithML

Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

https://github.com/jina-ai/jina

A repository to index and organize the latest machine learning courses found on YouTube.

https://github.com/dair-ai/ML-YouTube-Courses

ESP32 One (like ESP-EYE), waveshare

https://www.waveshare.net/wiki/ESP32_One
https://github.com/espressif/esp-who
https://github.com/espressif/esp-who/blob/master/docs/zh_CN/get-started/ESP-EYE_Getting_Started_Guide.md
say Hi Lexin (你好, 乐鑫)
https://github.com/espressif/esp-who/blob/master/examples/single_chip/face_recognition_solution/README.md
https://dl.espressif.com/dl/Hi_Lexin_wake-up_commend.wav
https://github.com/espressif/esp-who/blob/master/examples/single_chip/face_recognition_solution/main/app_main.c
https://github.com/espressif/esp-skainet/blob/master/examples/garbage_classification/README_cn.md

EMW3162, nuttx RTOS

https://github.com/apache/incubator-nuttx/blob/fd46d7a74fda13d6b4eb5505bc0689f3a2b78d5e/boards/Kconfig
https://www.waveshare.com/wiki/EMW3162

PaddleX -- 飞桨全流程开发工具

https://github.com/PaddlePaddle/PaddleX

xr872 tflm

https://gitee.com/tutuwin/xr872_-audio
search baidupan

google streaming kws

https://github.com/google-research/google-research/tree/master/kws_streaming
https://github.com/StuartIanNaylor/g-kws
stream_kws_cnn
Streaming keyword spotting on mobile devices
https://blog.csdn.net/weixin_37598106/article/details/106801481

(IMP) Edge Impulse

Clone a voice in 5 seconds to generate arbitrary speech in real-time

https://github.com/CorentinJ/Real-Time-Voice-Cloning

Aodie KAZOO: 树莓派Zero音频播放器, Raspberry Pi 配件店, UGEEK

https://github.com/u-geek/AOIDE_KAZOO
https://github.com/u-geek/st7789-python

(IMP???) RaspiVoiceHAT

https://github.com/u-geek/RaspiVoiceHAT
https://github.com/u-geek/RaspiVoiceHAT/blob/main/examples/record.py
http://ukonline2000.com/?p=1207
https://github.com/weimingtom/wmt_ai_study/blob/master/RaspiVoiceHAT_001.txt

(IMP) aicontroler, C语言编写的基于百度语音识别、语音合成和图灵机器人的智能语音控制中心

https://gitee.com/withsalt/aicontroler
https://github.com/jjzhang166/aicontroler
树莓派开发板的智能语音控制程序
https://cloud.tencent.com/developer/article/1475526
gitee上有一个古老(大概2017年)的树莓派语音识别项目,用C实现(罕见,工作量大,难写),
智能语音控制中心:withsalt/aicontroler
感觉用Python会更好(根据作者说法,是为了移植到Nanopi和OrangePi),
因为现在已经很少有人用C写树莓派的硬件驱动代码。。。不过有一些用,如果想用C语言做产品,
可以参考一下算法(算法好像在record.c,简单的VAD阈值判断截取有声音的部分)。
网上还有篇相关的介绍,《树莓派开发板的智能语音控制程序》 https://github.com/jjzhang166/aicontroler/blob/master/aicontroler/src/record.c

基于LD3320和MSP430语音识别智能家居设计

http://www.eepw.com.cn/article/200760.htm
why use MMD ???

(video, open course) 阿里云智能语音交互技术实战

https://edu.csdn.net/course/detail/4514

(Too old, may be not usable) 科大讯飞qtts.h

https://github.com/ferstar/tts_sample

(video, open course) 云从科技:详解CNN-pFSMN模型以及在语音识别中的应用

https://edu.csdn.net/course/detail/10195

(TODO???) xinbanben.rar

基于MSP430单片机,用workbench环境,基于LPCC算法,实现简单的语音识别。
http://www.verysource.com/item/xinbanben_rar-723699.html
search baidupan, xinbanben.rar (not origin, downloaded with hand)

基于MSP430F449单片机开发板的数码录音机设计与实现

https://www.dianyuan.com/upload/community/2014/01/23/1390462614-34427.pdf

新唐M480系列, M487JIDAE, NuMaker-PFM-M487开发板

https://github.com/OpenNuvoton/M480BSP/blob/master/SampleCode/StdDriver/SPII2S_PDMA_PlayRecord/main.c
32mcu

Android Neural Networks API (NNAPI) 是一个 Android C API

https://developer.android.google.cn/ndk/guides/neuralnetworks/
https://github.com/android/ndk-samples/blob/main/nn-samples/basic/src/main/cpp/simple_model.cpp

PPLNN, A primitive library for neural network

https://github.com/openppl-public/ppl.nn

(IMP) NVIDIA NeMo is a conversational AI toolkit built ...

...for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech synthesis (TTS)
https://github.com/NVIDIA/NeMo/releases/tag/v1.1.0
https://developer.nvidia.com/nvidia-nemo
https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/starthere/tutorials.html

【新书:设备端AI与机器学习开发指南】

《AI and Machine Learning for On-Device Development [Book]》by Laurence Moroney (O'Reilly 2021)
https://www.oreilly.com/library/view/ai-and-machine/9781098101732/

Machine Learning for Beginners - A Curriculum

https://github.com/microsoft/ML-For-Beginners

(IMP) 加窗谱估计

https://baike.baidu.com/item/加窗谱估计/22685832?fr=aladdin
(IMP, TODO) audacity, 菜单, 频率分析
其实audacity这个软件很好用的(除了不能导出数值文本,不知道以后会不会加这个功能),我打算后面用它来做numpy的fft频谱分析实验,不知道可不可靠,现在还很难说

Implementation of popular deep learning networks with TensorRT network definition API

https://github.com/wang-xinyu/tensorrtx

(IMP) A wifi webradio with only low cost boards ESP8266 and VS1053 by Jp Cocatrix

https://github.com/karawin/Ka-Radio
https://github.com/karawin/Ka-Radio32
https://github.com/karawin/karaClass

SparQL

https://baike.baidu.com/item/SparQL/10956405?fr=aladdin
知识图谱的自然语言查询和关键词查询 SPARQL 自然语言语义关系识别 自然语言处理机器学习数据工程师书

(IMP???) 云的智能, USB 4麦克风阵列 阵列麦克风 免驱动可用于android linux windows

https://item.taobao.com/item.htm?id=625080479278
https://github.com/jim2meng/dueros/tree/master/example
https://github.com/voice-engine/voice-engine
https://github.com/jim2meng/dueros/wiki/Alexa-,DuerOs-配合PSound-AI-Card使用方法

(IMP) apollo-DuerOS, FM1388, CarLife+, 车联网

https://github.com/ApolloAuto/apollo-DuerOS
FM1288开发板套件
http://www.chuntek.com/content/yuyinmokuaizhuanti/94.html

YOLOv5 rocket is a family of object detection architectures and models

https://github.com/ultralytics/yolov5

(IMP???) MNIST inference on i.MT RT1062 (Teensy 4.0) using TensorFlow Lite for Microcontrollers

https://github.com/dimtass/imxrt1062-tflite-micro-mnist

(IMP???) 智能电动床, 语音控制模块:LD3320,提供基于关键词语列表的语音识别

https://github.com/zhupingqi/SmartBedAli

The minimal opencv for Android, iOS, ARM Linux, Windows, Linux, MacOS, WebAssembly

https://github.com/nihui/opencv-mobile

(IMP) 野火, vs1053

http://doc.embedfire.com/products/link/zh/latest/module/audio/mp3_vs1053.html
search baidupan, F103_指南者开发板.rar, STM32F103VE

(IMP) 正点原子, vs1053

http://www.openedv.com/docs/modules/other/VS1053.html
search baidupan, 【正点原子】MP3音乐模块VS1053

(貌似是大学生的报告,没有代码)...但我们探索研究并构建了一个mfcc+VGG16(with Densenet)模型(虽然我们并没有将其用于最终的项目中)

https://github.com/info-ruc/ai20projects/blob/master/2018202064/实验报告/实验报告.md
实际上是个OCR的项目报告,ASR也是用OCR的思路做

(NLP) 中国AI研究新突破:「盘古α」发布

https://zhuanlan.zhihu.com/p/368261642
2000亿开源中文预训练语言模型「鹏程·盘古α」
https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-Alpha
在线推理服务测试版(吐槽:测试过似乎有些NG,例如问中国第一个皇帝是谁,答案有误)
https://pangu-alpha.openi.org.cn/index_params3_usingJiebaPrint_addLoginTimeout_hu.html
OpenI 启智 新一代人工智能开源开放平台
https://openi.org.cn

(IMP) PyTorch torchaudio教程, M5 CNN

https://www.w3cschool.cn/pytorch/pytorch-52d93bnc.html
使用torchaudio的语音命令识别
https://github.com/Estom/notes/blob/master/pytorch/官方教程/25.md
https://pytorch.org/tutorials/intermediate/speech_command_recognition_with_torchaudio.html
M5 CNN (LeNet-5? VGGNet-16?) https://arxiv.org/pdf/1610.00087.pdf
https://pytorch.org/tutorials/intermediate/speech_command_recognition_with_torchaudio.html
mfcc, vincentqb/audio-tutorial
https://github.com/vincentqb/audio-tutorial/blob/f54c3665a8134d81302f9658336cfa8b33791729/PipelineTrain.py
https://jovian.ai/gtaljaard/z2g-speech-command-recognition-with-m3-m5-m11-m18-cnn-networks-v0-4-6
sesach baidupan,m5_cnn.docx
常见CNN网络结构的详解和代码实现
https://blog.csdn.net/u012897374/article/details/79199935

VGG2, speechVGG

https://github.com/zashin-AI/project/blob/f7b0420aaee41d9923c6ac61d98b4044ee8c4b9b/Speaker_Recognition_Model/Transfer%20Learning/speech_vgg2.py
https://github.com/zhongyuchen/speech-classification

audionet

https://github.com/jingxianWang9401/speech-recognition/blob/ea116c40fc6caf264d420347b090e08c648bd7c8/method5/AudioNet-V1-master/AudioNet32.py

(AIoT) Pyaiot, connecting nodes to the web

https://github.com/humminglab/pyaiot

IoT-For-Beginners

https://github.com/microsoft/IoT-For-Beginners

Baidu B-L475E-IOT01 BSP, iot-edge-sdk-samples

https://github.com/baidu/baidu-iot-samples
STM32和百度云-天工最新物联网开发板,B-L475E-IOT01A探索套件操作说明
https://blog.csdn.net/annic9/article/details/80434389
(IMP) 音频输入输出API
https://github.com/baidu/baidu-iot-samples/blob/master/STM32/I-CUBE-BAIDU/Drivers/BSP/STM32L496G-Discovery/stm32l496g_discovery_audio.h

小米AI实验室声学团队 获婴儿啼哭声识别的挑战赛任务第一名

http://www.elecfans.com/d/1378886.html
gx8008
http://www.nationalchip.com/index.php/products/info/12

机器学习实战(Python3):kNN、决策树、贝叶斯、逻辑回归、SVM、线性回归、树回归

https://github.com/Jack-Cherish/Machine-Learning

(IMP???) kws for Andriod

https://github.com/shenpengjie/Voice_WakeUp-master
search voice_wakeup

sndkit is a sonic toolkit for everyone. It is a collection of DSP algorithms written in a literate style placed in the public domain

https://github.com/PaulBatchelor/sndkit

Soundpipe is a lightweight music DSP library for musicians and creative coders.

https://pbat.ch/proj/soundpipe.html

LiveKit - Open source, distributed video/audio rooms over WebRTC

https://github.com/livekit/livekit-server

respeaker

https://github.com/respeaker/mic_array

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

https://github.com/facebookresearch/ParlAI

快速傅里叶变换:算法与应用

https://baike.baidu.com/item/快速傅里叶变换:算法与应用/7835536?fr=aladdin

斑梨电子

http://spotpear.cn/public/index/study/index.html

C++科学库, gsl

http://www.360doc.com/content/17/1228/21/7378868_717225755.shtml
目前号称有三大库支持科学计算,它们是GNU的gsl,blitz++以及MTL,
GSL快速傅里叶变换FFT
https://blog.csdn.net/Augusdi/article/details/9982919
数学计算库
https://blog.csdn.net/shichaog/article/details/80762367

(IMP???) Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

https://github.com/dvgodoy/PyTorchStepByStep

digital signal processing using the arm cortex-m4 pdf

https://download.csdn.net/download/wawahust/9893809?utm_source=bbsseo

ifttt, 触发你的智能生活:IFTTT 入门

https://sspai.com/post/25270

(IMP???) Vision AI Developer Kit for Audio Processing

https://github.com/ksaye/vision-ai-developer-kit-audio/tree/master/documentation
https://www.adlinktech.com/en/Vizi-AI-devkit
Vizi-AI LEC-AL-E3940-AI-4G-32G/US
Mouser Part No
976-VIZIAI-4G-32G/US

音声合成エンジン

https://github.com/Hiroshiba/voicevox
https://github.com/Hiroshiba/voicevox_engine