Utterance-level Aggregation For Speaker Recognition In The Wild笔记
论文链接:https://arxiv.org/abs/1902.10107v1
开源代码:http://www.robots.ox.ac.uk/~vgg/research/speakerID/
网络结构
- 输入:每帧257维向量,256维的频率量+1维的DC量
- 主干网络:Thin-ResNet,提取frame-level特征
- NetVLAD或GhostVLAD层:将frame-level的特征转换成utterance-level特征。大多数算法是采用Average pooling层直接对帧维度进行平均,这样做的缺点是每帧的weight是一样的,但是实际上每帧对结果的contribution肯定是不一样的,比如有说话的帧肯定比没说话帧的contribution高,本文采用的方法其实是自动学习给予每帧不同的权重。
- trainning loss:标准的softmax loss和additive margin softmax(AM-Softmax)
Utterance-level Aggregation For Speaker Recognition In The Wild笔记相关推荐
- Utterance-Level Aggregation For Speaker Recognition In The Wild
本文使用NetVLAD,将frame-level聚合为utterance-level. in the wild: 4s以上的语音 实现流程 将通过Thin ResNet的frame-level通过Ne ...
- Within-sample variability-invariant loss for robust speaker recognition under noisy environments
Within-sample variability-invariant loss for robust speaker recognition under noisy environments 标题: ...
- ICASSP 2019----Analysis and Mitigation of Vocal Effort Variations in Speaker Recognition
Mahesh Kumar Nandwana1 , Mitchell McLaren1 , Luciana Ferrer2 , Diego Castan1 , Aaron Lawson1 1,Speec ...
- Speaker Recognition: Gaussian probabilistic LDA (PLDA)理解
"MSR Identity Toolbox"里使用到了G-PLDA(Gaussian probabilistic LDA). 根据文献[1]对G-PLDA的原理进行了初步的了解,记 ...
- Speaker Recognition: Feature Extraction
1. Short-Term Spectral Features 常用的有MFCC, LPCC, LSF, PLP.实际应用中,如何选择哪个特征参数,重要性不如如何做好channel compensat ...
- Speaker Recognition: GMM-UBM
1. WHY --- 为什么需要使用GMM-UBM来建立Individual Speaker Modeling? "Usually, we do not have much data fro ...
- voxsrc20_std_00-How many kinds of topology used in speaker recognition?
ID = voxsrc20_std_00 Status: closed Content Topic Study record [200711] VoxSRC19 Reference Topic How ...
- 【论文学习】《Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems》
<Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems>论文学习 文章目录 <Who is Real ...
- END-TO-END DNN BASED SPEAKER RECOGNITION INSPIRED BY I-VECTOR AND PLDA
END-TO-END DNN BASED SPEAKER RECOGNITION INSPIRED BY I-VECTOR AND PLDA Johan Rohdin, Anna Silnova, M ...
最新文章
- 她在博士阶段破釜沉舟转换研究方向后,发表了32篇SCI
- MaxCompute大数据实践,电商数据仓库选择雪花还是星型模型?
- SAP CDS view权限控制实现原理介绍
- Java –远景JDK 8
- [jQuery] 根据表单的不同参数跳转不同的链接
- java中doloop语句_Java中的do-while循环——通过示例学习Java编程(11)
- [ARM-Linux开发] 主设备号--驱动模块与设备节点联系的纽带
- 逆天的GPT-2居然还能写代码(但OpenAI却被无情吐槽)
- HAProxy安装和配置大全
- php简单使用shmop函数创建共享内存减少服务器负载
- [Flex]浅析Mate flex framework在实际项目中的应用(二)
- XGBoost和GBDT的区别与联系
- java adt教程_用Eclipse安装ADT插件搭建Android环境(图文)
- 完整的蓝屏错误代码大全详解
- CHM电子书木马制作攻略
- Spring AOP之动态代理方式
- 27_Pandas按星期,月份,季度和年份的天计算时间序列数据的总计和平均值
- DSP GPIO端口操作
- 呕心沥血大放血,今天小企鹅来给大家送福利了!!!Mac.Win.Lin虚拟机映像/资源超全[分享]
- seaweedfs报存储错误