KDD2021| 工业界搜推广nlp论文整理
本文整理了KDD21的Accepted Papers[1]中,工业界在搜索、推荐、广告、nlp上的文章。整理的论文列表比较偏个人口味,选取的方式是根据论文作者列表上看是否是公司主导的,但判断比较偏主观,存在漏掉的可能。整理的方式主要按照公司和方向来划分,排名不计先后顺序。
1. 按照方向分类
主要挑选了一些笔者比较感兴趣的方向,并整理了对应的文章名称。读者可以大致读一下文章名,判断是否和自己的研究方向或工作方向一致,从中选择感兴趣的文章进行精读。
1.1 推荐系统
1.1.1 样本
涉及到采样、负样本等。
Google: Bootstrapping for Batch Active Sampling
Google: Bootstrapping Recommendations at Chrome Web Store
Alibaba:Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling
1.1.2 表征学习
Google: Learning to Embed Categorical Features without Embedding Tables for Recommendation
华为:An Embedding Learning Framework for Numerical Features in CTR Prediction
腾讯:Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value
阿里:Representation Learning for Predicting Customer Orders
1.1.3 跨域推荐
阿里:Debiasing Learning based Cross-domain Recommendation
腾讯:Adversarial Feature Translation for Multi-domain Recommendation
1.1.4 纠偏
阿里:Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems
阿里:Debiasing Learning based Cross-domain Recommendation
1.1.5 图神经网络
华为:Dual Graph enhanced Embedding Neural Network for CTR Prediction
美团:Signed Graph Neural Network with Latent Groups
阿里:DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction
百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal
1.1.6 多任务学习
Google:Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning
美团:Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition
百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal
1.1.7 多模态/短视频推荐
阿里:SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations
1.1.8 知识图谱
Microsoft:Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning
1.1.9 推荐系统架构
Facebook:Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism
Facebook:Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters
阿里,FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters
腾讯,Large-Scale Network Embedding in Apache Spark
Microsoft,On Post-Selection Inference in A/B Testing
1.2 搜索
1.2.1 向量检索
阿里:Embedding-based Product Retrieval in Taobao Search
1.2.2 查询/内容理解
Facebook:Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook
1.2.3 概念图谱
阿里巴巴:AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba
阿里巴巴:AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce
1.2.4 预训练
百度:Pretrained Language Models for Web-scale Retrieval in Baidu Search
微软:Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
1.2.5 Query改写/自动补全
微软:Diversity driven Query Rewriting in Search Advertising
百度:Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps
1.2.6 图神经网络
百度:HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps
1.2.7 多模态
Google: Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries
Facebook:VisRel: Media Search at Scale
1.2.8 边缘计算
阿里:FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data
1.2.9 搜索引擎架构
百度:Norm Adjusted Proximity Graph for Fast Inner Product Retrieval
百度:JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
1.3 广告
这一块文章不是很多,就不细分了。
Google: Clustering for Private Interest-based Advertising
阿里:A Unified Solution to Constrained Bidding in Online Display Advertising
阿里:Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
阿里:Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising
阿里:We Know What You Want: An Advertising Strategy Recommender System for Online Advertising
1.4 NLP
1.4.1 预训练
微软:NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining
微软:TUTA: Tree-based Transformers for Generally Structured Table Pre-training
1.4.2 命名实体识别
微软:Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
1.4.3 少样本学习
微软:Generalized Zero-Shot Extreme Multi-label Learning
微软:Zero-shot Multi-lingual Interrogative Question Generation for "People Also Ask" at Bing
1.4.4 摘要
微软:Reinforcing Pretrained Models for Generating Attractive Text Advertisements
1.4.5 意图识别
阿里:MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning
1.4.6 多模态
阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining
2.按照公司分类
2.1 Google
Learning to Embed Categorical Features without Embedding Tables for Recommendation
NewsEmbed: Modeling News through Pre-trained Document Representations
Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning
Bootstrapping for Batch Active Sampling
Bootstrapping Recommendations at Chrome Web Store
Clustering for Private Interest-based Advertising
Dynamic Language Models for Continuously Evolving Content
Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries
On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition
2.2 Facebook
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism
Preference Amplification in Recommender Systems
Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters
Network Experimentation at Scale
Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook
VisRel: Media Search at Scale
Balancing Consistency and Disparity in Network Alignment
2.3 Microsoft
Generalized Zero-Shot Extreme Multi-label Learning
Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning
Table2Charts: Recommending Charts by Learning Shared Table Representations
TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data
TUTA: Tree-based Transformers for Generally Structured Table Pre-training
Contextual Bandit Applications in a Customer Support Bot
Diversity driven Query Rewriting in Search Advertising
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
On Post-Selection Inference in A/B Testing
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
Reinforcing Pretrained Models for Generating Attractive Text Advertisements
Zero-shot Multi-lingual Interrogative Question Generation for "People Also Ask" at Bing
2.4 阿里
A Unified Solution to Constrained Bidding in Online Display Advertising
AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba
AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce
Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems
Debiasing Learning based Cross-domain Recommendation
Device-Cloud Collaborative Learning for Recommendation
Deep Inclusion Relation-aware Network for User Response Prediction at Fliggy
DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction
Dual Attentive Sequential Learning for Cross-Domain Click-Through Rate Prediction
Embedding-based Product Retrieval in Taobao Search
Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data
FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters
Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection
Live-Streaming Fraud Detection: A Heterogeneous Graph Neural Network Approach
M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining
Markdowns in E-Commerce Fresh Retail: A Counterfactual Prediction and Multi-Period Optimization Approach
MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning
Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search
Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising
Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling
Representation Learning for Predicting Customer Orders
SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations
We Know What You Want: An Advertising Strategy Recommender System for Online Advertising
2.5 百度
Norm Adjusted Proximity Graph for Fast Inner Product Retrieval
Curriculum Meta-Learning for Next POI Recommendation
Pretrained Language Models for Web-scale Retrieval in Baidu Search
HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps
JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps
MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal
SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps
Talent Demand Forecasting with Attentive Neural Sequential Model
2.6 腾讯
Why Attentions May Not Be Interpretable?
Adversarial Feature Translation for Multi-domain Recommendation
Large-Scale Network Embedding in Apache Spark
Learn to Expand Audience via Meta Hybrid Experts and Critics
Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value
2.7 美团
Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition
User Consumption Intention Prediction in Meituan
Signed Graph Neural Network with Latent Groups
A Deep Learning Method for Route and Time Prediction in Food Delivery Service
2.8 华为
An Embedding Learning Framework for Numerical Features in CTR Prediction
Dual Graph enhanced Embedding Neural Network for CTR Prediction
Discrete-time Temporal Network Embedding via Implicit Hierarchical Learning
Retrieval & Interaction Machine for Tabular Data Prediction
A Multi-Graph Attributed Reinforcement Learning Based Optimization Algorithm for Large-scale Hybrid Flow Shop Scheduling Problem
结语
后续笔者会针对感兴趣的文章进行解读。如果大家有感兴趣的文章,也欢迎在公众号后台跟我留言,我会优先挑选大家感兴趣的文章进行解读。当然,如果你有解读好的笔记,也欢迎投稿或交流~~
一起交流
想和你一起学习进步!『NewBeeNLP』目前已经建立了多个不同方向交流群(机器学习 / 深度学习 / 自然语言处理 / 搜索推荐 / 图网络 / 面试交流 / 等),名额有限,赶紧添加下方微信加入一起讨论交流吧!(注意一定要备注信息才能通过)
参考
[1] KDD2021 Accepted Papers: https://kdd.org/kdd2021/accepted-papers/index
[2] KDD2021 | 推荐系统论文集锦
- END -
基于表征(Representation)的文本匹配、信息检索、向量召回的方法总结
2021-07-12
聊一聊 “超 大 模 型”
2021-07-11
对比学习还能这样用:字节推出真正的多到多翻译模型mRASP2
2021-07-09
小白必看:一文读懂推荐系统负采样
2021-07-09
KDD2021| 工业界搜推广nlp论文整理相关推荐
- WSDM'23 | 工业界搜推广nlp论文整理
大家好,蘑菇先生. WSDM'23已公布录用结果,共收到投稿690篇,录用123篇,录用率为17.8%,完整录用论文: https://www.wsdm-conference.org/2023/pro ...
- 机器阅读理解MRC论文整理
机器阅读理解MRC论文整理 最近发现一篇机器阅读理解整理的博客机器阅读理解整理整理于2020年 论文代码查找网站: https://dblp.uni-trier.de/db/conf/acl/acl2 ...
- PICASSO,一个高效的搜推广稀疏训练解决方案
作者:张远行,陈浪石,宋钺,袁满 来源:智能引擎事业部.阿里妈妈广告技术部.阿里云计算平台事业部 一.整体情况概述 近日,阿里巴巴自研稀疏训练引擎论文<PICASSO: Unleashing t ...
- 更加灵活、经济、高效的训练 — 新一代搜推广稀疏大模型训练范式GBA
作者:苏文博.张远行 近日,阿里巴巴在国际顶级机器学习会议NeurIPS 2022上发表了新的自研训练模式 Gloabl Batch gradients Aggregation(GBA,论文链接:ht ...
- 计算机维修知识综述论文,机器学习领域各领域必读经典综述论文整理分享
原标题:机器学习领域各领域必读经典综述论文整理分享 机器学习是一门多领域交叉学科,涉及概率论.统计学.逼近论.凸分析.算法复杂度理论等多门学科.专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知 ...
- 关系抽取论文整理,核方法、远程监督的重点都在这里
来源 | CSDN 博客 作者 | Matt_sh,编辑 | Carol 来源 | CSDN云计算(ID:CSDNcloud) 本文是个人阅读文章的笔记整理,没有涉及到深度学习在关系抽取中的应用. 笔 ...
- 论文整理集合 -- 吴恩达老师深度学习课程
吴恩达老师深度学习课程中所提到的论文整理集合!这些论文是深度学习的基本知识,阅读这些论文将更深入理解深度学习. 这些论文基本都可以免费下载到,如果无法免费下载,请留言!可以到coursera中看该视频 ...
- ACL2020论文整理
ACL2020论文整理目录 ACL2020论文整理(Main Conference) ACL2020接受文章列表 Best Paper Honorable Mention Papers – Main ...
- NLP数据集整理(更新中)
Ⅰ. NLP数据集整理 中英文NLP数据集搜索平台,点击搜索 一.情感分析 ID 标题 更新日期 数据集提供者 说明 关键字 类别 备注 1 weibo_senti_100k 无 无 带情感标注新浪微 ...
最新文章
- 论文不公开代码,应该被直接拒稿?
- 微服务化的数据库设计与读写分离
- hbase集群间数据迁移
- jsonProperty
- jdbc java例子_Spring JDBC 例子
- 一次性口令设计代码_品牌上新||预算百元,就能买到百搭又有设计感的首饰...
- 使用 soapUI 测试 REST 服务
- [网络开发]同步与线程安全方案
- 服务器经常崩溃??让我们来看看简单的内存知识:C语言——内存管理
- 《九章算术》中更相减损术----求最大公约数
- 学计算机不会重装系统正常吗,系统重装不了的原因是什么 重装不了系统的解决方法【图文】...
- oracle学用命令大全 笔记
- python 路径拼接字符串_字符串游戏之拼接字符串
- linux的IO调度算法和回写机制
- java实现模拟考试系统_基于JAVA SWING考试模拟系统.doc
- IOCCC.1987.korn.c.解析
- 分析华为鸿蒙操作系统的特点,申万宏源-通信行业系列深度研究和分析报告之华为鸿蒙操作系统全景解构.pdf...
- [转帖]怎样选择(FC-SAN)光纤通道(存储)交换机
- oracle alter database,alter database操作
- submit 和 button的区别
热门文章
- PPPoE 报文实例
- 摩杜云出席2021亚太CDN峰会,荣获“融合CDN创新奖”
- 苹果有的功能android没有的,安卓手机特有苹果手机没有的几大功能,你知道吗?...
- 【第二章 数据预处理】袁博《数据挖掘:理论与算法》
- 温馨提示-繁盛农场隐私政策
- 怎么清洁计算机主机内部,一种计算机主机内部除尘装置的制作方法
- Error: timed out while waiting for target halted
- 【互联网品读】谷歌候选人撩美女猎头,被霸气怒怼,码农也有渣渣
- 全方位营销观念(holistic marketing)
- ios中常用的第三方库