本文整理了KDD21的Accepted Papers[1]中,工业界在搜索、推荐、广告、nlp上的文章。整理的论文列表比较偏个人口味,选取的方式是根据论文作者列表上看是否是公司主导的,但判断比较偏主观,存在漏掉的可能。整理的方式主要按照公司方向来划分,排名不计先后顺序。

1. 按照方向分类

主要挑选了一些笔者比较感兴趣的方向,并整理了对应的文章名称。读者可以大致读一下文章名,判断是否和自己的研究方向或工作方向一致,从中选择感兴趣的文章进行精读。

1.1 推荐系统

1.1.1 样本

涉及到采样、负样本等。

  • Google: Bootstrapping for Batch Active Sampling

  • Google: Bootstrapping Recommendations at Chrome Web Store

  • Alibaba:Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling

1.1.2 表征学习

  • Google: Learning to Embed Categorical Features without Embedding Tables for Recommendation

  • 华为:An Embedding Learning Framework for Numerical Features in CTR Prediction

  • 腾讯:Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value

  • 阿里:Representation Learning for Predicting Customer Orders

1.1.3 跨域推荐

  • 阿里:Debiasing Learning based Cross-domain Recommendation

  • 腾讯:Adversarial Feature Translation for Multi-domain Recommendation

1.1.4 纠偏

  • 阿里:Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems

  • 阿里:Debiasing Learning based Cross-domain Recommendation

1.1.5 图神经网络

  • 华为:Dual Graph enhanced Embedding Neural Network for CTR Prediction

  • 美团:Signed Graph Neural Network with Latent Groups

  • 阿里:DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction

  • 百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal

1.1.6 多任务学习

  • Google:Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning

  • 美团:Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition

  • 百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal

1.1.7 多模态/短视频推荐

  • 阿里:SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations

1.1.8 知识图谱

  • Microsoft:Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning

1.1.9 推荐系统架构

  • Facebook:Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism

  • Facebook:Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters

  • 阿里,FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters

  • 腾讯,Large-Scale Network Embedding in Apache Spark

  • Microsoft,On Post-Selection Inference in A/B Testing

1.2 搜索

1.2.1 向量检索

  • 阿里:Embedding-based Product Retrieval in Taobao Search

1.2.2 查询/内容理解

  • Facebook:Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook

1.2.3 概念图谱

  • 阿里巴巴:AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba

  • 阿里巴巴:AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce

1.2.4 预训练

  • 百度:Pretrained Language Models for Web-scale Retrieval in Baidu Search

  • 微软:Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

1.2.5 Query改写/自动补全

  • 微软:Diversity driven Query Rewriting in Search Advertising

  • 百度:Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps

1.2.6 图神经网络

  • 百度:HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps

1.2.7 多模态

  • Google: Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries

  • Facebook:VisRel: Media Search at Scale

1.2.8 边缘计算

  • 阿里:FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data

1.2.9 搜索引擎架构

  • 百度:Norm Adjusted Proximity Graph for Fast Inner Product Retrieval

  • 百度:JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

1.3 广告

这一块文章不是很多,就不细分了。

  • Google: Clustering for Private Interest-based Advertising

  • 阿里:A Unified Solution to Constrained Bidding in Online Display Advertising

  • 阿里:Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning

  • 阿里:Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising

  • 阿里:We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

1.4 NLP

1.4.1 预训练

  • 微软:NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

  • 阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining

  • 微软:TUTA: Tree-based Transformers for Generally Structured Table Pre-training

1.4.2 命名实体识别

  • 微软:Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

1.4.3 少样本学习

  • 微软:Generalized Zero-Shot Extreme Multi-label Learning

  • 微软:Zero-shot Multi-lingual Interrogative Question Generation for "People Also Ask" at Bing

1.4.4 摘要

  • 微软:Reinforcing Pretrained Models for Generating Attractive Text Advertisements

1.4.5 意图识别

  • 阿里:MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning

1.4.6 多模态

  • 阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining

2.按照公司分类

2.1 Google

  • Learning to Embed Categorical Features without Embedding Tables for Recommendation

  • NewsEmbed: Modeling News through Pre-trained Document Representations

  • Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning

  • Bootstrapping for Batch Active Sampling

  • Bootstrapping Recommendations at Chrome Web Store

  • Clustering for Private Interest-based Advertising

  • Dynamic Language Models for Continuously Evolving Content

  • Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries

  • On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

2.2 Facebook

  • Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism

  • Preference Amplification in Recommender Systems

  • Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters

  • Network Experimentation at Scale

  • Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook

  • VisRel: Media Search at Scale

  • Balancing Consistency and Disparity in Network Alignment

2.3 Microsoft

  • Generalized Zero-Shot Extreme Multi-label Learning

  • Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport

  • NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

  • Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning

  • Table2Charts: Recommending Charts by Learning Shared Table Representations

  • TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data

  • TUTA: Tree-based Transformers for Generally Structured Table Pre-training

  • Contextual Bandit Applications in a Customer Support Bot

  • Diversity driven Query Rewriting in Search Advertising

  • Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

  • On Post-Selection Inference in A/B Testing

  • Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

  • Reinforcing Pretrained Models for Generating Attractive Text Advertisements

  • Zero-shot Multi-lingual Interrogative Question Generation for "People Also Ask" at Bing

2.4 阿里

  • A Unified Solution to Constrained Bidding in Online Display Advertising

  • AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba

  • AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce

  • Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems

  • Debiasing Learning based Cross-domain Recommendation

  • Device-Cloud Collaborative Learning for Recommendation

  • Deep Inclusion Relation-aware Network for User Response Prediction at Fliggy

  • DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction

  • Dual Attentive Sequential Learning for Cross-Domain Click-Through Rate Prediction

  • Embedding-based Product Retrieval in Taobao Search

  • Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning

  • FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data

  • FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters

  • Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection

  • Live-Streaming Fraud Detection: A Heterogeneous Graph Neural Network Approach

  • M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining

  • Markdowns in E-Commerce Fresh Retail: A Counterfactual Prediction and Multi-Period Optimization Approach

  • MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning

  • Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

  • Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising

  • Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling

  • Representation Learning for Predicting Customer Orders

  • SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations

  • We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

2.5 百度

  • Norm Adjusted Proximity Graph for Fast Inner Product Retrieval

  • Curriculum Meta-Learning for Next POI Recommendation

  • Pretrained Language Models for Web-scale Retrieval in Baidu Search

  • HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps

  • JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

  • Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps

  • MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal

  • SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps

  • Talent Demand Forecasting with Attentive Neural Sequential Model

2.6 腾讯

  • Why Attentions May Not Be Interpretable?

  • Adversarial Feature Translation for Multi-domain Recommendation

  • Large-Scale Network Embedding in Apache Spark

  • Learn to Expand Audience via Meta Hybrid Experts and Critics

  • Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value

2.7 美团

  • Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition

  • User Consumption Intention Prediction in Meituan

  • Signed Graph Neural Network with Latent Groups

  • A Deep Learning Method for Route and Time Prediction in Food Delivery Service

2.8 华为

  • An Embedding Learning Framework for Numerical Features in CTR Prediction

  • Dual Graph enhanced Embedding Neural Network for CTR Prediction

  • Discrete-time Temporal Network Embedding via Implicit Hierarchical Learning

  • Retrieval & Interaction Machine for Tabular Data Prediction

  • A Multi-Graph Attributed Reinforcement Learning Based Optimization Algorithm for Large-scale Hybrid Flow Shop Scheduling Problem

结语

后续笔者会针对感兴趣的文章进行解读。如果大家有感兴趣的文章,也欢迎在公众号后台跟我留言,我会优先挑选大家感兴趣的文章进行解读。当然,如果你有解读好的笔记,也欢迎投稿或交流~~

一起交流

想和你一起学习进步!『NewBeeNLP』目前已经建立了多个不同方向交流群(机器学习 / 深度学习 / 自然语言处理 / 搜索推荐 / 图网络 / 面试交流 / 等),名额有限,赶紧添加下方微信加入一起讨论交流吧!(注意一定要备注信息才能通过)

参考

[1] KDD2021 Accepted Papers: https://kdd.org/kdd2021/accepted-papers/index

[2] KDD2021 | 推荐系统论文集锦

END -

基于表征(Representation)的文本匹配、信息检索、向量召回的方法总结

2021-07-12

聊一聊 “超 大 模 型”

2021-07-11

对比学习还能这样用:字节推出真正的多到多翻译模型mRASP2

2021-07-09

小白必看:一文读懂推荐系统负采样

2021-07-09

KDD2021| 工业界搜推广nlp论文整理相关推荐

  1. WSDM'23 | 工业界搜推广nlp论文整理

    大家好,蘑菇先生. WSDM'23已公布录用结果,共收到投稿690篇,录用123篇,录用率为17.8%,完整录用论文: https://www.wsdm-conference.org/2023/pro ...

  2. 机器阅读理解MRC论文整理

    机器阅读理解MRC论文整理 最近发现一篇机器阅读理解整理的博客机器阅读理解整理整理于2020年 论文代码查找网站: https://dblp.uni-trier.de/db/conf/acl/acl2 ...

  3. PICASSO,一个高效的搜推广稀疏训练解决方案

    作者:张远行,陈浪石,宋钺,袁满 来源:智能引擎事业部.阿里妈妈广告技术部.阿里云计算平台事业部 一.整体情况概述 近日,阿里巴巴自研稀疏训练引擎论文<PICASSO: Unleashing t ...

  4. 更加灵活、经济、高效的训练 — 新一代搜推广稀疏大模型训练范式GBA

    作者:苏文博.张远行 近日,阿里巴巴在国际顶级机器学习会议NeurIPS 2022上发表了新的自研训练模式 Gloabl Batch gradients Aggregation(GBA,论文链接:ht ...

  5. 计算机维修知识综述论文,机器学习领域各领域必读经典综述论文整理分享

    原标题:机器学习领域各领域必读经典综述论文整理分享 机器学习是一门多领域交叉学科,涉及概率论.统计学.逼近论.凸分析.算法复杂度理论等多门学科.专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知 ...

  6. 关系抽取论文整理,核方法、远程监督的重点都在这里

    来源 | CSDN 博客 作者 | Matt_sh,编辑 | Carol 来源 | CSDN云计算(ID:CSDNcloud) 本文是个人阅读文章的笔记整理,没有涉及到深度学习在关系抽取中的应用. 笔 ...

  7. 论文整理集合 -- 吴恩达老师深度学习课程

    吴恩达老师深度学习课程中所提到的论文整理集合!这些论文是深度学习的基本知识,阅读这些论文将更深入理解深度学习. 这些论文基本都可以免费下载到,如果无法免费下载,请留言!可以到coursera中看该视频 ...

  8. ACL2020论文整理

    ACL2020论文整理目录 ACL2020论文整理(Main Conference) ACL2020接受文章列表 Best Paper Honorable Mention Papers – Main ...

  9. NLP数据集整理(更新中)

    Ⅰ. NLP数据集整理 中英文NLP数据集搜索平台,点击搜索 一.情感分析 ID 标题 更新日期 数据集提供者 说明 关键字 类别 备注 1 weibo_senti_100k 无 无 带情感标注新浪微 ...

最新文章

  1. 论文不公开代码,应该被直接拒稿?
  2. 微服务化的数据库设计与读写分离
  3. hbase集群间数据迁移
  4. jsonProperty
  5. jdbc java例子_Spring JDBC 例子
  6. 一次性口令设计代码_品牌上新||预算百元,就能买到百搭又有设计感的首饰...
  7. 使用 soapUI 测试 REST 服务
  8. [网络开发]同步与线程安全方案
  9. 服务器经常崩溃??让我们来看看简单的内存知识:C语言——内存管理
  10. 《九章算术》中更相减损术----求最大公约数
  11. 学计算机不会重装系统正常吗,系统重装不了的原因是什么 重装不了系统的解决方法【图文】...
  12. oracle学用命令大全 笔记
  13. python 路径拼接字符串_字符串游戏之拼接字符串
  14. linux的IO调度算法和回写机制
  15. java实现模拟考试系统_基于JAVA SWING考试模拟系统.doc
  16. IOCCC.1987.korn.c.解析
  17. 分析华为鸿蒙操作系统的特点,申万宏源-通信行业系列深度研究和分析报告之华为鸿蒙操作系统全景解构.pdf...
  18. [转帖]怎样选择(FC-SAN)光纤通道(存储)交换机
  19. oracle alter database,alter database操作
  20. submit 和 button的区别

热门文章

  1. PPPoE 报文实例
  2. 摩杜云出席2021亚太CDN峰会,荣获“融合CDN创新奖”
  3. 苹果有的功能android没有的,安卓手机特有苹果手机没有的几大功能,你知道吗?...
  4. 【第二章 数据预处理】袁博《数据挖掘:理论与算法》
  5. 温馨提示-繁盛农场隐私政策
  6. 怎么清洁计算机主机内部,一种计算机主机内部除尘装置的制作方法
  7. Error: timed out while waiting for target halted
  8. 【互联网品读】谷歌候选人撩美女猎头,被霸气怒怼,码农也有渣渣
  9. 全方位营销观念(holistic marketing)
  10. ios中常用的第三方库