【NLP】文本分类-情感分类
1 常见NLP文本分类模型
1.1 TextCNN
论文原文:《Convolutional Neural Networks for Sentence Classification》
论文地址:1408.5882.pdf (arxiv.org)
结构图如下:
值得一提的是,在2016年的《A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification》作者通过大量实验对TextCNN进行网络参数选取,并给出了参数建议。论文地址:https://arxiv.org/pdf/1510.03820.pdf,文章经典的结构图如下:
1.2 TextRNN
TextRNN指的是利用RNN循环神经网络解决文本分类问题。
论文原文:《Recurrent Neural Network for Text Classification with Multi-Task Learning》
论文链接:https://www.ijcai.org/Proceedings/16/Papers/408.pdf
结构图如下:
1.3 TextRCNN
论文原文:《Recurrent Convolutional Neural Networks for Text Classification》
论文链接:TextRCNN论文
结构图如下:
1.4 FastText
论文原文:《Bag of Tricks for Efficient Text Classification》
论文链接:https://arxiv.org/pdf/1607.01759v2.pdf
结构图如下:
1.5 HAN
论文原文:《Hierarchical Attention Networks for Document Classification》
论文链接:https://aclanthology.org/N16-1174.pdf
结构图如下:
1.6 CharCNN
论文原文:《Character-level Convolutional Networks for Text Classification》
论文链接:CharCNN论文
结构图如下:
1.7 Transformer
论文原文:《Attention is all you need》
论文链接:https://arxiv.org/pdf/1706.03762.pdf
结构图如下:
2 代码实现
import torch
import torch.nn as nn
import torch.nn.functional as Fimport numpy as npimport math
import copy#TextCNN
class TextCNN(nn.Module):def __init__(self, args):super(TextCNN, self).__init__()self.args = argsclass_num = args.class_numchanel_num = 1filter_num = args.filter_numfilter_sizes = args.filter_sizesvocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dimself.embedding = nn.Embedding(vocabulary_size, embedding_dimension)if args.static:self.embedding = self.embedding.from_pretrained(args.vectors, freeze=not args.non_static)if args.multichannel:self.embedding2 = nn.Embedding(vocabulary_size, embedding_dimension).from_pretrained(args.vectors)chanel_num += 1else:self.embedding2 = Noneself.convs = nn.ModuleList([nn.Conv2d(chanel_num, filter_num, (size, embedding_dimension)) for size in filter_sizes])self.dropout = nn.Dropout(args.dropout)self.fc = nn.Linear(len(filter_sizes) * filter_num, class_num)def forward(self, x):if self.embedding2:x = torch.stack([self.embedding(x), self.embedding2(x)], dim=1)else:x = self.embedding(x)x = x.unsqueeze(1)x = [F.relu(conv(x)).squeeze(3) for conv in self.convs]x = [F.max_pool1d(item, int(item.size(2))).squeeze(2) for item in x]x = torch.cat(x, 1)x = self.dropout(x)logits = self.fc(x)return logits#TextRNN
class LSTM(torch.nn.Module):def __init__(self, args):super(LSTM, self).__init__()self.embed_size = args.embedding_dimself.label_num = args.class_numself.embed_dropout = 0.1self.fc_dropout = 0.1self.hidden_num = 1self.hidden_size = 50self.hidden_dropout = 0self.bidirectional = Truevocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dimself.embeddings = nn.Embedding(vocabulary_size, embedding_dimension)# self.embeddings.weight.data.copy_(torch.from_numpy(vocabulary_size))self.embeddings.weight.requires_grad = Falseself.lstm = nn.LSTM(self.embed_size,self.hidden_size,dropout=self.hidden_dropout,num_layers=self.hidden_num,batch_first=True,bidirectional=True)self.embed_dropout = nn.Dropout(self.embed_dropout)self.fc_dropout = nn.Dropout(self.fc_dropout)self.linear1 = nn.Linear(self.hidden_size * 2, self.label_num)self.softmax = nn.Softmax()def forward(self, input):x = self.embeddings(input)x = self.embed_dropout(x)batch_size = len(input)_, (lstm_out, _) = self.lstm(x)lstm_out = lstm_out.permute(1, 0, 2)lstm_out = lstm_out.contiguous().view(batch_size, -1)out = self.linear1(lstm_out)out = self.fc_dropout(out)out = self.softmax(out)return out#TextRCNN
class BiLSTM(nn.Module):def __init__(self, args):super(BiLSTM, self).__init__()self.embed_size = args.embedding_dimself.label_num = args.class_numself.embed_dropout = 0.1self.fc_dropout = 0.1self.hidden_num = 2self.hidden_size = 50self.hidden_dropout = 0self.bidirectional = Truevocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dimself.embeddings = nn.Embedding(vocabulary_size, embedding_dimension)# self.embeddings.weight.data.copy_(torch.from_numpy(word_embeddings))self.embeddings.weight.requires_grad = Falseself.lstm = nn.LSTM(self.embed_size,self.hidden_size,dropout=self.hidden_dropout,num_layers=self.hidden_num,batch_first=True,bidirectional=self.bidirectional)self.embed_dropout = nn.Dropout(self.embed_dropout)self.fc_dropout = nn.Dropout(self.fc_dropout)self.linear1 = nn.Linear(self.hidden_size * 2, self.hidden_size // 2)self.linear2 = nn.Linear(self.hidden_size // 2, self.label_num)def forward(self, input):out = self.embeddings(input)out = self.embed_dropout(out)out, _ = self.lstm(out)out = torch.transpose(out, 1, 2)out = torch.tanh(out)out = F.max_pool1d(out, out.size(2))out = out.squeeze(2)out = self.fc_dropout(out)out = self.linear1(F.relu(out))output = self.linear2(F.relu(out))return output#FastText
class FastText(nn.Module):def __init__(self, args):super().__init__()self.output_dim = args.class_numvocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dimself.embeddings = nn.Embedding(vocabulary_size, embedding_dimension)self.fc = nn.Linear(embedding_dimension, self.output_dim)def forward(self, text):# text = [sent len, batch size]text = text.permute(1,0)embedded = self.embeddings(text)# embedded = [sent len, batch size, emb dim]embedded = embedded.permute(1, 0, 2)# embedded = [batch size, sent len, emb dim]pooled = F.avg_pool2d(embedded, (embedded.shape[1], 1)).squeeze(1)# pooled = [batch size, embedding_dim]return self.fc(pooled)#HAN
class SelfAttention(nn.Module):def __init__(self, input_size, hidden_size):super(SelfAttention, self).__init__()self.W = nn.Linear(input_size, hidden_size, True)self.u = nn.Linear(hidden_size, 1)def forward(self, x):u = torch.tanh(self.W(x))a = F.softmax(self.u(u), dim=1)x = a.mul(x).sum(1)return x
class HAN(nn.Module):def __init__(self,args):super(HAN, self).__init__()hidden_size_gru = 50 # 50hidden_size_att = 100 # 100num_classes = args.class_numvocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dimself.num_words = 64 #词Pading大小self.embed = nn.Embedding(vocabulary_size, embedding_dimension)self.gru1 = nn.GRU(embedding_dimension, hidden_size_gru, bidirectional=True, batch_first=True)self.att1 = SelfAttention(hidden_size_gru * 2, hidden_size_att)self.gru2 = nn.GRU(hidden_size_att, hidden_size_gru, bidirectional=True, batch_first=True)self.att2 = SelfAttention(hidden_size_gru * 2, hidden_size_att)# 这里fc的参数很少,不需要dropoutself.fc = nn.Linear(hidden_size_att, num_classes, True)def forward(self, x):# 64 512 200x = x.view(x.size(0) * self.num_words, -1).contiguous()x = self.embed(x)x, _ = self.gru1(x)x = self.att1(x)x = x.view(x.size(0) // self.num_words, self.num_words, -1).contiguous()x, _ = self.gru2(x)x = self.att2(x)x = self.fc(x)x = F.log_softmax(x, dim=1) # softmaxreturn x#CharCNN
class CharCNN(nn.Module):def __init__(self, args):super(CharCNN, self).__init__()self.num_chars = 64self.features = [128, 128, 128, 128, 128, 128]self.kernel_sizes = [7, 7, 3, 3, 3, 3]self.dropout = args.dropoutself.num_labels = args.class_numvocabulary_size = args.vocabulary_sizeembedding_dimension = args.embedding_dim# Embedding Layerself.embeddings = nn.Embedding(vocabulary_size, embedding_dimension)self.embeddings.weight.requires_grad = Falseself.in_features = [self.num_chars]+self.features[:-1]self.out_features = self.featuresself.conv1d_1 = nn.Sequential(nn.Conv1d(self.in_features[0], self.out_features[0], self.kernel_sizes[0], stride=1),nn.BatchNorm1d(self.out_features[0]),nn.ReLU(),nn.MaxPool1d(kernel_size=3, stride=3))self.conv1d_2 = nn.Sequential(nn.Conv1d(self.in_features[1], self.out_features[1], self.kernel_sizes[1], stride=1),nn.BatchNorm1d(self.out_features[1]),nn.ReLU(),nn.MaxPool1d(kernel_size=3, stride=3))self.conv1d_3 = nn.Sequential(nn.Conv1d(self.in_features[2], self.out_features[2], self.kernel_sizes[2], stride=1),nn.BatchNorm1d(self.out_features[2]),nn.ReLU())self.conv1d_4 = nn.Sequential(nn.Conv1d(self.in_features[3], self.out_features[3], self.kernel_sizes[3], stride=1),nn.BatchNorm1d(self.out_features[3]),nn.ReLU())self.conv1d_5 = nn.Sequential(nn.Conv1d(self.in_features[4], self.out_features[4], self.kernel_sizes[4], stride=1),nn.BatchNorm1d(self.out_features[4]),nn.ReLU())self.conv1d_6 = nn.Sequential(nn.Conv1d(self.in_features[5], self.out_features[5], self.kernel_sizes[5], stride=1),nn.BatchNorm1d(self.out_features[5]),nn.ReLU(),nn.MaxPool1d(kernel_size=3, stride=3))self.fc1 = nn.Sequential(nn.Linear(128, 128),nn.ReLU(),nn.Dropout(self.dropout))self.fc2 = nn.Sequential(nn.Linear(128, 128),nn.ReLU(),nn.Dropout(self.dropout))self.fc3 = nn.Linear(128, self.num_labels)def forward(self, x):# x = torch.Tensor(x).long() # batch_size=128, num_chars=128, seq_len=64x = self.embeddings(x)# x = x.permute(0,2,1)x = self.conv1d_1(x) # b, out_features[0], (seq_len-f + 1)-f/s+1 = 64, 256, (1014-7+1)-3/3 + 1=1008-3/3+1=336x = self.conv1d_2(x) # 64, 256, (336-7+1)-3/3+1=110x = self.conv1d_3(x) # 64, 256, 110-3+1=108x = self.conv1d_4(x) # 64, 256, 108-3+1=106x = self.conv1d_5(x) # 64, 256, 106-3=1=104x = self.conv1d_6(x) # 64, 256, (104-3+1)-3/3+1=34x = x.view(x.size(0), -1) # 64, 256, 34 -> 64, 8704out = self.fc1(x) # 64, 1024out = self.fc2(out) # 64, 1024out = self.fc3(out) # 64, 4return out#Transformer
class Transformer_Config(object):"""配置参数"""def __init__(self, args):# self.model_name = 'Transformer'# self.train_path = dataset + '/data/train.txt' # 训练集# self.dev_path = dataset + '/data/dev.txt' # 验证集# self.test_path = dataset + '/data/test.txt' # 测试集# self.class_list = [x.strip() for x in open(# dataset + '/data/class.txt', encoding='utf-8').readlines()] # 类别名单# self.vocab_path = dataset + '/data/vocab.pkl' # 词表# self.save_path = dataset + '/saved_dict/' + self.model_name + '.ckpt' # 模型训练结果# self.log_path = dataset + '/log/' + self.model_name# self.embedding_pretrained = torch.tensor(# np.load(dataset + '/data/' + embedding)["embeddings"].astype('float32'))\# if embedding != 'random' else None # 预训练词向量self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # 设备self.dropout = 0.5 # 随机失活self.require_improvement = 2000 # 若超过1000batch效果还没提升,则提前结束训练self.num_classes = args.class_num # 类别数self.n_vocab = args.vocabulary_size # 词表大小,在运行时赋值self.num_epochs = args.epochs # epoch数self.batch_size = args.batch_size # mini-batch大小self.pad_size = 64 # 每句话处理成的长度(短填长切)self.learning_rate = 5e-4 # 学习率self.embedding_pretrained = None# self.embed = self.embedding_pretrained.size(1)\# if self.embedding_pretrained is not None else 300 # 字向量维度self.embed = 128self.dim_model = args.embedding_dimself.hidden = 1024self.last_hidden = 512self.num_head = 2self.num_encoder = 2
'''Attention Is All You Need'''
class Transformer(nn.Module):def __init__(self, config):super(Transformer, self).__init__()if config.embedding_pretrained is not None:self.embedding = nn.Embedding.from_pretrained(config.embedding_pretrained, freeze=False)else:self.embedding = nn.Embedding(config.n_vocab, config.embed)self.postion_embedding = Positional_Encoding(config.embed, config.pad_size, config.dropout, config.device)self.encoder = Encoder(config.dim_model, config.num_head, config.hidden, config.dropout)self.encoders = nn.ModuleList([copy.deepcopy(self.encoder)# Encoder(config.dim_model, config.num_head, config.hidden, config.dropout)for _ in range(config.num_encoder)])self.fc1 = nn.Linear(config.pad_size * config.dim_model, config.num_classes)# self.fc2 = nn.Linear(config.last_hidden, config.num_classes)# self.fc1 = nn.Linear(config.dim_model, config.num_classes)def forward(self, x):out = self.embedding(x)out = self.postion_embedding(out)for encoder in self.encoders:out = encoder(out)out = out.view(out.size(0), -1)# out = torch.mean(out, 1)out = self.fc1(out)return out
class Encoder(nn.Module):def __init__(self, dim_model, num_head, hidden, dropout):super(Encoder, self).__init__()self.attention = Multi_Head_Attention(dim_model, num_head, dropout)self.feed_forward = Position_wise_Feed_Forward(dim_model, hidden, dropout)def forward(self, x):out = self.attention(x)out = self.feed_forward(out)return out
class Positional_Encoding(nn.Module):def __init__(self, embed, pad_size, dropout, device):super(Positional_Encoding, self).__init__()self.device = deviceself.pe = torch.tensor([[pos / (10000.0 ** (i // 2 * 2.0 / embed)) for i in range(embed)] for pos in range(pad_size)])self.pe[:, 0::2] = np.sin(self.pe[:, 0::2])self.pe[:, 1::2] = np.cos(self.pe[:, 1::2])self.dropout = nn.Dropout(dropout)def forward(self, x):out = x + nn.Parameter(self.pe, requires_grad=False).to(self.device)out = self.dropout(out)return out
class Scaled_Dot_Product_Attention(nn.Module):'''Scaled Dot-Product Attention '''def __init__(self):super(Scaled_Dot_Product_Attention, self).__init__()def forward(self, Q, K, V, scale=None):'''Args:Q: [batch_size, len_Q, dim_Q]K: [batch_size, len_K, dim_K]V: [batch_size, len_V, dim_V]scale: 缩放因子 论文为根号dim_KReturn:self-attention后的张量,以及attention张量'''attention = torch.matmul(Q, K.permute(0, 2, 1))if scale:attention = attention * scale# if mask: # TODO change this# attention = attention.masked_fill_(mask == 0, -1e9)attention = F.softmax(attention, dim=-1)context = torch.matmul(attention, V)return context
class Multi_Head_Attention(nn.Module):def __init__(self, dim_model, num_head, dropout=0.0):super(Multi_Head_Attention, self).__init__()self.num_head = num_headassert dim_model % num_head == 0self.dim_head = dim_model // self.num_headself.fc_Q = nn.Linear(dim_model, num_head * self.dim_head)self.fc_K = nn.Linear(dim_model, num_head * self.dim_head)self.fc_V = nn.Linear(dim_model, num_head * self.dim_head)self.attention = Scaled_Dot_Product_Attention()self.fc = nn.Linear(num_head * self.dim_head, dim_model)self.dropout = nn.Dropout(dropout)self.layer_norm = nn.LayerNorm(dim_model)def forward(self, x):batch_size = x.size(0)Q = self.fc_Q(x)K = self.fc_K(x)V = self.fc_V(x)Q = Q.view(batch_size * self.num_head, -1, self.dim_head)K = K.view(batch_size * self.num_head, -1, self.dim_head)V = V.view(batch_size * self.num_head, -1, self.dim_head)# if mask: # TODO# mask = mask.repeat(self.num_head, 1, 1) # TODO change thisscale = K.size(-1) ** -0.5 # 缩放因子context = self.attention(Q, K, V, scale)context = context.view(batch_size, -1, self.dim_head * self.num_head)out = self.fc(context)out = self.dropout(out)out = out + x # 残差连接out = self.layer_norm(out)return out
class Position_wise_Feed_Forward(nn.Module):def __init__(self, dim_model, hidden, dropout=0.0):super(Position_wise_Feed_Forward, self).__init__()self.fc1 = nn.Linear(dim_model, hidden)self.fc2 = nn.Linear(hidden, dim_model)self.dropout = nn.Dropout(dropout)self.layer_norm = nn.LayerNorm(dim_model)def forward(self, x):out = self.fc1(x)out = F.relu(out)out = self.fc2(out)out = self.dropout(out)out = out + x # 残差连接out = self.layer_norm(out)return out
3 结果讨论
本文针对文本情感二分类任务展开训练,采用数据集的数据量包含Train有56700条,Evaluate有7000条。得到测试结果如下表。
可以看到由于本文的数据量比较小,所以小模型还有更好的检测效果。如Transformer有点大材小用,缺少发挥空间。欢迎大家学习讨论。
【NLP】文本分类-情感分类相关推荐
- MXNet中使用卷积神经网络textCNN对文本进行情感分类
在图像识别领域,卷积神经网络是非常常见和有用的,我们试图将它应用到文本的情感分类上,如何处理呢?其实思路也是一样的,图片是二维的,文本是一维的,同样的,我们使用一维的卷积核去处理一维的文本(当作一维的 ...
- 【学习笔记】NLP之影评情感分类
本文对影评数据进行NLP情感分类,数据分为标注数据(含sentiment)和非标注数据(不含sentiment),数据25000条,列出前五条如下: 自然语言处理和文本分析的问题中,词袋(Bag of ...
- MXNet中使用双向循环神经网络BiRNN对文本进行情感分类
文本分类类似于图片分类,也是很常见的一种分类任务,将一段不定长的文本序列变换为文本的类别.这节主要就是关注文本的情感分析(sentiment analysis),对电影的评论进行一个正面情绪与负面情绪 ...
- MXNet中使用双向循环神经网络BiRNN对文本进行情感分类<改进版>
在上一节的情感分类当中,有些评论是负面的,但预测的结果是正面的,比如,"this movie was shit"这部电影是狗屎,很明显就是对这部电影极不友好的评价,属于负类评价,给 ...
- 【NLP】中文情感分类单标签
章节 背景介绍 预处理 完整的 GitHub 项目代码地址: https://github.com/sherlcok314159/ML/blob/main/nlp/practice/sentiment ...
- 【毕业设计】深度学习中文文本分类(新闻分类 情感分类 垃圾邮件分类)
文章目录 0 简介 1 前言 2 中文文本分类 3 数据集准备 4 经典机器学习方法 4.1 分词.去停用词 4.2 文本向量化 tf-idf 4.3 构建训练和测试数据 4.4 训练分类器 4.4. ...
- BERT 预训练模型及文本分类(情感分类)
https://www.cnblogs.com/wwj99/p/12283799.html
- 使用Python和机器学习进行文本情感分类
使用Python和机器学习进行文本情感分类 1. 效果图 2. 原理 3. 源码 参考 这篇博客将介绍如何使用Python进行机器学习的文本情感分类(Text Emotions Classificat ...
- kaggle之电影文本情感分类
电影文本情感分类 Github地址 Kaggle地址 这个任务主要是对电影评论文本进行情感分类,主要分为正面评论和负面评论,所以是一个二分类问题,二分类模型我们可以选取一些常见的模型比如贝叶斯.逻辑回 ...
最新文章
- OpenvSwitch — Overview
- .net中实现拖拽控件
- 设计模式复习-装饰模式
- 源码分析netty服务器创建过程vs java nio服务器创建
- 工商银行打造在线诊断平台的探索与实践
- navicat 8 mysql生成关系_MySQL数据库通过navicat建立多对多关系
- 私有云的部署(1)_ISCSI 无盘引导的一些心得
- iSCSI存储设备的udev绑定 以及iscsi重启卡住解决方法
- 虚拟邮箱怎么设置方法_腾讯企业邮箱邮件列表白名单设置方法
- 使用HttpsUrlConnedtion连接https地址时异常处理 (方式二)
- django 1.8 官方文档翻译: 2-3-2 关联对象参考
- 图像处理常用八大算法
- Android开发笔记(一百六十三)高仿京东的沉浸式状态栏
- 想学习C语言,学习路线是什么?
- [转]c++中RTTI的观念和使用
- Easy machine learning pipelines with pipelearner: intro and call for contributors
- 02.STM32开发板资源介绍与驱动
- Android 调用12306接口,聚合数据Android SDK 12306火车票查询订票演示示例 编辑
- python读取word文档并做简单的批量文档筛选
- 时间脱敏,也许能稍稍帮助你摆脱焦虑
热门文章
- 基于ModelArts和CANN的端到端行人检测和跟踪Demo(Python版本)【训练篇】
- iOS开发 版本更新提醒
- 【说明书】核酸测试盒使用说明
- 高等数学18讲(19版)7.12
- 理解Elasticsearch中的桶聚合(Bucket aggregation)
- 数字电路与逻辑设计——模型机时序部件的实现
- 三元组java_用Java从数组创建三元组
- 学python处理数据结构_Python学习【第2篇】:Python数据结构
- 域名服务器系统所维护的信息是,域名服务系统所维护的信息是什么?
- jar not loaded. See Servlet Spec 3.0, section 10.7.2. Offending class