简介
BLEU（Bilingual Evaluation Understudy），相信大家对这个评价指标的概念已经很熟悉，随便百度谷歌就有相关介绍。原论文为BLEU: a Method for Automatic Evaluation of Machine Translation，IBM出品。

本文通过一个例子详细介绍BLEU是如何计算以及NLTKnltk.align.bleu_score模块的源码。

首先祭出公式：

BLEU=BP⋅exp(∑n=1NwnlogPn)
BLEU=BP⋅exp(∑n=1NwnlogPn)

其中，
BP={1e1−r/cif c>rif c≤r
BP={1if c>re1−r/cif c≤r
注意这里的BLEU值是针对一条翻译（一个样本）来说的。

NLTKnltk.align.bleu_score模块实现了这里的公式，主要包括三个函数，两个私有函数分别计算P和BP，一个函数整合计算BLEU值。

计算BLEU值

def bleu(candidate, references, weights)

（1）私有函数，计算修正的n元精确率（Modified n-gram Precision）

def _modified_precision(candidate, references, n)

（2）私有函数，计算BP惩罚因子

def _brevity_penalty(candidate, references)
1
2
3
4
5
6
7
8
例子：

候选译文（Predicted）：
It is a guide to action which ensures that the military always obeys the commands of the party

参考译文（Gold Standard）
1：It is a guide to action that ensures that the military will forever heed Party commands
2：It is the guiding principle which guarantees the military forces always being under the command of the Party
3：It is the practical guide for the army always to heed the directions of the party

Modified n-gram Precision计算（也即是PnPn）
def _modified_precision(candidate, references, n):
counts = Counter(ngrams(candidate, n))

if not counts:
return 0

max_counts = {}
for reference in references:
reference_counts = Counter(ngrams(reference, n))
for ngram in counts:
max_counts[ngram] = max(max_counts.get(ngram, 0), reference_counts[ngram])

clipped_counts = dict((ngram, min(count, max_counts[ngram])) for ngram, count in counts.items())

return sum(clipped_counts.values()) / sum(counts.values())
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
我们这里nn取值为4，也就是从1-gram计算到4-gram。

Modified 1-gram precision：
首先统计候选译文里每个词出现的次数，然后统计每个词在参考译文中出现的次数，Max表示3个参考译文中的最大值，Min表示候选译文和Max两个的最小值。

词候选译文参考译文1 参考译文2 参考译文3 Max Min
the 3 1 4 4 4 3
obeys 1 0 0 0 0 0
a 1 1 0 0 1 1
which 1 0 1 0 1 1
ensures 1 1 0 0 1 1
guide 1 1 0 1 1 1
always 1 0 1 1 1 1
is 1 1 1 1 1 1
of 1 0 1 1 1 1
to 1 1 0 1 1 1
commands 1 1 0 0 1 1
that 1 2 0 0 2 1
It 1 1 1 1 1 1
action 1 1 0 0 1 1
party 1 0 0 1 1 1
military 1 1 1 0 1 1
然后将每个词的Min值相加，将候选译文每个词出现的次数相加，然后两值相除即得P1=3+0+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+13+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1=0.95P1=3+0+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+13+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1=0.95。

类似可得：

Modified 2-gram precision：
词候选译文参考译文1 参考译文2 参考译文3 Max Min
ensures that 1 1 0 0 1 1
guide to 1 1 0 0 1 1
which ensures 1 0 0 0 0 0
obeys the 1 0 0 0 0 0
commands of 1 0 0 0 0 0
that the 1 1 0 0 1 1
a guide 1 1 0 0 1 1
of the 1 0 1 1 1 1
always obeys 1 0 0 0 0 0
the commands 1 0 0 0 0 0
to action 1 1 0 0 1 1
the party 1 0 0 1 1 1
is a 1 1 0 0 1 1
action which 1 0 0 0 0 0
It is 1 1 1 1 1 1
military always 1 0 0 0 0 0
the military 1 1 1 0 1 1
P2=1017=0.588235294P2=1017=0.588235294
Modified 3-gram precision：
词候选译文参考译文1 参考译文2 参考译文3 Max Min
ensures that the 1 1 0 0 1 1
which ensures that 1 0 0 0 0 0
action which ensures 1 0 0 0 0 0
a guide to 1 1 0 0 1 1
military always obeys 1 0 0 0 0 0
the commands of 1 0 0 0 0 0
commands of the 1 0 0 0 0 0
to action which 1 0 0 0 0 0
the military always 1 0 0 0 0 0
obeys the commands 1 0 0 0 0 0
It is a 1 1 0 0 1 1
of the party 1 0 0 1 1 1
is a guide 1 1 0 0 1 1
that the military 1 1 0 0 1 1
always obeys the 1 0 0 0 0 0
guide to action 1 1 0 0 1 1
P3=716=0.4375P3=716=0.4375
Modified 4-gram precision：
词候选译文参考译文1 参考译文2 参考译文3 Max Min
to action which ensures 1 0 0 0 0 0
action which ensures that 1 0 0 0 0 0
guide to action which 1 0 0 0 0 0
obeys the commands of 1 0 0 0 0 0
which ensures that the 1 0 0 0 0 0
commands of the party 1 0 0 0 0 0
ensures that the military 1 1 0 0 1 1
a guide to action 1 1 0 0 1 1
always obeys the commands 1 0 0 0 0 0
that the military always 1 0 0 0 0 0
the commands of the 1 0 0 0 0 0
the military always obeys 1 0 0 0 0 0
military always obeys the 1 0 0 0 0 0
is a guide to 1 1 0 0 1 1
It is a guide 1 1 0 0 1 1
P4=415=0.266666667P4=415=0.266666667
然后我们取w1=w2=w3=w4=0.25w1=w2=w3=w4=0.25，也就是Uniform Weights。

所以：

∑Ni=1wnlogPn=0.25∗logP1+0.25∗logP2+0.25∗logP3+0.25∗logP4=−0.684055269517∑i=1Nwnlog⁡Pn=0.25∗log⁡P1+0.25∗log⁡P2+0.25∗log⁡P3+0.25∗log⁡P4=−0.684055269517
3. Brevity Penalty 计算
def _brevity_penalty(candidate, references):

c = len(candidate)
ref_lens = (len(reference) for reference in references)
#这里有个知识点是Python中元组是可以比较的，如(0,1)>(1,0)返回False，这里利用元组比较实现了选取参考翻译中长度最接近候选翻译的句子，当最接近的参考翻译有多个时，选取最短的。例如候选翻译长度是10，两个参考翻译长度分别为9和11，则r=9.
r = min(ref_lens, key=lambda ref_len: (abs(ref_len - c), ref_len))
print 'r:',rif c > r:return 1
else:return math.exp(1 - r / c)

1
2
3
4
5
6
7
8
9
10
11
12
下面计算BP（Brevity Penalty），翻译过来就是“过短惩罚”。由BP的公式可知取值范围是(0,1]，候选句子越短，越接近0。

候选翻译句子长度为18，参考翻译分别为：16，18，16。
所以c=18c=18，r=18r=18（参考翻译中选取长度最接近候选翻译的作为rr）

所以BP=e0=1BP=e0=1
4. 整合
最终BLEU=1⋅exp(−0.684055269517)=0.504566684006BLEU=1⋅exp(−0.684055269517)=0.504566684006。

BLEU的取值范围是[0,1]，0最差，1最好。

通过计算过程，我们可以看到，BLEU值其实也就是“改进版的n-gram”加上“过短惩罚因子”。
————————————————
版权声明：本文为CSDN博主「手撕机」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/guolindonggld/article/details/56966200

机器翻译评价指标之BLEU详细计算过程相关推荐

IP数据报首部检验和的详细计算过程
目录 IP数据报检验的计算过程引入检验原理题目案例及分析题目要求分析计算过程图解总结 IP数据报检验的计算过程本篇文章只介绍IP数据报的检验过程,不对原理做过多讲解.内容通俗易懂,请放 ...
机器翻译评价指标之BLEU原理介绍及代码实现
欢迎关注知乎: 世界是我改变的知乎上的原文链接一. 原理介绍 BLEU(Bilingual Evaluation Understudy),即双语评估替补.所谓替补就是代替人类来评估机器翻译的每一个 ...
图像sobel梯度详细计算过程_OpenCV-Python 图像梯度 | 十八
目标在本章中,我们将学习: 查找图像梯度,边缘等我们将看到以下函数:cv.Sobel(),cv.Scharr(),cv.Laplacian()等理论 OpenCV提供三种类型的梯度滤波器或高通滤 ...
图像sobel梯度详细计算过程_数字图像处理（第十章）
点.线.边缘检测背景知识.书中主要介绍了图像的一阶导数与二阶导数,这个之前的文章中有过介绍这里在复习一遍.对于函数 ,对于点在x方向的一阶偏导为: ,二阶偏导为: 之后书中总结了一阶导与二阶导对于 ...
图像sobel梯度详细计算过程_视频处理之Sobel【附源码】
边缘检测是检测图像中的一些像素点,它们周围的像素点的灰度发生了急剧的变化,我们认为在这过程中,图像中的物体不同导致了这一变化,因此可以将这些像素点作为一个集合,可以用来标注图像中不同物体的边界.边缘区 ...
矩阵特征值和特征向量详细计算过程(转载)
1.矩阵特征值和特征向量定义 A为n阶矩阵,若数λ和n维非0列向量x满足Ax=λx,那么数λ称为A的特征值,x称为A的对应于特征值λ的特征向量.式Ax=λx也可写成( A-λE)x=0,并且|λE-A ...
CV学习笔记-BP神经网络训练实例（含详细计算过程与公式推导）
BP神经网络训练实例 1. BP神经网络关于BP神经网络在我的上一篇博客<CV学习笔记-推理和训练>中已有介绍,在此不做赘述.本篇中涉及的一些关于BP神经网络的概念与基础知识均在< ...
mfcc计算 java_MFCC特征提取详细计算过程
一.MFCC计算总体流程 1.信号的预处理,包括预加重(Preemphasis),分帧(Frame Blocking),加窗(Windowing).假设语音信号的采样频率fs=8KHz.由于语音信号在 ...
详解ROC/AUC计算过程
ROC和AUC定义 ROC全称是"受试者工作特征"(Receiver Operating Characteristic).ROC曲线的面积就是AUC(Area Under the ...

机器翻译评价指标之BLEU详细计算过程

计算BLEU值

（1）私有函数，计算修正的n元精确率（Modified n-gram Precision）

（2）私有函数，计算BP惩罚因子

机器翻译评价指标之BLEU详细计算过程相关推荐

最新文章

热门文章