前言
一. 基本概念
- 1.1 分类问题中的四类样本
- 1.2 精确率（precision）
- 1.3 召回率（recall）
- 1.4 准确率（accuracy）
- 1.5 其他
- 1.6 讨论
二. mAP来度量目标检测的准确度。
- 2.1 计算某一类别在某一IoU阈值下的AP：
- - 2.1.1 判断DT框的正负性
  - 2.1.2 计算P和R
  - 2.1.3 计算AP
- 2.2 讨论

前言

一. 基本概念

1.1 分类问题中的四类样本

这4个概念来源于下面两个表格，这两个表格只是在不同领域中的叫法，实际上是一样的，其中，第一个表是在机器学习领域的叫法，第二个表格是在医学领域上的叫法。

真实情况	预测结果
真实情况	正例	反例
正例	TP(真正例)	FN(假反例)
反例	FP(假正例)	TN(真反例)

真实情况	预测结果
真实情况	阳性	阴性
阳性	TP(真阳例)	FN(假阴例)
阴性	FP(假阳例)	TN(真阴例)

在二分类问题上，根据预测结果与真实情况的差别，会存在以下4种情况。

True positive (TP)
False negative (FN)
False positive (FP)
True negative (TN)

初学者很容易混淆以上4种情况，但实际上是非常好记的，关键是理解以下两点：
（1）True or False 表示预测是否准确
（2）positive or negative 表示预测结果

1.2 精确率（precision）

精确率，又称查准率。查准率是衡量某一检索系统的信号噪声比的一种指标，即检出的相关文献与检出的全部文献的百分比。
在分类问题中，精确率（查准率）表征的是某算法识别出的正样本中，有多少确实为正样本，其计算公式为：TP / ( TP +FP )。
在目标检测领域，precision表示某一类样本预测有多准，离开类别谈precision是没有意义的。在二分类问题中，样本只有两类，不是正样本就是负样本，此时计算precision无需说明类别。但是在目标检测领域，边界框是有多个类别的，这时在计算precision时就需要说明是计算的是哪一类样本的precision。

1.3 召回率（recall）

召回率，又称查全率。查全率是衡量某一检索系统从文献集合中检出相关文献成功度的一项指标，即检出的相关文献与全部相关文献的百分比。
在分类问题中，召回率（查全率）表征某算法将所有正样本正确识别出来的能力，其计算公式为：TP / (TP + FN ) 。
在目标检测领域，recall表示某一类样本中，预测正确的边界框与所有同类别GT框的比例。同样，离开类别谈recall是没有意义的。

1.4 准确率（accuracy）

准确率=预测正确的样本数/所有样本数，即预测正确的样本比例（包括预测正确的正样本和预测正确的负样本，其计算公式为：Acc = ( TP + TN ) / (AllSamples )。不过在目标检测领域，没有预测正确的负样本这一说法，所以目标检测里面没有用Accuracy的）。

1.5 其他

真阳性率（true positive rate）：TPR = TP / ( TP+FN ) = TP / T （敏感性 sensitivity）
真阴性率（true negative rate）：TNR= TN / (FP + TN) （特异性：specificity)

1.6 讨论

1. 为什么在衡量一个分类算法好坏的时候，要同时引入precision和recall这两个指标。

精确度和召回率从两个不同方面衡量分类算法的好坏。在图像分类任务中，虽然很多时候考察的是accuracy，比如ImageNet的评价标准。但具体到单个类别，如果recall比较高，但precision较低，比如大部分的汽车都被识别出来了，但把很多卡车也误识别为了汽车，这时候对应一个原因。如果recall较低，precision较高，比如检测出的飞机结果很准确，但是还有很多飞机没有被识别出来，这时候又有一个原因。

recall度量的是「查全率」，所有的正样本是不是都被检测出来了。比如在肿瘤预测场景中，要求模型有更高的recall，不能放过每一个肿瘤。

precision度量的是「查准率」，在所有检测出的正样本中是不是实际都为正样本。比如在垃圾邮件判断等场景中，要求有更高的precision，确保放到回收站的都是垃圾邮件。

二. mAP来度量目标检测的准确度。

假设以测试集的一张图像为例，该图像存在类别标签为1、2、3的三类物体，GT框总数为6，其中，类别1有1个物体，类别2有2个物体，类别3有3个物体。检测器对这张图像进行检测后，给出了10个DT框，如下图所示。

2.1 计算某一类别在某一IoU阈值下的AP：

2.1.1 判断DT框的正负性

（1）首先选出所有同一类别的预测框，并对其进行排序。
（2）判断正负性：计算分数最高的DT框与所有同类别的GT框之间的IoU，如果IoU中的最大者小于给定阈值，则将该DT框视为负样本，否则将IoU最大者相对应的GT框与该DT框相匹配，表示该DT框在IoU阈值的条件下对该GT框进行了正确的检测，此步骤完成了DT框与GT框的最佳匹配。
（3）剔除上述已成功匹配的GT框，选择分数次高的DT框，并在剩余未匹配的GT框中进行正负性判断。
（4）重复执行（2）（3）直到所有DT框均有正负性，或者所有GT框已完成匹配。

伪代码为：

输入：同一类别的DT框集合DTSet和GT框集合GTSet、IoU阈值iou_threshDTSet = sort(DTSet)  # 按照预测分数进行降序排序
while DTSet非空 and GTSet非空 doDT = DTSet(0)  # 在DTSet中取出当前分数最高的框，记为DTi = argmax IoU(DT, GTSet)  # 找出与DT框之间的IoU最大的那个GT框的编号if IoU(DT, GTSet(i)) > iou_thresh：DT框为正样本GTSet = GTSet - GTSet(i)  # 删除已经成功匹配的GT框else:DT框为负样本DTSet = DTSet -DT  # 删除判断过正负性的DT框
# 退出循环后，未进行正负性判断的DT框，均认为是负样本
for DT in DTSet:DT框为负样本输出：所以完成正负判断的DT框集合

按照上述算法，在pred_label=3时，判断DT框的正负性，得到下表。（为了简单，并未给出真正的GT box和DT box）。

2.1.2 计算P和R

如上表所示，按照分数由高到低计算，在计算分数较低的框时，是以所有分数大于等于该框分数的框作为一个整体的。
（1）第一行：id = 10的DT为positive，positive总数为1，且GT框总数为3，则P=1/1，R=1/3
（2）第二行：以id=10和id=6为整体，positive总数为1，则P=1/2，R=1/3
（3）第三行：以id=10、id=6、id=4为整体，positive总数为2，则P=2/3，R=2/3
后面依此类推…

2.1.3 计算AP

AP即为PR曲线的面积，对于上表
AP = Precision .* Recall
.*表示点积，即对应元素相乘再相加。

其余类别依此类推…

注意：

判断DT框的正负是在每一张图像上对每一类别进行逐一判断的
计算某个类别的P、R和AP是在所有图像下得到的同一类别下的所有DT框下进行判断的。

最后附上COCO API源码文件的大白话注释吧

2.2 讨论

__author__ = 'tsungyi'import numpy as np
import datetime
import time
from collections import defaultdict
from . import mask as maskUtils
import copyclass COCOeval:# Interface for evaluating detection on the Microsoft COCO dataset.## The usage for CocoEval is as follows:#  cocoGt=..., cocoDt=...       # load dataset and results#  E = CocoEval(cocoGt,cocoDt); # initialize CocoEval object#  E.params.recThrs = ...;      # set parameters as desired#  E.evaluate();                # run per image evaluation#  E.accumulate();              # accumulate per image results#  E.summarize();               # display summary metrics of results# For example usage see evalDemo.m and http://mscoco.org/.## The evaluation parameters are as follows (defaults in brackets):#  imgIds     - [all] N img ids to use for evaluation#  catIds     - [all] K cat ids to use for evaluation#  iouThrs    - [.5:.05:.95] T=10 IoU thresholds for evaluation#  recThrs    - [0:.01:1] R=101 recall thresholds for evaluation#  areaRng    - [...] A=4 object area ranges for evaluation#  maxDets    - [1 10 100] M=3 thresholds on max detections per image#  iouType    - ['segm'] set iouType to 'segm', 'bbox' or 'keypoints'#  iouType replaced the now DEPRECATED useSegm parameter.#  useCats    - [1] if true use category labels for evaluation# Note: if useCats=0 category labels are ignored as in proposal scoring.# Note: multiple areaRngs [Ax2] and maxDets [Mx1] can be specified.## evaluate(): evaluates detections on every image and every category and# concats the results into the "evalImgs" with fields:#  dtIds      - [1xD] id for each of the D detections (dt)#  gtIds      - [1xG] id for each of the G ground truths (gt)#  dtMatches  - [TxD] matching gt id at each IoU or 0#  gtMatches  - [TxG] matching dt id at each IoU or 0#  dtScores   - [1xD] confidence of each dt#  gtIgnore   - [1xG] ignore flag for each gt#  dtIgnore   - [TxD] ignore flag for each dt at each IoU## accumulate(): accumulates the per-image, per-category evaluation# results in "evalImgs" into the dictionary "eval" with fields:#  params     - parameters used for evaluation#  date       - date evaluation was performed#  counts     - [T,R,K,A,M] parameter dimensions (see above)#  precision  - [TxRxKxAxM] precision for every evaluation setting#  recall     - [TxKxAxM] max recall for every evaluation setting# Note: precision and recall==-1 for settings with no gt objects.## See also coco, mask, pycocoDemo, pycocoEvalDemo## Microsoft COCO Toolbox.      version 2.0# Data, paper, and tutorials available at:  http://mscoco.org/# Code written by Piotr Dollar and Tsung-Yi Lin, 2015.# Licensed under the Simplified BSD License [see coco/license.txt]def __init__(self, cocoGt=None, cocoDt=None, iouType='segm'):'''Initialize CocoEval using coco APIs for gt and dt:param cocoGt: coco object with ground truth annotations:param cocoDt: coco object with detection results:return: None'''if not iouType:print('iouType not specified. use default iouType segm')self.cocoGt   = cocoGt              # ground truth COCO APIself.cocoDt   = cocoDt              # detections COCO APIself.params   = {}                  # evaluation parametersself.evalImgs = defaultdict(list)   # per-image per-category evaluation results [KxAxI] elementsself.eval     = {}                  # accumulated evaluation resultsself._gts = defaultdict(list)       # gt for evaluationself._dts = defaultdict(list)       # dt for evaluationself.params = Params(iouType=iouType) # parametersself._paramsEval = {}               # parameters for evaluationself.stats = []                     # result summarizationself.ious = {}                      # ious between all gts and dtsif not cocoGt is None:self.params.imgIds = sorted(cocoGt.getImgIds())self.params.catIds = sorted(cocoGt.getCatIds())def _prepare(self):'''Prepare ._gts and ._dts for evaluation based on params:return: None'''def _toMask(anns, coco):# modify ann['segmentation'] by referencefor ann in anns:rle = coco.annToRLE(ann)ann['segmentation'] = rlep = self.paramsif p.useCats:gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds, catIds=p.catIds))else:gts=self.cocoGt.loadAnns(self.cocoGt.getAnnIds(imgIds=p.imgIds))dts=self.cocoDt.loadAnns(self.cocoDt.getAnnIds(imgIds=p.imgIds))# 对于目标检测，这个不用管# convert ground truth to mask if iouType == 'segm'if p.iouType == 'segm':_toMask(gts, self.cocoGt)_toMask(dts, self.cocoDt)# set ignore flagfor gt in gts:gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']if p.iouType == 'keypoints':gt['ignore'] = (gt['num_keypoints'] == 0) or gt['ignore']self._gts = defaultdict(list)       # gt for evaluationself._dts = defaultdict(list)       # dt for evaluationfor gt in gts:self._gts[gt['image_id'], gt['category_id']].append(gt)for dt in dts:self._dts[dt['image_id'], dt['category_id']].append(dt)self.evalImgs = defaultdict(list)   # per-image per-category evaluation resultsself.eval     = {}                  # accumulated evaluation resultsdef evaluate(self):'''Run per image evaluation on given images and store results (a list of dict) in self.evalImgs:return: None'''tic = time.time()print('Running per image evaluation...')# 1.先对参数self.params进行预处理p = self.params# add backward compatibility if useSegm is specified in params# p.useSegm默认是None, 对于目标检测，p.iouType == 'bbox'if not p.useSegm is None:p.iouType = 'segm' if p.useSegm == 1 else 'bbox'print('useSegm (deprecated) is not None. Running {} evaluation'.format(p.iouType))print('Evaluate annotation type *{}*'.format(p.iouType))# 可能存在着重复的p.imgIds，过滤一下p.imgIds = list(np.unique(p.imgIds))# 目标检测默认是要按类别进行区分的，所以也过滤一下重复的类别idif p.useCats:p.catIds = list(np.unique(p.catIds))# p.maxDets的默认值是[1, 10, 100]，排一下序p.maxDets = sorted(p.maxDets)# 处理后再赋值给self.paramsself.params=p# 2.处理一下gt和dt的图像id和类别idself._prepare()# 3.loop through images, area range, max detection number# 对于目标检测，就是直接把p.catIds赋值给catIdscatIds = p.catIds if p.useCats else [-1]# 对于目标检测，就是采用self.computeIoU这个函数if p.iouType == 'segm' or p.iouType == 'bbox':computeIoU = self.computeIoUelif p.iouType == 'keypoints':computeIoU = self.computeOks# 对于COCO2017val, 有N=5000张图像，K=80个类别，所以共有5000*80个不同的元组(imgId, catId)# 以这些元组作为字典self.ious的键，其iou作为键值self.ious = {(imgId, catId): computeIoU(imgId, catId) \for imgId in p.imgIdsfor catId in catIds}evaluateImg = self.evaluateImgmaxDet = p.maxDets[-1]  # 默认值100# self.areaRng = [[0 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]self.evalImgs = [evaluateImg(imgId, catId, areaRng, maxDet)for catId in catIdsfor areaRng in p.areaRngfor imgId in p.imgIds]self._paramsEval = copy.deepcopy(self.params)toc = time.time()print('DONE (t={:0.2f}s).'.format(toc-tic))def computeIoU(self, imgId, catId):# 以某一张图像的某一个类别来计算IoUp = self.paramsif p.useCats:# 目标检测使用两行代码，对于给定的图像imgId和给定的类别catId# 分别取出其对应的gt框的dt框，可以出现多种情况：# （1）len(gt) == 0 and len(dt) ==0 这种情况是“好情况”# （2）len(gt) == 0 and len(dt) !=0 说明误检了# （3）len(gt) != 0 and len(dt) ==0 说明漏检了# （4）len(gt) != 0 and len(dt) !=0 起码检测出了该类别，好不好就看定位得怎样了gt = self._gts[imgId,catId]dt = self._dts[imgId,catId]else:gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]if len(gt) == 0 and len(dt) ==0:return []  # 啥也没有，直接返回# 使用快速排序法对检测出来的边界框dt按照score进行降序排序，得到索引inds = np.argsort([-d['score'] for d in dt], kind='mergesort')# 按照索引取dt，实际上最终实现了：对检测出来的边界框dt按照score进行降序排序dt = [dt[i] for i in inds]# p.maxDets[-1]默认值是100,所以如果检测出来的dt框数量大于100，那只取score最大的前100个框if len(dt) > p.maxDets[-1]:dt=dt[0:p.maxDets[-1]]if p.iouType == 'segm':g = [g['segmentation'] for g in gt]d = [d['segmentation'] for d in dt]elif p.iouType == 'bbox':# 对于目标检测，执行这两行代码g = [g['bbox'] for g in gt]d = [d['bbox'] for d in dt]else:raise Exception('unknown iouType for iou computation')# compute iou between each dt and gt regioniscrowd = [int(o['iscrowd']) for o in gt]# 计算每个gt与每个dt的IoUious = maskUtils.iou(d,g,iscrowd)return iousdef computeOks(self, imgId, catId):p = self.params# dimention here should be Nxmgts = self._gts[imgId, catId]dts = self._dts[imgId, catId]inds = np.argsort([-d['score'] for d in dts], kind='mergesort')dts = [dts[i] for i in inds]if len(dts) > p.maxDets[-1]:dts = dts[0:p.maxDets[-1]]# if len(gts) == 0 and len(dts) == 0:if len(gts) == 0 or len(dts) == 0:return []ious = np.zeros((len(dts), len(gts)))sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0vars = (sigmas * 2)**2k = len(sigmas)# compute oks between each detection and ground truth objectfor j, gt in enumerate(gts):# create bounds for ignore regions(double the gt bbox)g = np.array(gt['keypoints'])xg = g[0::3]; yg = g[1::3]; vg = g[2::3]k1 = np.count_nonzero(vg > 0)bb = gt['bbox']x0 = bb[0] - bb[2]; x1 = bb[0] + bb[2] * 2y0 = bb[1] - bb[3]; y1 = bb[1] + bb[3] * 2for i, dt in enumerate(dts):d = np.array(dt['keypoints'])xd = d[0::3]; yd = d[1::3]if k1>0:# measure the per-keypoint distance if keypoints visibledx = xd - xgdy = yd - ygelse:# measure minimum distance to keypoints in (x0,y0) & (x1,y1)z = np.zeros((k))dx = np.max((z, x0-xd),axis=0)+np.max((z, xd-x1),axis=0)dy = np.max((z, y0-yd),axis=0)+np.max((z, yd-y1),axis=0)e = (dx**2 + dy**2) / vars / (gt['area']+np.spacing(1)) / 2if k1 > 0:e=e[vg > 0]ious[i, j] = np.sum(np.exp(-e)) / e.shape[0]return iousdef evaluateImg(self, imgId, catId, aRng, maxDet):'''perform evaluation for single category and image:return: dict (single image results)'''p = self.paramsif p.useCats:# 目标检测使用两行代码，对于给定的图像imgId和给定的类别catId# 分别取出其对应的gt框的dt框gt = self._gts[imgId,catId]dt = self._dts[imgId,catId]else:gt = [_ for cId in p.catIds for _ in self._gts[imgId,cId]]dt = [_ for cId in p.catIds for _ in self._dts[imgId,cId]]if len(gt) == 0 and len(dt) ==0:return None# 对于gt框，框的面积不在规定范围之内，都将被忽略，这样方便用于计算面积大小的AP:# 如AP_small、AP_medium、AP_large、AP_allfor g in gt:if g['ignore'] or (g['area']<aRng[0] or g['area']>aRng[1]):g['_ignore'] = 1else:g['_ignore'] = 0# sort dt highest score first, sort gt ignore last# 升序排序，最终gt框中的_ignore框就被扔到了最后gtind = np.argsort([g['_ignore'] for g in gt], kind='mergesort')gt = [gt[i] for i in gtind]# 降序排序，分数越高的dt框越靠前dtind = np.argsort([-d['score'] for d in dt], kind='mergesort')dt = [dt[i] for i in dtind[0:maxDet]]iscrowd = [int(o['iscrowd']) for o in gt]# load computed ious# 只有一种情况满足len(self.ious[imgId, catId]) > 0：就是之前在计算iou时出现的第四种情况# 这时相应的IoU(dt, gt):dt取所有，gt只取g['_ignore'] = 0的ious = self.ious[imgId, catId][:, gtind] if len(self.ious[imgId, catId]) > 0 else self.ious[imgId, catId]T = len(p.iouThrs) # T= 10G = len(gt)D = len(dt)gtm  = np.zeros((T,G))dtm  = np.zeros((T,D))gtIg = np.array([g['_ignore'] for g in gt])dtIg = np.zeros((T,D))if not len(ious)==0:for tind, t in enumerate(p.iouThrs):for dind, d in enumerate(dt):# information about best match so far (m=-1 -> unmatched)iou = min([t,1-1e-10])m   = -1for gind, g in enumerate(gt):# if this gt already matched, and not a crowd, continueif gtm[tind,gind]>0 and not iscrowd[gind]:continue# if dt matched to reg gt, and on ignore gt, stop# 如果dt[dind]已经找到匹配（m>-1）且 成功匹配的gt框是有效的 且 当前要匹配的gt框是ignore的，就不找了# 这样子的话，假如dt[dind]一直没找到匹配对，那我可以在无效gt框中继续寻找咯?if m>-1 and gtIg[m]==0 and gtIg[gind]==1:break# continue to next gt unless better match madeif ious[dind,gind] < iou:# 程序能来到这里，表明dt[dind]和gt[gind]的iou小于目前的iou阈值，即dt[dind]还需要继续寻找剩下的# 为匹配的gt框，直到搜索空间为空continue# if match successful and best so far, store appropriatelyiou=ious[dind,gind]m=gind# if match made store id of match for both dt and gtif m ==-1:continue# 程序能来到这里，表明在IoU阈值=p.iouThrs[tind]下，预测框dt[dind]找到了与其最佳匹配的真实框gt[m]dtIg[tind,dind] = gtIg[m]# dtm[tind,dind]表明：在IoU阈值=p.iouThrs[tind]下，预测框dt[dind]的匹配对象为gt[m],于是存了gt[m]的iddtm[tind,dind]  = gt[m]['id']# gtm[tind,m]表明：在IoU阈值=p.iouThrs[tind]下，真实框gt[m]的匹配对象为dt[dind],于是存了dt[dind]的idgtm[tind,m]     = d['id']# 感觉上面的匹配是不考虑面积的，到这里才考虑面积# set unmatched detections outside of area range to ignorea = np.array([d['area']<aRng[0] or d['area']>aRng[1] for d in dt]).reshape((1, len(dt)))dtIg = np.logical_or(dtIg, np.logical_and(dtm==0, np.repeat(a,T,0)))# store results for given image and categoryreturn {'image_id':     imgId,'category_id':  catId,'aRng':         aRng,'maxDet':       maxDet,'dtIds':        [d['id'] for d in dt],'gtIds':        [g['id'] for g in gt],'dtMatches':    dtm,'gtMatches':    gtm,'dtScores':     [d['score'] for d in dt],'gtIgnore':     gtIg,'dtIgnore':     dtIg,}def accumulate(self, p = None):'''Accumulate per image evaluation results and store the result in self.eval:param p: input params for evaluation:return: None'''print('Accumulating evaluation results...')tic = time.time()if not self.evalImgs:print('Please run evaluate() first')# allows input customized parametersif p is None:p = self.paramsp.catIds = p.catIds if p.useCats == 1 else [-1]T           = len(p.iouThrs)R           = len(p.recThrs)K           = len(p.catIds) if p.useCats else 1A           = len(p.areaRng)M           = len(p.maxDets)precision   = -np.ones((T,R,K,A,M)) # -1 for the precision of absent categoriesrecall      = -np.ones((T,K,A,M))scores      = -np.ones((T,R,K,A,M))# create dictionary for future indexing_pe = self._paramsEvalcatIds = _pe.catIds if _pe.useCats else [-1]setK = set(catIds)setA = set(map(tuple, _pe.areaRng))setM = set(_pe.maxDets)setI = set(_pe.imgIds)# get inds to evaluatek_list = [n for n, k in enumerate(p.catIds)  if k in setK]# k_list = [0, 1, 2, ..., 79]m_list = [m for n, m in enumerate(p.maxDets) if m in setM]# m_list = [1, 10, 100]a_list = [n for n, a in enumerate(map(lambda x: tuple(x), p.areaRng)) if a in setA]# a_list = [0, 1, 2, 3]i_list = [n for n, i in enumerate(p.imgIds)  if i in setI]# i_list = [0, 1, 2, ..., 4999]I0 = len(_pe.imgIds) # I0 = 5000A0 = len(_pe.areaRng) # A0 = 4# retrieve E at each category, area range, and max number of detectionsfor k, k0 in enumerate(k_list):Nk = k0*A0*I0for a, a0 in enumerate(a_list):Na = a0*I0for m, maxDet in enumerate(m_list):# 由3维数据找出对应的一维索引E = [self.evalImgs[Nk + Na + i] for i in i_list]E = [e for e in E if not e is None]if len(E) == 0:continuedtScores = np.concatenate([e['dtScores'][0:maxDet] for e in E])# different sorting method generates slightly different results.# mergesort is used to be consistent as Matlab implementation.inds = np.argsort(-dtScores, kind='mergesort')dtScoresSorted = dtScores[inds]dtm  = np.concatenate([e['dtMatches'][:,0:maxDet] for e in E], axis=1)[:,inds]dtIg = np.concatenate([e['dtIgnore'][:,0:maxDet]  for e in E], axis=1)[:,inds]gtIg = np.concatenate([e['gtIgnore'] for e in E])npig = np.count_nonzero(gtIg==0 )if npig == 0:continuetps = np.logical_and(               dtm,  np.logical_not(dtIg) )fps = np.logical_and(np.logical_not(dtm), np.logical_not(dtIg) )tp_sum = np.cumsum(tps, axis=1).astype(dtype=np.float)fp_sum = np.cumsum(fps, axis=1).astype(dtype=np.float)for t, (tp, fp) in enumerate(zip(tp_sum, fp_sum)):tp = np.array(tp)fp = np.array(fp)nd = len(tp)rc = tp / npigpr = tp / (fp+tp+np.spacing(1))q  = np.zeros((R,))ss = np.zeros((R,))if nd:recall[t,k,a,m] = rc[-1]else:recall[t,k,a,m] = 0# numpy is slow without cython optimization for accessing elements# use python array gets significant speed improvementpr = pr.tolist(); q = q.tolist()for i in range(nd-1, 0, -1):if pr[i] > pr[i-1]:pr[i-1] = pr[i]inds = np.searchsorted(rc, p.recThrs, side='left')try:for ri, pi in enumerate(inds):q[ri] = pr[pi]ss[ri] = dtScoresSorted[pi]except:passprecision[t,:,k,a,m] = np.array(q)scores[t,:,k,a,m] = np.array(ss)self.eval = {'params': p,'counts': [T, R, K, A, M],'date': datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'),'precision': precision,'recall':   recall,'scores': scores,}toc = time.time()print('DONE (t={:0.2f}s).'.format( toc-tic))def summarize(self):'''Compute and display summary metrics for evaluation results.Note this functin can *only* be applied on the default parameter setting'''def _summarize( ap=1, iouThr=None, areaRng='all', maxDets=100 ):p = self.paramsiStr = ' {:<18} {} @[ IoU={:<9} | area={:>6s} | maxDets={:>3d} ] = {:0.3f}'titleStr = 'Average Precision' if ap == 1 else 'Average Recall'typeStr = '(AP)' if ap==1 else '(AR)'iouStr = '{:0.2f}:{:0.2f}'.format(p.iouThrs[0], p.iouThrs[-1]) \if iouThr is None else '{:0.2f}'.format(iouThr)aind = [i for i, aRng in enumerate(p.areaRngLbl) if aRng == areaRng]mind = [i for i, mDet in enumerate(p.maxDets) if mDet == maxDets]if ap == 1:# dimension of precision: [TxRxKxAxM]s = self.eval['precision']# IoUif iouThr is not None:t = np.where(iouThr == p.iouThrs)[0]s = s[t]s = s[:,:,:,aind,mind]else:# dimension of recall: [TxKxAxM]s = self.eval['recall']if iouThr is not None:t = np.where(iouThr == p.iouThrs)[0]s = s[t]s = s[:,:,aind,mind]if len(s[s>-1])==0:mean_s = -1else:mean_s = np.mean(s[s>-1])print(iStr.format(titleStr, typeStr, iouStr, areaRng, maxDets, mean_s))return mean_sdef _summarizeDets():stats = np.zeros((12,))stats[0] = _summarize(1)stats[1] = _summarize(1, iouThr=.5, maxDets=self.params.maxDets[2])stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[2])stats[3] = _summarize(1, areaRng='small', maxDets=self.params.maxDets[2])stats[4] = _summarize(1, areaRng='medium', maxDets=self.params.maxDets[2])stats[5] = _summarize(1, areaRng='large', maxDets=self.params.maxDets[2])stats[6] = _summarize(0, maxDets=self.params.maxDets[0])stats[7] = _summarize(0, maxDets=self.params.maxDets[1])stats[8] = _summarize(0, maxDets=self.params.maxDets[2])stats[9] = _summarize(0, areaRng='small', maxDets=self.params.maxDets[2])stats[10] = _summarize(0, areaRng='medium', maxDets=self.params.maxDets[2])stats[11] = _summarize(0, areaRng='large', maxDets=self.params.maxDets[2])return statsdef _summarizeKps():stats = np.zeros((10,))stats[0] = _summarize(1, maxDets=20)stats[1] = _summarize(1, maxDets=20, iouThr=.5)stats[2] = _summarize(1, maxDets=20, iouThr=.75)stats[3] = _summarize(1, maxDets=20, areaRng='medium')stats[4] = _summarize(1, maxDets=20, areaRng='large')stats[5] = _summarize(0, maxDets=20)stats[6] = _summarize(0, maxDets=20, iouThr=.5)stats[7] = _summarize(0, maxDets=20, iouThr=.75)stats[8] = _summarize(0, maxDets=20, areaRng='medium')stats[9] = _summarize(0, maxDets=20, areaRng='large')return statsif not self.eval:raise Exception('Please run accumulate() first')iouType = self.params.iouTypeif iouType == 'segm' or iouType == 'bbox':summarize = _summarizeDetselif iouType == 'keypoints':summarize = _summarizeKpsself.stats = summarize()def __str__(self):self.summarize()class Params:'''Params for coco evaluation api'''def setDetParams(self):self.imgIds = []self.catIds = []# np.arange causes trouble.  the data point on arange is slightly larger than the true valueself.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)# [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95] 10个IoU阈值self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)# 0:0.01:1 共101个不同的召回率self.maxDets = [1, 10, 100]self.areaRng = [[0 ** 2, 1e5 ** 2], [0 ** 2, 32 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]self.areaRngLbl = ['all', 'small', 'medium', 'large']self.useCats = 1def setKpParams(self):self.imgIds = []self.catIds = []# np.arange causes trouble.  the data point on arange is slightly larger than the true valueself.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)self.maxDets = [20]self.areaRng = [[0 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]self.areaRngLbl = ['all', 'medium', 'large']self.useCats = 1def __init__(self, iouType='segm'):if iouType == 'segm' or iouType == 'bbox':self.setDetParams()elif iouType == 'keypoints':self.setKpParams()else:raise Exception('iouType not supported')self.iouType = iouType# useSegm is deprecatedself.useSegm = None

目标检测中的mAP和AP计算原理相关推荐

目标检测中的mAP是什么含义？
目标检测中的mAP是什么含义? 1.mAP定义及相关概念 mAP: mean Average Precision, 即各类别AP的平均值 AP: PR曲线下面积,后文会详细讲解 PR曲线: Preci ...
【深度学习】目标检测中 IOU 的概念及计算
在目标检测当中,有一个重要的概念就是 IOU.一般指代模型预测的 bbox 和 Groud Truth 之间的交并比. 何为交并比呢? I O U = A ∩ B A ∪ B IOU = \frac{ ...
对于目标检测中mAP@0.5的理解
文章目录前言 mAP@0.5 AP是Precision-Recall Curve(PRC)下面的面积!!! 理一下思路参考前言一直不是很理解目标检测中的mAP是如何的,今天具体来写一下,加深一 ...
map平均准确率_第五篇目标检测评价标准—MAP
MAP(Mean Average Precision) 网上关于map的帖子,博客,回答有很多,但是描述很不清楚,有的甚至有很多错误,非常难以理解,最近研究了一下map,这里记录总结一下这里的AP指 ...
理解目标检测当中的mAP
我们在评价一个目标检测算法的"好坏"程度的时候,往往采用的是pascal voc 2012的评价标准mAP. 网上一些资料博客参差不齐,缺乏直观易懂的正确说明.希望这篇博文能够给大 ...
一文讲清楚目标检测中mAP、AP、precison、recall、accuracy、TP、FP、FN、TN
TP.FP.FN.TN 分类中TP.FP.FN.TN含义目标检测中TP.FP.FN.TN的含义 precision .recall .accuracy precision recall accura ...
目标检测中map的计算
文章目录前言一.IoU和TP.FP.TN.FN的概念 IoU(Intersection over Union): TP.FP.TN.FN 二.Precision和Recall 1.Precisio ...
通俗地讲解目标检测中AP指标
声明:以下内容全是我的个人见解,如有问题,欢迎指正! AP(Average Precision)即平均精度,是目标检测中的一个常用指标. 一.精确率和召回率说道AP,那不得不提准确率和召回率.首先我 ...
睿智的目标检测20——利用mAP计算目标检测精确度
睿智的目标检测20--利用mAP计算目标检测精确度学习前言 GITHUB代码下载知识储备 1.IOU的概念 2.TP TN FP FN的概念 3.precision(精确度)和recall(召回率 ...

目标检测中的mAP和AP计算原理

目录

前言