【记录】数据处理方法总结及实现

背景

数据增强作为前处理的关键步骤，在整个计算机视觉中有着具足轻重的地位；

数据增强往往是决定数据集质量的关键，主要用于数据增广，在基于深度学习的任务中，数据的多样性和数量往往能够决定模型的上限；

本次记录主要是对数据增强中一些方法的源码实现；

常用数据增强方法

首先如果是使用Pytorch框架，其内部的torchvision已经包装好了数据增强的很多方法；

from torchvision import transformsdata_aug = transforms.Compose[transforms.Resize(size=240),transforms.RandomHorizontalFlip(0.5),transforms.ToTensor()
]

接下来自己实现一些主要的方法；

常见的数据增强方法有：Compose、RandomHflip、RandomVflip、Reszie、RandomCrop、Normalize、Rotate、RandomRotate

1、Compose

作用：对多个方法的排序整合，并且依次调用；

# 排序（compose）
class Compose(object):def __init__(self, transforms):self.transforms = transformsdef __call__(self, img):for t in self.transforms:img = t(img)    # 通过循环不断调用列表中的方法return img

2、RandomHflip

作用：随机水平翻转；

# 随机水平翻转（random h flip）
class RandomHflip(object):def __call__(self, image):if random.randint(2):return cv2.flip(image, 1)else:return image

通过随机数0或1，实现对图像可能反转或不翻转；

3、RandomVflip

作用：随机垂直翻转

class RandomVflip(object):def __call__(self, image):if random.randint(2):return cv2.flip(image, 0)else:return image

4、RandomCrop

作用：随机裁剪；

# 缩放（scale）
def scale_down(src_size, size):w, h = sizesw, sh = src_sizeif sh < h:w, h = float(w * sh) / h, shif sw < w:w, h = sw, float(h * sw) / wreturn int(w), int(h)# 固定裁剪（fixed crop）
def fixed_crop(src, x0, y0, w, h, size=None):out = src[y0:y0 + h, x0:x0 + w]if size is not None and (w, h) != size:out = cv2.resize(out, (size[0], size[1]), interpolation=cv2.INTER_CUBIC)return out# 随机裁剪（random crop）
class RandomCrop(object):def __init__(self, size):self.size = sizedef __call__(self, image):h, w, _ = image.shapenew_w, new_h = scale_down((w, h), self.size)if w == new_w:x0 = 0else:x0 = random.randint(0, w - new_w)if h == new_h:y0 = 0else:y0 = random.randint(0, h - new_h)out = fixed_crop(image, x0, y0, new_w, new_h, self.size)return out

5、Normalize

作用：对图像数据进行正则化，也就是减均值除方差的作用；

# 正则化（normalize）
class Normalize(object):def __init__(self,mean, std):''':param mean: RGB order:param std:  RGB order'''self.mean = np.array(mean).reshape(3,1,1)self.std = np.array(std).reshape(3,1,1)def __call__(self, image):''':param image:  (H,W,3)  RGB:return:'''return (image.transpose((2, 0, 1)) / 255. - self.mean) / self.std

6、Rotate

作用：对图像进行旋转；

# 旋转（rotate）
def rotate_nobound(image, angle, center=None, scale=1.):(h, w) = image.shape[:2]# if the center is None, initialize it as the center of the imageif center is None:center = (w // 2, h // 2)    # perform the rotationM = cv2.getRotationMatrix2D(center, angle, scale)    # 这里是实现得到旋转矩阵rotated = cv2.warpAffine(image, M, (w, h))            # 通过矩阵进行仿射变换return rotated

7、RandomRotate

作用：随机旋转，广泛适用于图像增强；

# 随机旋转（random rotate）
class FixRandomRotate(object):# 这里的随机旋转是指在0、90、180、270四个角度下的def __init__(self, angles=[0,90,180,270], bound=False):self.angles = anglesself.bound = bounddef __call__(self,img):do_rotate = random.randint(0, 4)angle=self.angles[do_rotate]if self.bound:img = rotate_bound(img, angle)else:img = rotate_nobound(img, angle)return img

8、Resize

作用：实现缩放；

# 大小重置（resize）
class Resize(object):def __init__(self, size, inter=cv2.INTER_CUBIC):self.size = sizeself.inter = interdef __call__(self, image):return cv2.resize(image, (self.size[0], self.size[0]), interpolation=self.inter)

其他数据增强方法

其他一些数据增强的方法大部分是特殊的裁剪；

1、中心裁剪

# 中心裁剪（center crop）
def center_crop(src, size):h, w = src.shape[0:2]new_w, new_h = scale_down((w, h), size)x0 = int((w - new_w) / 2)y0 = int((h - new_h) / 2)out = fixed_crop(src, x0, y0, new_w, new_h, size)return out

2、随机亮度增强

# 随机亮度增强（random brightness）
class RandomBrightness(object):def __init__(self, delta=10):assert delta >= 0assert delta <= 255self.delta = deltadef __call__(self, image):if random.randint(2):delta = random.uniform(-self.delta, self.delta)image = (image + delta).clip(0.0, 255.0)# print('RandomBrightness,delta ',delta)return image

3、随机对比度增强

# 随机对比度增强（random contrast）
class RandomContrast(object):def __init__(self, lower=0.9, upper=1.05):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."# expects float imagedef __call__(self, image):if random.randint(2):alpha = random.uniform(self.lower, self.upper)# print('contrast:', alpha)image = (image * alpha).clip(0.0,255.0)return image

4、随机饱和度增强

# 随机饱和度增强（random saturation）
class RandomSaturation(object):def __init__(self, lower=0.8, upper=1.2):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."def __call__(self, image):if random.randint(2):alpha = random.uniform(self.lower, self.upper)image[:, :, 1] *= alpha# print('RandomSaturation,alpha',alpha)return image

4、边界扩充

# 边界扩充（expand border）
class ExpandBorder(object):def __init__(self,  mode='constant', value=255, size=(336,336), resize=False):self.mode = modeself.value = valueself.resize = resizeself.size = sizedef __call__(self, image):h, w, _ = image.shapeif h > w:pad1 = (h-w)//2pad2 = h - w - pad1if self.mode == 'constant':image = np.pad(image, ((0, 0), (pad1, pad2), (0, 0)),self.mode, constant_values=self.value)else:image = np.pad(image,((0,0), (pad1, pad2),(0,0)), self.mode)elif h < w:pad1 = (w-h)//2pad2 = w-h - pad1if self.mode == 'constant':image = np.pad(image, ((pad1, pad2),(0, 0), (0, 0)),self.mode,constant_values=self.value)else:image = np.pad(image, ((pad1, pad2), (0, 0), (0, 0)),self.mode)if self.resize:image = cv2.resize(image, (self.size[0], self.size[0]),interpolation=cv2.INTER_LINEAR)return image

当然还有很多其他数据增强的方式，在这里就不继续做说明了；

拓展

除了可以使用Pytorch中自带的数据增强包之外，也可以使用imgaug这个包（一个基于数据处理的包、包含大量的数据处理方法，并且代码完全开源）

代码地址：https://github.com/aleju/imgaug

说明文档：https://imgaug.readthedocs.io/en/latest/index.html

强烈建议大家看看这个说明文档，其中的很多数据处理方法可以快速的应用到实际项目中，也可以加深对图像处理的理解；