参考资料

官网教程链接

http://www.tensorfly.cn/tfdoc/how_tos/adding_an_op.html#AUTOGENERATED-implement-the-gradient-in-python
这里我们可以知道
使用 ops.RegisterGradient 注册梯度函数需要注意的一些细节:

对于仅有一个输出的 Op, 梯度函数使用 Operation op 和一个 Tensor grad 作为参数, 并从 op.inputs[i], op.outputs[i], 和 grad 构建新的 Op. 属性的信息可以通过 op.get_attr 获取.如果 Op 有多个输出, 梯度函数将使用 op 和 grads 作为参数, 其中, grads 是一个 梯度 Op 的列表, 为每一个输出计算梯度. 梯度函数的输出必须是一个 Tensor 对象列表, 对应到 每一个输入的梯度.如果没有为一些输入定义梯度, 譬如用作索引的整型, 这些输入返回的梯度为 None. 举一个例子, 如果一个 Op 的输入为一个浮点数 tensor x 和一个整型索引 i, 那么梯度函数将返回 [x_grad, None].如果梯度对于一个 Op 来说毫无意义, 使用 ops.NoGradient("OpName") 禁用自动差分.

注意当梯度函数被调用时, 作用的对象是数据流图中的 Op, 而不是 tensor 数据本身. 因此, 只有在图运行时, 梯度运算才会被其它 tensorflow Op 的执行动作所触发.

新版本更加简洁的方法

@tf.custom_gradient
def log1pexp(x):e = tf.exp(x)def grad(dy):return dy * (1 - 1 / (1 + e))return tf.log(1 + e), grad

https://blog.csdn.net/u014061630/article/details/81369787
https://blog.csdn.net/qq_39216794/article/details/86183668

TensorFlow官方GitHub 这是用tf.funtion来实现的

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/function.py#L349
这里我们可以知道，对Python来说有两种自定义梯度的方法

import tensorflow as tf
from tensorflow.core.framework import function_pb2
from tensorflow.core.protobuf import config_pb2
from tensorflow.core.protobuf import rewriter_config_pb2
from tensorflow.python.client import session
from tensorflow.python.framework import constant_op
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import errors_impl
from tensorflow.python.framework import function
from tensorflow.python.framework import graph_to_function_def
from tensorflow.python.framework import ops
from tensorflow.python.framework import tensor_shape
from tensorflow.python.framework import test_util
from tensorflow.python.framework.errors import InvalidArgumentError
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import control_flow_ops
from tensorflow.python.ops import functional_ops
from tensorflow.python.ops import gen_logging_ops
from tensorflow.python.ops import gradients_impl
from tensorflow.python.ops import init_ops
from tensorflow.python.ops import linalg_ops
from tensorflow.python.ops import logging_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn_ops
from tensorflow.python.ops import random_ops
from tensorflow.python.ops import variable_scope
from tensorflow.python.ops import variables
from tensorflow.python.platform import test
from tensorflow.python.platform import tf_loggingdef testSameFunctionDifferentGrads():def PartOne(x):# Default grad is dx = dy * 2@function.Defun(dtypes.float32)def Foo(x):return x * 2return Foo(x)def PartTwo(x):# 低级方式@function.Defun(dtypes.float32, dtypes.float32)def Bar(x, dy):return x + dy  # crazy backprop@function.Defun(dtypes.float32, grad_func=Bar)def Foo(x):return x * 2return Foo(x)def PartThree(x):# 高级方式def Bar(op, dy):print("op->{} dy->{}".format(op,dy))return op.inputs[0] * dy / 2  # crazy backprop@function.Defun(dtypes.float32, python_grad_func=Bar)def Foo(x):return x * 2return Foo(x)g = ops.Graph()with g.as_default():x = constant_op.constant(100.)x0 = xy0 = PartOne(x0)dx0, = gradients_impl.gradients(ys=[y0], xs=[x0])x1 = xy1 = PartTwo(x1)dx1, = gradients_impl.gradients(ys=[y1], xs=[x1])x2 = xy2 = PartThree(x2)dx2, = gradients_impl.gradients(ys=[y2], xs=[x2])with tf.Session(graph=g) as sess:v0, v1, v2 = sess.run([dx0, dx1, dx2])print(v0)print(v1)print(v2)if __name__ == "__main__":testSameFunctionDifferentGrads()# self.assertAllEqual(v0, 2.)# self.assertAllEqual(v1, 101.)# self.assertAllEqual(v2, 50.)

以下是一些实验案例

import time
import numpy as npimport tensorflow as tffrom tensorflow.python.framework import function

参考论文Stochastic Generative Hashing

https://github.com/doubling/Stochastic_Generative_Hashi
他使用了重参数的技巧，使得ξ产生了梯度

dtype=tf.float32
@function.Defun(dtype, dtype, dtype, dtype)
def DoublySNGrad(logits, epsilon, dprev, dpout):'''函数名的意思便是，连续的sign的梯度给定ξ（epsilon）便可用论文2.4 Reparametrization via Stochastic Neur中提到的方法来进行重参数的采样return->dlogits(logits的新梯度), depsilon(ξ的新梯度)'''prob = 1.0 / (1 + tf.exp(-logits))yout = (tf.sign(prob - epsilon) + 1.0) / 2.0# unbiaseddlogits = prob * (1 - prob) * (dprev + dpout)depsilon = dprevreturn dlogits, depsilon# 这里应该是使用了TensorFlow中的梯度重写@function.Defun(dtype, dtype, grad_func=DoublySNGrad)
def DoublySN(logits, epsilon):prob = 1.0 / (1 + tf.exp(-logits))yout = (tf.sign(prob - epsilon) + 1.0) / 2.0return yout, proba = tf.constant(10.0)
b = tf.constant(10.0)
g = tf.Graph()
y = DoublySN(a,b)
dx = tf.gradients(y,[a,b])
init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)grad = sess.run(dx)print(grad)

WARNING:tensorflow:From F:\Anaconda3\lib\site-packages\tensorflow\python\framework\function.py:987: calling Graph.create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version.
Instructions for updating:
Shapes are always computed; don't use the compute_shapes as it has no effect.
[9.083335e-05, 1.0]

一个自定义梯度的例子

sign函数的梯度恒为0，这里把梯度修改
https://blog.csdn.net/LoseInVain/article/details/83108001

@function.Defun(tf.float32, tf.float32)
def DoublySignGrad(x, dx):'''而dx指的是从反向而言的上一层的梯度。这里举个例子使dx为10'''input = xcond = (input >= -1) & (input <= 1)zeros = tf.zeros_like(dx)return tf.where(cond, dx, zeros)@function.Defun(tf.float32, grad_func=DoublySignGrad)
def DoublySign(x):return tf.sign(x)x = tf.constant([10., 0., -10., -0.05, 0.05])
g = tf.Graph()
y = DoublySign(x*10) # <-注意这里x乘以10，使得grad为10
dx = tf.gradients(y, x)
init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)y = sess.run(y)print(y)grad = sess.run(dx)print(grad)

[ 1.  0. -1. -1.  1.]
[array([ 0., 10.,  0., 10., 10.], dtype=float32)]

多个输出

@function.Defun(tf.float32,tf.float32,tf.float32,tf.float32)
def DoublySignGrad(a,b,pa,pb):'''a和b即DoublySign(a,b)中的a和bpa和pb即DoublySign(a,b)的返回值2*a+10,b+10的从反向而言的上一层的梯度。'''return 123.0,456.0@function.Defun(tf.float32,tf.float32,grad_func=DoublySignGrad)
def DoublySign(a,b):return 2*a+10,b+10def Sign(a,b):return 2*a+10,b+10a = tf.constant(10.0)
b = tf.constant(10.0)
g = tf.Graph()
y = DoublySign(a,b)
y_ = Sign(a,b)
dx = tf.gradients(y,[a,b])
dx_ = tf.gradients(y_,[a,b])
init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)print("修改梯度")y = sess.run(y)print(y)grad = sess.run(dx)print(grad)print("原始梯度")y_ = sess.run(y_)print(y_)grad = sess.run(dx_)print(grad)

修改梯度
(30.0, 20.0)
[123.0, 456.0]
原始梯度
(30.0, 20.0)
[2.0, 1.0]

使用高级的API

和低级API的区别只有
使用python_grad_func指明重载的梯度函数
携带一个op类，里面存放了一些op信息，替代了之前低级api中需要一一指明的麻烦

# @tf.RegisterGradient("QuantizeGrad")
def sign_grad(op, grad1,grad2):a = op.inputs[0]  # 取出当前的输入b = op.inputs[1]  # 取出当前的输入out = op.outputs[0] # 取出当前的输出return grad1*out,grad2*b# 将大于1或者小于-1的上一层的梯度置为0@function.Defun(tf.float32,tf.float32, python_grad_func=sign_grad)
def DoublySign(x,b):return x*100, b*10a = tf.constant([1.,2.])
b = tf.constant([3.,4.])
g = tf.Graph()
y = DoublySign(a,b)
dx = tf.gradients(y, [a,b])
init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)y = sess.run(y)print(y)grad = sess.run(dx)print(grad)

(array([100., 200.], dtype=float32), array([30., 40.], dtype=float32))
[array([100., 200.], dtype=float32), array([3., 4.], dtype=float32)]

把低级API改写成高级API的例子

def DoublySignGrad(op, grad):'''而dx指的是从反向而言的上一层的梯度。'''input = op.inputs[0]cond = (input >= -1) & (input <= 1)zeros = tf.zeros_like(grad)return tf.where(cond, grad, zeros)@function.Defun(tf.float32, python_grad_func=DoublySignGrad)
def DoublySign(x):return tf.sign(x)x = tf.constant([10., 0., -10., -0.05, 0.05])
g = tf.Graph()
y = DoublySign(x)
dx = tf.gradients(y, x)
init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)y = sess.run(y)print(y)grad = sess.run(dx)print(grad)

[ 1.  0. -1. -1.  1.]
[array([0., 1., 0., 1., 1.], dtype=float32)]

tensorflow自定义op和梯度相关推荐

tensorflow自定义op：梯度
暂时并未解决我的问题,但感觉将来会有用,特此转载 . 在使用 tensorflow 的时候,有时不可避免的会需要自定义 op,官方文档对于定义 op 的前向过程介绍挺详细,但是对于梯度的介绍有点 ...
tensorflow：自定义op
比官网介绍的更好理解,特此转载 tensorflow:自定义op简单介绍 2017年06月26日 13:32:55 阅读数:6094 tensorflow 自定义 op 本文只是简单的翻译了 http ...
tensorflow：自定义op简单介绍
本文只是简单的翻译了 https://www.tensorflow.org/extend/adding_an_op 的简单部分,高级部分请移步官网. 可能需要新定义 c++ operation 的几种 ...
TensorFlow使用Python自定义op和损失函数
TensorFlow使用Python自定义op和损失函数 TensorFlow是静态图结构,即必须把所有的操作以及网络结构定义好(后来有了动态图功能,即Eager Execution ),在没有用tf ...
【Tensorflow】Tensorflow 自定义梯度
目录前言自定义梯度说明 gradient_override_map的使用多输入与多输出op 利用stop_gradient 参考 [fishing-pan:https://blog.csdn. ...
Ubuntu tensorflow自定义GPU版本op节点
参考:https://blog.csdn.net/qq_27637315/article/details/79114633 windows增加op节点: https://github.com/tens ...
TensorFlow实现自定义Op
『写在前面』以CTC Beam search decoder为例,简单整理一下TensorFlow实现自定义Op的操作流程. 基本的流程 1. 定义Op接口 #include "tenso ...
Tensorflow中的各种梯度处理gradient
最近其实一直想自己手动创建op,这样的话好像得懂tensorflow自定义api/op的规则,设计前向与反向,注册命名,注意端口以及文件组织,最后可能还要需要重新编译才能使用.这一部分其实记得tens ...
tensorflow频域操作及梯度求取
tensorflow频域操作及梯度求取最近尝试使用tensorflow中的傅立叶变换操作,主要涉及的op有tf.complex,tf.fft, tf.fft2d,tf.angle, 涉及的数据类型为 ...

tensorflow自定义op和梯度