Deleting a branch is permanent. It CANNOT be undone. Continue?
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》
梯度裁剪一般用于解决梯度爆炸问题,而梯度爆炸问题在训练RNN(循环神经网络)过程中出现得尤为频繁,所以训练RNN基本都需要带上这个参数。
常见的 gradient clipping 有两种做法:
1.根据参数的 gradient 的值直接进行裁剪
2.根据若干参数的 gradient 组成的 vector 的 L2 norm 进行裁剪
第一种方法是先设定一个 gradient 的范围如 (-1, 1), 小于 -1 的 gradient 设为 -1, 大于这个 1 的 gradient 设为 1
关于 gradient clipping 的作用,没有gradient clipping 时,若梯度过大优化算法会越过最优点。
而在一些的框架中,设置 gradient clipping 往往也是在 Optimizer 中设置,如 tensorflow 中设置如下:
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
gvs = optimizer.compute_gradients(cost)
capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
train_op = optimizer.apply_gradients(capped_gvs)