#50 代码中使用clipnorm梯度阶段方法,请问有什么作用?

Closed
created 1 year ago by ZhangY · 5 comments
ZhangY commented 1 year ago
ZhangY commented 1 year ago
Owner
梯度裁剪一般用于解决梯度爆炸问题,而梯度爆炸问题在训练RNN(循环神经网络)过程中出现得尤为频繁,所以训练RNN基本都需要带上这个参数。
ZhangY commented 1 year ago
Owner
常见的 gradient clipping 有两种做法: 1.根据参数的 gradient 的值直接进行裁剪 2.根据若干参数的 gradient 组成的 vector 的 L2 norm 进行裁剪
ZhangY commented 1 year ago
Owner
第一种方法是先设定一个 gradient 的范围如 (-1, 1), 小于 -1 的 gradient 设为 -1, 大于这个 1 的 gradient 设为 1
ZhangY commented 1 year ago
Owner
关于 gradient clipping 的作用,没有gradient clipping 时,若梯度过大优化算法会越过最优点。
ZhangY commented 1 year ago
Owner
而在一些的框架中,设置 gradient clipping 往往也是在 Optimizer 中设置,如 tensorflow 中设置如下: optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate) gvs = optimizer.compute_gradients(cost) capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs] train_op = optimizer.apply_gradients(capped_gvs)
ZhangY closed this issue 1 year ago
Sign in to join this conversation.
No Label
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.