ICML 18 10-26-2020
Last updated 4 years ago
Was this helpful?
the original BPTT is not efficient, because of the inverse matrix
instead of using this, this paper use Neurman series to approximate the inverse matrix to reduce the cost.
similar to the hyperparameter optimization paper