chemprop.schedulers
===================

.. py:module:: chemprop.schedulers


Functions
---------

.. autoapisummary::

   chemprop.schedulers.build_NoamLike_LRSched


Module Contents
---------------

.. py:function:: build_NoamLike_LRSched(optimizer, warmup_steps, cooldown_steps, init_lr, max_lr, final_lr)

   Build a Noam-like learning rate scheduler which schedules the learning rate with a piecewise linear followed
   by an exponential decay.

   The learning rate increases linearly from ``init_lr`` to ``max_lr`` over the course of
   the first warmup_steps then decreases exponentially to ``final_lr`` over the course of the
   remaining ``total_steps - warmup_steps`` (where ``total_steps = total_epochs * steps_per_epoch``). This is roughly based on the learning rate schedule from [1]_, section 5.3.

   Formally, the learning rate schedule is defined as:

   .. math::
       \mathtt{lr}(i) &=
           \begin{cases}
               \mathtt{init\_lr} + \delta \cdot i &\text{if } i < \mathtt{warmup\_steps} \\
               \mathtt{max\_lr} \cdot \left( \frac{\mathtt{final\_lr}}{\mathtt{max\_lr}} \right)^{\gamma(i)} &\text{otherwise} \\
           \end{cases}
       \\
       \delta &\mathrel{:=}
           \frac{\mathtt{max\_lr} - \mathtt{init\_lr}}{\mathtt{warmup\_steps}} \\
       \gamma(i) &\mathrel{:=}
           \frac{i - \mathtt{warmup\_steps}}{\mathtt{total\_steps} - \mathtt{warmup\_steps}}


   :param optimizer: A PyTorch optimizer.
   :type optimizer: Optimizer
   :param warmup_steps: The number of steps during which to linearly increase the learning rate.
   :type warmup_steps: int
   :param cooldown_steps: The number of steps during which to exponential decay the learning rate.
   :type cooldown_steps: int
   :param init_lr: The initial learning rate.
   :type init_lr: float
   :param max_lr: The maximum learning rate (achieved after ``warmup_steps``).
   :type max_lr: float
   :param final_lr: The final learning rate (achieved after ``cooldown_steps``).
   :type final_lr: float

   .. rubric:: References

   .. [1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. "Attention is all you need." Advances in neural information processing systems, 2017, 30. https://arxiv.org/abs/1706.03762