Models
- Filters
- Losses
BCELossBCEWithLogitsLossBaseLossBaseSchedulerChainedSchedulerComposedSchedulerCosineAnnealingSchedulerCosineEmbeddingLossCrossEntropyLossDecaySchedulerDeferSchedulerEarlyStopSchedulerFunctionalLossL1LossMSELossNormMixinSchedulerRegistrySequentialSchedulerStepSchedulerWarmupScheduler
- Shadows
- class todd.models.CheckMixin[source]
-
Mixin to perform a check before the first forward pass of a module.
You should implement
checkin your subclass:>>> class Model(CheckMixin): ... def check( ... self, ... module: nn.Module, ... args: tuple[Any], ... kwargs: dict[str, Any], ... ) -> None: ... print( ... f"Checking {module.__class__.__name__!r} with " ... f"{args=} and {kwargs=}" ... ) ... def forward(self, *args, **kwargs) -> None: ... print(f"Forwarding with {args=} and {kwargs=}")
The
checkmethod executes prior to__call__:>>> model = Model() >>> model(1, 2, a=3, b=4) Checking 'Model' with args=(1, 2) and kwargs={'a': 3, 'b': 4} Forwarding with args=(1, 2) and kwargs={'a': 3, 'b': 4}
If
Store.DRY_RUNis False, thecheckmethod executes only once:>>> Store.DRY_RUN False >>> model(1, a=2) Forwarding with args=(1,) and kwargs={'a': 2}
If
Store.DRY_RUNis True, thecheckmethod executes every time__call__is invoked:>>> Store.DRY_RUN = True >>> model = Model() >>> model(1, 2, a=3, b=4) Checking 'Model' with args=(1, 2) and kwargs={'a': 3, 'b': 4} Forwarding with args=(1, 2) and kwargs={'a': 3, 'b': 4} >>> model(1, a=2) Checking 'Model' with args=(1,) and kwargs={'a': 2} Forwarding with args=(1,) and kwargs={'a': 2}
- class todd.models.ConvNeXtRegistry[source]
Bases:
TorchVisionRegistry- data = {'ConvNeXt': <class 'torchvision.models.convnext.ConvNeXt'>, 'convnext_base': <function convnext_base>, 'convnext_large': <function convnext_large>, 'convnext_small': <function convnext_small>, 'convnext_tiny': <function convnext_tiny>}
- class todd.models.EvalMixin[source]
Bases:
InitWeightsMixin,BuildPreHookMixin,CheckMixinA mixin class that provides evaluation functionality for a model.
This mixin class is intended to be used as a base class for models that require evaluation functionality. It provides methods for checking the model’s evaluation mode and toggling between training and evaluation modes.
- Parameters:
eval – A function specifying the modules to be marked evaluation.
To use the mixin class, first define a model:
>>> class Eval(EvalMixin): ... def __init__(self, *args, **kwargs): ... super().__init__(*args, **kwargs) ... self.conv = nn.Conv2d(1, 2, 3) ... self.bn = nn.BatchNorm2d(2) ... def forward(self) -> None: ... pass
Set up a module filter:
>>> nmf = NamedModulesFilter(name='bn')
Construct the model:
>>> model = Eval(eval_=nmf) >>> model.conv.training True >>> model.bn.training True
Use
init_weightsto change the training property:>>> model.init_weights(Config()) True >>> model.conv.training True >>> model.bn.training False
Use
checkmethod to verify if the properties meet the requirement:>>> model.check(model, tuple(), dict()) >>> model.bn.training = True >>> model.check(model, tuple(), dict()) Traceback (most recent call last): ... RuntimeError: bn is in training mode
Use
trainto enforce the properties to meet the requirement:>>> _ = model.train() >>> model.conv.training True >>> model.bn.training False >>> _ = model.eval() >>> model.conv.training False >>> model.bn.training False
The model can be utilized as a component in other models:
>>> sequential = nn.Sequential(model) >>> _ = sequential.train() >>> sequential[0].conv.training True >>> sequential[0].bn.training False
- __init__(*args, eval_=None, **kwargs)[source]
- Parameters:
eval_ (NamedModulesFilter | None)
- Return type:
None
- class todd.models.FilterRegistry[source]
Bases:
ModelRegistry- data = {'NamedModulesFilter': <class 'todd.models.filters.named_module.NamedModulesFilter'>, 'NamedParametersFilter': <class 'todd.models.filters.named_parameter.NamedParametersFilter'>}
- class todd.models.FreezeMixin[source]
Bases:
NoGradMixin,EvalMixinA mixin class that provides freezing functionality to a model.
Examples
>>> class Freeze(FreezeMixin): ... def __init__(self, *args, **kwargs) -> None: ... super().__init__(*args, **kwargs) ... self.c = nn.Conv2d(1, 2, 3) ... self.b = nn.BatchNorm2d(2) >>> nmf = NamedModulesFilter(name='b') >>> freeze = Freeze(freeze=nmf) >>> freeze.init_weights(Config()) True >>> {n: p.requires_grad for n, p in freeze.named_parameters()} {'c.weight': True, 'c.bias': True, 'b.weight': False, 'b.bias': False} >>> {n: m.training for n, m in freeze.named_modules()} {'': True, 'c': True, 'b': False}
- __init__(*args, freeze=None, **kwargs)[source]
- Parameters:
freeze (NamedModulesFilter | None)
- Return type:
None
- class todd.models.FrozenMixin[source]
Bases:
FreezeMixinA mixin class that provides freezing functionality to a class.
This mixin class is used to create frozen modules, where the parameters are excluded from gradient computation and the modules are marked as evaluation.
Examples
>>> class Frozen(FrozenMixin): ... def __init__(self) -> None: ... super().__init__() ... self.conv = nn.Conv2d(1, 2, 3) >>> frozen = Frozen() >>> frozen.init_weights(Config()) True >>> {n: p.requires_grad for n, p in frozen.named_parameters()} {'conv.weight': False, 'conv.bias': False} >>> {n: m.training for n, m in frozen.named_modules()} {'': False, 'conv': False}
- class todd.models.LossRegistry[source]
Bases:
ModelRegistry- data = {'BCELoss': <class 'todd.models.losses.functional.BCELoss'>, 'BCEWithLogitsLoss': <class 'todd.models.losses.functional.BCEWithLogitsLoss'>, 'CosineEmbeddingLoss': <class 'todd.models.losses.functional.CosineEmbeddingLoss'>, 'CrossEntropyLoss': <class 'todd.models.losses.functional.CrossEntropyLoss'>, 'FunctionalLoss': <class 'todd.models.losses.functional.FunctionalLoss'>, 'L1Loss': <class 'todd.models.losses.functional.L1Loss'>, 'MSELoss': <class 'todd.models.losses.functional.MSELoss'>}
- class todd.models.NoGradMixin[source]
Bases:
InitWeightsMixin,BuildPreHookMixin,CheckMixinA mixin class that excludes specific parameters from gradient computation.
This mixin class is designed to be a base class for models that necessitate certain parameters to be excluded from gradient computation. It offers methods to scrutinize the model’s weights and modify the state dictionary for excluding frozen parameters.
- Parameters:
no_grad – A function specifying the parameters to be excluded from gradient computation.
filter_state_dict – A flag controlling whether to filter the state dictionary to exclude frozen parameters.
Given a model that inherits from this mixin class:
>>> class NoGrad(NoGradMixin): ... def __init__(self, *args, **kwargs): ... super().__init__(*args, **kwargs) ... self.conv = nn.Conv2d(1, 2, 3) ... def forward(self) -> None: ... pass
Users can specify parameters to be excluded from gradient computation via a filter:
>>> npf = NamedParametersFilter(name='conv.weight') >>> model = NoGrad(no_grad=npf)
The exclusion of parameters from gradient computation does not happen immediately after the model is constructed:
>>> {n: p.requires_grad for n, p in model.named_parameters()} {'conv.weight': True, 'conv.bias': True}
Instead, the exclusion is triggered by calling
init_weights:>>> _ = model.init_weights(Config()) >>> {n: p.requires_grad for n, p in model.named_parameters()} {'conv.weight': False, 'conv.bias': True}
or by calling
requires_grad_:>>> _ = model.requires_grad_() >>> {n: p.requires_grad for n, p in model.named_parameters()} {'conv.weight': False, 'conv.bias': True} >>> _ = model.requires_grad_(False) >>> {n: p.requires_grad for n, p in model.named_parameters()} {'conv.weight': False, 'conv.bias': False}
Note that parameters that should be excluded from gradient computation can sometimes be included. A typical example is when the model is used as a component in another:
>>> sequential = nn.Sequential(model) >>> _ = sequential.requires_grad_() >>> {n: p.requires_grad for n, p in sequential.named_parameters()} {'0.conv.weight': True, '0.conv.bias': True}
To prevent this, the
checkmethod can be used to verify if parameters are correctly excluded from gradient computation:>>> model.check(model, tuple(), dict()) Traceback (most recent call last): ... RuntimeError: conv.weight requires grad >>> _ = model.requires_grad_() >>> model.check(model, tuple(), dict())
Refer to
CheckMixinfor more information on thecheckmethod.By default, the state dictionary includes all parameters:
>>> list(model.state_dict()) ['conv.weight', 'conv.bias']
However, in most cases, users may want to exclude frozen parameters from the state dictionary. This can be achieved by setting
filter_state_dictto True:>>> model = NoGrad(no_grad=npf, filter_state_dict=True) >>> list(model.state_dict()) ['conv.bias']
State dictionary filtering works even if the model is used as a component in another model:
>>> sequential = nn.Sequential(model) >>> list(sequential.state_dict()) ['0.conv.bias']
- __init__(*args, no_grad=None, filter_state_dict=False, **kwargs)[source]
- Parameters:
no_grad (NamedParametersFilter | None)
filter_state_dict (bool)
- Return type:
None
- class todd.models.NormRegistry[source]
Bases:
ModelRegistry- data = {'AdaGN': <class 'todd.models.norms.AdaptiveGroupNorm'>, 'AdaLN': <class 'todd.models.norms.AdaptiveLayerNorm'>, 'BN': <class 'torch.nn.modules.batchnorm.BatchNorm2d'>, 'BN1d': <class 'torch.nn.modules.batchnorm.BatchNorm1d'>, 'BN2d': <class 'torch.nn.modules.batchnorm.BatchNorm2d'>, 'BN3d': <class 'torch.nn.modules.batchnorm.BatchNorm3d'>, 'GN': <class 'torch.nn.modules.normalization.GroupNorm'>, 'IN': <class 'torch.nn.modules.instancenorm.InstanceNorm2d'>, 'IN1d': <class 'torch.nn.modules.instancenorm.InstanceNorm1d'>, 'IN2d': <class 'torch.nn.modules.instancenorm.InstanceNorm2d'>, 'IN3d': <class 'torch.nn.modules.instancenorm.InstanceNorm3d'>, 'LN': <class 'torch.nn.modules.normalization.LayerNorm'>, 'SyncBN': <class 'torch.nn.modules.batchnorm.SyncBatchNorm'>}
- class todd.models.ShadowRegistry[source]
Bases:
ModelRegistry- data = {'EMAShadow': <class 'todd.models.shadows.ema.EMAShadow'>}
- class todd.models.TorchVisionRegistry[source]
Bases:
ModelRegistry- data = {}
- class todd.models.ViTRegistry[source]
Bases:
TorchVisionRegistry- data = {'VisionTransformer': <class 'torchvision.models.vision_transformer.VisionTransformer'>, 'vit_b_16': <function vit_b_16>, 'vit_b_32': <function vit_b_32>, 'vit_h_14': <function vit_h_14>, 'vit_l_16': <function vit_l_16>, 'vit_l_32': <function vit_l_32>}