Adapter Implementation
The following classes define the common interfaces for all adapter methods. They further hold logic shared by all adapter implementations. All newly added adapter methods should inherit from either one of these classes.
- class adapters.AdapterLayerBase
Base class for all adaptation methods that require per-layer modules.
Make sure the ‘adapter_modules_name’ attribute is overriden in derived classes.
- abstract add_adapter(adapter_name: str, layer_idx: int) bool
Adds a new adapter module to the layer.
- Parameters
adapter_name (str) – The name of the new adapter to add.
layer_idx (int) – The index of the adapters layer (this should be set once by the first added adapter and the kept fix).
- Returns
True if the adapter was added, False otherwise.
- Return type
bool
- average_adapter(adapter_name: str, input_adapters: Dict[str, float], combine_strategy, **kwargs) bool
Averages a set of adapter modules into a new adapter module.
- Parameters
adapter_name (str) – The name of the new (averaged) adapter module to add.
input_adapters (Dict[str, float]) – Dictionary of adapter names and their corresponding weights.
combine_strategy (str) – The strategy to combine the adapters. Available strategies depend on the used adapter method, see: https://docs.adapterhub.ml/adapter_composition.html#merging-adapters
**kwargs – Additional arguments that are specific to the combine_strategy. E.g. svd_rank for LoRA.
- Returns
True if the adapter was added, False otherwise.
- Return type
bool
- delete_adapter(adapter_name: str)
Deletes an adapter module from the layer.
- Parameters
adapter_name (str) – The name of the adapter to delete.
- enable_adapters(adapter_setup: AdapterCompositionBlock, unfreeze_adapters: bool, unfreeze_fusion: bool)
Enables/ disables a set of adapter modules within the layer.
- Parameters
adapter_setup (AdapterCompositionBlock) – The adapter setup to enable/ disable.
unfreeze_adapters (bool) – Whether to unfreeze the adapters.
unfreeze_fusion (bool) – Whether to unfreeze the fusion layers.
- freeze_adapter(adapter_name: str, freeze: bool = True)
Freezes/ unfreezes an adapter module.
- Parameters
adapter_name (str) – The name of the adapter to freeze/ unfreeze.
freeze (bool, optional) – Whether to freeze the adapter. Defaults to True.
- get_adapter(adapter_name: str) Module
Returns the adapter module with the given name.
- Parameters
adapter_name (str) – The name of the adapter module.
- pre_save_adapters()
Called before saving the adapters to disk.
- class adapters.ComposableAdapterLayerBase(*args, **kwargs)
Base class for all adapter methods that support composition.
Make sure the ‘adapter_modules_name’ and ‘supported_compositions’ attributes as well as all abstract methods are overriden in derived classes. ‘allow_multi_parallelize’ can be set to True to allow inputs to be parallelized independently multiple times. This is useful when there are multiple parallel input flows through an adapter layer (e.g. in LoRA).
- check_composition_valid(parent: AdapterCompositionBlock, child: AdapterCompositionBlock, lvl: int)
Checks whether the given composition is valid.
- Parameters
parent (AdapterCompositionBlock) – The parent composition block.
child (AdapterCompositionBlock) – The child composition block.
lvl (int) – The composition depth.
- Raises
ValueError – If the composition is invalid.
- compose(adapter_setup: Union[AdapterCompositionBlock, str], state: NamedTuple) NamedTuple
The main composition forward method which recursively calls the composition blocks forward methods. This method should be called by the forward method of the derived class.
- Parameters
adapter_setup (Union[AdapterCompositionBlock, str]) – The adapter setup to be used.
state (NamedTuple) – The current state.
- Returns
The state after forwarding through the adapter setup.
- Return type
NamedTuple
- compose_average(adapter_setup: Average, state: NamedTuple, lvl: int = 0)
For averaging the output representations of multiple adapters.
- compose_batch_split(adapter_setup: BatchSplit, state: NamedTuple, lvl: int = 0)
For splitting to multiple adapters along the batch size dimension.
- compose_fuse(adapter_setup: Fuse, state: NamedTuple, lvl: int = 0)
For fusing multiple adapters using adapter fusion. NOTE: This method has no default implementation.
- compose_parallel(adapter_setup: Parallel, state: NamedTuple, lvl: int = 0)
For parallel execution of the adapters on the same input. This means that the input is repeated N times before feeding it to the adapters (where N is the number of adapters).
- abstract compose_single(adapter_setup: str, state: NamedTuple, lvl: int = 0) NamedTuple
Forwards the given state through the given single adapter.
- Parameters
adapter_setup (str) – The name of the adapter.
state (NamedTuple) – The state to be forwarded.
lvl (int, optional) – The composition depth. Defaults to 0.
- Returns
The state after forwarding through the adapter.
- Return type
NamedTuple
- compose_split(adapter_setup: Split, state: NamedTuple, lvl: int = 0)
For splitting to multiple adapters along the sequence length dimension. NOTE: This method has no default implementation.
- compose_stack(adapter_setup: Stack, state: NamedTuple, lvl: int = 0) NamedTuple
For sequentially stacking multiple adapters.
- abstract mean(states: List[NamedTuple], weights: Tensor) NamedTuple
Averages the given states along the batch size dimension by the given weights. This is e.g. used by the Average composition block. IMPORTANT: Has to be implemented by all derived classes.
- Parameters
states (List[NamedTuple]) – The states to be averaged.
weights (torch.Tensor) – The averaging weights.
- Returns
The averaged state.
- Return type
NamedTuple
- abstract pad_and_concat(states: List[NamedTuple]) NamedTuple
Concatenates the given states along the batch size dimension. Pads the states before concatenation if necessary. This is e.g. used by the BatchSplit and Parallel composition blocks. IMPORTANT: Has to be implemented by all derived classes.
- Parameters
states (List[NamedTuple]) – The states to be concatenated.
- Returns
The concatenated state.
- Return type
NamedTuple
- pre_block(adapter_setup: Union[AdapterCompositionBlock, str], state: NamedTuple) NamedTuple
Optional state pre-processing method which is invoked before passing the state to the first child block of a composition. By default, this method does not contain any logic. E.g. used for bottleneck adapters to implement residuals and LNs.
- Parameters
adapter_setup (Union[AdapterCompositionBlock, str]) – The current composition or single adapter.
state (NamedTuple) – The current state.
- Returns
The pre-processed state.
- Return type
NamedTuple
- abstract repeat(state: NamedTuple, channels: int) NamedTuple
Repeats the given state along the batch size dimension for the given number of times. This is e.g. used by the Parallel composition block. IMPORTANT: Has to be implemented by all derived classes.
- Parameters
state (NamedTuple) – The state to be repeated.
channels (int) – The number of times the state should be repeated.
- Returns
The repeated state.
- Return type
NamedTuple
- abstract vslice(state: NamedTuple, slice_obj: slice) NamedTuple
Slices the given state along the batch size (vertical) dimension. This is e.g. used by the BatchSplit and Parallel composition blocks. IMPORTANT: Has to be implemented by all derived classes.
- Parameters
state (NamedTuple) – The state to be sliced.
slice_obj (slice) – The slice object.
- Returns
The sliced state.
- Return type
NamedTuple