vllm.v1.sample.logits_processor.interface ¶
AddedRequest module-attribute ¶
BatchUpdate dataclass ¶
Persistent batch state change info for logitsprocs
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ ¶
__init__(
batch_size: int,
removed: Sequence[RemovedRequest],
added: Sequence[AddedRequest],
moved: Sequence[MovedRequest],
) -> None
LogitsProcessor ¶
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ ¶
__init__(
vllm_config: VllmConfig,
device: device,
is_pin_memory: bool,
) -> None
apply abstractmethod ¶
Apply LogitsProcessor to batch logits tensor.
The updated tensor must be returned but may be modified in-place.
get_state_from_params abstractmethod ¶
get_state_from_params(
params: SamplingParams,
prompt_tok_ids: list[int],
out_tok_ids: list[int],
) -> Optional[T]
Produce a minimal representation of initial logits processor state for a newly-added request
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params | SamplingParams |
| required |
prompt_tok_ids | list[int] | list of new request prompt token ids | required |
out_tok_ids | list[int] | list of request generated tokens as of current engine step | required |
Returns:
| Type | Description |
|---|---|
Optional[T] |
|
Optional[T] | instance of initial logits processor state representation |
Source code in vllm/v1/sample/logits_processor/interface.py
is_argmax_invariant ¶
is_argmax_invariant() -> bool
True if logits processor has no impact on the argmax computation in greedy sampling; causes logits processor to be optimized away in greedy sampling scenarios. Base-class default is false but can be overriden by subclass. NOTE: may or may not have the same value for all instances of a given LogitsProcessor subclass, depending on subclass implementation.
Source code in vllm/v1/sample/logits_processor/interface.py
state_update_callback ¶
update_state ¶
update_state(batch_update: Optional[BatchUpdate]) -> None
Called when there are new output tokens, prior to each forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch_update | Optional[BatchUpdate] | Non-None iff there have been changes to the batch makeup. | required |
Source code in vllm/v1/sample/logits_processor/interface.py
process_dict_updates ¶
process_dict_updates(
req_entries: dict[int, T],
batch_update: Optional[BatchUpdate],
new_state: Callable[
[SamplingParams, Optional[list[int]], list[int]],
Optional[T],
],
) -> bool
Utility function to update dict state for sparse LogitsProcessors.