Add vit attribution hugo#154
Conversation
Additive image-modality extension on top of the merged attr-inference-refacto: - ImageClassificationInferenceWrapper (ClassificationInferenceWrapper subclass): __init__ drops pad_token_id lookup; _prepare_inputs stacks pixel tensors without padding; _compute_gradients differentiates w.r.t. pixel_values and collapses channels via .abs().mean(dim=1).flatten() to fit the 1D `l` contract (l = H*W). Runtime assert on (3, H, W) channel dim in _prepare_inputs. - ImageGranularity (standalone Enum, can't subclass Granularity): PIXEL/PATCH with DEFAULT=PATCH; duck-typed get_indices, get_association_matrix, granularity_score_aggregation (no generation branch), and get_decomposition returning (row, col) int tuples instead of strings. Generation/text-only branches are stripped; PATCH aggregation asserts >=2 pixels per unit. - ImageAttributionOutput + ImageClassificationAttributionExplainer in a new attributions/image_base.py: AttributionOutput mirror with ImageGranularity default and tuple-coordinate elements; explainer subclasses ClassificationAttributionExplainer, swaps tokenizer for image_processor, drops the text-side setup_token_ids call (no pad/mask tokens for ViT), adds a preprocess flag, accepts PIL/numpy/torch.Tensor/BatchFeature in process_model_inputs, and rewrites explain() with patch_size in place of tokenizer and ImageAttributionOutput as the output type. No tests, no perturbator, no visualization yet — gradient-only MVP. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…on method) - ImagePerturbator: no-op image-side perturbator. Subclass of Perturbator that returns (model_inputs, None), replacing the text-keyed default that would KeyError on a ViT BatchFeature. - ImageSaliency: thin subclass of ImageClassificationAttributionExplainer with use_gradient=True, input_x_gradient=True; no MultitaskExplainerMixin (classification-only MVP). Defaults to ImagePerturbator + default Aggregator. - Wire ImagePerturbator as the default fallback in ImageClassificationAttributionExplainer. - Re-exports through perturbations/, methods/, attributions/, and top-level interpreto/ (alongside ImageGranularity). - Sanity-runs end-to-end against hf-internal-testing/tiny-random-vit: returns (1, 225) attributions matching the model's 15x15 patch grid. - first_tests/first_test_image.py: ad-hoc sanity script (not a pytest test). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ded the firs_test_image.py file to check that it works
…ianNoiseImagePertubator logic
…near_interpolation_image_perturbation and the downstream methods that depend on it: ImageSmoothGrad, ImageIntergratedGradients and ImageGradientShap. I also added the tests of the methods in first_tests/first_test_image.py and plotted the results of the methods on a similar graph
…s ie the counterpart of IdsPerturbator for images. Added the specific logic for the perturbator and the actual explainer method for sobol, lime, occlusion, var_grad and square_grad. Also added it for kernel_shap but modified compared to the text version because the weighted sampling of the text version does not correspond to the actual shapley kernel. Modified first_tests/ to check with a tiny random vit that it prints something. also modified the relevant __init__.py files to add the newly defined techniques
…default number of perturbations to 10 for easier testing. for image_base.py the source of truth for patch_size is int the attributionexplainer
…the same plot and decided to leave the rescaling to the imshow function rather than leaving it to _prepare_heatmap so that we may plot a color bar on the side of each image that corresponds to the unormalized scores. adapted the actual_vit.py example to work with the new function to plot several different techniques on the same plot
…n star import) (new file)
AntoninPoche
left a comment
There was a problem hiding this comment.
The commit messages are clear, so I can review them commit by commit. Its nice! I guess thanks, Claude ^^
Overall, it is clear, and it is nice that you already managed to have something running. My main comment is that ImageGranularity is far too complex for what you need for images. But you will really know after trying perturbations at both granularities.
In any case, you can discuss all comments; they are my opinions, not based on having tried the thing myself.
| inference_mode: Callable[[torch.Tensor], torch.Tensor] = InferenceModes.LOGITS, | ||
| use_gradient: bool = False, | ||
| input_x_gradient: bool = True, | ||
| preprocess: bool = True, |
There was a problem hiding this comment.
Are you sure preprocessing by default is the way to go?
There was a problem hiding this comment.
I would have said yes because in my head the typical use case would be using this on a PIL image where this requires preprocessing. Of course maybe I'm wrong and it will be more generally used in already preprocessed tensors in which case it may be better to put preprocess = False.
I think this has to be resolved by how it will most often be used, which I have little visibility on.
There was a problem hiding this comment.
What happens if we process already processed inputs? Does it do something weird?
If so, then it might be tricky, otherwise, default preprocess is the way to go.
There was a problem hiding this comment.
In most cases it will do something weird indeed. But if the user does not preprocess the inputs it also does something weird (the ViT will not function as expected). So I don't know which one is best.
There was a problem hiding this comment.
The current behavior looks good to me. It is just that I was not familiar with how things work in ViTs, and I wanted the default choice to preprocess or not to be made consciously.
| inference_mode: Callable[[torch.Tensor], torch.Tensor] = InferenceModes.LOGITS | ||
|
|
||
|
|
||
| class ImageClassificationAttributionExplainer(ClassificationAttributionExplainer): |
There was a problem hiding this comment.
You cannot inherit directly from the ClassificationAttributionExplainer class. In particular, if your goal is just to reuse the target's preprocessing.
The goal is to find a way to use or adapt the MultitaskExplainerMixin that kind of serves as a factory. When you instantiate Saliency, it checks the model class and assigns the base task-explainer to inherit from both and have what you need.
The thing is that this hinders clarity. IMO, they should all be at the same level.
There was a problem hiding this comment.
But for now, it can be kept like this if you want easy tests.
| # Coordinate labels per granularity unit (replaces text's decoded-token strings). | ||
| # All samples share the same H, W after image_processor normalization, so the | ||
| # decomposition is identical across samples — compute it once on the first sample | ||
| # and share the reference. Text iterates per-sample because each sample can have | ||
| # a different sequence length; for image that variation doesn't exist. | ||
| # TODO: revisit — see project_vit_explainer_decomposition_refactor in auto-memory. | ||
| # `get_decomposition` already replicates internally based on the input's batch dim | ||
| # (`pixel_values.shape[0]`), but our per-sample BatchFeatures all have batch=1, so | ||
| # the internal replication is a no-op and we redo it here. Cleaner long-term: either | ||
| # strip the internal replication (return one decomposition) or concat samples into a | ||
| # single batched BatchFeature upstream and let `get_decomposition` produce the full | ||
| # `n_samples` list directly. | ||
| shared_coords: list[tuple[int, int]] = self.granularity.get_decomposition( | ||
| model_inputs_to_explain[0], | ||
| patch_size=self.patch_size, | ||
| return_coordinates=True, | ||
| )[0] # type: ignore | ||
| granular_inputs_coords: list[list[tuple[int, int]]] = [shared_coords for _ in model_inputs_to_explain] |
There was a problem hiding this comment.
Not sure what you use this for.
This was used in the text to know to which words or tokens attributions correspond.
For images, you can just store the images in their original (h, w) and the attributions in the granularity (h, w). Then, for visualizations, we resize (which does nothing for pixels and something for patches). (Note that the way of resizing can be a visualization parameter.)
In summary, if you already store the image, I would set None for the elements or omit it.
There was a problem hiding this comment.
We discussed and this outdated
| sanitized_targets, | ||
| strict=True, | ||
| ): | ||
| model_task, clean_contribution = self.post_processing(contribution) |
There was a problem hiding this comment.
You do not implement self.post_processing. However, the parent class just does return ModelTask.CLASSIFICATION, contribution.
I do not think you should use the same ModelTask, at least for now.
So you can remove this line and just create an ImageClassification task to pass to your ImageAttributionOutput. Or just omit it.
If you omit everything you do not need, it might be easier to identify what's common and not for later merging.
There was a problem hiding this comment.
Okay I removed it.
| for row_start in range(0, h, patch_size): | ||
| for col_start in range(0, w, patch_size): | ||
| patch = [] | ||
| for i in range(patch_size): | ||
| for j in range(patch_size): | ||
| # pixel positions in this patch, starting from row_start and col_start and moving by patch_size pixels | ||
| patch.append((i+row_start) * w + j + col_start) | ||
|
|
||
| per_sample.append(patch) |
There was a problem hiding this comment.
There should be a simpler solution than 4 nested for loops.
In any case, my first intuition is that you do not the get_indices. At least for the two granularities you have here.
The granularity is used at two steps:
- For perturbations, either your perturb the pixel_values or the patch so no need for granularity.
- For Aggregation if your attributions are not already at the correct granularity. In this case, for now, you just have the case were you request patch for gradients where you need to downscale the attributions from pixel-wise to patch-wise.
So IMO, you do not need most of the granularity elements for images. It is useful for text because each sample has a different size, with words and sentences distributed across the whole sample. It is much more consistent with images.
Well, the above comments do not make sense if you have special patch/tokens which do not correspond to pixels but to constants (to my understanding, these are used in ViTs and such).
| # text does `.abs().mean(dim=-1)` to collapse the d=768/4096 width dim for memory; | ||
| # here we collapse channels for the 1D l contract (not memory) and flatten H,W to l=H*W | ||
| target_wise_mean_grads: Float[torch.Tensor, f"{c} l"] = ( | ||
| target_wise_grads.abs().mean(dim=1).flatten(start_dim=1) |
There was a problem hiding this comment.
Why flatten? I know it breaks the shape from text, but it makes much more sense to have an (H,W) shape IMO.
This depends on how the rest is done.
There was a problem hiding this comment.
You told it was for aggregation right?
There was a problem hiding this comment.
This allows us to reuse aggregator entirely without changing anything. It also allowed me to copy with minimal changes a lot of the functions that were already written for text (such as the perturbators for example, or the granularity) so that we may more easily refactor after.
There was a problem hiding this comment.
Then it is a good choice!
| image_processor: BaseImageProcessor, | ||
| batch_size: int = 4, | ||
| granularity: ImageGranularity = ImageGranularity.DEFAULT, | ||
| granularity_aggregation_strategy: GranularityAggregationStrategy = GranularityAggregationStrategy.MEAN, |
There was a problem hiding this comment.
I imagine that for images, this should correspond to the interpolation mode.
There was a problem hiding this comment.
I'm not sure if I correctly understand the comment but granularity_aggregation_strategy is the strategy used when use_gradient = True to go from pixel scores to granularity scores.
If the interpolation mode you reference here is the one in the aggregator then, from what I understand, this aggregates on the perturbations, not the granularity.
So I'm not sure about what to respond.
There was a problem hiding this comment.
I am talking about the granularity of the aggregation here.
After gradients on the pixel_values, if you want to go to the patch granularity, you need to resize the explanations.
The image resizing operation takes an interpolation parameter. This parameter is a kind of choice we make with the GranularityAggregationStrategy. Well, you could also call this pooling, but resize allows you to think at the image level.
There was a problem hiding this comment.
Okay I undestand better.
Interpolation and pooling both work but I have a preference for pooling since this looks a lot like your typical max pooling operation (when the strategy is MAX) and those terms are generally used in CNNs (here it would be a patch_size * patch_size pooling operation). If you think interpolation is better I can put interpolation.
There was a problem hiding this comment.
I think resizing is better adapted, indeed, because it considers the image as a whole. While pooling makes small windows and extracts a value from them.
Here, taking the maximum, mean, or any other value does not really make sense. Well, it kind of does the job, but resizing is a much better way to change an image's size. (Which also applies to attributions with shape (h, w)).
Some methods, like the CAM family or Rise, provide smooth, human-friendly attributions as a result.
There was a problem hiding this comment.
Ok I'll look into replacing GranularityAggregationStrategy with an InterpolationStrategy and to define several stategies of interpolation
| "nvidia-cusparse-cu11>=11.7.4.91; sys_platform=='Linux'", | ||
| "nvidia-nccl-cu11>=2.14.3; sys_platform=='Linux'", | ||
| "nvidia-nvtx-cu11>=11.7.91; sys_platform=='Linux'", | ||
| "torchvision>=0.27.0", |
There was a problem hiding this comment.
We should discuss with @fanny-jourdan if we want torchvision to be a main dependency or in something like pip install interpreto[vision].
| The output of `image_processor(image, return_tensors="pt")` (a `BatchFeature`, | ||
| satisfies `TensorMapping`). Holds `pixel_values` of shape `(1, 3, H, W)`. | ||
|
|
||
| raw_image (PIL.Image | np.ndarray | torch.Tensor | None): |
There was a problem hiding this comment.
I think it should be harmonized before being saved. It will be much easier for visualizations.
So it should have a single type.
There was a problem hiding this comment.
I had kept it this way because I didn't want to plot a modified image for the user (I thought it was best if he saw the same image he had first loaded). I already did the visualization script so : do I still need to modify this field ?
If the concern was that the visualization was going to be hard then maybe I can keep it.
If it was simplicity then I can just use model_inputs_to_explain as the base image onto which to display the heatmap.
There was a problem hiding this comment.
What you are saying is that once processed, if we plot the image, it looks weird, so we need to keep a raw image, right?
If that is the case, I agree with you.
There was a problem hiding this comment.
I haven't actually tried to plot it with the processed image. Since the comment had not been submitted before I actually wrote the visualization functions, I used the raw_image and it works. Do you want me to try and see if I can also make it work with the processed image so that we may drop this field that may not be useful ?
| # Preserve each user-supplied raw image alongside its sanitized BatchFeature so | ||
| # the per-sample ImageAttributionOutput can carry it for visualization. The | ||
| # post-normalization pixel_values in model_inputs_to_explain are not directly | ||
| # displayable. None for samples that came in as BatchFeature or under preprocess=False. | ||
| raw_images: list[PILImage | np.ndarray | torch.Tensor | None] | ||
| if isinstance(model_inputs, list): | ||
| raw_images = [ | ||
| m if self.preprocess and isinstance(m, (PILImage, np.ndarray, torch.Tensor)) else None | ||
| for m in model_inputs | ||
| ] | ||
| elif self.preprocess and isinstance(model_inputs, (PILImage, np.ndarray, torch.Tensor)): | ||
| raw_images = [model_inputs] | ||
| else: | ||
| raw_images = [None] * len(model_inputs_to_explain) | ||
|
|
There was a problem hiding this comment.
I think this should go in process_model_inputs. So you know that you return None if inputs are already processed.
|
Hi @HugoDeBosschere, with @thomas-mullor we discussed how to tackle the perturbators problem. @fanny-jourdan do not hesitate to give your opinion. To summarize, the problem was that perturbators depend both on the method and the modality. So we found a way to dynamically create the necessary perturbator by inheriting from both the method and modality specific classes. Here is a summary of the modifications: # attributions/perturbations/base.py
class Perturbator(ABC):
@abstractmethod
def perturb(self, inputs):
pass
class TensorPerturbator(Perturbator): # new class (just for typing and clarity)
@abstractmethod
def perturb_tensor(self): # renaming of `perturb_embeds`
pass
class TextTensorPerturbator(TensorPerturbator): # renaming of `EmbeddingsPerturbator`
def perturb(self, inputs): # already exists
# calls `perturb_tensor`
class ImageTensorPerturbator(TensorPerturbator): # renaming of your `ImageEmbeddingsPerturbator`
def perturb(self, inputs): # already exists but can surely be simplified
# calls `perturb_tensor`
class MaskPerturbator(Perturbator): # new class (just for typing and clarity)
@abstractmethod
def get_mask(self):
pass
class TextMaskPerturbator(MaskPerturbator): # renaming of `IdsPerturbator`
def perturb(self, inputs): # already exists
# calls `get_mask`
class ImageMaskPerturbator(MaskPerturbator): # renaming of your `ImageIdsPerturbator`
def perturb(self, inputs): # already exists
# calls `get_mask`# attributions/perturbations/occlusion_perturbation.py
from .base import MaskPerturbator
class OcclusionPerturbator(MaskPerturbator): # change inheritance, might make the type checker unhappy
def get_mask(self, mask_dim):
# should support both `TextMaskPerturbator` and `ImageMaskPerturbator` requirements
# it might be nothing at first# attributions/perturbations/gaussian_noise_perturbation.py
from .base import TensorPerturbator
class GaussianNoisePerturbator(TensorPerturbator): # change inheritance, might make the type checker unhappy
def perturb_tensor(self, inputs_embeds):
# should support both `TextTensorPerturbator` and `ImageTensorPerturbator` requirements
# therefore, support (1, l, d) and (1, c, h, w) shapes
# (1, ...) -> (p, ...)# attributions/base.py
from .perturbations.base import (
Perturbator,
TensorPerturbator,
TextTensorPerturbator,
ImageTensorPerturbator,
MaskPerturbator,
TextMaskPerturbator,
ImageMaskPerturbator
)
class AttributionExplainer(ABC):
associated_inference_wrapper: InferenceWrapper
base_tensor_perturbator_class: type[TensorPerturbator]
base_mask_perturbator_class: type[MaskPerturbator]
# does not impact the methods
class TextClassificationExplainer(AttributionExplainer):
associated_inference_wrapper: TextClassificationInferenceWrapper
base_tensor_perturbator_class = TextTensorPerturbator
base_mask_perturbator_class = TextMaskPerturbator
class TextGenerationExplainer(AttributionExplainer):
associated_inference_wrapper: TextGenerationInferenceWrapper
base_tensor_perturbator_class = TextTensorPerturbator
base_mask_perturbator_class = TextMaskPerturbator
class ImageClassificationExplainer(AttributionExplainer):
associated_inference_wrapper: ImageClassificationInferenceWrapper
base_tensor_perturbator_class = ImageTensorPerturbator
base_mask_perturbator_class = ImageMaskPerturbator
class MultitaskExplainerMixin:
# no modifications specific to the new perturbator classes
# still, it should include the ImageClassificationExplainer at some point# attributions/methods/occlusion.py
class Occlusion(MultitaskExplainerMixin, AttributionExplainer):
def __init__(...):
# create the perturbator dynamically by inheriting from both the method and modality specific classes
perturbator_class = type(
"ModalitySpecific" + self.__class__.__name__, # name
(OcclusionPerturbator, self.base_mask_perturbator_class,), # parent classes
{}
)
perturbator = perturbator_class(
tokenizer=tokenizer,
granularity=granularity,
replace_token_id=replace_token_id,
)
...# attributions/methods/smooth_grad.py
class SmoothGrad(MultitaskExplainerMixin, AttributionExplainer):
def __init__(...):
# create the perturbator dynamically by inheriting from both the method and modality specific classes
perturbator_class = type(
"ModalitySpecific" + self.__class__.__name__, # name
(GaussianNoisePerturbator, self.base_tensor_perturbator_class,), # parent classes
{}
)
perturbator = perturbator_class(
inputs_embedder=model.get_input_embeddings(),
n_perturbations=n_perturbations,
std=noise_std
)
... |
…ngsPerturbator now becomes TensorPerturbator. perturb_embeds thus become perturb_tensor. IdsPerturbator becomes MaskPerturbator. Both kinds of perturbation have (task,modality) tuple children (Image Generation does not exist). There is also a new ImageInferenceWrapper and TextInferenceWrapper that both inherit from the InferenceWrapper class (which may need to be abstracted). AttributionExplainer has been abstracted and there are the same (task, modality) children as for the perturbations. All the other changes are just casading from these modifications (import changes, inheritance changes). I also had to copy and paste process_targets and process_inputs_to_explain_and_targets from TextClassificationAttributionExplainer to ImageClassificationAttributionExplainer since the latter used to inherit from the former and needed those methods to function correctly.
…he necessary imports into the files that were affected by this change
…of FactoryGeneratedMeta to avoid metaclass conflict. Changed the attributions test and ran them to ensure that nothing was broken by the new modifications
…y and created a new parent class called Granularity to harmonize typing in order to be able to merge the image attribution methods with the text attribution methods. Executed the pytest tests and they still work.
… for images in order to have a more image point of view on the resize explanations. Added 3 types of interpolation: BILINEAR, BICUBIC and AREA (which is just a mean) all derived from the torch library in order to be able to do the interpolation on gpu (I also changed the moment where contributions was put on cpu in order to be stored in ImageAttributionOutput). GranularityAggregationStrategy and GranularityResizeStrategy now both inherit from GranularityCombinationStrategy following the same pattern as the one from Granularity. This is done in order to then be able to implement all the methods in one and only class. All the Granularity related classes / methods have been put on the granularity.py file though the image_granularity.py files remain because I have not yet tested the changes.
Description
Added a small MVP for ViT attribution. For now no vizualisation but there is a Saliency method that executes.
I did not modify any preexisting code, it's only bonus.
Creation of:
It returns a flattened 2D tensors to be able to reuse for free the methods developped for text. In this class 3 methods were overriden:
4.ImagePerturbator in interpreto/attributions/perturbations/image_base.py. Only the base class for the MVP so the perturbator is a no-op. Since we don't need to check for model_inputs or model_embeds it just returns the intputs. Allows to instantiate ImageSaliency to check if the pipeline runs.
Checklist
CODE_OF_CONDUCT.mddocument.CONTRIBUTING.mdguide.make lint.make test.