Fine-Grained Perturbation Guidance via Attention Head Selection
Paper
β’ 2506.10978 β’ Published
β’ 25
None defined yet.
device=["cuda:0", "cuda:1"] or device=["cpu"]*4 on the model.predict or model.rank calls.dataset_id, e.g. dataset_id="lightonai/NanoBEIR-de" for the German benchmark.output_scores=True to get similarity scores returned. This can be useful for some distillation losses!