SentenceTransformer based on FacebookAI/roberta-base

This is a sentence-transformers model finetuned from FacebookAI/roberta-base on the all-nli dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/roberta-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6264, 0.0948],
#         [0.6264, 1.0000, 0.2493],
#         [0.0948, 0.2493, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.8242 0.7899
spearman_cosine 0.8259 0.7921

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 10.38 tokens
    • max: 45 tokens
    • min: 6 tokens
    • mean: 12.8 tokens
    • max: 39 tokens
    • min: 6 tokens
    • mean: 13.4 tokens
    • max: 50 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 18.02 tokens
    • max: 66 tokens
    • min: 5 tokens
    • mean: 9.81 tokens
    • max: 29 tokens
    • min: 5 tokens
    • mean: 10.37 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.6500 -
0.0287 500 12.9834 4.4446 0.8142 -
0.0574 1000 5.0558 2.6336 0.8433 -
0.0860 1500 4.0677 2.1055 0.8449 -
0.1147 2000 3.5554 1.8425 0.8488 -
0.1434 2500 3.1731 1.6827 0.8506 -
0.1721 3000 2.9998 1.6020 0.8574 -
0.2008 3500 2.8479 1.5138 0.8518 -
0.2294 4000 2.6391 1.4776 0.8531 -
0.2581 4500 2.6123 1.4301 0.8609 -
0.2868 5000 2.4732 1.3546 0.8591 -
0.3155 5500 2.3969 1.3039 0.8591 -
0.3442 6000 2.3997 1.3233 0.8656 -
0.3729 6500 2.3758 1.3031 0.8595 -
0.4015 7000 2.3403 1.2585 0.8578 -
0.4302 7500 2.2049 1.3044 0.8553 -
0.4589 8000 2.1295 1.2580 0.8525 -
0.4876 8500 2.2281 1.2584 0.8576 -
0.5163 9000 2.2291 1.2483 0.8470 -
0.5449 9500 2.1401 1.2757 0.8610 -
0.5736 10000 2.0297 1.2819 0.8525 -
0.6023 10500 2.0616 1.2209 0.8499 -
0.6310 11000 1.9647 1.2464 0.8547 -
0.6597 11500 2.0016 1.1847 0.8503 -
0.6883 12000 1.9446 1.2070 0.8525 -
0.7170 12500 1.9737 1.2414 0.8557 -
0.7457 13000 1.9648 1.1860 0.8508 -
0.7744 13500 1.9201 1.2226 0.8496 -
0.8031 14000 1.8554 1.2228 0.8484 -
0.8318 14500 1.8868 1.2439 0.8603 -
0.8604 15000 1.8792 1.1713 0.8495 -
0.8891 15500 1.8812 1.1805 0.8519 -
0.9178 16000 1.8516 1.2316 0.8514 -
0.9465 16500 1.8007 1.2269 0.8461 -
0.9752 17000 1.7603 1.2313 0.8490 -
1.0038 17500 1.7137 1.2067 0.8416 -
1.0325 18000 1.5488 1.1928 0.8384 -
1.0612 18500 1.5885 1.2342 0.8400 -
1.0899 19000 1.587 1.2192 0.8470 -
1.1186 19500 1.5847 1.2362 0.8412 -
1.1472 20000 1.6477 1.3148 0.8488 -
1.1759 20500 1.5791 1.2597 0.8412 -
1.2046 21000 1.6322 1.2991 0.8431 -
1.2333 21500 1.6587 1.2976 0.8355 -
1.2620 22000 1.5471 1.3057 0.8479 -
1.2907 22500 1.5954 1.3160 0.8394 -
1.3193 23000 1.5961 1.3306 0.8393 -
1.3480 23500 1.6175 1.3224 0.8338 -
1.3767 24000 1.58 1.3298 0.8423 -
1.4054 24500 1.5811 1.2604 0.8372 -
1.4341 25000 1.5488 1.3224 0.8380 -
1.4627 25500 1.5905 1.3510 0.8395 -
1.4914 26000 1.5759 1.3830 0.8355 -
1.5201 26500 1.5344 1.3678 0.8328 -
1.5488 27000 1.5502 1.3485 0.8268 -
1.5775 27500 1.6084 1.3426 0.8345 -
1.6061 28000 1.5494 1.3062 0.8434 -
1.6348 28500 1.5705 1.2855 0.8384 -
1.6635 29000 1.5043 1.3451 0.8387 -
1.6922 29500 1.5479 1.3092 0.8405 -
1.7209 30000 1.5569 1.2877 0.8352 -
1.7496 30500 1.4811 1.3088 0.8375 -
1.7782 31000 1.503 1.2647 0.8407 -
1.8069 31500 1.5416 1.3738 0.8445 -
1.8356 32000 1.4459 1.2674 0.8474 -
1.8643 32500 1.5038 1.2700 0.8440 -
1.8930 33000 1.4559 1.2964 0.8420 -
1.9216 33500 1.3829 1.3190 0.8395 -
1.9503 34000 1.4752 1.3236 0.8413 -
1.9790 34500 1.4366 1.2968 0.8402 -
2.0077 35000 1.3896 1.3126 0.8433 -
2.0364 35500 1.0903 1.3262 0.8322 -
2.0650 36000 1.1466 1.2708 0.8427 -
2.0937 36500 1.1498 1.2716 0.8434 -
2.1224 37000 1.1693 1.3225 0.8473 -
2.1511 37500 1.1437 1.3153 0.8346 -
2.1798 38000 1.1813 1.3388 0.8340 -
2.2085 38500 1.1034 1.2852 0.8428 -
2.2371 39000 1.0881 1.2978 0.8432 -
2.2658 39500 1.1366 1.2986 0.8397 -
2.2945 40000 1.1261 1.2811 0.8397 -
2.3232 40500 1.136 1.3165 0.8372 -
2.3519 41000 1.1146 1.2736 0.8423 -
2.3805 41500 1.1484 1.2822 0.8411 -
2.4092 42000 1.1052 1.2153 0.8443 -
2.4379 42500 1.0838 1.2890 0.8395 -
2.4666 43000 1.099 1.3082 0.8446 -
2.4953 43500 1.1038 1.3195 0.8352 -
2.5239 44000 1.0813 1.2516 0.8348 -
2.5526 44500 1.1078 1.3428 0.8381 -
2.5813 45000 1.1094 1.2394 0.8324 -
2.6100 45500 1.0594 1.2751 0.8369 -
2.6387 46000 1.0994 1.2607 0.8312 -
2.6674 46500 1.0946 1.2326 0.8308 -
2.6960 47000 1.0575 1.2957 0.8316 -
2.7247 47500 1.0936 1.3358 0.8234 -
2.7534 48000 1.0449 1.2434 0.8358 -
2.7821 48500 1.0078 1.3299 0.8292 -
2.8108 49000 1.0929 1.3079 0.8291 -
2.8394 49500 1.0517 1.2575 0.8338 -
2.8681 50000 1.0385 1.2555 0.8388 -
2.8968 50500 1.0458 1.2732 0.8368 -
2.9255 51000 1.0525 1.3205 0.8327 -
2.9542 51500 1.0272 1.2707 0.8428 -
2.9828 52000 0.9977 1.2854 0.8401 -
3.0115 52500 0.9116 1.3667 0.8371 -
3.0402 53000 0.7987 1.2610 0.8437 -
3.0689 53500 0.8297 1.3018 0.8428 -
3.0976 54000 0.8309 1.3320 0.8406 -
3.1263 54500 0.8354 1.2898 0.8353 -
3.1549 55000 0.8209 1.2937 0.8369 -
3.1836 55500 0.8346 1.2745 0.8413 -
3.2123 56000 0.8002 1.2647 0.8382 -
3.2410 56500 0.828 1.3219 0.8351 -
3.2697 57000 0.8146 1.2671 0.8421 -
3.2983 57500 0.7883 1.2576 0.8430 -
3.3270 58000 0.8037 1.2747 0.8402 -
3.3557 58500 0.8312 1.2527 0.8371 -
3.3844 59000 0.8239 1.3148 0.8344 -
3.4131 59500 0.8105 1.2951 0.8349 -
3.4417 60000 0.7824 1.3018 0.8330 -
3.4704 60500 0.7864 1.2749 0.8338 -
3.4991 61000 0.7713 1.2763 0.8368 -
3.5278 61500 0.7846 1.3069 0.8343 -
3.5565 62000 0.8137 1.2746 0.8372 -
3.5852 62500 0.8086 1.2858 0.8326 -
3.6138 63000 0.7581 1.3138 0.8373 -
3.6425 63500 0.8743 1.2905 0.8375 -
3.6712 64000 0.79 1.2679 0.8425 -
3.6999 64500 0.8028 1.3027 0.8392 -
3.7286 65000 0.7923 1.2678 0.8388 -
3.7572 65500 0.8628 1.2883 0.8362 -
3.7859 66000 0.8007 1.2476 0.8356 -
3.8146 66500 0.8163 1.2941 0.8327 -
3.8433 67000 0.8147 1.2695 0.8328 -
3.8720 67500 0.806 1.2737 0.8393 -
3.9006 68000 0.8017 1.2550 0.8380 -
3.9293 68500 0.7735 1.2912 0.8357 -
3.9580 69000 0.7936 1.2877 0.8403 -
3.9867 69500 0.7962 1.2525 0.8387 -
4.0154 70000 0.6995 1.2901 0.8382 -
4.0441 70500 0.6376 1.3132 0.8381 -
4.0727 71000 0.6268 1.3263 0.8401 -
4.1014 71500 0.6339 1.3005 0.8321 -
4.1301 72000 0.5927 1.3181 0.8355 -
4.1588 72500 0.674 1.3045 0.8361 -
4.1875 73000 0.6086 1.2831 0.8296 -
4.2161 73500 0.6055 1.2399 0.8369 -
4.2448 74000 0.6089 1.2837 0.8354 -
4.2735 74500 0.6201 1.2798 0.8370 -
4.3022 75000 0.6112 1.2372 0.8404 -
4.3309 75500 0.6228 1.3101 0.8372 -
4.3595 76000 0.6411 1.2884 0.8371 -
4.3882 76500 0.6705 1.2782 0.8376 -
4.4169 77000 0.6208 1.2618 0.8346 -
4.4456 77500 0.6243 1.2773 0.8417 -
4.4743 78000 0.6598 1.2746 0.8423 -
4.5030 78500 0.6265 1.3135 0.8350 -
4.5316 79000 0.6285 1.2945 0.8316 -
4.5603 79500 0.6514 1.2947 0.8373 -
4.5890 80000 0.6222 1.2861 0.8358 -
4.6177 80500 0.6499 1.3138 0.8352 -
4.6464 81000 0.6528 1.2633 0.8347 -
4.6750 81500 0.647 1.2300 0.8353 -
4.7037 82000 0.6339 1.3199 0.8304 -
4.7324 82500 0.6241 1.2745 0.8379 -
4.7611 83000 0.6196 1.2750 0.8333 -
4.7898 83500 0.6415 1.2656 0.8319 -
4.8184 84000 0.6285 1.2675 0.8349 -
4.8471 84500 0.6704 1.2692 0.8424 -
4.8758 85000 0.6346 1.3031 0.8366 -
4.9045 85500 0.6332 1.2819 0.8327 -
4.9332 86000 0.6201 1.2538 0.8396 -
4.9619 86500 0.5993 1.2576 0.8335 -
4.9905 87000 0.6233 1.2494 0.8403 -
5.0192 87500 0.5449 1.2462 0.8431 -
5.0479 88000 0.522 1.3205 0.8367 -
5.0766 88500 0.5046 1.2828 0.8404 -
5.1053 89000 0.5002 1.2980 0.8343 -
5.1339 89500 0.5024 1.2879 0.8413 -
5.1626 90000 0.4996 1.3013 0.8345 -
5.1913 90500 0.5184 1.2622 0.8326 -
5.2200 91000 0.5001 1.2429 0.8405 -
5.2487 91500 0.4899 1.2539 0.8314 -
5.2773 92000 0.5163 1.2123 0.8359 -
5.3060 92500 0.5124 1.2529 0.8321 -
5.3347 93000 0.5399 1.2868 0.8359 -
5.3634 93500 0.5095 1.2217 0.8355 -
5.3921 94000 0.5154 1.2764 0.8393 -
5.4208 94500 0.4904 1.2653 0.8386 -
5.4494 95000 0.4924 1.2685 0.8374 -
5.4781 95500 0.5201 1.2494 0.8280 -
5.5068 96000 0.5172 1.2479 0.8401 -
5.5355 96500 0.5385 1.3059 0.8389 -
5.5642 97000 0.5029 1.2922 0.8371 -
5.5928 97500 0.5158 1.2882 0.8394 -
5.6215 98000 0.5084 1.3059 0.8298 -
5.6502 98500 0.4991 1.2331 0.8354 -
5.6789 99000 0.5211 1.3025 0.8335 -
5.7076 99500 0.5106 1.2624 0.8374 -
5.7362 100000 0.5329 1.2238 0.8335 -
5.7649 100500 0.4982 1.2890 0.8362 -
5.7936 101000 0.4856 1.3022 0.8347 -
5.8223 101500 0.5195 1.2774 0.8434 -
5.8510 102000 0.5012 1.2318 0.8409 -
5.8797 102500 0.5086 1.3152 0.8296 -
5.9083 103000 0.5422 1.2187 0.8322 -
5.9370 103500 0.4863 1.2511 0.8358 -
5.9657 104000 0.536 1.2410 0.8303 -
5.9944 104500 0.5082 1.2399 0.8369 -
6.0231 105000 0.4542 1.3084 0.8309 -
6.0517 105500 0.4102 1.2852 0.8337 -
6.0804 106000 0.3911 1.2805 0.8333 -
6.1091 106500 0.4227 1.2853 0.8342 -
6.1378 107000 0.3923 1.3140 0.8347 -
6.1665 107500 0.4027 1.2836 0.8337 -
6.1951 108000 0.4268 1.2788 0.8335 -
6.2238 108500 0.4337 1.3135 0.8289 -
6.2525 109000 0.4152 1.2783 0.8375 -
6.2812 109500 0.4386 1.3863 0.8171 -
6.3099 110000 0.4269 1.2786 0.8344 -
6.3386 110500 0.4512 1.2746 0.8272 -
6.3672 111000 0.3879 1.3043 0.8382 -
6.3959 111500 0.3996 1.3095 0.8346 -
6.4246 112000 0.3942 1.3081 0.8339 -
6.4533 112500 0.413 1.2650 0.8392 -
6.4820 113000 0.4243 1.2872 0.8348 -
6.5106 113500 0.403 1.2970 0.8380 -
6.5393 114000 0.4151 1.2704 0.8383 -
6.5680 114500 0.4037 1.2972 0.8331 -
6.5967 115000 0.4269 1.3380 0.8376 -
6.6254 115500 0.4233 1.2927 0.8336 -
6.6540 116000 0.4289 1.2784 0.8337 -
6.6827 116500 0.4373 1.2634 0.8396 -
6.7114 117000 0.3996 1.2934 0.8356 -
6.7401 117500 0.4076 1.2793 0.8363 -
6.7688 118000 0.4091 1.3162 0.8300 -
6.7975 118500 0.4002 1.3168 0.8379 -
6.8261 119000 0.4128 1.2875 0.8385 -
6.8548 119500 0.3872 1.3245 0.8353 -
6.8835 120000 0.4091 1.2763 0.8406 -
6.9122 120500 0.3955 1.3050 0.8390 -
6.9409 121000 0.4223 1.2819 0.8340 -
6.9695 121500 0.4226 1.2982 0.8310 -
6.9982 122000 0.3973 1.2659 0.8354 -
7.0269 122500 0.3507 1.3166 0.8325 -
7.0556 123000 0.3665 1.2739 0.8373 -
7.0843 123500 0.3285 1.3148 0.8344 -
7.1129 124000 0.3441 1.2813 0.8390 -
7.1416 124500 0.3334 1.2939 0.8378 -
7.1703 125000 0.3616 1.2835 0.8372 -
7.1990 125500 0.3424 1.3163 0.8334 -
7.2277 126000 0.3729 1.3179 0.8333 -
7.2564 126500 0.3418 1.2671 0.8385 -
7.2850 127000 0.3425 1.2964 0.8338 -
7.3137 127500 0.3348 1.3023 0.8355 -
7.3424 128000 0.3261 1.3049 0.8408 -
7.3711 128500 0.3775 1.2912 0.8388 -
7.3998 129000 0.3512 1.3120 0.8340 -
7.4284 129500 0.3487 1.2879 0.8355 -
7.4571 130000 0.3346 1.2994 0.8298 -
7.4858 130500 0.3685 1.3090 0.8311 -
7.5145 131000 0.3604 1.2841 0.8314 -
7.5432 131500 0.3525 1.2841 0.8274 -
7.5718 132000 0.3406 1.2683 0.8310 -
7.6005 132500 0.3464 1.2790 0.8304 -
7.6292 133000 0.3559 1.2381 0.8327 -
7.6579 133500 0.3463 1.2864 0.8343 -
7.6866 134000 0.3748 1.2516 0.8307 -
7.7153 134500 0.3575 1.2559 0.8286 -
7.7439 135000 0.3318 1.2782 0.8254 -
7.7726 135500 0.3589 1.2826 0.8265 -
7.8013 136000 0.3473 1.2572 0.8337 -
7.8300 136500 0.3284 1.2599 0.8322 -
7.8587 137000 0.3334 1.3010 0.8307 -
7.8873 137500 0.3253 1.2655 0.8330 -
7.9160 138000 0.3402 1.2753 0.8343 -
7.9447 138500 0.345 1.2521 0.8338 -
7.9734 139000 0.3209 1.2731 0.8327 -
8.0021 139500 0.3481 1.2801 0.8296 -
8.0307 140000 0.2829 1.3438 0.8363 -
8.0594 140500 0.2889 1.3169 0.8290 -
8.0881 141000 0.2933 1.3344 0.8306 -
8.1168 141500 0.3098 1.2897 0.8317 -
8.1455 142000 0.2846 1.3296 0.8342 -
8.1742 142500 0.2738 1.3388 0.8254 -
8.2028 143000 0.2741 1.3294 0.8266 -
8.2315 143500 0.3133 1.3106 0.8303 -
8.2602 144000 0.3043 1.3315 0.8284 -
8.2889 144500 0.3257 1.2867 0.8319 -
8.3176 145000 0.299 1.2985 0.8293 -
8.3462 145500 0.2837 1.3443 0.8301 -
8.3749 146000 0.2886 1.3092 0.8316 -
8.4036 146500 0.2837 1.3053 0.8309 -
8.4323 147000 0.2898 1.3065 0.8285 -
8.4610 147500 0.2967 1.2968 0.8301 -
8.4896 148000 0.3062 1.3315 0.8288 -
8.5183 148500 0.3099 1.2954 0.8310 -
8.5470 149000 0.2906 1.2878 0.8282 -
8.5757 149500 0.3114 1.3483 0.8259 -
8.6044 150000 0.2784 1.2967 0.8318 -
8.6331 150500 0.2977 1.3032 0.8331 -
8.6617 151000 0.3194 1.3009 0.8336 -
8.6904 151500 0.292 1.2783 0.8331 -
8.7191 152000 0.2909 1.3243 0.8357 -
8.7478 152500 0.2867 1.3135 0.8316 -
8.7765 153000 0.2762 1.3196 0.8362 -
8.8051 153500 0.2973 1.3327 0.8344 -
8.8338 154000 0.2976 1.3533 0.8332 -
8.8625 154500 0.281 1.2960 0.8356 -
8.8912 155000 0.2885 1.2833 0.8347 -
8.9199 155500 0.286 1.2873 0.8367 -
8.9485 156000 0.2851 1.3146 0.8308 -
8.9772 156500 0.2999 1.3225 0.8313 -
9.0059 157000 0.2812 1.3481 0.8290 -
9.0346 157500 0.2415 1.3387 0.8276 -
9.0633 158000 0.2282 1.3657 0.8310 -
9.0920 158500 0.2315 1.3422 0.8332 -
9.1206 159000 0.2271 1.3270 0.8338 -
9.1493 159500 0.2578 1.3409 0.8346 -
9.1780 160000 0.2534 1.3682 0.8263 -
9.2067 160500 0.252 1.3480 0.8292 -
9.2354 161000 0.2426 1.3118 0.8324 -
9.2640 161500 0.241 1.3244 0.8342 -
9.2927 162000 0.2641 1.3550 0.8300 -
9.3214 162500 0.235 1.3633 0.8281 -
9.3501 163000 0.2376 1.3247 0.8295 -
9.3788 163500 0.2533 1.2885 0.8280 -
9.4074 164000 0.2496 1.2880 0.8262 -
9.4361 164500 0.2542 1.3147 0.8303 -
9.4648 165000 0.2398 1.3021 0.8283 -
9.4935 165500 0.255 1.3547 0.8331 -
9.5222 166000 0.2566 1.3392 0.8322 -
9.5509 166500 0.253 1.3471 0.8311 -
9.5795 167000 0.2521 1.3255 0.8320 -
9.6082 167500 0.2431 1.3392 0.8286 -
9.6369 168000 0.2455 1.3146 0.8325 -
9.6656 168500 0.2379 1.3177 0.8359 -
9.6943 169000 0.2446 1.3157 0.8338 -
9.7229 169500 0.244 1.2792 0.8323 -
9.7516 170000 0.243 1.2883 0.8327 -
9.7803 170500 0.2607 1.2789 0.8342 -
9.8090 171000 0.2457 1.3022 0.8286 -
9.8377 171500 0.2556 1.3212 0.8319 -
9.8663 172000 0.2483 1.2960 0.8326 -
9.8950 172500 0.2337 1.2957 0.8349 -
9.9237 173000 0.2279 1.3591 0.8322 -
9.9524 173500 0.2537 1.3259 0.8320 -
9.9811 174000 0.2501 1.3193 0.8283 -
10.0098 174500 0.2342 1.3031 0.8306 -
10.0384 175000 0.2107 1.3039 0.8320 -
10.0671 175500 0.2144 1.3254 0.8299 -
10.0958 176000 0.2195 1.3310 0.8289 -
10.1245 176500 0.2224 1.3331 0.8283 -
10.1532 177000 0.2314 1.3470 0.8290 -
10.1818 177500 0.2074 1.3448 0.8295 -
10.2105 178000 0.2144 1.3502 0.8289 -
10.2392 178500 0.2165 1.3407 0.8321 -
10.2679 179000 0.2225 1.3336 0.8300 -
10.2966 179500 0.2049 1.3322 0.8325 -
10.3252 180000 0.235 1.3099 0.8291 -
10.3539 180500 0.223 1.3654 0.8285 -
10.3826 181000 0.2486 1.3403 0.8318 -
10.4113 181500 0.2111 1.3213 0.8294 -
10.4400 182000 0.2272 1.3535 0.8311 -
10.4687 182500 0.2128 1.3455 0.8282 -
10.4973 183000 0.2087 1.3252 0.8326 -
10.5260 183500 0.2134 1.3360 0.8328 -
10.5547 184000 0.1955 1.3268 0.8345 -
10.5834 184500 0.2061 1.3499 0.8300 -
10.6121 185000 0.2089 1.3325 0.8294 -
10.6407 185500 0.2152 1.3630 0.8318 -
10.6694 186000 0.2086 1.3266 0.8285 -
10.6981 186500 0.2163 1.3130 0.8299 -
10.7268 187000 0.2218 1.3009 0.8317 -
10.7555 187500 0.1945 1.3064 0.8337 -
10.7841 188000 0.2323 1.3216 0.8319 -
10.8128 188500 0.1993 1.3334 0.8330 -
10.8415 189000 0.2302 1.3199 0.8315 -
10.8702 189500 0.2095 1.3527 0.8309 -
10.8989 190000 0.2019 1.3195 0.8335 -
10.9276 190500 0.2179 1.3258 0.8289 -
10.9562 191000 0.2264 1.2852 0.8345 -
10.9849 191500 0.2311 1.2949 0.8336 -
11.0136 192000 0.1846 1.3296 0.8317 -
11.0423 192500 0.1773 1.3000 0.8316 -
11.0710 193000 0.206 1.3370 0.8324 -
11.0996 193500 0.1861 1.3470 0.8267 -
11.1283 194000 0.2117 1.3355 0.8300 -
11.1570 194500 0.1756 1.3456 0.8265 -
11.1857 195000 0.2077 1.3432 0.8272 -
11.2144 195500 0.1997 1.3428 0.8280 -
11.2430 196000 0.198 1.3004 0.8309 -
11.2717 196500 0.1841 1.3437 0.8272 -
11.3004 197000 0.1927 1.3217 0.8326 -
11.3291 197500 0.1905 1.3424 0.8270 -
11.3578 198000 0.2011 1.3316 0.8251 -
11.3865 198500 0.1901 1.3385 0.8279 -
11.4151 199000 0.2003 1.3366 0.8264 -
11.4438 199500 0.2099 1.3282 0.8320 -
11.4725 200000 0.1838 1.3151 0.8317 -
11.5012 200500 0.1867 1.3451 0.8309 -
11.5299 201000 0.1887 1.3606 0.8273 -
11.5585 201500 0.2096 1.3450 0.8281 -
11.5872 202000 0.2017 1.3495 0.8288 -
11.6159 202500 0.1911 1.3665 0.8293 -
11.6446 203000 0.1752 1.3612 0.8276 -
11.6733 203500 0.1843 1.3630 0.8307 -
11.7019 204000 0.1771 1.3637 0.8300 -
11.7306 204500 0.1876 1.3336 0.8301 -
11.7593 205000 0.1908 1.3561 0.8291 -
11.7880 205500 0.154 1.3526 0.8284 -
11.8167 206000 0.1891 1.3541 0.8277 -
11.8454 206500 0.1888 1.3914 0.8276 -
11.8740 207000 0.1755 1.3729 0.8305 -
11.9027 207500 0.1747 1.3535 0.8332 -
11.9314 208000 0.1974 1.3643 0.8310 -
11.9601 208500 0.1912 1.3955 0.8251 -
11.9888 209000 0.1655 1.3693 0.8313 -
12.0174 209500 0.17 1.3893 0.8306 -
12.0461 210000 0.1523 1.3973 0.8298 -
12.0748 210500 0.1811 1.3727 0.8303 -
12.1035 211000 0.1631 1.4143 0.8299 -
12.1322 211500 0.1686 1.3605 0.8290 -
12.1608 212000 0.1644 1.3652 0.8295 -
12.1895 212500 0.1777 1.3657 0.8259 -
12.2182 213000 0.1618 1.3921 0.8272 -
12.2469 213500 0.1514 1.3664 0.8273 -
12.2756 214000 0.1651 1.3821 0.8249 -
12.3043 214500 0.1696 1.3777 0.8262 -
12.3329 215000 0.1638 1.3712 0.8265 -
12.3616 215500 0.1824 1.3719 0.8271 -
12.3903 216000 0.1646 1.3850 0.8271 -
12.4190 216500 0.1667 1.3704 0.8298 -
12.4477 217000 0.1764 1.3355 0.8307 -
12.4763 217500 0.1703 1.3312 0.8292 -
12.5050 218000 0.166 1.3544 0.8284 -
12.5337 218500 0.1659 1.3639 0.8275 -
12.5624 219000 0.1617 1.3830 0.8277 -
12.5911 219500 0.1557 1.3394 0.8292 -
12.6197 220000 0.1819 1.3540 0.8274 -
12.6484 220500 0.1611 1.3919 0.8254 -
12.6771 221000 0.1824 1.3366 0.8304 -
12.7058 221500 0.1621 1.3581 0.8277 -
12.7345 222000 0.1716 1.3638 0.8302 -
12.7632 222500 0.1604 1.3631 0.8294 -
12.7918 223000 0.1772 1.3577 0.8274 -
12.8205 223500 0.1592 1.3439 0.8290 -
12.8492 224000 0.1619 1.3722 0.8298 -
12.8779 224500 0.1561 1.3631 0.8294 -
12.9066 225000 0.1681 1.3768 0.8301 -
12.9352 225500 0.1666 1.3711 0.8257 -
12.9639 226000 0.1599 1.3684 0.8271 -
12.9926 226500 0.1702 1.3657 0.8260 -
13.0213 227000 0.1487 1.3755 0.8268 -
13.0500 227500 0.1428 1.3684 0.8280 -
13.0786 228000 0.1549 1.4036 0.8271 -
13.1073 228500 0.1623 1.3891 0.8281 -
13.1360 229000 0.1487 1.3758 0.8289 -
13.1647 229500 0.1619 1.3753 0.8283 -
13.1934 230000 0.1567 1.4021 0.8275 -
13.2221 230500 0.1622 1.3753 0.8288 -
13.2507 231000 0.1478 1.3688 0.8295 -
13.2794 231500 0.1482 1.3960 0.8279 -
13.3081 232000 0.1504 1.3679 0.8285 -
13.3368 232500 0.1651 1.3879 0.8270 -
13.3655 233000 0.1392 1.3828 0.8282 -
13.3941 233500 0.1652 1.3710 0.8271 -
13.4228 234000 0.1356 1.3715 0.8272 -
13.4515 234500 0.1515 1.3681 0.8289 -
13.4802 235000 0.1443 1.3815 0.8295 -
13.5089 235500 0.1438 1.3946 0.8280 -
13.5375 236000 0.1419 1.3664 0.8292 -
13.5662 236500 0.1494 1.3847 0.8285 -
13.5949 237000 0.1517 1.3948 0.8269 -
13.6236 237500 0.1522 1.3967 0.8276 -
13.6523 238000 0.152 1.3933 0.8285 -
13.6809 238500 0.1453 1.3879 0.8275 -
13.7096 239000 0.1514 1.4199 0.8263 -
13.7383 239500 0.1457 1.3954 0.8244 -
13.7670 240000 0.1556 1.3833 0.8261 -
13.7957 240500 0.149 1.3795 0.8262 -
13.8244 241000 0.1466 1.3849 0.8272 -
13.8530 241500 0.1417 1.3761 0.8283 -
13.8817 242000 0.149 1.3752 0.8256 -
13.9104 242500 0.1377 1.3687 0.8276 -
13.9391 243000 0.154 1.3717 0.8275 -
13.9678 243500 0.1419 1.3658 0.8286 -
13.9964 244000 0.1406 1.3816 0.8284 -
14.0251 244500 0.1365 1.3771 0.8281 -
14.0538 245000 0.1486 1.3772 0.8286 -
14.0825 245500 0.1344 1.3811 0.8293 -
14.1112 246000 0.1419 1.3766 0.8288 -
14.1398 246500 0.1399 1.3688 0.8275 -
14.1685 247000 0.1226 1.3883 0.8267 -
14.1972 247500 0.1373 1.3843 0.8257 -
14.2259 248000 0.152 1.3978 0.8269 -
14.2546 248500 0.1377 1.3878 0.8260 -
14.2833 249000 0.1455 1.3833 0.8258 -
14.3119 249500 0.1683 1.3903 0.8262 -
14.3406 250000 0.1248 1.4027 0.8259 -
14.3693 250500 0.1371 1.3947 0.8260 -
14.3980 251000 0.1419 1.3897 0.8265 -
14.4267 251500 0.1431 1.3956 0.8262 -
14.4553 252000 0.1485 1.3880 0.8271 -
14.4840 252500 0.1235 1.3851 0.8270 -
14.5127 253000 0.1119 1.3891 0.8268 -
14.5414 253500 0.1297 1.3973 0.8269 -
14.5701 254000 0.1388 1.3949 0.8261 -
14.5987 254500 0.1234 1.3931 0.8263 -
14.6274 255000 0.1392 1.3906 0.8270 -
14.6561 255500 0.1403 1.3913 0.8269 -
14.6848 256000 0.141 1.3999 0.8263 -
14.7135 256500 0.1356 1.3971 0.8264 -
14.7422 257000 0.1352 1.3998 0.8258 -
14.7708 257500 0.1495 1.3989 0.8256 -
14.7995 258000 0.1315 1.3983 0.8257 -
14.8282 258500 0.1097 1.4002 0.8259 -
14.8569 259000 0.125 1.3997 0.8259 -
14.8856 259500 0.1424 1.4013 0.8259 -
14.9142 260000 0.1393 1.3994 0.8259 -
14.9429 260500 0.1441 1.3972 0.8259 -
14.9716 261000 0.1391 1.3988 0.8259 -
-1 -1 - - - 0.7921

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
25
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/roberta-base-mrl-768-512-256-128-64

Finetuned
(2060)
this model

Dataset used to train sobamchan/roberta-base-mrl-768-512-256-128-64

Collection including sobamchan/roberta-base-mrl-768-512-256-128-64

Evaluation results