SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m on the sebenx_triplets dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google/embeddinggemma-300m
  • Maximum Sequence Length: 2048 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("s7d11/SEmbedv1-0.3b")
# Run inference
queries = [
    "Lon b\u025b\u025b, a b\u025b feere damina kabini s\u0254g\u0254ma f\u0254 su f\u025b.",
]
documents = [
    'best tak dia?',
    'Sankasojanw camaw lajɛliw cɛkaɲi ani lajɛli kabɔ sankasojanw dɔ sanfɛlala walima kabɔ finɛtiri dɔla min pozisiyɔn kaɲi be seka kɛ fɛn cɛɲuman ye ka lajɛ.',
    'Pua ka wiliwili nanahu ka mano …',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.6840, 0.3838, 0.8201]])

Training Details

Training Dataset

sebenx_triplets

  • Dataset: sebenx_triplets at a152d73
  • Size: 2,703,977 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 40.61 tokens
    • max: 212 tokens
    • min: 4 tokens
    • mean: 40.27 tokens
    • max: 153 tokens
    • min: 3 tokens
    • mean: 39.66 tokens
    • max: 155 tokens
  • Samples:
    anchor positive negative
    kalata fan wɛrɛw fɛ, kɛmɛ sarada jabi 29 hakilina ye Australie ka kɛ bɛjɛnjamana ye teliyara, waati minna kɛmɛ sarada 31 hakilina ye Australie man kan abada ka kɛ bɛjɛnjamana ye. Tile minna ko nunuw kɛra be wele forobala ko Sanfɛ Cɛmancɛ Waati, Erɔpu tariku waati sankɛmɛsigi 11na, 12na ni 13na (san 1000-1300 Yesu bangeli kɔfɛ). Nɛnɛ damatɛmɛlen be seka kɛ nisongoyala: goniyajakɛ be jigin cogogɛlɛ jali duguma, nka fiyɛn ani sumaya be faara ɲɔgɔnkan ka nɛnɛ tɔ juguya ka tɛmɛ goniyahakɛ sumana be min fɔ kan.
    a mɛn ka kɛ tile kelen ye, walima fila,. A tigi dalen ka kan ka to tile caman. Nin ye mun ɲεnajε ye ?
    Dja ko foyi tèkai ni sababou tala. interj NZ a Māori greeting to two. galoda Sinamuso jugu Nis kasara Gawo taama bɛnkan Ngɔnikɔrɔ bama bamukan Dinbal Fakɔkuru Lolɛ Ncininna Erɔp ntonlan.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            0.3,
            0.1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

sebenx_triplets

  • Dataset: sebenx_triplets at a152d73
  • Size: 271 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 271 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 41.12 tokens
    • max: 141 tokens
    • min: 4 tokens
    • mean: 41.09 tokens
    • max: 131 tokens
    • min: 4 tokens
    • mean: 41.08 tokens
    • max: 149 tokens
  • Samples:
    anchor positive negative
    Kaladen kɔrɔ dɔ ko 'a tun be kumakan langolo fɔ kalanso kɔnɔ, ka jalali dɔnniw kalankɛ nɔbilaw la an'a tun be inafɔ kalandenw teri .'. Kabini o tuma na Sini jamana sɔrɔ yiriwara siyɛn 90. Nɔremu 802.11n be barakƐ fiɲƐsira 2,4 Ghz ni 5 Ghz kan.
    maure.» banxanxalle Noun. sore; dartre. Category: 2.5. Healthy. sim: fanqalelle. Bambara, dioula, malinké
    Barabara cia bũrũri itirĩ umithio mũnene angĩkoro irateithia mĩtoka mĩnini, kwa ũguo no nginya mĩtaratara ya gũthondeka na kũnyihia thogora ya cio ĩthondeko. Sɔrɔboso yɛlɛmabolo fɔlɔ in kɛra Deng Xiaoping ka fanga kɔnɔ. jàmanakuntigi ka Irisilataga Krowasiya Shipuru 2025-05 Mùneyisa 2025-04 Bilgariya Bɔsni 2025-03 2025 Selincinin Ne bɛ n.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            0.3,
            0.1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 1
  • gradient_accumulation_steps: 8
  • learning_rate: 5e-06
  • weight_decay: 0.01
  • max_steps: 5000
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • prompts: task: sentence similarity | query:

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 1
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: 5000
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: task: sentence similarity | query:
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0000 10 1.3857 -
0.0001 20 1.3352 -
0.0001 30 1.1998 -
0.0001 40 1.1487 -
0.0001 50 1.0881 -
0.0002 60 1.0239 -
0.0002 70 1.0615 -
0.0002 80 0.9406 -
0.0003 90 1.0496 -
0.0003 100 0.9214 -
0.0003 110 0.953 -
0.0004 120 0.8999 -
0.0004 130 1.0409 -
0.0004 140 0.9998 -
0.0004 150 0.883 -
0.0005 160 0.9923 -
0.0005 170 0.8498 -
0.0005 180 0.7824 -
0.0006 190 0.8451 -
0.0006 200 0.846 -
0.0006 210 0.8094 -
0.0007 220 0.9448 -
0.0007 230 0.9072 -
0.0007 240 0.7502 -
0.0007 250 0.7043 -
0.0008 260 0.777 -
0.0008 270 0.9626 -
0.0008 280 1.0501 -
0.0009 290 0.7489 -
0.0009 300 0.8305 -
0.0009 310 0.6411 -
0.0009 320 0.8251 -
0.0010 330 0.6632 -
0.0010 340 0.8039 -
0.0010 350 0.6465 -
0.0011 360 0.9541 -
0.0011 370 0.8584 -
0.0011 380 0.8907 -
0.0012 390 0.7243 -
0.0012 400 0.7592 -
0.0012 410 0.8199 -
0.0012 420 0.7394 -
0.0013 430 0.7053 -
0.0013 440 0.7955 -
0.0013 450 0.8382 -
0.0014 460 0.7608 -
0.0014 470 0.7308 -
0.0014 480 0.718 -
0.0014 490 0.6998 -
0.0015 500 0.8602 -
0.0015 510 0.9128 -
0.0015 520 0.6842 -
0.0016 530 0.8519 -
0.0016 540 0.9339 -
0.0016 550 0.8003 -
0.0017 560 0.7795 -
0.0017 570 0.8377 -
0.0017 580 0.5871 -
0.0017 590 0.7091 -
0.0018 600 0.79 -
0.0018 610 0.9473 -
0.0018 620 0.8671 -
0.0019 630 0.7165 -
0.0019 640 0.8825 -
0.0019 650 0.8347 -
0.0020 660 0.6016 -
0.0020 670 0.8572 -
0.0020 680 0.772 -
0.0020 690 0.7865 -
0.0021 700 0.9205 -
0.0021 710 0.8145 -
0.0021 720 0.783 -
0.0022 730 0.7304 -
0.0022 740 0.7809 -
0.0022 750 0.8504 -
0.0022 760 0.6971 -
0.0023 770 0.8535 -
0.0023 780 0.7312 -
0.0023 790 0.6701 -
0.0024 800 0.7899 -
0.0024 810 0.7688 -
0.0024 820 0.7493 -
0.0025 830 0.6789 -
0.0025 840 0.8506 -
0.0025 850 0.7875 -
0.0025 860 0.653 -
0.0026 870 0.7847 -
0.0026 880 0.7561 -
0.0026 890 0.6186 -
0.0027 900 0.6631 -
0.0027 910 0.7191 -
0.0027 920 0.666 -
0.0028 930 0.7304 -
0.0028 940 0.9292 -
0.0028 950 0.5899 -
0.0028 960 0.8594 -
0.0029 970 0.6226 -
0.0029 980 0.6514 -
0.0029 990 0.5353 -
0.0030 1000 0.5977 -
0.0030 1010 0.7201 -
0.0030 1020 0.8047 -
0.0030 1030 0.5933 -
0.0031 1040 0.7361 -
0.0031 1050 0.6687 -
0.0031 1060 0.8877 -
0.0032 1070 0.722 -
0.0032 1080 0.6555 -
0.0032 1090 0.7812 -
0.0033 1100 0.6015 -
0.0033 1110 0.7915 -
0.0033 1120 0.6999 -
0.0033 1130 0.6956 -
0.0034 1140 0.5105 -
0.0034 1150 0.7155 -
0.0034 1160 0.6233 -
0.0035 1170 0.9316 -
0.0035 1180 0.6544 -
0.0035 1190 0.6487 -
0.0036 1200 0.6459 -
0.0036 1210 0.8283 -
0.0036 1220 0.7653 -
0.0036 1230 0.7429 -
0.0037 1240 0.6253 -
0.0037 1250 0.7739 3.2216
0.0037 1260 0.5848 -
0.0038 1270 0.6655 -
0.0038 1280 0.6969 -
0.0038 1290 0.8055 -
0.0038 1300 0.5927 -
0.0039 1310 0.6252 -
0.0039 1320 0.7125 -
0.0039 1330 0.547 -
0.0040 1340 0.7427 -
0.0040 1350 0.8153 -
0.0040 1360 0.6979 -
0.0041 1370 0.6194 -
0.0041 1380 0.6441 -
0.0041 1390 0.5782 -
0.0041 1400 0.6529 -
0.0042 1410 0.8555 -
0.0042 1420 0.7904 -
0.0042 1430 0.5629 -
0.0043 1440 0.6203 -
0.0043 1450 0.6226 -
0.0043 1460 0.7346 -
0.0043 1470 0.8153 -
0.0044 1480 0.7878 -
0.0044 1490 0.769 -
0.0044 1500 0.6265 -
0.0045 1510 0.5634 -
0.0045 1520 0.5782 -
0.0045 1530 0.6093 -
0.0046 1540 0.6989 -
0.0046 1550 0.7951 -
0.0046 1560 0.6215 -
0.0046 1570 0.7065 -
0.0047 1580 0.6772 -
0.0047 1590 0.5745 -
0.0047 1600 0.6832 -
0.0048 1610 0.6318 -
0.0048 1620 0.7641 -
0.0048 1630 0.8019 -
0.0049 1640 0.7143 -
0.0049 1650 0.6369 -
0.0049 1660 0.6575 -
0.0049 1670 0.6055 -
0.0050 1680 0.675 -
0.0050 1690 0.5365 -
0.0050 1700 0.5092 -
0.0051 1710 0.7284 -
0.0051 1720 0.7647 -
0.0051 1730 0.5493 -
0.0051 1740 0.5061 -
0.0052 1750 0.5138 -
0.0052 1760 0.7677 -
0.0052 1770 0.5683 -
0.0053 1780 0.6337 -
0.0053 1790 0.5645 -
0.0053 1800 0.4971 -
0.0054 1810 0.7195 -
0.0054 1820 0.4615 -
0.0054 1830 0.7374 -
0.0054 1840 0.5524 -
0.0055 1850 0.7127 -
0.0055 1860 0.6545 -
0.0055 1870 0.6168 -
0.0056 1880 0.6194 -
0.0056 1890 0.4979 -
0.0056 1900 0.7268 -
0.0057 1910 0.4508 -
0.0057 1920 0.7093 -
0.0057 1930 0.6059 -
0.0057 1940 0.5363 -
0.0058 1950 0.659 -
0.0058 1960 0.5137 -
0.0058 1970 0.6528 -
0.0059 1980 0.6183 -
0.0059 1990 0.8905 -
0.0059 2000 0.6576 -
0.0059 2010 0.6094 -
0.0060 2020 0.5005 -
0.0060 2030 0.5099 -
0.0060 2040 0.631 -
0.0061 2050 0.4429 -
0.0061 2060 0.5831 -
0.0061 2070 0.6217 -
0.0062 2080 0.5121 -
0.0062 2090 0.5428 -
0.0062 2100 0.62 -
0.0062 2110 0.5721 -
0.0063 2120 0.5665 -
0.0063 2130 0.4057 -
0.0063 2140 0.7022 -
0.0064 2150 0.7608 -
0.0064 2160 0.6097 -
0.0064 2170 0.5711 -
0.0064 2180 0.4813 -
0.0065 2190 0.6525 -
0.0065 2200 0.6782 -
0.0065 2210 0.5661 -
0.0066 2220 0.754 -
0.0066 2230 0.6587 -
0.0066 2240 0.5377 -
0.0067 2250 0.8553 -
0.0067 2260 0.4283 -
0.0067 2270 0.6733 -
0.0067 2280 0.6693 -
0.0068 2290 0.5919 -
0.0068 2300 0.5743 -
0.0068 2310 0.7105 -
0.0069 2320 0.4436 -
0.0069 2330 0.6323 -
0.0069 2340 0.5959 -
0.0070 2350 0.6491 -
0.0070 2360 0.7986 -
0.0070 2370 0.5997 -
0.0070 2380 0.4897 -
0.0071 2390 0.5401 -
0.0071 2400 0.7304 -
0.0071 2410 0.5874 -
0.0072 2420 0.5637 -
0.0072 2430 0.5432 -
0.0072 2440 0.5799 -
0.0072 2450 0.5674 -
0.0073 2460 0.846 -
0.0073 2470 0.6006 -
0.0073 2480 0.5279 -
0.0074 2490 0.706 -
0.0074 2500 0.5741 3.0077
0.0074 2510 0.5416 -
0.0075 2520 0.448 -
0.0075 2530 0.5437 -
0.0075 2540 0.662 -
0.0075 2550 0.6424 -
0.0076 2560 0.682 -
0.0076 2570 0.6211 -
0.0076 2580 0.5738 -
0.0077 2590 0.5747 -
0.0077 2600 0.959 -
0.0077 2610 0.56 -
0.0078 2620 0.6612 -
0.0078 2630 0.5008 -
0.0078 2640 0.4839 -
0.0078 2650 0.6241 -
0.0079 2660 0.6323 -
0.0079 2670 0.6601 -
0.0079 2680 0.517 -
0.0080 2690 0.6023 -
0.0080 2700 0.5601 -
0.0080 2710 0.611 -
0.0080 2720 0.7261 -
0.0081 2730 0.515 -
0.0081 2740 0.5517 -
0.0081 2750 0.5843 -
0.0082 2760 0.4607 -
0.0082 2770 0.5416 -
0.0082 2780 0.6806 -
0.0083 2790 0.6127 -
0.0083 2800 0.6366 -
0.0083 2810 0.6962 -
0.0083 2820 0.4876 -
0.0084 2830 0.7263 -
0.0084 2840 0.5974 -
0.0084 2850 0.4835 -
0.0085 2860 0.4579 -
0.0085 2870 0.429 -
0.0085 2880 0.4439 -
0.0086 2890 0.5631 -
0.0086 2900 0.6307 -
0.0086 2910 0.5138 -
0.0086 2920 0.617 -
0.0087 2930 0.5033 -
0.0087 2940 0.6152 -
0.0087 2950 0.5089 -
0.0088 2960 0.4937 -
0.0088 2970 0.5528 -
0.0088 2980 0.5194 -
0.0088 2990 0.772 -
0.0089 3000 0.5303 -
0.0089 3010 0.565 -
0.0089 3020 0.5464 -
0.0090 3030 0.6153 -
0.0090 3040 0.5965 -
0.0090 3050 0.712 -
0.0091 3060 0.4347 -
0.0091 3070 0.4398 -
0.0091 3080 0.6925 -
0.0091 3090 0.8619 -
0.0092 3100 0.7581 -
0.0092 3110 0.8109 -
0.0092 3120 0.4329 -
0.0093 3130 0.4853 -
0.0093 3140 0.5674 -
0.0093 3150 0.6655 -
0.0093 3160 0.48 -
0.0094 3170 0.3521 -
0.0094 3180 0.5814 -
0.0094 3190 0.4354 -
0.0095 3200 0.6543 -
0.0095 3210 0.5167 -
0.0095 3220 0.8639 -
0.0096 3230 0.48 -
0.0096 3240 0.6677 -
0.0096 3250 0.6518 -
0.0096 3260 0.5602 -
0.0097 3270 0.589 -
0.0097 3280 0.6361 -
0.0097 3290 0.6589 -
0.0098 3300 0.5138 -
0.0098 3310 0.5356 -
0.0098 3320 0.533 -
0.0099 3330 0.6241 -
0.0099 3340 0.6112 -
0.0099 3350 0.5351 -
0.0099 3360 0.4903 -
0.0100 3370 0.4544 -
0.0100 3380 0.4495 -
0.0100 3390 0.4382 -
0.0101 3400 0.5671 -
0.0101 3410 0.4735 -
0.0101 3420 0.638 -
0.0101 3430 0.5626 -
0.0102 3440 0.4754 -
0.0102 3450 0.4749 -
0.0102 3460 0.4778 -
0.0103 3470 0.3425 -
0.0103 3480 0.5415 -
0.0103 3490 0.5165 -
0.0104 3500 0.6016 -
0.0104 3510 0.5639 -
0.0104 3520 0.8738 -
0.0104 3530 0.5062 -
0.0105 3540 0.4332 -
0.0105 3550 0.8084 -
0.0105 3560 0.7191 -
0.0106 3570 0.5944 -
0.0106 3580 0.6997 -
0.0106 3590 0.63 -
0.0107 3600 0.4186 -
0.0107 3610 0.5776 -
0.0107 3620 0.4875 -
0.0107 3630 0.5769 -
0.0108 3640 0.509 -
0.0108 3650 0.5627 -
0.0108 3660 0.5159 -
0.0109 3670 0.6378 -
0.0109 3680 0.4965 -
0.0109 3690 0.5775 -
0.0109 3700 0.657 -
0.0110 3710 0.7192 -
0.0110 3720 0.3836 -
0.0110 3730 0.6142 -
0.0111 3740 0.4774 -
0.0111 3750 0.5099 2.9023
0.0111 3760 0.6325 -
0.0112 3770 0.6974 -
0.0112 3780 0.5958 -
0.0112 3790 0.5643 -
0.0112 3800 0.5476 -
0.0113 3810 0.3828 -
0.0113 3820 0.7134 -
0.0113 3830 0.5593 -
0.0114 3840 0.4622 -
0.0114 3850 0.4911 -
0.0114 3860 0.7652 -
0.0114 3870 0.4124 -
0.0115 3880 0.7257 -
0.0115 3890 0.459 -
0.0115 3900 0.4988 -
0.0116 3910 0.5146 -
0.0116 3920 0.5613 -
0.0116 3930 0.6893 -
0.0117 3940 0.4245 -
0.0117 3950 0.4426 -
0.0117 3960 0.8301 -
0.0117 3970 0.3732 -
0.0118 3980 0.516 -
0.0118 3990 0.445 -
0.0118 4000 0.838 -
0.0119 4010 0.6627 -
0.0119 4020 0.3563 -
0.0119 4030 0.532 -
0.0120 4040 0.7707 -
0.0120 4050 0.5832 -
0.0120 4060 0.5266 -
0.0120 4070 0.5309 -
0.0121 4080 0.6722 -
0.0121 4090 0.5141 -
0.0121 4100 0.4724 -
0.0122 4110 0.7266 -
0.0122 4120 0.4685 -
0.0122 4130 0.4988 -
0.0122 4140 0.4194 -
0.0123 4150 0.4976 -
0.0123 4160 0.5164 -
0.0123 4170 0.6077 -
0.0124 4180 0.6547 -
0.0124 4190 0.6342 -
0.0124 4200 0.5514 -
0.0125 4210 0.4814 -
0.0125 4220 0.4895 -
0.0125 4230 0.7219 -
0.0125 4240 0.5481 -
0.0126 4250 0.4702 -
0.0126 4260 0.7058 -
0.0126 4270 0.3936 -
0.0127 4280 0.6489 -
0.0127 4290 0.5032 -
0.0127 4300 0.5088 -
0.0128 4310 0.523 -
0.0128 4320 0.4418 -
0.0128 4330 0.583 -
0.0128 4340 0.564 -
0.0129 4350 0.6308 -
0.0129 4360 0.5444 -
0.0129 4370 0.5474 -
0.0130 4380 0.4261 -
0.0130 4390 0.5347 -
0.0130 4400 0.6137 -
0.0130 4410 0.4739 -
0.0131 4420 0.5185 -
0.0131 4430 0.4315 -
0.0131 4440 0.5913 -
0.0132 4450 0.5222 -
0.0132 4460 0.4818 -
0.0132 4470 0.5603 -
0.0133 4480 0.6157 -
0.0133 4490 0.6436 -
0.0133 4500 0.6227 -
0.0133 4510 0.4639 -
0.0134 4520 0.6379 -
0.0134 4530 0.5369 -
0.0134 4540 0.4951 -
0.0135 4550 0.5235 -
0.0135 4560 0.5048 -
0.0135 4570 0.4953 -
0.0136 4580 0.6981 -
0.0136 4590 0.5543 -
0.0136 4600 0.5432 -
0.0136 4610 0.4719 -
0.0137 4620 0.5418 -
0.0137 4630 0.7021 -
0.0137 4640 0.5176 -
0.0138 4650 0.459 -
0.0138 4660 0.6334 -
0.0138 4670 0.4691 -
0.0138 4680 0.4473 -
0.0139 4690 0.474 -
0.0139 4700 0.5297 -
0.0139 4710 0.6543 -
0.0140 4720 0.5651 -
0.0140 4730 0.5072 -
0.0140 4740 0.5961 -
0.0141 4750 0.5262 -
0.0141 4760 0.6235 -
0.0141 4770 0.4718 -
0.0141 4780 0.497 -
0.0142 4790 0.5046 -
0.0142 4800 0.5694 -
0.0142 4810 0.4202 -
0.0143 4820 0.6833 -
0.0143 4830 0.6341 -
0.0143 4840 0.5327 -
0.0143 4850 0.4914 -
0.0144 4860 0.6098 -
0.0144 4870 0.4093 -
0.0144 4880 0.5317 -
0.0145 4890 0.5809 -
0.0145 4900 0.3474 -
0.0145 4910 0.4408 -
0.0146 4920 0.4957 -
0.0146 4930 0.5085 -
0.0146 4940 0.5089 -
0.0146 4950 0.6008 -
0.0147 4960 0.3984 -
0.0147 4970 0.489 -
0.0147 4980 0.3918 -
0.0148 4990 0.6235 -
0.0148 5000 0.5644 2.8565

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.3
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.2
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
32
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for s7d11/SEmbedv1-0.3b

Finetuned
(165)
this model

Dataset used to train s7d11/SEmbedv1-0.3b