Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Ethosoft
/
NedoTurkishTokenizer

Token Classification
Transformers
Turkish
nedo-turkish-tokenizer
tokenizer
morphology
turkish
nlp
Model card Files Files and versions
xet
Community
NedoTurkishTokenizer
3.88 MB
Ctrl+K
Ctrl+K
  • 4 contributors
History: 39 commits
nmstech's picture
nmstech
remove embedded hugging face tokens
3a2f322 about 16 hours ago
  • nedo_turkish_tokenizer
    merge github main before publish about 16 hours ago
  • results
    zemberek temizlik 25 days ago
  • tests
    use zemberek-python and add regression tests about 17 hours ago
  • .gitattributes
    1.64 kB
    Zemberek LFS ile eklendi 25 days ago
  • .gitignore
    75 Bytes
    Fix broken placeholder mechanism: replace with segment-based tokenization 27 days ago
  • README.md
    6.27 kB
    Upload folder using huggingface_hub 23 days ago
  • hf_benchmark.py
    13.9 kB
    remove embedded hugging face tokens about 16 hours ago
  • paper_baseline_check.py
    3.6 kB
    remove embedded hugging face tokens about 16 hours ago
  • pyproject.toml
    1.17 kB
    use zemberek-python and add regression tests about 17 hours ago
  • test_lattice.py
    2.76 kB
    Migrate to zemberek-python, remove JVM dependency and 31MB JAR, apply O(N^2) init fix 24 days ago
  • tokenization_nedo_turkish.py
    6.45 kB
    Rename project from TurkTokenizer to NedoTurkishTokenizer 25 days ago
  • tokenizer_config.json
    398 Bytes
    use zemberek-python and add regression tests about 17 hours ago