Zixi "Oz" Li
OzTianlu
AI & ML interests
My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.
Recent Activity
reacted
to
Parveshiiii's
post
with ๐ฅ
about 2 hours ago
๐ Wanna train your own AI Model or Tokenizer from scratch?
Building models isnโt just for big labs anymore โ with the right data, compute, and workflow, you can create **custom AI models** and **tokenizers** tailored to any domain. Whether itโs NLP, domainโspecific datasets, or experimental architectures, training from scratch gives you full control over vocabulary, embeddings, and performance.
โจ Why train your own?
- Full control over vocabulary & tokenization
- Domainโspecific optimization (medical, legal, technical, etc.)
- Better performance on niche datasets
- Freedom to experiment with architectures
โก The best part?
- Tokenizer training (TikToken / BPE) can be done in **just 3 lines of code**.
- Model training runs smoothly on **Google Colab notebooks** โ no expensive hardware required.
๐ Try out my work:
- ๐ https://github.com/OE-Void/Tokenizer-from_scratch
- ๐ https://github.com/OE-Void/GPT
liked
a model
about 17 hours ago
NoesisLab/NanoHammer-1.5B-Instruct
reacted
to
their
post
with ๐ฅ
about 18 hours ago
๐ NanoHammer-1.5B-Instruct:
https://huggingface.co/NoesisLab/NanoHammer-1.5B-Instruct
We are excited to introduce NanoHammer, a novel architecture by NoesisLab designed for Causal State Compression and true Linear Inference Complexity.
๐ง The Core: Holographic State SpaceForget the growing KV Cache. NanoHammer leverages Holographic Rotary Embeddings to compress sequence history into a dynamic integral state.
Polynomial Compression: Instead of storing raw history, we "integrate" context into a complex number space , treating memory as a container of evolving polynomial coefficients.
Dynamic Evolution: The architecture features a custom StateUpdateCell that uses Euler method fixed-point iteration, allowing the model to perform implicit reasoning via differential state updates.
โก Why It Matters: Efficiency Meets Reasoning O(1) Inference Memory: State size remains constant regardless of sequence length.Causal Modeling: Explicitly models the causal flow of logic through time, perfect for "implicit reasoning" tasks without the verbosity of Chain-of-Thought.1.5B Lightweight Design: High performance, low resource footprint.
๐ Model Card HighlightsType: nanohammer (Hybrid Causal-State Architecture)
License: Apache 2.0
Capabilities: Instruction following, Long-context handling
๐ Try it on Hugging Face: https://huggingface.co/NoesisLab/NanoHammer-1.5B-Instruct