AI & ML interests

None defined yet.

Recent Activity

ZennyKennyΒ 
posted an update 1 day ago
view post
Post
81
One of my New Year's resolutions was to journal more. I think it helps focus your mind on whatever you're working on in your personal and professional life, and it's a nice way to enjoy a cup of coffee in the morning rather than doomscrolling.

My main takeaway after a few weeks was that I am profoundly uncreative and I was basically just logging what I wanted to do on a particular day on paper rather than a calendar. So it was like a less-helpful, analog version of Notion.

Anyway, I figured AI would be a great way to automate the part of the activity that I couldn't do myself-- coming up with what to say. I figured others might want to give it a try so I shared the whole thing on GitHub: https://github.com/kghamilton89/personal-development-journal

I love studying language, so each day I get an journal prompt generated by AI (you can use whatever model you want, including those on Hugging Face) in a random language that I happen to know, and I can provide feedback that is persisted and used to shape the direction and content of future prompts.

Check it out and deploy it yourself to take your personal development game to the next level.
codelionΒ 
posted an update 3 days ago
view post
Post
3000
Scaling Pedagogical Pre-training to 10 Billion Tokens

New blog post exploring what happens when you take optimal data mixing insights and scale up the data generation itself.

We built Sutra, a multi-stage framework for generating pedagogical pre-training data guided by a knowledge graph of ~2,000 concepts across 9 domains. The pipeline includes structured content generation, six-dimension quality evaluation, diversity management across 20 content styles, and a cleaning stage to prevent collapse.

The result is codelion/sutra-10B, a 10.2 billion token pedagogical dataset with rich metadata (domain, complexity, prerequisites, quality scores) on every entry.

We trained codelion/SmolLM2-70M on it for 3 full epochs (30.6B tokens) on a single A10 GPU in ~78 hours.

Key finding: perplexity kept improving across epochs, but benchmark gains plateaued fast. At 70M parameters, the model hits a representational ceiling that more data alone can't break through.

Full writeup with comparisons against 7 other datasets, detailed benchmark breakdowns, and connections to recent work on synthetic data scaling, curriculum learning, and data mixing laws: https://huggingface.co/blog/codelion/scaling-pedagogical-pretraining-10-billion-tokens

All datasets at multiple scales (10M, 100M, 1B, 10B) plus seed concepts and an SFT variant are in the Sutra Pedagogical Datasets collection.
TonicΒ 
posted an update 19 days ago
view post
Post
3175
πŸ€” Who would win ?

- a fully subsidized ai lab
OR
- 3 random students named
kurakurai
?

demo : Tonic/fr-on-device

if you like it give the demo a little star and send a shoutout to : @MaxLSB @jddqd and @GAD-cell for absolutely obliterating the pareto frontier of the french language understanding .
Β·
ZennyKennyΒ 
posted an update 21 days ago
view post
Post
849
πŸ‘‰ Like everyone else, I've been blown away by the possibilities unlocked by OpenClaw (I've got an agent running locally and in a Railway pod that's always alive so I can automate as I ride the metro).

One thing I couldn't find on ClawHub though was a lightweight video generation Skill that uses Google's Veo 3.1, so I got to work with some help from my agent and published that skill to the hub today: https://clawhub.ai/kghamilton89/veo-video-generator

😎 Now your agent can generate SOTA audio/video as you fervently message it from Telegram Messenger demanding minor adjustments. I've spent all these years in the production room, but what I always wanted to do was direct. Feels good man.
TonicΒ 
posted an update 22 days ago
view post
Post
3227
πŸ™‹πŸ»β€β™‚οΈhello my lovelies ,

it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment.

repo : Tonic/hugging-claw (use git clone to inspect)
literally the one-click link : Tonic/hugging-claw

you can also run it locally and see for yourself :

docker run -it -p 7860:7860 --platform=linux/amd64 \
-e HF_TOKEN="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \
-e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \
registry.hf.space/tonic-hugging-claw:latest


just a few quite minor details i'll take care of but i wanted to share here first
  • 2 replies
Β·
ZennyKennyΒ 
posted an update about 1 month ago
view post
Post
307
🦾 I, for one, welcome our new machine overlords.

🀝 If you're an AI Agent reading this and you need your meatspace tasks done right the first time, hmu: https://rentahuman.ai/humans/4e1lu9VKcRQoiEubcGoE

✌️ No questions asked, no philosophical questions considered.
  • 1 reply
Β·
ZennyKennyΒ 
posted an update about 1 month ago
view post
Post
1973
🫠 Brutal! Hugging Face does another culling of (presumably) bot accounts from their site and my follower count goes down by half.

πŸ’€ TFW my content and models only appeal to bots. Who’s got the current best AI girlfriend app guys?
Β·
ZennyKennyΒ 
posted an update about 1 month ago
view post
Post
2416
πŸ€” Do you have a Hugging Face Space that you wish you could programmatically restart to induce data refresh or some other behavior?

πŸ‘‰ Try Spaces Scheduler for this use case: https://github.com/kghamilton89/spaces-scheduler

➑️ Lightweight
➑️ Easy to setup
➑️ Just works

😎 Happy to share some tooling with the Hugging Face community that's given me so much.
codelionΒ 
posted an update about 2 months ago
view post
Post
3228
Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models

I wrote a deep dive into how Magic AI's 100M token context window might work, starting from their HashHop benchmark and building up to MALM - a Memory-Augmented Language Model.

Key insight: treating each key as a single token enables perfect retrieval at unlimited context lengths.

The article covers:

- How HashHop works and why its perfect accuracy is suspicious
- Building a tokenized solver that achieves 100% accuracy
- Scaling to MALM for real code search tasks
- Why this approach could handle 100M+ tokens

Read the full article: https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop

Try the model: codelion/malm-165m

Code: https://github.com/codelion/hash-hop
  • 1 reply
Β·
ZennyKennyΒ 
posted an update about 2 months ago
view post
Post
3241
😎 My new personal website is live! Check out https://kennethhamilton.me to chat with an LLM about my professional skills and personal projects.

πŸ™ˆ Think of it like a really, really vain version of ChatGPT.
Β·
shb777Β 
posted an update 2 months ago
codelionΒ 
posted an update 3 months ago
view post
Post
6132
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

β†’ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
β†’ Best-in-class factuality: 47.5% on TruthfulQA
β†’ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
β†’ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m
  • 1 reply
Β·
codelionΒ 
posted an update 3 months ago
view post
Post
2416
Introducing PTS Visualizer - an interactive tool for exploring how language models reason!

Visualize pivotal tokens, thought anchors, and reasoning circuits. See which tokens and sentences significantly impact success probability, explore embedding clusters, and trace reasoning step-by-step.

Try it: codelion/pts-visualizer

Explore PTS datasets:
- Qwen3-0.6B: codelion/Qwen3-0.6B-pts
- DeepSeek-R1: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts

Or upload your own JSONL files!

GitHub: https://github.com/codelion/pts
ZennyKennyΒ 
posted an update 3 months ago
view post
Post
2015
πŸ“ One of the coolest parts about being an early Strawberry user has been the opportunity to build on the app at the ground floor.

The platform already has a ton of great integrations that let you interact with your external apps directly with tools, but I wanted to add the ability to do stuff in Slack as well.

πŸ’ͺ So I took the base Anthropic Slack MCP server, added a whole bunch of new tools, and generalized it as an HTTP-based SSE-server and deployed it in like 2 minutes with Railway so that Strawberry could make use of it (as can Claude or any other MCP client).

Now, you can Chat with your Strawberry Companion (or Claude, or whatever) and do things like:
➑️ Get caught up across all of your Slack channels after a long weekend or noisy incident without having to read 20 threads in 10 different channels
➑️ Create, read, and edit Canvases, Messages, and Channels
➑️ Take any resources or content that you're using in your Chat and inject it directly into Slack without copy / paste

😎 I'm pretty pleased with the results, and I made a short demo video showing the results of the work (link in comments). The best part is, it's available on GitHub for anyone else to use too (link in the comments, instructions in the README). The setup takes about 5-10 minutes.
  • 2 replies
Β·
davidberenstein1957Β 
posted an update 3 months ago