welcome

August 24, 2024·personal·2 min read·evergreen·metallmengineering

contents

i'm tokenbender. i love working with large language models.

background

formerly a technical lead at intel india, i led high-impact projects in the server division and built methods to save significant capital and labour when we had a cap of 5% workforce in office during covid lockdowns. i developed a deep interest in nlp and now focus on end-to-end llm pipelines, including orchestration, dataset curation, filtering, reweighting, and multilingual alignment (indic languages).

this is how my journey has been so far in the post-chatgpt world.

key achievements

codecherrypop: built from llama 2 7b, the first useful small coder that gained significant attention. linkedin announcement

chai ai success: achieved top 5 model ranking for multiple months in the chai ai character roleplay hackathon with ~78%+ satisfaction rates.

evolvedseeker: upgraded the deepseek coder 1.3b base to create what became the best-performing local model in the 1b range for coding tasks. reddit discussion

pic series: secured a top 10 spot on the open hugging face leaderboard with my pic (partner-in-crime) series—demonstrating pioneering function-calling, character engagement, and generic performance boosts all in one model. model on huggingface

multilingual innovation: built the first model fine-tuned for both rag and generic chat, also pioneered this approach in indic language space. navarna model

current work

my datasets and models have amassed several thousand downloads on hugging face, particularly in the coding category. check out my profile: tokenbender on huggingface

i spent the last several months diving into multimodal rag (retrieval-augmented generation) for structured/unstructured documents and structured extraction in document visual qa by fine-tuning specialized vlms.

i microblog as @tokenbender on twitter—constantly dissecting the latest developments, limits, and advantages of current systems.

what to expect

i'll be writing about:

llm research and practical insights
post-training techniques and methodologies
multimodal ai and document understanding
dataset curation and model alignment
thoughts on the evolving ai landscape

welcome

background

key achievements

current work

what to expect

see also