Hello there,
We are in the last leg of our grant, LLM marketplace and exploring tiny language models to fine tune for a niche. Below are the ones we are interesred in:
SmolLM2-1.7B-Instruct (360M parameters)
Qwen2.5-Coder-32B-Instruct (0.5 B params)
DistilGpt2 (124M parameters)
I know Qwen has already been worked on in the community. But is there any research done on the other two? Or would you recommend a tiny LM to research on?
cc: @YashBit
Thank you!