Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.
Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.
Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.
Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.
Exciting news! Introducing super-fast AI video assistant, currently in beta. With a minimum latency of under 500ms and an average latency of just 600ms.
I am experimenting with Flux and trying to push it to its limits without training (as I am GPU-poor 😅). I found some flaws in the pipelines, which I resolved, and now I am able to generate an approx similar quality image as Flux Schnell 4 steps in just 1 step. Demo Link: KingNish/Realtime-FLUX
Introducing Voicee, A superfast voice fast assistant. KingNish/Voicee It achieved latency <500 ms. While its average latency is 700ms. It works best in Google Chrome. Please try and give your feedbacks. Thank you. 🤗
This feature enhances the capabilities of OpenGPT 4o, allowing it to fetch and integrate the latest information from the web directly into its responses. Try Now: KingNish/OpenGPT-4o
With WEB SEARCH, OpenGPT 4o becomes an even more versatile and dynamic AI, ready to assist with up-to-date data retrieval and analysis.
1. Chat with Google Agent - This includes three AI models that allow you to converse with an AI, which provides answers by searching Google. Demo Link: poscye/google-go
Yes, you can use them but... with limitations like You can't use DallE 😥, You can't make Custom GPTs And chat limit also😥. But... We already have an open-source alternative like Hugging Chat, where you can create your custom assistant, generate, edit images, without any chat limit.
Future Updates: 1. Web Search (Suggested by @GPT007 and @Saionton ) 2. Live Chat with Voice Chat 3. Model Choices (Suggested by @NotAiLOL ) 4. Multilingual Chats.
Suggest more features that should be added. 🤗 Thanks!
2. Phi 3 Mini Vision 128k: A 4.5 billion-parameter, instruction-tuned vision model that has outperformed models such as Llava3 and Claude 3, and is providing stiff competition to Gemini 1Pro Vision. microsoft/Phi-3-vision-128k-instruct
𝐒𝐮𝐦𝐦𝐚𝐫𝐲 𝐨𝐟 𝐀𝐫𝐭𝐢𝐜𝐥𝐞- 📝 # 𝐌𝐞𝐜𝐡𝐚𝐧𝐢𝐜𝐬 𝐨𝐟 𝐆𝐏𝐓-𝟒’𝐨’: GPT-4’o’ operates through three main components 🛠️
𝟏. 𝐒𝐮𝐩𝐞𝐫𝐂𝐡𝐚𝐭: Integrates image generation, QnA (image, document and video) for diverse interactions. 𝟐. 𝐕𝐨𝐢𝐜𝐞 𝐂𝐡𝐚𝐭: Merges TTS and STT for real-time, human-like audio responses, focusing on human interaction. 𝟑. 𝐕𝐢𝐝𝐞𝐨 𝐂𝐡𝐚𝐭: Utilizes Zero Shot Image Classification to enhance user interaction with visual information.
# 𝐌𝐞𝐭𝐡𝐨𝐝𝐬 𝐭𝐨 𝐂𝐫𝐞𝐚𝐭𝐞 𝐒𝐢𝐦𝐢𝐥𝐚𝐫 𝐀𝐈 🧠
𝟏. 𝐌𝐮𝐥𝐭𝐢𝐌𝐨𝐝𝐚𝐥𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧: Combines multiple models for a powerful, multifunctional AI. 𝟐. 𝐃𝐮𝐜𝐭 𝐓𝐚𝐩𝐞 𝐌𝐞𝐭𝐡𝐨𝐝: Uses different models or APIs for specific tasks without additional training.
The article provides an in-depth exploration of GPT-4’o’, its functionalities, and methods to create similar AI models. It emphasizes the model’s language support and its innovative approach to human-AI interaction. 💡🌐