ZeroGPU Explorers

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

gagan3012 authored a paper 15 days ago

From RAG to Agentic RAG for Faithful Islamic Question Answering

gagan3012 authored a paper 15 days ago

Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics

hiyouga authored a paper 27 days ago

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

View all activity

seyf1elislam

posted an update 1 day ago

Post

177

# 🚀 Run Qwen3-TTS on Colab GPU or Locally

Run **Qwen3-TTS (Text-to-Speech & Voice Cloning)** with minimal effort. This setup is based on the official HF Space.

### 🔗 Links
* **Official Space:** Qwen/Qwen3-TTS
* **GitHub Repo:** https://github.com/seyf1elislam/qwen-tts-webui-notebook
* **Colab:** https://github.com/seyf1elislam/qwen-tts-webui-notebook/blob/main/Qwen_TTS_(TTS_%26_Voice_Cloning)_Colab.ipynb

---

### 📓 Method 1: Google Colab (Fastest)
1. Open the https://github.com/seyf1elislam/qwen-tts-webui-notebook/blob/main/Qwen_TTS_(TTS_%26_Voice_Cloning)_Colab.ipynb.
2. Add your HF_TOKEN to Google Colab Secrets
3. Ensure you are on a **T4 GPU** runtime.
4. Run all cells. Use the gradio.live link to open the UI.

---

### 💻 Method 2: Local Installation
Requires an GPU. Uses uv for faster setup.

# 1. Install uv & Clone
pip install uv
git clone https://huggingface.co/spaces/Qwen/Qwen3-TTS && cd Qwen3-TTS

# 2. Setup Environment
uv venv
uv pip install -r requirements.txt

# 3. Auth & Run
uvx hf auth login
python app.py 
# UI available at: http://localhost:7860/

wren93

authored 3 papers 8 days ago

Scaling Zero-Shot Reference-to-Video Generation

Paper • 2512.06905 • Published Dec 7, 2025 • 29

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Paper • 2512.07802 • Published Dec 8, 2025 • 45

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

Paper • 2512.21338 • Published Dec 24, 2025 • 22

mmhamdy

posted an update 14 days ago

Post

3013

The new DeepSeek Engram paper is super fun! It also integrates mHC, and I suspect they're probably releasing all these papers to make the V4 report of reasonable length😄

Here's a nice short summary from Gemini

ybelkada

authored a paper 19 days ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published 20 days ago • 41

Lin-Chen

authored a paper 21 days ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published 22 days ago • 46

Lin-Chen

submitted a paper to Daily Papers 21 days ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published 22 days ago • 46

MaziyarPanahi

posted an update 22 days ago

Post

3227

🎉 OpenMed 2025 Year in Review: 6 Months of Open Medical AI

I'm thrilled to share what the OpenMed community has accomplished since our July 2025 launch!

📊 The Numbers

29,700,000 downloads Thank you! 🙏

- 481 total models (475 medical NER models + 6 fine-tuned LLMs)
- 475 medical NER models in [OpenMed](

OpenMed ) organization
- 6 fine-tuned LLMs in [openmed-community](

openmed-community )
- 551,800 PyPI downloads of the [openmed package](https://pypi.org/project/openmed/)
- 707 followers on HuggingFace (you!)
- 97 GitHub stars on the [toolkit repo](https://github.com/maziyarpanahi/openmed)

🏆 Top Models by Downloads

1. [OpenMed-NER-PharmaDetect-SuperClinical-434M]( OpenMed/OpenMed-NER-PharmaDetect-SuperClinical-434M) — 147,305 downloads
2. [OpenMed-NER-ChemicalDetect-ElectraMed-33M]( OpenMed/OpenMed-NER-ChemicalDetect-ElectraMed-33M) — 126,785 downloads
3. [OpenMed-NER-BloodCancerDetect-TinyMed-65M]( OpenMed/OpenMed-NER-BloodCancerDetect-TinyMed-65M) — 126,465 downloads

🔬 Model Categories

Our 481 models cover comprehensive medical domains:

- Disease Detection (~50 variants)
- Pharmaceutical Detection (~50 variants)
- Oncology Detection (~50 variants)
- Genomics/DNA Detection (~80 variants)
- Chemical Detection (~50 variants)
- Species/Organism Detection (~60 variants)
- Protein Detection (~50 variants)
- Pathology Detection (~50 variants)
- Blood Cancer Detection (~30 variants)
- Anatomy Detection (~40 variants)
- Zero-Shot NER (GLiNER-based)

OpenMed
OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets (2508.01630)
https://huggingface.co/collections/OpenMed/medical-and-clinical-ner
https://huggingface.co/collections/OpenMed/zeroshot-medical-and-clinical-ner
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B

1 reply

toshas

posted an update about 1 month ago

Post

830

Introducing StereoSpace -- our new end-to-end method for turning photos into stereo images without explicit geometry or depth maps. This makes it especially robust with thin structures and transparencies. Try the demo below:

🌐 Project: prs-eth/stereospace_web
📕 Paper: StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space (2512.10959)
🐙 Code: https://github.com/prs-eth/stereospace
🤗 Demo: toshas/stereospace
🤗 Weights: prs-eth/stereospace-v1-0

By ETH Zürich ( @behretj , @Bingxin , @konradschindler ), University of Bologna ( @fabiotosi92 , @mpoggi ), HUAWEI Bayer Lab ( @toshas ).

toshas

authored a paper about 1 month ago

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Paper • 2512.10959 • Published Dec 11, 2025 • 12

toshas

submitted a paper to Daily Papers about 2 months ago

StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Paper • 2512.10959 • Published Dec 11, 2025 • 12

toshas

posted an update about 2 months ago

Post

2244

Introducing 🇨🇭WindowSeat🇨🇭 –– our new method for removing reflections from photos taken through windows, on planes, in malls, offices, and other glass-filled environments.

Finetuning a foundation diffusion transformer for reflection removal quickly runs up against the limits of what existing datasets and techniques can offer. To fill that gap, we generate physically accurate examples in Blender that simulate realistic glass and reflection effects. This data enables strong performance on both established benchmarks and previously unseen images.

To make this practical, the open-source Apache-2 model builds on Qwen-Image-Edit-2509, a 20B image-editing diffusion transformer that runs on a single GPU and can be fine-tuned in about a day. WindowSeat keeps its use of the underlying DiT cleanly separated from the data and training recipe, allowing future advances in base models to be incorporated with minimal friction.

Try it out with your own photos in this interactive demo:
🤗 toshas/windowseat-reflection-removal

Other resources:
🌎 Website: huawei-bayerlab/windowseat-reflection-removal-web
🎓 Paper: Reflection Removal through Efficient Adaptation of Diffusion Transformers (2512.05000)
🤗 Model: huawei-bayerlab/windowseat-reflection-removal-v1-0
🐙 Code: https://github.com/huawei-bayerlab/windowseat-reflection-removal

Team: Daniyar Zakarin ( @daniyarzt )*, Thiemo Wandel ( @thiemo-wandel )*, Anton Obukhov ( @toshas ), Dengxin Dai.
*Work done during internships at HUAWEI Bayer Lab