| | --- |
| | license: mit |
| | language: |
| | - en |
| | tags: |
| | - RVC |
| | - voice-conversion |
| | - text-to-speech |
| | - voice-cloning |
| | - audio-generation |
| | datasets: |
| | - LJSpeech |
| | - VCTK |
| | metrics: |
| | - MOS (Mean Opinion Score) |
| | - PESQ |
| | - STOI |
| | base_model: |
| | - MangioRVC/Mangio-RVC-Huggingface |
| | pipeline_tag: audio-to-audio |
| | --- |
| | |
| | # ๐ LUNAR - High-Quality Female Voice RVC Model |
| |
|
| | LUNAR is a state-of-the-art **RVC (Retrieval-Based Voice Conversion) model** optimized for **female voice conversion** with studio-grade audio quality at **48kHz sampling rate**. This model delivers natural-sounding voice transformations with minimal artifacts. |
| |
|
| | ## **Performance & Efficiency Metrics** |
| | Here are the visual benchmarks of Lunar-RVC: |
| |
|
| | ### **1. Training Loss Curve** |
| |  |
| |
|
| | ### **2. Validation Loss Curve** |
| |  |
| |
|
| | ### **3. Training vs Validation Loss** |
| |  |
| |
|
| | ### **4. Inference Speed Comparison** |
| |  |
| |
|
| | ### **5. Audio Quality Scores (MOS)** |
| |  |
| |
|
| | ### **6. GPU Memory Usage** |
| |  |
| |
|
| | ### **7. Dataset Duration Distribution** |
| |  |
| |
|
| | ### **8. Spectral Convergence** |
| |  |
| |
|
| | ### **9. Model Size Comparison** |
| |  |
| |
|
| | ### **10. Efficiency Radar Chart** |
| |  |
| |
|
| | --- |
| |
|
| | ## Key Features |
| |
|
| | - **High-Fidelity Conversion** - Produces natural, expressive female voices |
| | - **Real-Time Ready** - Optimized for low-latency inference (<20ms/frame) |
| | - **Pitch & Timbre Control** - Flexible voice modulation capabilities |
| | - **48kHz Studio Quality** - Professional-grade audio output |
| | - **Easy Integration** - Compatible with popular voice toolkits |
| |
|
| | ## ๐ Model Specifications |
| |
|
| | | Parameter | Value | |
| | |--------------------|---------------------| |
| | | Framework | RVC v2 | |
| | | Sample Rate | 48kHz | |
| | | Bit Depth | 16-bit | |
| | | Model Size | 1.8GB | |
| | | Training Hours | 150 epochs (~10h) | |
| | | VRAM Requirements | 4GB+ (inference) | |
| | | Supported Formats | WAV, MP3, FLAC | |
| | --- |
| |
|
| | ## **Inference Guide** |
| | To use Lunar-RVC for inference: |
| |
|
| | ``` bash |
| | # Clone repository |
| | git clone https://huggingface.co/IssacMosesD/Lunar-RVC-Model |
| | cd Lunar-RVC |
| | |
| | # Install dependencies |
| | pip install -r requirements.txt |
| | |
| | # Run inference |
| | python infer.py --input input.wav --output output.wav --model Lunar-RVC.pth |
| | |
| | ## Use Cases |
| |
|
| | - Voice Cloning โ Convert your voice into a professional singing voice. |
| |
|
| | - Streaming โ Real-time voice conversion for content creators. |
| |
|
| | - Dubbing โ High-quality voice conversion for movies & animations. |
| |
|
| | - Music Production โ Transform any vocal track into a new singerโs voice. |
| |
|
| | ## System Requirements |
| |
|
| | - OS: Windows / Linux |
| |
|
| | - Python: 3.8+ |
| |
|
| | - GPU: NVIDIA (6GB VRAM minimum recommended) |
| |
|
| | - CUDA: 11.7+ |
| |
|
| | - Torch: 1.13.1+ |
| |
|
| | ## Contact |
| |
|
| | - For support, queries, or collaboration: |
| |
|
| | - Name: [Issac Moses D](https://www.linkedin.com/in/issacmosesd/) |
| |
|
| | - Email: [issacmsoes19082005@gmail.com](mailto:issacmoses19082005@gmail.com) |
| |
|
| | - Hugging Face: [IssacMosesD](https://huggingface.co/IssacMosesD) |
| |
|
| | - GitHub: [Issac Moses D](https://github.com/Issac-Moses) |
| |
|
| | ## Collaburator |
| |
|
| | - Name: [Dharani Karnan](https://www.linkedin.com/in/dharani-karnan-060040320/) |
| |
|
| | - Email: [dharanikarnan18@gmail.com](mailto:dharanikarnan18@gmail.com) |
| |
|
| | - Hugging Face: [Dharani K](https://huggingface.co/dharzz188) |
| |
|
| | - GitHub: [Dharani K](https://github.com/Issac-Moses) |