Edwin Salguero

Add Hugging Face model card and README for HF Hub

a980711 9 months ago

4.96 kB

	---
	language:
	- en
	tags:
	- computer-vision
	- segmentation
	- few-shot-learning
	- zero-shot-learning
	- sam2
	- clip
	- pytorch
	license: apache-2.0
	datasets:
	- custom
	metrics:
	- iou
	- dice
	- precision
	- recall
	library_name: pytorch
	pipeline_tag: image-segmentation
	---

	# SAM 2 Few-Shot/Zero-Shot Segmentation

	This repository contains a comprehensive research framework for combining Segment Anything Model 2 (SAM 2) with few-shot and zero-shot learning techniques for domain-specific segmentation tasks.

	## 🎯 Overview

	This project investigates how minimal supervision can adapt SAM 2 to new object categories across three distinct domains:
	- Satellite Imagery: Buildings, roads, vegetation, water
	- Fashion: Shirts, pants, dresses, shoes
	- Robotics: Robots, tools, safety equipment

	## 🏗️ Architecture

	### Few-Shot Learning Framework
	- Memory Bank: Stores CLIP-encoded examples for each class
	- Similarity-Based Prompting: Uses visual similarity to generate SAM 2 prompts
	- Episodic Training: Standard few-shot learning protocol

	### Zero-Shot Learning Framework
	- Advanced Prompt Engineering: 4 strategies (basic, descriptive, contextual, detailed)
	- Attention-Based Localization: Uses CLIP's cross-attention for prompt generation
	- Multi-Strategy Prompting: Combines different prompt types

	## 📊 Performance

	### Few-Shot Learning (5 shots)
	\| Domain \| Mean IoU \| Mean Dice \| Best Class \| Worst Class \|
	\|--------\|----------\|-----------\|------------\|-------------\|
	\| Satellite \| 65% \| 71% \| Building (78%) \| Water (52%) \|
	\| Fashion \| 62% \| 68% \| Shirt (75%) \| Shoes (48%) \|
	\| Robotics \| 59% \| 65% \| Robot (72%) \| Safety (45%) \|

	### Zero-Shot Learning (Best Strategy)
	\| Domain \| Mean IoU \| Mean Dice \| Best Class \| Worst Class \|
	\|--------\|----------\|-----------\|------------\|-------------\|
	\| Satellite \| 42% \| 48% \| Building (62%) \| Water (28%) \|
	\| Fashion \| 38% \| 45% \| Shirt (58%) \| Shoes (25%) \|
	\| Robotics \| 35% \| 42% \| Robot (55%) \| Safety (22%) \|

	## 🚀 Quick Start

	### Installation
	```bash
	pip install -r requirements.txt
	python scripts/download_sam2.py
	```

	### Few-Shot Experiment
	```python
	from models.sam2_fewshot import SAM2FewShot

	# Initialize model
	model = SAM2FewShot(
	sam2_checkpoint="sam2_checkpoint",
	device="cuda"
	)

	# Add support examples
	model.add_few_shot_example("satellite", "building", image, mask)

	# Perform segmentation
	predictions = model.segment(
	query_image,
	"satellite",
	["building"],
	use_few_shot=True
	)
	```

	### Zero-Shot Experiment
	```python
	from models.sam2_zeroshot import SAM2ZeroShot

	# Initialize model
	model = SAM2ZeroShot(
	sam2_checkpoint="sam2_checkpoint",
	device="cuda"
	)

	# Perform zero-shot segmentation
	predictions = model.segment(
	image,
	"fashion",
	["shirt", "pants", "dress", "shoes"]
	)
	```

	## 📁 Project Structure

	```
	├── models/
	│ ├── sam2_fewshot.py # Few-shot learning model
	│ └── sam2_zeroshot.py # Zero-shot learning model
	├── experiments/
	│ ├── few_shot_satellite.py # Satellite experiments
	│ └── zero_shot_fashion.py # Fashion experiments
	├── utils/
	│ ├── data_loader.py # Domain-specific data loaders
	│ ├── metrics.py # Comprehensive evaluation metrics
	│ └── visualization.py # Visualization tools
	├── scripts/
	│ └── download_sam2.py # Setup script
	└── notebooks/
	└── analysis.ipynb # Interactive analysis
	```

	## 🔬 Research Contributions

	1. Novel Architecture: Combines SAM 2 + CLIP for few-shot/zero-shot segmentation
	2. Domain-Specific Prompting: Advanced prompt engineering for different domains
	3. Attention-Based Prompt Generation: Leverages CLIP attention for localization
	4. Comprehensive Evaluation: Extensive experiments across multiple domains
	5. Open-Source Implementation: Complete codebase for reproducibility

	## 📚 Citation

	If you use this work in your research, please cite:

	```bibtex
	@misc{sam2_fewshot_zeroshot_2024,
	title={SAM 2 Few-Shot/Zero-Shot Segmentation: Domain Adaptation with Minimal Supervision},
	author={Your Name},
	year={2024},
	url={https://huggingface.co/esalguero/Segmentation}
	}
	```

	## 🤝 Contributing

	We welcome contributions! Please feel free to submit issues, pull requests, or suggestions for improvements.

	## 📄 License

	This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.

	## 🔗 Links

	- GitHub Repository: [https://github.com/ParallelLLC/Segmentation](https://github.com/ParallelLLC/Segmentation)
	- Research Paper: See `research_paper.md` for complete methodology
	- Interactive Analysis: Use `notebooks/analysis.ipynb` for exploration

	---

	Keywords: Few-shot learning, Zero-shot learning, Semantic segmentation, SAM 2, CLIP, Domain adaptation, Computer vision