Instructions to use LVSTCK/domestic-yak-8B-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LVSTCK/domestic-yak-8B-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LVSTCK/domestic-yak-8B-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("LVSTCK/domestic-yak-8B-instruct") model = AutoModelForMultimodalLM.from_pretrained("LVSTCK/domestic-yak-8B-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use LVSTCK/domestic-yak-8B-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LVSTCK/domestic-yak-8B-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LVSTCK/domestic-yak-8B-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/LVSTCK/domestic-yak-8B-instruct
- SGLang
How to use LVSTCK/domestic-yak-8B-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LVSTCK/domestic-yak-8B-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LVSTCK/domestic-yak-8B-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LVSTCK/domestic-yak-8B-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LVSTCK/domestic-yak-8B-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use LVSTCK/domestic-yak-8B-instruct with Docker Model Runner:
docker model run hf.co/LVSTCK/domestic-yak-8B-instruct
🐂 domestic-yak, a Macedonian LM (instruct version)
This repository contains the model of the paper Towards Open Foundation Language Model and Corpus for Macedonian: A Low-Resource Language.
Code: https://github.com/LVSTCK
Model Summary
This is the instruct-tuned version of domestic-yak-8B. It has been fine-tuned specifically to improve instruction-following capabilities in Macedonian. It was fine-tuned on the sft-mk dataset for three epochs. Building on the foundation of domestic-yak-8B, this version is optimized for generating coherent, task-specific responses to user queries, making it ideal for chatbots, virtual assistants, and other interactive applications.
📊 Results
The table below compares the performance of our model, domestic-yak-8B-instruct with 4 other models. As we can see our model is on par with Llama 70B, and even beats it on three of the benchmarks. It is also worth noting that this model is currently the best in the 8B parameter range.
The results were obtained using the macedonian-llm-eval benchmark.
wn.png)
🔑 Key Details
- Language: Macedonian (
mk) - Base Model: domestic-yak-8B
- Dataset: ~100k samples across multiple categories (Question answering (QA), chat-like conversations, reasoning, essays, and code) consolidated from translating publicly available datasets and custom synthetic data. Dataset can be found here.
- Fine-tuning Objective: Supervised fine-tuning (SFT) on Macedonian-specific instruction-following data
Usage
Pipeline automatically uses apply_chat_template which formats the input appropriately. The model was trained using the default Llama 3.1 format.
import transformers
import torch
model_id = "LVSTCK/domestic-yak-8B-instruct"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "Ти си виртуелен асистент кој помага на корисници на македонски јазик. Одговарај на прашања на јасен, разбирлив и професионален начин. Користи правилна граматика и обиди се одговорите да бидат што е можно покорисни и релевантни."},
{"role": "user", "content": "Кој е највисок врв во Македонија?"},
]
outputs = pipeline(
messages,
max_new_tokens=256, # You can increase this
temperature=0.1,
)
print(outputs[0]["generated_text"][-1])
📬 Contact
For inquiries, feedback, or contributions, please feel free to reach out to the core team:
Citation
@article{krsteski2025towards,
title={Towards Open Foundation Language Model and Corpus for Macedonian: A Low-Resource Language},
author={Krsteski, Stefan and Tashkovska, Matea and Sazdov, Borjan and Gjoreski, Hristijan and Gerazov, Branislav},
journal={arXiv preprint arXiv:2506.09560},
year={2025}
}
- Downloads last month
- 9
Model tree for LVSTCK/domestic-yak-8B-instruct
Base model
meta-llama/Llama-3.1-8B