Using HuggingFace Mirror to Download Models and Datasets Smoothly
Webpage Download Method
Search for your target model or dataset directly on the mirror site, then navigate to the Files and Version section on its dedicated page to start downloading.
HuggingFace CLI Method
The huggingface-cli is Hugging Face’s official command-line tool with robust download capabilities.
1. Install Dependencies
pip install -U huggingface_hub
2. Configure Environment Variable Linux/MacOS Terminal (persist by adding to ~/.bash_profile or ~/.zshrc)
export HF_ENDPOINT="https://hf-mirror.com"
Windows PowerShell (persist via Environment Variables GUI)
$env:HF_ENDPOINT = "https://hf-mirror.com"
3. Download Operations
- Download a Model (e.g., mistralai/Mistral-7B-v0.3)
huggingface-cli download --resume-download mistralai/Mistral-7B-v0.3 --local-dir mistral-7b
- Download a Dataset (e.g., allenai/c4)
huggingface-cli download --repo-type dataset --resume-download allenai/c4 --local-dir c4-dataset
Add --local-dir-use-symlinks False to disable symbolic links for a flat file structure.
HFD Tool Method
HFD is a mirror-site-developed download utility built on git+aria2 for stable, resumable transfers.
1. Install HFD
wget https://hf-mirror.com/hfd/hfd.sh
chmod +x hfd.sh
2. Configure Environment Variable Same as HuggingFace CLI Step 2.
3. Download Operations
- Download a Model (e.g., meta-llama/Llama-2-7b-hf)
./hfd.sh meta-llama/Llama-2-7b-hf --tool aria2c -x 8
- Download a Dataset (e.g., squad)
./hfd.sh squad --dataset --tool aria2c -x 8
Non-Intrusive Environment Variable Method
Set the HF_ENDPOINT variable temporarily before runing your Python script to route all Hugging Face Hub API and download requests through the mirror.
HF_ENDPOINT=https://hf-mirror.com python your_llm_training_script.py