YuleYule
CLI

yule pull / list

Download models from HuggingFace and manage the local cache

yule pull

Download a GGUF model from HuggingFace and store it in the local cache.

yule pull <model>

Arguments

ArgumentDescription
modelHuggingFace model reference (e.g. bartowski/Llama-3.2-1B-Instruct-GGUF)

Options

FlagDefaultDescription
--verifytrueCompute Merkle root after download

Authentication

For gated models, set the HF_TOKEN environment variable:

export HF_TOKEN=hf_your_token_here
yule pull meta-llama/Llama-3.2-1B-Instruct-GGUF

Example

yule pull bartowski/Llama-3.2-1B-Instruct-GGUF
downloading: Llama-3.2-1B-Instruct-Q4_K_M.gguf
  [========================================] 100% (1.24 GB)

model:       bartowski/Llama-3.2-1B-Instruct-GGUF
file:        Llama-3.2-1B-Instruct-Q4_K_M.gguf
size:        1.24 GB
merkle root: a3f8c1e9d0...
path:        ~/.yule/models/bartowski/Llama-3.2-1B-Instruct-GGUF/Llama-3.2-1B-Instruct-Q4_K_M.gguf

Cache Location

Models are stored at ~/.yule/models/{publisher}/{repo}/. Once pulled, yule run and yule serve can reference models by their registry name instead of a file path:

# these are equivalent after pulling
yule run bartowski/Llama-3.2-1B-Instruct-GGUF --prompt "Hello"
yule run ~/.yule/models/bartowski/Llama-3.2-1B-Instruct-GGUF/Llama-3.2-1B-Instruct-Q4_K_M.gguf --prompt "Hello"

yule list

Show all models in the local cache.

yule list

Example

cached models:

  bartowski/Llama-3.2-1B-Instruct-GGUF
    file:     Llama-3.2-1B-Instruct-Q4_K_M.gguf
    size:     1.24 GB
    status:   verified
    merkle:   a3f8c1e9d0b2...

  TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF
    file:     tinyllama-1.1b-chat-v1.0.Q4_0.gguf
    size:     637.81 MB
    status:   verified
    merkle:   ffc7e1fd6016...

If no models are cached:

no cached models
  pull one with: yule pull bartowski/Llama-3.2-1B-Instruct-GGUF

On this page