-
Notifications
You must be signed in to change notification settings - Fork 123
feat: granite4.1 #964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: granite4.1 #964
Changes from all commits
a2aac9d
f2b5640
fcd2459
b37f074
f00bc6f
fa95e50
f4c89cf
f81d6c8
eeb79cc
f8597fd
642339c
50c4b87
427ea3c
a502aea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -128,4 +128,4 @@ uv run docs/examples/<folder>/<file>.py | |
|
|
||
| **Default backend:** `start_session()` with no arguments connects to a local | ||
| [Ollama](https://ollama.ai) instance running **IBM Granite 4 Micro** | ||
| (`granite4:micro`). Make sure Ollama is running before you execute any example. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same branding mismatch. |
||
| (`granite4.1:3b`). Make sure Ollama is running before you execute any example. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,7 +10,7 @@ description: "Run your first generative program in minutes." | |
| ## Hello world | ||
|
|
||
| By default, `start_session()` connects to Ollama and uses **IBM Granite 4 Micro** | ||
| (`granite4:micro`). Make sure Ollama is running before you run this: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Branding mismatch. |
||
| (`granite4.1:3b`). Make sure Ollama is running before you run this: | ||
|
|
||
| ```python | ||
| import mellea | ||
|
|
@@ -191,7 +191,7 @@ HuggingFace, and WatsonX are also supported. See | |
|
|
||
| ## Troubleshooting | ||
|
|
||
| **`granite4:micro` not found** — run `ollama pull granite4:micro` before starting. | ||
| **`granite4.1:3b` not found** — run `ollama pull granite4.1:3b` before starting. | ||
|
|
||
| **Python 3.13 `outlines` install failure** — `outlines` requires a Rust compiler. | ||
| Either install [Rust](https://www.rust-lang.org/tools/install) or pin Python to 3.12. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,7 +13,7 @@ configure the backend when you create a session. | |
|
|
||
| ## Default backend | ||
|
|
||
| `start_session()` defaults to **Ollama** with **IBM Granite 4 Micro** (`granite4:micro`). | ||
| `start_session()` defaults to **Ollama** with **IBM Granite 4 Micro** (`granite4.1:3b`). | ||
| No API keys needed — just have Ollama running: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same branding mismatch: points to |
||
|
|
||
| ```python | ||
|
|
@@ -142,7 +142,7 @@ Run models locally using HuggingFace transformers: | |
| from mellea import MelleaSession | ||
| from mellea.backends.huggingface import LocalHFBackend | ||
|
|
||
| backend = LocalHFBackend(model_id="ibm-granite/granite-4.0-micro") | ||
| backend = LocalHFBackend(model_id="ibm-granite/granite-4.1-3b") | ||
| m = MelleaSession(backend=backend) | ||
| ``` | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -34,7 +34,7 @@ background service. | |
| ## Default setup | ||
|
|
||
| `start_session()` connects to Ollama on `localhost:11434` and uses | ||
| **IBM Granite 4 Micro** (`granite4:micro`) by default. On first run, Mellea | ||
| **IBM Granite 4 Micro** (`granite4.1:3b`) by default. On first run, Mellea | ||
| automatically pulls the model if it is not already downloaded: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same branding mismatch. |
||
|
|
||
| ```python | ||
|
|
@@ -47,7 +47,7 @@ print(str(email)) | |
| # Output will vary — LLM responses depend on model and temperature. | ||
| ``` | ||
|
|
||
| > **Note:** The first run pulls `granite4:micro` (~2 GB). Subsequent runs start | ||
| > **Note:** The first run pulls `granite4.1:3b` (~2 GB). Subsequent runs start | ||
| > immediately from the local cache. | ||
|
|
||
| ## Switching models | ||
|
|
@@ -75,7 +75,7 @@ m = start_session(model_id=model_ids.IBM_GRANITE_3_3_8B) | |
| Pull models before using them (or let Mellea pull on first use): | ||
|
|
||
| ```bash | ||
| ollama pull granite4:micro | ||
| ollama pull granite4.1:3b | ||
| ollama pull llama3.2:3b | ||
| ollama pull mistral:7b | ||
| ``` | ||
|
|
@@ -84,8 +84,8 @@ ollama pull mistral:7b | |
|
|
||
| | `model_ids` constant | Ollama name | Notes | | ||
| | -------------------- | ----------- | ----- | | ||
| | `IBM_GRANITE_4_MICRO_3B` | `granite4:micro` | Default. Fast, low memory (~2 GB). | | ||
| | `IBM_GRANITE_4_HYBRID_MICRO` | `granite4:micro-h` | Hybrid variant with extended thinking. | | ||
| | `IBM_GRANITE_4_1_3B` | `granite4.1:3b` | Default. Fast, low memory (~2 GB). | | ||
| | `IBM_GRANITE_4_1_8B` | `granite4.1:8b` | Higher quality, ~5 GB. | | ||
| | `IBM_GRANITE_3_3_8B` | `granite3.3:8b` | Higher quality, ~5 GB. | | ||
| | `IBM_GRANITE_3_3_VISION_2B` | `ibm/granite3.3-vision:2b` | Vision model for image inputs. | | ||
| | `META_LLAMA_3_2_3B` | `llama3.2:3b` | Compact Llama model. | | ||
|
|
@@ -131,7 +131,7 @@ from mellea.backends.ollama import OllamaModelBackend | |
|
|
||
| m = MelleaSession( | ||
| OllamaModelBackend( | ||
| model_id="granite4:micro", | ||
| model_id="granite4.1:3b", | ||
| base_url="http://my-gpu-server:11434", | ||
| ) | ||
| ) | ||
|
|
@@ -152,7 +152,7 @@ from mellea.backends.ollama import OllamaModelBackend | |
|
|
||
| m = MelleaSession( | ||
| OllamaModelBackend( | ||
| model_id=model_ids.IBM_GRANITE_4_MICRO_3B, | ||
| model_id=model_ids.IBM_GRANITE_4_1_3B, | ||
| model_options={ | ||
| ModelOption.TEMPERATURE: 0.1, | ||
| ModelOption.SEED: 42, | ||
|
|
@@ -193,7 +193,7 @@ print(str(response)) | |
| ``` | ||
|
|
||
| > **Backend note:** Vision requires a model that supports image inputs. The default | ||
| > `granite4:micro` is text-only. Pull a vision model explicitly before using images: | ||
| > `granite4.1:3b` is text-only. Pull a vision model explicitly before using images: | ||
| > `ollama pull ibm/granite3.3-vision:2b`. | ||
|
|
||
| ## Ollama's OpenAI-compatible endpoint | ||
|
|
@@ -236,7 +236,7 @@ let Mellea pull it automatically on first use. | |
|
|
||
| Ollama loads the model into memory on the first request. Subsequent requests in the | ||
| same session are much faster. On machines with less than 8 GB RAM, consider using | ||
| `granite4:micro` or `llama3.2:1b`. | ||
| `granite4.1:3b` or `llama3.2:1b`. | ||
|
|
||
| ### Intel Mac torch errors | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
granite4:micro-hwas dropped from the pull list. Confirm noollama-marked integration tests still useIBM_GRANITE_4_HYBRID_MICRO, or they'll cold-start/fail in CI.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think I confirmed that in a nightly run.