gemma3:4b - Ollama 框架

Gemma 是 Google 基於 Gemini 技術開發的輕量級模型系列。Gemma 3 模型是多模態的——處理文字和圖像——並具有 128K 上下文視窗，支援超過 140 種語言。提供 1B、4B、12B 和 27B 參數大小，它們在問答、摘要和推理等任務中表現出色，而其緊湊的設計允許部署在資源受限的裝置上。

模型

文字

1B 參數模型 (32k 上下文視窗)

ollama run gemma3:1b

多模態 (視覺)

4B 參數模型 (128k 上下文視窗)

ollama run gemma3:4b

12B 參數模型 (128k 上下文視窗)

ollama run gemma3:12b

27B 參數模型 (128k 上下文視窗)

ollama run gemma3:27b

評估

基準測試結果

這些模型已針對大量的不同資料集和指標進行評估，以涵蓋文字生成的不同方面

推理、邏輯和程式碼能力

基準	指標	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
HellaSwag	10-shot	62.3	77.2	84.2	85.6
BoolQ	0-shot	63.2	72.3	78.8	82.4
PIQA	0-shot	73.8	79.6	81.8	83.3
SocialIQA	0-shot	48.9	51.9	53.4	54.9
TriviaQA	5-shot	39.8	65.8	78.2	85.5
Natural Questions	5-shot	9.48	20.0	31.4	36.1
ARC-c	25-shot	38.4	56.2	68.9	70.6
ARC-e	0-shot	73.0	82.4	88.3	89.0
WinoGrande	5-shot	58.2	64.7	74.3	78.8
BIG-Bench Hard		28.4	50.9	72.6	77.7
DROP	3-shot, F1	42.4	60.1	72.2	77.2
AGIEval	3-5-shot	22.2	42.1	57.4	66.2
MMLU	5-shot, top-1	26.5	59.6	74.5	78.6
MATH	4-shot	–	24.2	43.3	50.0
GSM8K	5-shot, maj@1	1.36	38.4	71.0	82.6
GPQA		9.38	15.0	25.4	24.3
MMLU (Pro)	5-shot	11.2	23.7	40.8	43.9
MBPP	3-shot	9.80	46.0	60.4	65.6
HumanEval	pass@1	6.10	36.0	45.7	48.8
MMLU (Pro COT)	5-shot	9.7	NaN	NaN	NaN

多語言能力

基準	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
MGSM	2.04	34.7	64.3	74.3
Global-MMLU-Lite	24.9	57.0	69.4	75.7
Belebele	26.6	59.4	78.0	–
WMT24++ (ChrF)	36.7	48.4	53.9	55.7
FloRes	29.5	39.2	46.0	48.8
XL-Sum	4.82	8.55	12.2	14.9
XQuAD (all)	43.9	68.0	74.5	76.8

多模態能力

基準	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
COCOcap	102	111	116
DocVQA (val)	72.8	82.3	85.6
InfoVQA (val)	44.1	54.8	59.4
MMMU (pt)	39.2	50.3	56.1
TextVQA (val)	58.9	66.5	68.6
RealWorldQA	45.5	52.2	53.9
ReMI	27.3	38.5	44.8
AI2D	63.2	75.2	79.0
ChartQA	45.4	60.9	63.8
ChartQA (augmented)	81.8	88.5	88.7
VQAv2	–	–	–
BLINK	38.0	35.9	39.6
OKVQA	51.0	58.7	60.2
TallyQA	42.5	51.8	54.3
SpatialSense VQA	50.9	60.0	59.4
CountBenchQA	26.1	17.8	68.0

![Google Gemma 3 logo](/assets/library/gemma3/b54bf767-f9c5-4284-b551-a49aebe3a3c2)

> This model requires Ollama 0.6 or later. [Download Ollama](https://ollama.dev.org.tw/download)

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

## Models

### Text

**1B parameter model** (32k context window)

```
ollama run gemma3:1b 
```

### Multimodal (Vision)

**4B parameter model** (128k context window)

```
ollama run gemma3:4b
```

**12B parameter model** (128k context window)

```
ollama run gemma3:12b
```

**27B parameter model** (128k context window)

```
ollama run gemma3:27b
```

## Evaluation

![Chatbot Arena ELO Score](/assets/library/gemma3/89dc5a19-179e-4dd3-8e5d-12ad54973148)

### Benchmark Results

These models were evaluated against a large collection of different datasets and
metrics to cover different aspects of text generation:

#### Reasoning, logic and code capabilities

| Benchmark                      | Metric         | Gemma 3 PT 1B  | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:|
| [HellaSwag][hellaswag]         | 10-shot        |      62.3      |     77.2      |      84.2      |      85.6      |
| [BoolQ][boolq]                 | 0-shot         |      63.2      |     72.3      |      78.8      |      82.4      |
| [PIQA][piqa]                   | 0-shot         |      73.8      |     79.6      |      81.8      |      83.3      |
| [SocialIQA][socialiqa]         | 0-shot         |      48.9      |     51.9      |      53.4      |      54.9      |
| [TriviaQA][triviaqa]           | 5-shot         |      39.8      |     65.8      |      78.2      |      85.5      |
| [Natural Questions][naturalq]  | 5-shot         |      9.48      |     20.0      |      31.4      |      36.1      |
| [ARC-c][arc]                   | 25-shot        |      38.4      |     56.2      |      68.9      |      70.6      |
| [ARC-e][arc]                   | 0-shot         |      73.0      |     82.4      |      88.3      |      89.0      |
| [WinoGrande][winogrande]       | 5-shot         |      58.2      |     64.7      |      74.3      |      78.8      |
| [BIG-Bench Hard][bbh]          |                |      28.4      |     50.9      |      72.6      |      77.7      |
| [DROP][drop]                   | 3-shot, F1     |      42.4      |     60.1      |      72.2      |      77.2      |
| [AGIEval][agieval]             | 3-5-shot       |      22.2      |     42.1      |      57.4      |      66.2      |
| [MMLU][mmlu]                   | 5-shot, top-1  |      26.5      |     59.6      |      74.5      |      78.6      |
| [MATH][math]                   | 4-shot         |       --       |     24.2      |      43.3      |      50.0      |
| [GSM8K][gsm8k]                 | 5-shot, maj@1  |      1.36      |     38.4      |      71.0      |      82.6      |
| [GPQA][gpqa]                   |                |      9.38      |     15.0      |      25.4      |      24.3      |
| [MMLU][mmlu] (Pro)             | 5-shot         |      11.2      |     23.7      |      40.8      |      43.9      |
| [MBPP][mbpp]                   | 3-shot         |      9.80      |     46.0      |      60.4      |      65.6      |
| [HumanEval][humaneval]         | pass@1         |      6.10      |     36.0      |      45.7      |      48.8      |
| [MMLU][mmlu] (Pro COT)         | 5-shot         |      9.7       |     NaN       |      NaN       |      NaN       |

[hellaswag]: https://arxiv.org/abs/1905.07830
[boolq]: https://arxiv.org/abs/1905.10044
[piqa]: https://arxiv.org/abs/1911.11641
[socialiqa]: https://arxiv.org/abs/1904.09728
[triviaqa]: https://arxiv.org/abs/1705.03551
[naturalq]: https://github.com/google-research-datasets/natural-questions
[arc]: https://arxiv.org/abs/1911.01547
[winogrande]: https://arxiv.org/abs/1907.10641
[bbh]: https://paperswithcode.com/dataset/bbh
[drop]: https://arxiv.org/abs/1903.00161
[agieval]: https://arxiv.org/abs/2304.06364
[mmlu]: https://arxiv.org/abs/2009.03300
[math]: https://arxiv.org/abs/2103.03874
[gsm8k]: https://arxiv.org/abs/2110.14168
[gpqa]: https://arxiv.org/abs/2311.12022
[mbpp]: https://arxiv.org/abs/2108.07732
[humaneval]: https://arxiv.org/abs/2107.03374

#### Multilingual capabilities

| Benchmark                            | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:|
| [MGSM][mgsm]                         |      2.04     |      34.7     |      64.3      |      74.3      |
| [Global-MMLU-Lite][global-mmlu-lite] |      24.9     |      57.0     |      69.4      |      75.7      |
| [Belebele][belebele]                 |      26.6     |      59.4     |      78.0      |       --       |
| [WMT24++][wmt24pp] (ChrF)            |      36.7     |      48.4     |      53.9      |      55.7      |
| [FloRes][flores]                     |      29.5     |      39.2     |      46.0      |      48.8      |
| [XL-Sum][xlsum]                      |      4.82     |      8.55     |      12.2      |      14.9      |
| [XQuAD][xquad] (all)                 |      43.9     |      68.0     |      74.5      |      76.8      |

[mgsm]: https://arxiv.org/abs/2210.03057
[flores]: https://arxiv.org/abs/2106.03193
[belebele]: https://arxiv.org/abs/2308.16884
[xlsum]: https://arxiv.org/abs/2106.13822
[xquad]: https://arxiv.org/abs/1910.11856v3
[global-mmlu-lite]: https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite
[wmt24pp]: https://arxiv.org/abs/2502.12404v1

#### Multimodal capabilities

| Benchmark                      | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |:-------------:|:--------------:|:--------------:|
| [COCOcap][coco-cap]            |      102      |      111      |      116      |
| [DocVQA][docvqa] (val)         |      72.8     |      82.3     |      85.6     |
| [InfoVQA][info-vqa] (val)      |      44.1     |      54.8     |      59.4     |
| [MMMU][mmmu] (pt)              |      39.2     |      50.3     |      56.1     |
| [TextVQA][textvqa] (val)       |      58.9     |      66.5     |      68.6     |
| [RealWorldQA][realworldqa]     |      45.5     |      52.2     |      53.9     |
| [ReMI][remi]                   |      27.3     |      38.5     |      44.8     |
| [AI2D][ai2d]                   |      63.2     |      75.2     |      79.0     |
| [ChartQA][chartqa]             |      45.4     |      60.9     |      63.8     |
| [ChartQA][chartqa] (augmented) |      81.8     |      88.5     |      88.7     |
| [VQAv2][vqav2]                 |       --      |       --      |       --      |
| [BLINK][blinkvqa]              |      38.0     |      35.9     |      39.6     |
| [OKVQA][okvqa]                 |      51.0     |      58.7     |      60.2     |
| [TallyQA][tallyqa]             |      42.5     |      51.8     |      54.3     |
| [SpatialSense VQA][ss-vqa]     |      50.9     |      60.0     |      59.4     |
| [CountBenchQA][countbenchqa]   |      26.1     |      17.8     |      68.0     |

[coco-cap]: https://cocodataset.org/#home
[docvqa]: https://www.docvqa.org/
[info-vqa]: https://arxiv.org/abs/2104.12756
[mmmu]: https://arxiv.org/abs/2311.16502
[textvqa]: https://textvqa.org/
[realworldqa]: https://paperswithcode.com/dataset/realworldqa
[remi]: https://arxiv.org/html/2406.09175v1
[ai2d]: https://allenai.org/data/diagrams
[chartqa]: https://arxiv.org/abs/2203.10244
[vqav2]: https://visualqa.org/index.html
[blinkvqa]: https://arxiv.org/abs/2404.12390
[okvqa]: https://okvqa.allenai.org/
[tallyqa]: https://arxiv.org/abs/1810.12440
[ss-vqa]: https://arxiv.org/abs/1908.02660
[countbenchqa]: https://github.com/google-research/big_vision/blob/main/big_vision/datasets/countbenchqa/

貼上、拖曳或點擊以上傳圖片 (.png, .jpeg, .jpg, .svg, .gif)