gemma3:27b - Ollama 框架

Gemma 是 Google 基於 Gemini 技術開發的輕量級模型系列。Gemma 3 模型是多模態的，可以處理文字和圖像，並具有 128K 上下文窗口，支援超過 140 種語言。提供 1B、4B、12B 和 27B 參數尺寸，它們擅長問答、摘要和推理等任務，而其緊湊的設計允許部署在資源受限的裝置上。

模型

文字

1B 參數模型 (32k 上下文窗口)

ollama run gemma3:1b

多模態 (視覺)

4B 參數模型 (128k 上下文窗口)

ollama run gemma3:4b

12B 參數模型 (128k 上下文窗口)

ollama run gemma3:12b

27B 參數模型 (128k 上下文窗口)

ollama run gemma3:27b

評估

基準測試結果

這些模型針對大量不同的資料集和指標進行了評估，以涵蓋文字生成的不同方面

推理、邏輯和程式碼能力

基準	指標	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
HellaSwag	10 次射擊	62.3	77.2	84.2	85.6
BoolQ	0 次射擊	63.2	72.3	78.8	82.4
PIQA	0 次射擊	73.8	79.6	81.8	83.3
SocialIQA	0 次射擊	48.9	51.9	53.4	54.9
TriviaQA	5 次射擊	39.8	65.8	78.2	85.5
Natural Questions	5 次射擊	9.48	20.0	31.4	36.1
ARC-c	25 次射擊	38.4	56.2	68.9	70.6
ARC-e	0 次射擊	73.0	82.4	88.3	89.0
WinoGrande	5 次射擊	58.2	64.7	74.3	78.8
BIG-Bench Hard		28.4	50.9	72.6	77.7
DROP	3 次射擊, F1	42.4	60.1	72.2	77.2
AGIEval	3-5 次射擊	22.2	42.1	57.4	66.2
MMLU	5 次射擊, top-1	26.5	59.6	74.5	78.6
MATH	4 次射擊	–	24.2	43.3	50.0
GSM8K	5 次射擊, maj@1	1.36	38.4	71.0	82.6
GPQA		9.38	15.0	25.4	24.3
MMLU (Pro)	5 次射擊	11.2	23.7	40.8	43.9
MBPP	3 次射擊	9.80	46.0	60.4	65.6
HumanEval	pass@1	6.10	36.0	45.7	48.8
MMLU (Pro COT)	5 次射擊	9.7	NaN	NaN	NaN

多語言能力

基準	Gemma 3 PT 1B	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
MGSM	2.04	34.7	64.3	74.3
Global-MMLU-Lite	24.9	57.0	69.4	75.7
Belebele	26.6	59.4	78.0	–
WMT24++ (ChrF)	36.7	48.4	53.9	55.7
FloRes	29.5	39.2	46.0	48.8
XL-Sum	4.82	8.55	12.2	14.9
XQuAD (all)	43.9	68.0	74.5	76.8

多模態能力

基準	Gemma 3 PT 4B	Gemma 3 PT 12B	Gemma 3 PT 27B
COCOcap	102	111	116
DocVQA (val)	72.8	82.3	85.6
InfoVQA (val)	44.1	54.8	59.4
MMMU (pt)	39.2	50.3	56.1
TextVQA (val)	58.9	66.5	68.6
RealWorldQA	45.5	52.2	53.9
ReMI	27.3	38.5	44.8
AI2D	63.2	75.2	79.0
ChartQA	45.4	60.9	63.8
ChartQA (augmented)	81.8	88.5	88.7
VQAv2	–	–	–
BLINK	38.0	35.9	39.6
OKVQA	51.0	58.7	60.2
TallyQA	42.5	51.8	54.3
SpatialSense VQA	50.9	60.0	59.4
CountBenchQA	26.1	17.8	68.0

![Google Gemma 3 logo](/assets/library/gemma3/b54bf767-f9c5-4284-b551-a49aebe3a3c2)

> This model requires Ollama 0.6 or later. [Download Ollama](https://ollama.dev.org.tw/download)

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

## Models

### Text

**1B parameter model** (32k context window)

```
ollama run gemma3:1b 
```

### Multimodal (Vision)

**4B parameter model** (128k context window)

```
ollama run gemma3:4b
```

**12B parameter model** (128k context window)

```
ollama run gemma3:12b
```

**27B parameter model** (128k context window)

```
ollama run gemma3:27b
```

## Evaluation

![Chatbot Arena ELO Score](/assets/library/gemma3/89dc5a19-179e-4dd3-8e5d-12ad54973148)

### Benchmark Results

These models were evaluated against a large collection of different datasets and
metrics to cover different aspects of text generation:

#### Reasoning, logic and code capabilities

| Benchmark                      | Metric         | Gemma 3 PT 1B  | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |----------------|:--------------:|:-------------:|:--------------:|:--------------:|
| [HellaSwag][hellaswag]         | 10-shot        |      62.3      |     77.2      |      84.2      |      85.6      |
| [BoolQ][boolq]                 | 0-shot         |      63.2      |     72.3      |      78.8      |      82.4      |
| [PIQA][piqa]                   | 0-shot         |      73.8      |     79.6      |      81.8      |      83.3      |
| [SocialIQA][socialiqa]         | 0-shot         |      48.9      |     51.9      |      53.4      |      54.9      |
| [TriviaQA][triviaqa]           | 5-shot         |      39.8      |     65.8      |      78.2      |      85.5      |
| [Natural Questions][naturalq]  | 5-shot         |      9.48      |     20.0      |      31.4      |      36.1      |
| [ARC-c][arc]                   | 25-shot        |      38.4      |     56.2      |      68.9      |      70.6      |
| [ARC-e][arc]                   | 0-shot         |      73.0      |     82.4      |      88.3      |      89.0      |
| [WinoGrande][winogrande]       | 5-shot         |      58.2      |     64.7      |      74.3      |      78.8      |
| [BIG-Bench Hard][bbh]          |                |      28.4      |     50.9      |      72.6      |      77.7      |
| [DROP][drop]                   | 3-shot, F1     |      42.4      |     60.1      |      72.2      |      77.2      |
| [AGIEval][agieval]             | 3-5-shot       |      22.2      |     42.1      |      57.4      |      66.2      |
| [MMLU][mmlu]                   | 5-shot, top-1  |      26.5      |     59.6      |      74.5      |      78.6      |
| [MATH][math]                   | 4-shot         |       --       |     24.2      |      43.3      |      50.0      |
| [GSM8K][gsm8k]                 | 5-shot, maj@1  |      1.36      |     38.4      |      71.0      |      82.6      |
| [GPQA][gpqa]                   |                |      9.38      |     15.0      |      25.4      |      24.3      |
| [MMLU][mmlu] (Pro)             | 5-shot         |      11.2      |     23.7      |      40.8      |      43.9      |
| [MBPP][mbpp]                   | 3-shot         |      9.80      |     46.0      |      60.4      |      65.6      |
| [HumanEval][humaneval]         | pass@1         |      6.10      |     36.0      |      45.7      |      48.8      |
| [MMLU][mmlu] (Pro COT)         | 5-shot         |      9.7       |     NaN       |      NaN       |      NaN       |

[hellaswag]: https://arxiv.org/abs/1905.07830
[boolq]: https://arxiv.org/abs/1905.10044
[piqa]: https://arxiv.org/abs/1911.11641
[socialiqa]: https://arxiv.org/abs/1904.09728
[triviaqa]: https://arxiv.org/abs/1705.03551
[naturalq]: https://github.com/google-research-datasets/natural-questions
[arc]: https://arxiv.org/abs/1911.01547
[winogrande]: https://arxiv.org/abs/1907.10641
[bbh]: https://paperswithcode.com/dataset/bbh
[drop]: https://arxiv.org/abs/1903.00161
[agieval]: https://arxiv.org/abs/2304.06364
[mmlu]: https://arxiv.org/abs/2009.03300
[math]: https://arxiv.org/abs/2103.03874
[gsm8k]: https://arxiv.org/abs/2110.14168
[gpqa]: https://arxiv.org/abs/2311.12022
[mbpp]: https://arxiv.org/abs/2108.07732
[humaneval]: https://arxiv.org/abs/2107.03374

#### Multilingual capabilities

| Benchmark                            | Gemma 3 PT 1B | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------------ |:-------------:|:-------------:|:--------------:|:--------------:|
| [MGSM][mgsm]                         |      2.04     |      34.7     |      64.3      |      74.3      |
| [Global-MMLU-Lite][global-mmlu-lite] |      24.9     |      57.0     |      69.4      |      75.7      |
| [Belebele][belebele]                 |      26.6     |      59.4     |      78.0      |       --       |
| [WMT24++][wmt24pp] (ChrF)            |      36.7     |      48.4     |      53.9      |      55.7      |
| [FloRes][flores]                     |      29.5     |      39.2     |      46.0      |      48.8      |
| [XL-Sum][xlsum]                      |      4.82     |      8.55     |      12.2      |      14.9      |
| [XQuAD][xquad] (all)                 |      43.9     |      68.0     |      74.5      |      76.8      |

[mgsm]: https://arxiv.org/abs/2210.03057
[flores]: https://arxiv.org/abs/2106.03193
[belebele]: https://arxiv.org/abs/2308.16884
[xlsum]: https://arxiv.org/abs/2106.13822
[xquad]: https://arxiv.org/abs/1910.11856v3
[global-mmlu-lite]: https://huggingface.co/datasets/CohereForAI/Global-MMLU-Lite
[wmt24pp]: https://arxiv.org/abs/2502.12404v1

#### Multimodal capabilities

| Benchmark                      | Gemma 3 PT 4B | Gemma 3 PT 12B | Gemma 3 PT 27B |
| ------------------------------ |:-------------:|:--------------:|:--------------:|
| [COCOcap][coco-cap]            |      102      |      111      |      116      |
| [DocVQA][docvqa] (val)         |      72.8     |      82.3     |      85.6     |
| [InfoVQA][info-vqa] (val)      |      44.1     |      54.8     |      59.4     |
| [MMMU][mmmu] (pt)              |      39.2     |      50.3     |      56.1     |
| [TextVQA][textvqa] (val)       |      58.9     |      66.5     |      68.6     |
| [RealWorldQA][realworldqa]     |      45.5     |      52.2     |      53.9     |
| [ReMI][remi]                   |      27.3     |      38.5     |      44.8     |
| [AI2D][ai2d]                   |      63.2     |      75.2     |      79.0     |
| [ChartQA][chartqa]             |      45.4     |      60.9     |      63.8     |
| [ChartQA][chartqa] (augmented) |      81.8     |      88.5     |      88.7     |
| [VQAv2][vqav2]                 |       --      |       --      |       --      |
| [BLINK][blinkvqa]              |      38.0     |      35.9     |      39.6     |
| [OKVQA][okvqa]                 |      51.0     |      58.7     |      60.2     |
| [TallyQA][tallyqa]             |      42.5     |      51.8     |      54.3     |
| [SpatialSense VQA][ss-vqa]     |      50.9     |      60.0     |      59.4     |
| [CountBenchQA][countbenchqa]   |      26.1     |      17.8     |      68.0     |

[coco-cap]: https://cocodataset.org/#home
[docvqa]: https://www.docvqa.org/
[info-vqa]: https://arxiv.org/abs/2104.12756
[mmmu]: https://arxiv.org/abs/2311.16502
[textvqa]: https://textvqa.org/
[realworldqa]: https://paperswithcode.com/dataset/realworldqa
[remi]: https://arxiv.org/html/2406.09175v1
[ai2d]: https://allenai.org/data/diagrams
[chartqa]: https://arxiv.org/abs/2203.10244
[vqav2]: https://visualqa.org/index.html
[blinkvqa]: https://arxiv.org/abs/2404.12390
[okvqa]: https://okvqa.allenai.org/
[tallyqa]: https://arxiv.org/abs/1810.12440
[ss-vqa]: https://arxiv.org/abs/1908.02660
[countbenchqa]: https://github.com/google-research/big_vision/blob/main/big_vision/datasets/countbenchqa/

貼上、拖曳或點擊上傳圖片 (.png, .jpeg, .jpg, .svg, .gif)