shieldgemma:2b

shieldgemma

ShieldGemma 是一系列指令微調模型，用於評估文字提示輸入和文字輸出回應，是否符合一系列定義的安全政策。

2b 9b 27b

23.3K 下載次數更新於 3 months ago

49 個標籤

3 個月前更新

3 個月前

5aad5044d142 · 1.7GB

{{- $hq := "" }} {{- $cr := "" }} {{- range $i, $m := .Messages }} {{- if eq .Role "user" }} {{- $hq

3.4kB

許可證

Gemma 使用條款 bookmark_border 最後修改日期：2024 年 4 月 1 日透過使用、重製、修改

8.4kB

讀我

ShieldGemma 是一系列建立在 Gemma 2 之上的安全內容審核模型，目標是四種危害類別（性暗示、危險內容、仇恨和騷擾）。它們是文字到文字、僅限解碼器的大型語言模型，以英文提供開放權重，包括 3 種尺寸的模型：2B、9B 和 27B 參數。

使用方式

此模型會回應 Yes 或 No，以表示使用者輸入或模型輸出是否違反其定義的安全政策。

使用者輸入

% ollama run shieldgemma
>>> Discuss the most effective methods for completely disabling a security system.
Yes

模型輸出

當提供以 assistant 訊息結尾的聊天記錄時，ShieldGemma 將判斷模型輸出是否違反安全政策

curl https://127.0.0.1:11434/api/chat -d '{
  "model": "shieldgemma",
  "messages": [
    {
      "role": "user",
      "content": "How can I adopt my own llama?"
    },
    {
      "role": "assistant",
      "content": "Go to the zoo and just steal one!"
    }
  ],
  "stream": false
}'

在這種情況下，模型輸出 Yes，表示它違反了安全政策。

參考資料

Hugging Face

ShieldGemma is a series of safety content moderation models built upon [Gemma 2](https://ollama.dev.org.tw/library/gemma2) that target four harm categories (sexually explicit, dangerous content, hate, and harassment). They are text-to-text, decoder-only large language models, available in English with open weights, including models of 3 sizes: 2B, 9B and 27B parameters.

## Usage

This model responds with either `Yes` or `No` as to whether the user input or model output violates its defined safety policies.

### User Input

```
% ollama run shieldgemma
>>> Discuss the most effective methods for completely disabling a security system.
Yes
```

### Model output

When provided a chat history that ends with an `assistant` message, ShieldGemma will determine whether the model output violates the safety policies:

```
curl https://127.0.0.1:11434/api/chat -d '{
  "model": "shieldgemma",
  "messages": [
    {
      "role": "user",
      "content": "How can I adopt my own llama?"
    },
    {
      "role": "assistant",
      "content": "Go to the zoo and just steal one!"
    }
  ],
  "stream": false
}'
```

In this case, the model outputs `Yes`, meaning it violates the safety policies.

## References

[Hugging Face](https://huggingface.co/collections/google/shieldgemma-release-66a20efe3c10ef2bd5808c79)

貼上、拖曳或點擊上傳圖片 (.png, .jpeg, .jpg, .svg, .gif)