如何提示 Code Llama

兩週前，Meta 發布了 Code Llama 模型，共有三種變體

Instruct（指令）
程式碼完成
Python

本指南將逐步說明如何針對 Code Llama 的不同變體和功能構建提示。

以下範例使用具有 4 位元量化的 70 億參數模型，但也提供了 130 億和 340 億參數模型。本指南使用開源 Ollama 專案來下載和提示 Code Llama，但這些提示也適用於其他模型提供者和運行時環境。

Instruct（指令）

instruct 模型經過訓練，可以輸出類似人類的問答。此模型變體最容易使用，行為也最接近 ChatGPT，回答問題時會包含自然語言和程式碼

_提示

ollama run codellama:7b-instruct 'You are an expert programmer that writes simple, concise code and explanations. Write a python function to generate the nth fibonacci number.'

_回應

Here is a Python function that generates the nth Fibonacci number:

def fib(n):
    if n <= 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

This function uses the recursive formula for the Fibonacci sequence, which is:

fib(n) = fib(n-1) + fib(n-2)

程式碼審查

程式碼審查通常對於團隊來說非常耗時。Code Llama 可以透過為提交或提取請求提供第二雙眼睛來檢查簡單的錯誤，從而提供幫助

_提示

ollama run codellama '
Where is the bug in this code?

def fib(n):
    if n <= 0:
        return n
    else:
        return fib(n-1) + fib(n-2)
'

_回應

The bug in this code is that it does not handle the case where `n` is equal to 1.

單元測試

編寫單元測試通常需要大量的樣板程式碼。Code Llama 可以提供協助

_提示

ollama run codellama "write a unit test for this function: $(cat fib.py)"

_回應

Here's an example of how you could write a unit test for the `fib` function:

```
import unittest

class TestFib(unittest.TestCase):
    def test_fib(self):
        self.assertEqual(fib(0), 0)
        self.assertEqual(fib(1), 1)
        self.assertEqual(fib(2), 1)
        self.assertEqual(fib(3), 2)
        self.assertEqual(fib(4), 3)
        self.assertEqual(fib(5), 5)
        self.assertEqual(fib(6), 8)
        self.assertEqual(fib(7), 13)
        self.assertEqual(fib(8), 21)
        self.assertEqual(fib(9), 34)
if __name__ == '__main__':
    unittest.main()
```

This test case covers all the cases where the input is an integer between 0 and 9, inclusive.

程式碼完成

標準的程式碼完成模型只會根據提供的提示繼續生成後續的 tokens。如果您只想生成程式碼，例如在看到函數簽名或註解後讓模型編寫函數，則此變體效果最佳

_提示

ollama run codellama:7b-code '# A simple python function to remove whitespace from a string:'

_回應

def remove_whitespace(s):
    return ''.join(s.split())

Infill（填充）

Fill-in-the-middle (FIM)，或更簡潔地說，infill（填充） 是程式碼完成模型支援的一種特殊提示格式，可以完成兩個已編寫程式碼區塊之間的程式碼。這對於在現有函數中插入程式碼尤其有效，如下所示

def compute_gcd(x, y):

    <FILL>

    return result

使用 Code Llama，infill 提示需要模型期望的特殊格式。

<PRE> {prefix} <SUF>{suffix} <MID>

若要將此用於現有程式碼，請在之前和之後分割程式碼在上面的範例中，分為幾個部分：前綴和後綴。例如，對於我們的 LCM 範例

_提示

ollama run codellama:7b-code '<PRE> def compute_gcd(x, y): <SUF>return result <MID>'

雖然結果會有所不同，但您應該得到類似這樣的結果

_回應

  if x == y:
        return x

    if x > y:
        x = x - y
    else:
        y = y - x

    result = compute_gcd(x, y)

注意：模型可能會在結果末尾傳回 <EOT>。這是回應中的特殊 token，表示回應的結尾，類似於 <PRE>、<SUF> 和 <MID>

Python

為了感謝創建該模型的社群和工具，Code Llama 的作者包含了一個 Python 變體，該變體在額外的 100B 個 Python tokens 上進行了微調，使其成為在處理機器學習相關工具或任何其他 Python 程式碼時使用的良好模型

_提示

ollama run codellama:7b-python '
# django view for rendering the current day and time without a template
def current_datetime(request):'

_回應

    now = datetime.now()
    html = "<html><body>It is now %s.</body></html>" % now
    return HttpResponse(html)

基於 Code Llama 建構的工具

Cody 有一個使用 Code Llama 並支援 infill 的實驗性版本。
Continue 支援 Code Llama 作為 GPT-4 的直接替換
來自 Phind 和 WizardLM 團隊的 Code Llama 微調版本
Open interpreter 可以使用 Code Llama 生成函數，然後在終端機中本地執行

2023年9月9日