How I build Rails apps
VCR
I advocate for the use of VCR to test against AI APIs. Although this can be some effort to set up and get used to, it allows you to:
- Test your code against the actual APIs, which is slow, and costs a little money, but is a snapshot of the real world.
- Just comment out your OpenAI API token to run your specs instantly against the recorded cassettes that exactly mirror the real API. This is free and as fast as a unit test.
- At any time rerun all specs again against the actual API and regenerate the responses that are automatically stored in your fixtures, to validate that your code will still work against the changing reality of the third-party API.
Installation
First, install VCR & WebMock.
# Gemfile
group :test do
# Use VCR to record HTTP interactions and replay them.
# https://github.com/vcr/vcr
gem "vcr", "~> 6.2.0"
# Use WebMock to stub HTTP requests.
# https://github.com/bblimke/webmock
gem "webmock", "~> 3.23.0"
end
Setup
Configure VCR to ALWAYS record :all
cassettes if the OPENAI_ACCESS_TOKEN is present, otherwise only record when a cassette is not present (:new_episodes
). Also redact your secret keys from the fixtures:
require "vcr"
require "dotenv/load"
VCR.configure do |c|
c.hook_into :webmock
c.cassette_library_dir = "spec/fixtures/cassettes"
# Record new episodes if the access token is present.
c.default_cassette_options = {
record: ENV.fetch("OPENAI_ACCESS_TOKEN", nil) ? :all : :new_episodes,
match_requests_on: [:method, :uri]
}
c.filter_sensitive_data("<OPENAI_ACCESS_TOKEN>") { ENV["OPENAI_ACCESS_TOKEN"] }
c.filter_sensitive_data("<OPENAI_ORGANIZATION_ID>") { ENV["OPENAI_ORGANIZATION_ID"] }
end
Config
When no actual token is present, a fallback dummy-token
is needed for tests to prevent ruby-openai
raising an error . It will then hit recorded VCR cassettes and get the recorded response back.
# config/initializers/openai.rb
OpenAI.configure do |config|
config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN", Rails.env.test? ? "dummy-token" : nil)
config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID", nil)
end
Usage
Click here to view in AI Engine Starter Kit
1 cassette per request
It's best to use 1 cassette for 1 API request, and not reuse them in different tests unless you know exactly what you're doing. Give the cassettes explanatory names, and wrap the code that will trigger the API call in a VCR block like VCR.use_cassette("unique_cassette_name") do
, for example:
require "rails_helper"
RSpec.describe MessagesController, type: :request do
let(:current_user) { create(:user) }
before do
sign_in current_user
end
describe "POST /create" do
context "with valid parameters" do
context "with a chat" do
let(:chat) { current_user.chats.create }
let(:model) { AI::Engine::MODEL_OPTIONS.sample }
let(:valid_attributes) { {chat_id: chat.id, content: "Hi there", model: model} }
it "creates a new Message" do
# Sends the message history off to OpenAI and gets the response.
VCR.use_cassette("requests_chat_messages_create_and_run") do
expect {
post messages_url, as: :turbo_stream, params: {message: valid_attributes}
}.to change(chat.messages, :count).by(2)
end
expect(chat.messages.count).to eq(2)
response = chat.messages.last
expect(response.content).to be_present
expect(response.model).to eq(model)
expect(response.remote_id).to eq(nil)
end
end
end
end
end
Streaming data can then be seen in the fixture file, 3 chunks building the sentence "Hello!":
data: {"id":"chatcmpl-9eewU9DFnKgTFs6eSjFgMye4q6RL0","object":"chat.completion.chunk","created":1719477530,"model":"gpt-4-0613","system_fingerprint":null,"choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}],"usage":null}
data: {"id":"chatcmpl-9eewU9DFnKgTFs6eSjFgMye4q6RL0","object":"chat.completion.chunk","created":1719477530,"model":"gpt-4-0613","system_fingerprint":null,"choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}],"usage":null}
data: {"id":"chatcmpl-9eewU9DFnKgTFs6eSjFgMye4q6RL0","object":"chat.completion.chunk","created":1719477530,"model":"gpt-4-0613","system_fingerprint":null,"choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}],"usage":null}
Expect non-determinism
I generally don't test the content of the AI response, since this naturally can change a lot based on how LLMs work. Instead just check that SOME text (or image, file etc.) was returned. Specs like this are useful to check the thing that usually breaks, ie. your integration with the API (since it can be broken at either end).
Workflows
Initial recording of cassettes
- Go to .env file and uncomment
OPENAI_ACCESS_TOKEN=123abc...
- Run specs with
bundle exec rspec
, which will record new cassettes - Go to .env file and comment out
# OPENAI_ACCESS_TOKEN=123abc...
- Run specs with
bundle exec rspec
, which will use the recorded cassettes - Ensure the fixtures are committed to source control so they can be used by others and in Continuous Integration, and to spot any changes
Debugging
- To debug, you can examine the fixtures to see exactly what the LLM API has returned, even when streaming
- Use source control to see exactly what's changed in the fixtures
- Delete a fixture and rerun the spec to get a refreshed API response
Resetting/cleaning up
- At any time you can delete all the fixtures in
spec/fixtures/cassettes
- Go to .env file and uncomment
OPENAI_ACCESS_TOKEN=123abc...
- Run specs with
bundle exec rspec
, which will record new cassettes
Gotchas
Leaking Secrets
Click here to view example in AI Engine Starter Kit
Ensure your OPENAI_ACCESS_TOKEN is correctly redacted from fixtures when you first set it up. This might not happen if you haven't correctly set up VCR or the OPENAI_ACCESS_TOKEN
ENV var is not available in spec_helper.rb
for some reason. When set up correctly, references to the token in fixtures should look something like:
headers:
Content-Type:
- application/json
Authorization:
- Bearer <OPENAI_ACCESS_TOKEN>
Openai-Organization:
- '<OPENAI_ORGANIZATION_ID>'
Endless cassettes
Some slight changes to specs can cause VCR to append requests to cassette files instead of replacing their contents. These can stack up but shouldn't cause any problems, and the fix is to just delete the cassette and rerun against the API to get a clean cassette, that only contains 1 request & response.
Support
Any issues, please email me at [email protected]
and I'll respond ASAP.