Navigation

Vectors with an ORM

(Updated )

Note

PlanetScale MySQL vectors is still in beta and is not intended for use in production workloads.

Many modern web development frameworks provide ORMs to make mapping data between your database and the objects in your application a seamless experience. Here, we show several examples of how you can use the VECTOR MySQL type with ORMs. These docs to not provide a comprehensive list of how to use vectors with all ORMs. Rather, small examples for several popular choices are provided (Drizzle, Prisma, Rails). If you use a different ORM and are having trouble getting it to work with vectors, please reach out.

Also note that for now, using PlanetScale MySQL vectors with these ORMs requires either a custom type or running raw MySQL queries.

Drizzle

Here, we'll show how you can create and insert rows into a table that has a VECTOR column using Drizzle. First, modify your schema.ts file to import the appropriate items and define two new types. The first type is for the distance functions, and the second is to represent a vector.

import { sql } from 'drizzle-orm'
import { mysqlTable, customType, text, MySqlColumn, bigint } from 'drizzle-orm/mysql-core'

export type DistanceFunction = 'COSINE' | 'DOT' | 'L2' | 'L2_SQUARED'

export const vector = customType<{
  data: ArrayBuffer
  config: { length: number }
  configRequired: true
  driverData: Buffer
}>({
  dataType(config) {
    return `VECTOR(${config.length})`
  },
  fromDriver(value) {
    return value.buffer as ArrayBuffer
  },
  toDriver(value) {
    return Buffer.from(value)
  }
})

With these in place, we can go ahead and declare our table, also in schema.ts.

export const product = mysqlTable('product', {
  id: bigint('id', { mode: 'number', unsigned: true }).autoincrement().primaryKey(),
  name: text('name'),
  description: text('description'),
  embedding: vector('embedding', { length: 384 })
})

This would be in addition to other schema declarations you need for your application.

We now have an object to model a table with a vector embedding column. Elsewhere in our application, we can perform typical operations like inserts, deletes, searches, etc. For example, to insert a row into this table, do the following:

const embedding = // generate an embedding with your preferred API
const serializedEmbedding = Buffer.from((new Float32Array(embedding)).buffer)
await db.insert(product).values({
  description: 'hi there',
  embedding: serializedEmbedding
})

We can also perform searches on the data:

const rankedItems = await db
  .select({ description: product.description })
  .from(product)
  .orderBy(`DISTANCE(TO_VECTOR(${embedding}), ${product.embedding}, L2_SQUARED)`)
  .limit(10)

If you have a large data set, you'll want to make sure you create an index on this vector column.

Prisma

Next, let's look at what it takes to get a table with a VECTOR column working with Prisma. As of this writing, Prisma does not support custom types.

Until Prisma provides support, you can still use vectors in a Prisma-powered application by using the Unsupported function in your schema.prisma and then use raw queries to perform vector searches. We can add a new table with a VECTOR(4) column by adding the following to our schema.prisma file.

model Product {
  id          BigInt     @id @default(autoincrement()) @db.UnsignedBigInt
  name        String?    @db.Text
  description String?    @db.Text
  embedding   Unsupported("vector(4)")?
}

When running prisma db push, it will create the table with the embedding column having type VECTOR(4), even though Prisma does not technically support vectors.

From here, you can use Prisma's queryRaw feature to run raw SQL queries. For example, to insert a row into this table.

const name = 'Pots'
const description = 'For cooking'
const embedding = '[0.5, 0.4, 0.3, 0.2]'
await prisma.$queryRaw(
  Prisma.sql`
    INSERT INTO
      Product (name, description, embedding)
      VALUES(${name}, ${description}, TO_VECTOR(${embedding}))`
)

And to retrieve results from this table:

const result = await prisma.$queryRaw(
  Prisma.sql`SELECT * FROM Product ORDER BY DISTANCE(TO_VECTOR(${query_vector}), embedding, 'l2_squared')`
)
console.log(result)

If you have a large data set, you'll want to make sure you create an index on this vector column.

Ruby on Rails

Lets look at how you can work with a vector column in a Ruby on Rails application. Say you have an existing object that models rows in a table, and you are using ActiveRecord to manage the mapping between your objects and your database. For example, an object representing a tweet in app/models/tweet.rb.

class Tweet < ApplicationRecord
  belongs_to :user
  has_one_attached :image
  default_scope -> { order(created_at: :desc) }
  validates :content, presence: true, length: { maximum: 140 }
  validates :user_id, presence: true
end

Since vectors are a new type for MySQL, we will add a custom migration to handle adding and dropping a VECTOR column to this table in the database. To do this, add a new migration in db/migrations like so:

class AddTweetEmbeddings < ActiveRecord::Migration[7.0]
  def up
    execute <<-SQL
      ALTER TABLE tweets ADD COLUMN embedding VECTOR(1536);
    SQL
    execute <<-SQL
      CREATE VECTOR INDEX idx_tweet_embeddings ON tweets(embedding) '{"type":"spann","distance":"cosine"}';
    SQL
  end

  def down
    execute <<-SQL
      ALTER TABLE tweets DROP COLUMN embedding;
    SQL
  end
end

You'll also need to make the appropriate adjustments in your project to ensure that this gets executed when you run rails db:migrate. With the schema updated appropriately, you can run raw SQL queries via ActiveRecord to insert rows with vectors and perform search. For example, to insert a new row with a vector, you can do something like this:

embedding = [1.0, 0.5, 0.25, 0.125].to_json
created_at = Time.now.to_s
updated_at = Time.now.to_s
content = 'A tweet!'
user_id =  100

sql = <<-SQL
  INSERT INTO tweets (user_id, content, content_embeddings, created_at, updated_at)
    VALUES (
      #{ActiveRecord::Base.connection.quote(user_id)},
      #{ActiveRecord::Base.connection.quote(content)},
      TO_VECTOR(#{ActiveRecord::Base.connection.quote(embedding)}),
      #{ActiveRecord::Base.connection.quote(created_at)},
      #{ActiveRecord::Base.connection.quote(updated_at)}
    )
SQL

ActiveRecord::Base.connection.execute(sql)

To perform a similarity search based on an input embedding, you can execute the following:

embedding = [1.0, 0.5, 0.25, 0.125]
sql = <<-SQL
  SELECT id, content, distance(content_embeddings, TO_VECTOR('#{embedding}'), 'COSINE') as d FROM tweets ORDER BY d DESC LIMIT 10
SQL

results = ActiveRecord::Base.connection.execute(sql)

Other ORMs

If you have requests for other ORMs you'd like to see documented for using with vectors, please reach out with your questions. On the other hand, if you get PlanetScale MySQL vectors working with an ORM not listed here and would like to share your technique, please do as well.

Need help?

Get help from the PlanetScale Support team, or join our GitHub discussion board to see how others are using PlanetScale.