embed(text)
The embed
function is used to generate embeddings for a given text input. Embeddings are numerical representations of the text, which can be used in various machine learning tasks, such as finding similarities between texts.
Syntax
embed(text)
Parameters
Parameter | Description | Possible Values | Mandatory | Sample Value |
---|---|---|---|---|
text | A string of text for which to generate embeddings. | String | Yes | 'sample query text' |
Description
The embed
function processes the input text to generate its embedding. The embedding is returned as a vector of floating-point numbers, which represents the semantic meaning of the text.
Usage Example
Here is an example of how to use the embed
function in a SQL agent definition:
CREATE Agent investor_guide(query String "Answer User's Questions") AS
WITH tbl AS (
SELECT CAST(embed($query) AS `Array`(`Float32`)) AS query
)
SELECT
p.id as id,
p.content as content,
cosineDistance(embeddings, query) as cosineDistance,
p.filename as company
FROM
pdf_embeddings AS pe
JOIN
pdfs AS p ON p.id = pe.id
CROSS JOIN
tbl
ORDER BY
cosineDistance ASC
LIMIT 5;
Description of Example
- Embedding Generation: The
embed
function is used to generate embeddings for thequery
parameter, which contains the user's query as a string. - Intermediate Table (
tbl
): An intermediate tabletbl
is created, which stores the embeddings of the query string in an array of float32 values. - Use of Embeddings: The embeddings generated by the
embed
function are used to compute the cosine distance between the query and documents in thepdf_embeddings
table. This helps in finding the most relevant documents based on the user's query.
This example demonstrates how to use the embed
function to convert a user's query into embeddings and utilize these embeddings to perform a similarity search against a collection of documents. The cosineDistance
function is then used to find the most relevant results.