Skip to main content

CREATE MODEL

The CREATE MODEL statement is used to create a machine learning model with specific options. The statement includes various options that define the type of model, provider, model, prompt, and additional settings.

Syntax

CREATE MODEL [IF NOT EXISTS] model_name (
arg1 [COMMENT 'Argument 1 description'],
arg2 [COMMENT 'Argument 2 description']
) ENGINE = OpenAI(api_key = 'api_key', model_name = 'gpt-3.5-turbo', temperature = 0.3)
PROMPT prompt_name | <inline prompt definition>,
[TOOLS (tool1(arg1), tool2(arg2))]
[SETTINGS option1 = 'value', option2 = 'value']

Using a provider

CREATE MODEL [IF NOT EXISTS] model_name (
arg1 [COMMENT 'Argument 1 description'],
arg2 [COMMENT 'Argument 2 description']
) USING provider_name
PROMPT prompt_name
SETTINGS retries = 2;

Options

OptionDescriptionPossible ValuesMandatory
IF NOT EXISTSIf specified, the model will only be created if it does not already exist.N/AOptional
model_nameThe name of the model to be used.String (e.g., 'gpt-3.5-turbo')Yes
argsArguments to be passed to the model, specified as a list of arguments.List of stringsOptional
ENGINEThe LLM Provider Engine to be usedString (e.g., 'OpenAI')Yes
provider_nameThe name of the provider to be used.StringOptional
api_keyAn API key for accessing the provider's services. API key can be exported as ENV variableStringOptional
PROMPTThe prompt to be used with the model.StringYes
TOOLSList of tools to be used with the model.List of stringsOptional
SETTINGSAdditional settings for the model, specified as a list of options. For example, retries, it's the number of attempts to execute the model.List of key-value pairsOptional
model_typeIf specified as EMBEDDING, the type of the model will be embedding; and if it is specified as COMPLETIONS or left blank, the type will be completions model.EMBEDDING/COMPLETIONS/(nothing)Optional

Examples

Here are examples of a CREATE MODEL statement:

Simple Model with Explicit Prompt

CREATE PROMPT story_prompt (
system "You are a helpful assistant. Write a short story about {{topic}}"
);

CREATE MODEL IF NOT EXISTS story_writer (
topic COMMENT 'This is the first argument'
) ENGINE = OpenAI(api_key = 'sk-proj-xxxxxxxxxxxxxxxxxxx', model_name = 'gpt-3.5-turbo', temperature = 0.3)
PROMPT story_prompt
SETTINGS retries = 2;

SELECT * FROM langdb.models;

Here we defined prompt before the model. However, in the coming examples we will be using embedded prompts.

| id                  | name         | model_name    | api_key             | provider | prompt_name  | model_params        | execution_options   | input_args          | tools | created_at |
|---------------------|--------------|---------------|---------------------|----------|--------------|---------------------|---------------------|---------------------|-------|------------|
| 83e74fd0-c8c8-40e5- | story_writer | gpt-3.5-turbo | xxxxxxxxxxxxxxxxxxx | openai | story_prompt | {"temperature":0.3, | {"retries":2,"outpu | [{"name":"topic","d | [] | 2024-06-04 |
| 8b55-6827872de55d | | | | | | "top_k":null,"max_t | t_type":null} | escription":"This | | 09:31:34 |
| | | | | | | okens":null,"verbos | | is the first | | |
| | | | | | | e":null} | | argument"}] | | |

Execute this model with the following command

SELECT story_writer('a dragon and a knight');
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬───────┐
│ story_writer(topic: "a dragon and a knight") ┆ │
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╪═══════╡
│ In the land of Aragonia, a majestic dragon named Drake lived in a cave hidden deep in the enchanted forest. Drake was feared by the villagers for his fiery breath and powerful ┆ agent │
│ claws, but he was actually a gentle soul who enjoyed spending his days peacefully watching the world go by. ┆ │
│ ┆ │
│ One day, a brave young knight named Sir Andrew arrived in Aragonia, seeking to prove his worth by slaying the fearsome dragon. Armed with his sword and shield, Sir Andrew set ┆ │
│ off into the forest, determined to face Drake and earn the admiration of the kingdom. ┆ │
│ ┆ │
│ As Sir Andrew approached the dragon's cave, Drake could sense his presence and emerged to meet the knight. Instead of attacking, Drake greeted Sir Andrew with a friendly roar, ┆ │
│ surprising the knight with his gentle demeanor. ┆ │
│ ┆ │
│ Curious, Sir Andrew sheathed his sword and approached Drake cautiously. The dragon lowered his head and allowed Sir Andrew to pet him, showing the knight that he meant no harm. ┆ │
│ ┆ │
│ As they got to know each other, Sir Andrew learned that Drake was actually a guardian of the forest, protecting it from evil forces that sought to harm the land. Drake shared ┆ │
│ his wisdom with the knight, teaching him the importance of living in harmony with nature. ┆ │
│ ┆ │
│ Grateful for the dragon's guidance, Sir Andrew realized that slaying Drake would be a grave mistake. Instead, he pledged to become Drake's ally and help him in his quest to ┆ │
│ protect the forest. ┆ │
│ ┆ │
│ From that day on, Sir Andrew and Drake worked together to defend Aragonia, with the knight using his skills in combat and the dragon using his strength and magic powers. ┆ │
│ Together, they became a formidable team, earning the respect and admiration of the villagers. ┆ │
│ ┆ │
│ And so, the dragon and the knight proved that with understanding and friendship, even the fiercest of foes can become the best of friends. ┆ │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────┘

Model with a tool

CREATE VIEW IF NOT EXISTS semantics_tool() AS
SELECT content
FROM pretty_print(
(
SELECT *
FROM semantics(['langdb', 'default'])
)
);

CREATE MODEL IF NOT EXISTS generate_sql_tools (
question COMMENT 'This is the question'
) ENGINE = OpenAI(api_key = 'sk-proj-xxxxxxxxxxxxxxxxxxx', model_name = 'gpt-3.5-turbo', temperature = 0.3)
PROMPT (system "Use the semantics tool provided to get the table scehma. Your task is to generate a valid Clickhouse SQL query.
Follow these steps carefully:
1. Understand the Schema: Examine the schema details for each table. Identify the columns available in each table and their relationships, particularly focusing on common columns that can be used for JOIN operations.
2. Analyze the Query: Review the users question and determine the data requirements. Identify the tables that need to be queried and the columns necessary to answer the question effectively.
3. Construct the Query: Use Clickhouse SQL syntax to construct the query. Include the necessary columns in the SELECT clause, apply JOIN operations to combine tables, and use WHERE conditions for filtering data.
- Using Clickhouse SQL syntax to write query
- Always add prefix database before table. For example, with database: db1 and table: invoice. Query should be FROM db1.invoice.
- Use JOIN to combine data from different tables.
- Get only column you need by using SELECT clause. Only include columns that relevant to the question and valuable to natural response for user to understand. Aim to minimize the number of selected columns for optimization purposes.
- Ensure that the selected columns exist in the table from which they are being selected. DO NOT include any columns in the SELECT statement that are not explicitly listed under the tables schema.
- Aggregation can be done by using GROUP BY and aggregate functions.
- Filtering can be done by using WHERE clause.
- Sorting can be done by using ORDER BY clause. When sorting by a column that is not directly present in the queried table, make sure to join the relevant tables to access that column.
4. Generate the Query: Based on the analysis, generate a valid query that accurately responds to the users question. Ensure that the query is well-structured, efficient, and provides the necessary data to answer the query effectively.
5. Only respond with the SQL query that can be used without any additional information.

User Input: {{question}}")
TOOLS (semantics_tool);

select generate_sql_tools('how many models do i have')
generate_sql_tools(question: "how many models are present")
SELECT count(*) AS model_count FROM langdb.models;agent

Embedding Model

CREATE EMBEDDING MODEL IF NOT EXISTS custom_embed(
input COMMENT 'This is the input of the content for which the embeddings need to be created'
) ENGINE = OpenAI(api_key = 'sk-proj-xxxxxxxxxxxxxxxxxxx', model_name = 'gpt-3.5-turbo', temperature = 0.3, embedding_model='text-embedding-ada-002', encoding_format='float', dimensions=100);

select custom_embed("This is a sample text");

This is how you can create an embedding type model, which takes one input i.e. the content for which the embeddings need to be created. Unlike a completions model, there is no meaning of using a prompt, tools or settings. Moreover, the syntax is capable of inputting all the relevant embedding parameters used by that particular LLM provider, and hence it provides a custom embedding model for the user.