Working With LLMs
Introduction
Large Language Models (LLMs) are powerful deep learning models that have revolutionized the field of natural language processing (NLP). These models are pre-trained on vast amounts of text data, allowing them to understand and generate human-like language.
LLMs are trained using self-supervised learning techniques, where they learn patterns and relationships within the text data without explicit labeling. This allows them to capture the intricacies of language and generate coherent and contextually relevant text.
LLMs in LangDB
LangDB integrates seamlessly with Large Language Models (LLMs), enabling developers to interact with structured and unstructured data efficiently. A unified interface for working with different LLMs, making it easy to compare their performance and select the most suitable model for a given task.
Comparing LLM Models
Let's take an example of extracting information from a resume, and we will be deciding if we want to use Claude 3.5 Sonnet
or GPT-4o
.
GPT-4o
CREATE MODEL IF NOT EXISTS extract_resume_open_ai (input)
USING openai(model_name= 'gpt-4o')
PROMPT extract_resume_pt
Claude 3.5 Sonnet
CREATE MODEL IF NOT EXISTS extract_resume_anthropic (input)
USING anthropic(model_name= 'claude-3-5-sonnet-20240620')
PROMPT extract_resume_pt
Resume PDF
LangDB, provides you with functions to extract and load unstructured data into your database. Here, we will be using extract_text
.
SELECT content AS text FROM extract_text((
SELECT content AS file FROM load('s3://langdb-sample-data/ACCOUNTANT/12202337.pdf')
),
path => file, type => 'pdf')
INVESTMENTACCOUNTANT
Career Focus
Accomplished and results oriented Investment professionalwith strong leadership and interpersonalskills who addsenergy and valueto an
organization's quest forexcellence.
Summary ofSkills
Internetand Microsoft Office- MS Word, MS Power Point, MS Excel, Pivot Tables, Spreadsheets,
Macros.
* Business Objects, Lombardi, Eagle Accounting System, PEGA, DRAS, Workbench.
Account reconciliations
Detail-oriented
Varianceanalysis
Detail-oriented
...
Querying the models
With LangDB, we can query models easily using SQL.
GPT 4o
SELECT * FROM extract_resume_open_ai((
select content as text from extract_text((
select content as file from load('s3://langdb-sample-data/ACCOUNTANT/12202337.pdf')
),
path => file, type => 'pdf')
), input => text)
{
"Personal_Information": {
"Full_Name": "Generated Name",
"Contact_Information": {
"Email": "generated.email@example.com",
"Phone_Number": "123-456-7890",
"Location": "Generated Location"
}
},
"Professional_Summary": "Accomplished and results-oriented Investment professional with strong leadership and interpersonal skills...",
"Work_Experience": [
{
"Company": "CompanyName",
"Job_Title": "Investment Accountant",
"Start_Date": "10/2012",
"End_Date": "11/2015",
"Key_Responsibilities": [
"Reconciled mutual fund accounts with the custody",
"Identified and resolved differences in Custody and Accounting Cash, Currency and Positions...",
// Additional responsibilities...
],
"Achievements": [
"Received a special achievement award at BNY Mellon Bank in Asset Servicing (Dec 2013)",
"Received a special achievement award at BNY Mellon Bank in Asset Servicing (Dec 2014)"
]
}
],
// Additional work experiences...
}
Claude 3.5 Sonnet
SELECT * FROM extract_resume_anthropic((
select content as text from extract_text((
select content as file from load('s3://langdb-sample-data/ACCOUNTANT/12202337.pdf')
),
path => file, type => 'pdf')
), input => text)
{
"Personal_Information": {
"Full_Name": "Not provided",
"Contact_Information": {
"Email": "example@email.com",
"Phone_Number": "+1 555-123-4567"
},
"Location": "City, State, USA"
},
"Professional_Summary": "Accomplished and results-oriented Investment professional with strong leadership and interpersonal skills. Experienced in investment accounting, AML compliance, and medical technology...",
"Work_Experience": [
{
"Company": "CompanyName",
"Job_Title": "Investment Accountant",
"Start_Date": "10/2012",
"End_Date": "11/2015",
"Key_Responsibilities": [
// List of responsibilities would go here
]
}
]
// Additional work experiences and sections...
}
Traces
We get traces when we execute a query. These traces can be used to compare time taken, tools called and token usage. Here is the comparision between the OpenAi and Anthropic calls we did above.
Traces for OpenAI GPT-4o | Traces for Anthropic Claude 3.5 Sonnet |
Supported Providers
- OpenAI
- Anthropic
- Gemini
- Cohere
- Meta
- Mistral
Using Providers
You can learn more about Providers and creating them in Concepts and SQL Statements.