📄️ available_models()
The available_models() function lists all the providers and their models available in LangDB.
📄️ list_files(directory_url)
The listfiles() SQL function is used to display a list of files from a specified directory URL and returns their names and sizes. Built using objectstore which supports passing any object store such
📄️ crawl(file_url, max_depth)
The crawl() function is a powerful SQL extension that allows you to perform web scraping operations directly within your database queries. This function crawls web pages starting from a given URL and returns the crawled data.
📄️ load(file_url)
The load(file_url) function converts a document/URL into an Array of Bytes.
📄️ extract_text(path)
The extract_text() function extracts text from various file types, with specific options available for PDF files.
📄️ extract_layout(path)
The extract_layout() function enables structured data extraction with layout information from a document.
📄️ chunk()
The chunk() function splits a given text into chunks of a specified size. It provides options for customizing the chunking process and returns the resulting chunks as rows in a table-like format.
📄️ pretty_print(query)
The pretty_print converts the SQL query to a human-readable format.
📄️ show_parsed_schemas(query)
The showparsedschema function is designed to provide a structured representation of the schema. This function is particularly useful in Structured Data Extraction after you have extracted layout from the document to understand the schema output of a sub-query or a complex SQL statement without executing it against the actual database.
📄️ transpose_parsed_tables(table_index, query)
The transposeparsedtables function takes the parsed information using extractlayout function, to extract tables as clickhouse tables.
📄️ print_parsed_markdown(query)
The printparsedmarkdown function allows you to print parsed documents in markdown.