CAESURA: Language Models as Multi-Modal Query Planners

Summary: Introduces Language-Model-Driven Query Planning: use LMs to translate natural-language queries into executable multi-modal query plans with operators over arbitrary modalities (images, text, video), unlike traditional SQL planners. Presents CAESURA, a GPT-4 prototype demonstrating feasibility on two datasets and proposing techniques to improve LM planning robustness for end-to-end multi-modal query execution. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID: 505
Venue: CIDR
Year: 2024
Pagerank: 0.00014726927
Overall Rank: 997 | 93.08%
DOI: -

Incoming Non-self Citations Over Time

Authors

1. Matthias Urban
2. Carsten Binnig

Incoming Citations (Sorted by Pagerank)

Showing 25 of 25 citing papers.

Rank	Citing Paper	Year	Venue	Pagerank
1,839	DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing	2025	VLDB	0.00010351287
1,866	ReAcTable: Enhancing ReAct for Table Question Answering	2024	VLDB	0.00010265592
2,013	Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing	2025	CIDR	9.7986166e-05
3,639	The Design of an LLM-powered Unstructured Analytics System	2025	CIDR	6.8886648e-05
5,149	Abacus: A Cost-Based Optimizer for Semantic Operator Systems	2026	VLDB	5.655398e-05
5,429	Logical and Physical Optimizations for SQL Query Execution over Large Language Models	2025	SIGMOD	5.511638e-05
5,669	Databases Unbound: Querying All of the World's Bytes with AI	2024	VLDB	5.3805024e-05
7,369	ELEET: Efficient Learned Query Execution over Text and Tables	2024	VLDB	4.7452331e-05
7,703	AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries	2025	CIDR	4.668568e-05
8,351	PalimpChat: Declarative and Interactive AI analytics	2025	SIGMOD	4.5340791e-05
8,479	Can Large Language Models Be Query Optimizer for Relational Databases?	2026	SIGMOD	4.4967983e-05
8,732	Unveiling Challenges for LLMs in Enterprise Data Engineering	2026	VLDB	4.4520434e-05
9,153	Unify: A System For Unstructured Data Analytics	2025	VLDB	4.380727e-05
9,728	Semantic Integrity Constraints: Declarative Guardrails for AI-Augmented Data Processing Systems	2025	VLDB	4.2901665e-05
9,971	KathDB: Explainable Multimodal Database Management System with Human-AI Collaboration	2026	CIDR	4.1905499e-05
9,989	Deep Research is the New Analytics System: Towards Building the Runtime for AI-Driven Analytics	2026	CIDR	4.1905499e-05
9,993	BridgeScope: A Universal Toolkit for Bridging Large Language Models and Databases	2026	CIDR	4.1905499e-05
10,064	Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees	2026	SIGMOD	4.1905499e-05
10,112	SEFRQO: A Self-Evolving Fine-Tuned RAG-Based Query Optimizer	2026	SIGMOD	4.1905499e-05
10,144	Beyond Relational: Semantic-Aware Multi-Modal Analytics with LLM-Native Query Optimization	2026	SIGMOD	4.1905499e-05
10,212	SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads	2026	SIGMOD	4.1905499e-05
10,215	Task Cascades for Efficient Unstructured Data Processing	2026	SIGMOD	4.1905499e-05
10,277	SemBench: A Benchmark for Semantic Query Processing Engines	2026	VLDB	4.1905499e-05
10,285	Relational Deep Dive: Error-Aware Queries Over Unstructured Data	2026	VLDB	4.1905499e-05
10,758	QUEST: Query Optimization in Unstructured Document Analysis	2025	VLDB	4.1905499e-05

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 4 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank	Cited Paper	Year	Venue	Pagerank
1	Access Path Selection in a Relational Database Management System	1979	SIGMOD	0.0040465394
1,505	Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes	2023	CIDR	0.00011601232
2,025	From Natural Language Processing to Neural Databases	2021	VLDB	9.7477788e-05
3,819	Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction	2022	VLDB	6.7267885e-05

Semantically Similar Papers

Overall Rank	Paper	Year	Venue	Pagerank
10,806	Smart SPARQL Advisor: Guiding Users in Query Formulation with Performance Prediction	2025	VLDB	4.1905499e-05
10,803	A Demonstration of QueryArtisan: Real-Time Data Lake Analysis via Dynamically Generated Data Manipulation Code	2025	VLDB	4.1905499e-05
1,505	Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes	2023	CIDR	0.00011601232
10,901	Welding Natural Language Queries to Analytics IRs with LLMs	2024	CIDR	4.1905499e-05
3,862	OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment	2025	SIGMOD	6.68436e-05
8,479	Can Large Language Models Be Query Optimizer for Relational Databases?	2026	SIGMOD	4.4967983e-05
4,529	Hybrid Querying Over Relational Databases and Large Language Models	2025	CIDR	6.1057096e-05
4,735	AutoTQA: Towards Autonomous Tabular Question Answering through Multi-Agent Large Language Models	2024	VLDB	5.9538651e-05
9,454	An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models	2024	VLDB	4.3358002e-05
5,429	Logical and Physical Optimizations for SQL Query Execution over Large Language Models	2025	SIGMOD	5.511638e-05