Analyzing the executions

With logs of our saved runs in bao_run.txt and pg_run.txt, we can next analyze them. If you are using the VM, you can move these files to /vagrant/ to access them from outside the VM.

To analyze the logs, run the analyze_bao.ipynb Python notebook. Make sure bao_run.txt and pg_run.txt are in the same folder. You'll need Jupyter to run the notebook if you don't already have it.

The notebook will first generate a queries complete vs. time graph. Depending on your hardware, the results will vary. Below is what I saw on my hardware:

Queries completed over time

The graph shows the number of queries that have been completed since the start of the experiment, including the training time of Bao. Each red circle indicates a time when query execution was paused and the Bao model was trained. Obviously, when deployed, one does not have to pause query execution to retrain the Bao model, and this retraining can either be offloaded to another machine or done at a fixed time during the day, concurrent with query processing. On my hardware, Bao executed the example workload about twice as fast as the default PostgreSQL optimizer.

Where do Bao's gains come from? Finishing the workload faster could come from making every query a little faster, or making a few queries much faster. To examine this, we look at the CDF of query times, which is the next pair of graphs generated by the notebook.

CDF of query completion time

The left figure shows the CDF with a linear y-axis, whereas the right figure shows the same CDF with a log scale. We can see that the majority of Bao's gains come from decreasing latency "at the tail" of the distribution: the 90%, 95%, etc. Bao incurs a slightly longer longest-running query (best visible on the left plot), and increases the query time of the fastest queries slightly (best visible on the right plot).

Bao increases the latency of the fastest query by a small margin because of increased optimization time. This is because query planning with Bao requires executing the PostgreSQL planner several times, and running inference through a neural network. If you care about a particular query taking 0.01 seconds instead of 0.05 seconds, you can always disable Bao on a per-query basis.

What about query regressions? The next table shows the latency of each query when using the PostgreSQL optimizer, and the worst / best time achieved by Bao.

	PG	Bao worst	Bao best
Q
q1	275.415884	12.206382	6.005776
q2	71.049927	198.310226	9.242487
q3	10.982070	290.048801	10.805816
q4	26.890862	26.966064	1.527303
q5	9.692364	9.354480	1.350012
q6	21.741243	19.851484	7.341236
q7	51.935738	51.321676	7.288905
q8	28.725613	15.981973	5.995388
q9	15.645138	16.394102	7.327004
q10	11.720967	9.688347	7.373339
q11	15.163100	7.686548	5.853226
q12	12.934380	9.379889	4.565600
q13	18.687008	11.803825	3.417922
q14	11.100027	14.864732	7.060695
q15	9.641760	8.258874	4.153027
q16	5.312640	7.992982	1.221813
q17	6.404161	17.702658	5.868285
q18	11.912653	20.336241	6.772051
q19	9.943220	33.939818	10.330661
q20	0.143906	0.679753	0.344254
q21	1.022706	1.292618	0.921263
q22	16.113360	51.231996	8.196555
q23	12.050350	13.501636	7.194857
q24	0.025990	0.153196	0.100763
q25	3.906976	5.511800	2.178743
q26	10.439918	16.880665	7.772304
q27	0.759958	1.491062	0.461650
q28	1.784515	2.679448	1.671500
q29	0.279165	0.263327	0.113964
q30	6.967469	7.600260	5.197018
q31	1.877799	3.540210	1.459715
q32	0.981562	7.478595	0.731711
q33	2.215288	4.805330	1.749660
q34	5.175736	8.535833	2.692968
q35	6.402113	14.327323	5.302665
q36	11.992452	15.665057	8.864666
q37	12.208148	22.470210	9.716367
q38	13.334725	34.440121	10.314824
q39	8.051381	16.096334	8.621814
q40	14.709921	19.853354	11.598819

We can see immediately that Bao has huge gains on Q1, which takes almost 5 minutes for PostgreSQL to process. However, on Q2 and Q3, while Bao normally found a pretty good plan (the Bao best column), sometimes Bao picked a regressing plan that took significantly longer to execute than the PostgreSQL plan.

For some applications, the raw workload speedup achieved by Bao may be more important than any of these query regressions. For other applications, these query regressions may be a huge problem. Next, we discuss how to use Bao's exploration mode to avoid these regressions.

Bao for PostgreSQL

Analyzing the executions