Analyze the workload with exploration mode

We'll use the same notebook as before to analyze our new bao_with_regblock.txt results. You'll need to change the SHOW_RG = False line in the 2nd cell to SHOW_RG = True to plot both our previous run (without exploration mode) and our new run.

First, we'll look at queries completed vs. time.

Queries vs. time

The new green line shows the performance of Bao with our three test queries entered into experimental mode. In terms of overall workload performance, exploration mode doesn't help all tha tmuch: the workload finishes only a little bit faster.

Next, we'll look at the query latency CDFs.

Query latency CDF

The green line shows that tail latency has been significantly reduced, which is accounted for almost entirely by avoiding a few regressing query plans. We can verify this with the same table we looked at before:

PG Bao worst Bao best Bao + E worst Bao + E best
Q
q1 275.415884 12.206382 6.005776 12.455495 6.148398
q2 71.049927 198.310226 9.242487 72.339166 10.068161
q3 10.982070 290.048801 10.805816 14.475681 5.646478
q4 26.890862 26.966064 1.527303 26.190447 1.468367
q5 9.692364 9.354480 1.350012 6.007022 1.319892
q6 21.741243 19.851484 7.341236 24.368830 8.481136
q7 51.935738 51.321676 7.288905 53.320758 8.071399
q8 28.725613 15.981973 5.995388 25.003634 5.759860
q9 15.645138 16.394102 7.327004 15.451688 7.699232
q10 11.720967 9.688347 7.373339 23.513964 7.451328
q11 15.163100 7.686548 5.853226 14.888963 10.835752
q12 12.934380 9.379889 4.565600 16.264112 4.621375
q13 18.687008 11.803825 3.417922 6.843567 3.754824
q14 11.100027 14.864732 7.060695 15.598768 7.015126
q15 9.641760 8.258874 4.153027 14.213644 3.983182
q16 5.312640 7.992982 1.221813 6.572023 1.271791
q17 6.404161 17.702658 5.868285 18.423400 5.759903
q18 11.912653 20.336241 6.772051 21.853793 6.889149
q19 9.943220 33.939818 10.330661 21.690789 10.453242
q20 0.143906 0.679753 0.344254 0.460239 0.357460
q21 1.022706 1.292618 0.921263 1.653735 0.884557
q22 16.113360 51.231996 8.196555 51.013479 7.209244
q23 12.050350 13.501636 7.194857 12.140767 7.034203
q24 0.025990 0.153196 0.100763 0.181307 0.107129
q25 3.906976 5.511800 2.178743 3.643400 2.291787
q26 10.439918 16.880665 7.772304 12.206994 6.855079
q27 0.759958 1.491062 0.461650 1.744088 0.467688
q28 1.784515 2.679448 1.671500 2.584871 1.753601
q29 0.279165 0.263327 0.113964 0.330460 0.128979
q30 6.967469 7.600260 5.197018 9.345165 5.021910
q31 1.877799 3.540210 1.459715 3.384164 1.439300
q32 0.981562 7.478595 0.731711 1.171264 0.626129
q33 2.215288 4.805330 1.749660 4.287353 1.832341
q34 5.175736 8.535833 2.692968 8.329981 2.556046
q35 6.402113 14.327323 5.302665 10.737619 5.267236
q36 11.992452 15.665057 8.864666 30.359994 10.390017
q37 12.208148 22.470210 9.716367 20.191838 10.248038
q38 13.334725 34.440121 10.314824 17.423221 10.154855
q39 8.051381 16.096334 8.621814 19.279665 8.618636
q40 14.709921 19.853354 11.598819 73.711118 10.908722

The first column shows the latency from the query plans produced by the PostgreSQL optimizer. The next two columns show the latency from the query plans produced Bao optimizer. The final two columns show our new results, the latency from the query plans produced by the Bao optimizer with exploration mode.

The large regressions on query 2 and 3 are eliminated, with both having a much more reasonble worst case time.