Oracle’s Parallel Query Option (PQO) is a fantastic tool. But much like any good tool, it can very easily be used in the wrong situation or simply abused. In those cases, PQO can actually make database performance much worse. Let’s examine some common misuses or misperceptions regarding optimal and efficient PQO usage.
- PQO make sense universally on any SMP or multi-processer database server.
Not really. Why use PQO on a small dual core server just because it’s available? And why use PQO on a big SMP server with lots of CPU’s, but maybe not as robust an IO subsystem – what advantage does that offer? And my favorite abuse, we have a 64 CPU server, so we’ll just define all important tables and indexes as parallel 64 (or even some lesser number significantly greater than 1). How many of those peoples’ servers will be running many concurrent queries anyhow? So why sub-divide the work and force sharing CPU’s across processes? We don’t drive the posted speed limit simply because we can if the weather makes that a questionable choice. So we should use the same discretion when planning PQO usage.
- PQO “by definition” will make any large operation faster when “parallelized”.
Right – and I have some wonderful “magic beans” to sell to people who blindly believe that theory. Oracle has to spawn additional shared server processes (or reuse those that are lying around still from recent completed PQO operations). So the cost of spawning additional processes and their footprints must be accounted for in the grand scheme of things. Plus, Oracle will need a “coordinator” process to coalesce all the results – and that too has a small cost. So calculate all this into your assumptions before using PQO. But for those of us less scientific, think of PQO as being like a grocery store checkout line. If you only have a couple items in your grocery cart, will forcibly spreading them across multiple checkout lanes get you out the any door sooner? And even if you do have an overly full cart, will spreading your items across checkout lanes that are already busy be any better? There are obvious times (i.e. rules) that are better than others. We simply need to apply the same clear thinking when using PQO.
- When I say parallel N, Oracle only exactly and only spawns N server processes.
OK – so now I also have the clear rights to sell the “Brooklyn Bridge” to people who may think this. Oracle and parallel queries are much more complicated than this. For example an ORDER BY or GROUP BY will double the number of non-coordinator processes required. So we have at least two times N now. But then there are “init.ora” parameters to limit or govern PQO usage. So some queries might get the full PQO amount, some might execute linearly, while others may use <= N processes due to such configuration parameters. The only way to know for sure is to monitor it live.
- I have a super-fast IO subsystem and multiple top-end CPU’s, so I can increase PQO usage a “wee bit” beyond my actual CPU count.
OK – I’m 100% guilty of doing this on occasion. In fact, I might even do this more often than not. But I am cognoscente of the above issues and have planned according – and still have a reasonable PQO scenario where my IO is 100% not my chief bottleneck. Thus in those clear cases I will extend beyond the actual CPU count in my PQO settings. I’ve even gone two to four times higher in the right circumstances. And in those cases it has made a clear and measureable difference. But be careful of when it does not make sense. For an easy test to verify this, simply set PQO to degree four on your dual core desktop or laptop and run several concurrent queries. It will not only run slower, but painfully so.
- I have a big multi-processer server, so full-table scans are always the way to go.
As silly as this may sound at first, I’ve seen this in many data warehousing scenarios. I think it happens because somewhere someone wrote that “parallel full-table scans were efficient”. But that author meant in clearly reasonable scenarios – and not universally. Yet this remains one of the top data warehousing tuning mistakes I see. Think about a DW query – it may scan millions of rows in tables with many billions of rows. Why force a full-table scan when there is at least a ten-to-one, if not a one-hundred-to-one, reduction factor. That’s exactly what indexes are for – and index scans can be done in parallel. I’ve seen improvements as large as 1000% from undoing this simple PQO misuse.