Archives of the TeradataForum
Message Posted: Thu, 20 Sep 2001 @ 23:08:48 GMT
There are at least a couple of possibilities. Your work load appears to be skewed. The following is not an all inclusive list, but things to look for.
1. Your data is not evenly distributed, so the amps holding the data work, while others having no data have no work. Or, some complete their work before the rest. Does the system appear to have even work load at first and then as the work continues there are some amps doing nothing while others are still working? (hot amps) Check for work load distribution across the entire period of work load, noting whether all amps are active in the beginning. You can check for even data distribution by determining the distribution of data for the associated tables:
select Vproc, currentperm from DBC.TableSize where databasename = yourDB and tablename in (list the tables involved in the work load) order by 1,2;
The space usage should be similar across all amps. If not you may need to rethink your primary index choices.
2. Even if your data is evenly distributed in the tables, the tables being joined in the work load may not share primary indexes. This causes data to be redistributed across the system. Depending upon your join criteria and WHERE clause constraints, you may distribute only a portion of the table(s), which pulls an uneven distribution of the values in one or more tables. The spool then resides unevenly across the amps. Watch the distribution of spool across the run time of the job. The following can be run during the work load:
select Max(CurrentSpool), Min(CurrentSpool), Avg(CurrentSpool) from DBC.DiskSpace;
If you have cleared peak disk prior to the work load you can use
select max(PeakSpool), Min(PeakSpool), Avg(PeakSpool) from DBC.DiskSpace;
This can be done after the job as it will check the peak spool usage.
The Max and Min should be close to the Avg in either case. If not, you may be able to affect it by choosing a different primary index on the tables. If possible you want joined tables to have the same PI so that joins are amp local. This, obviously is not always possible, and it is more likely that the pattern you are seeing is going to be the pattern for this query due to what data is being used.
3. Do you have statistics collected on indexes and columns used in joins and WHERE clause constraints?
4. Are you familiar with the data being used to constrain the query? For example, are you selecting only rows in a table where a code value = a and there are only 3 values in the column, 90% of which = a, and the other 10% = b or c? This table may be redistributed in spool unevenly depending upon what you use from the table.
5. You may also be dealing with only large tables since you indicate that overall amp usage is low. Tables not cached will result in lots of disk activity relative to CPU as you scan the tables. Skewed data would explain the uneven distribution of disk and amp work. Work may also include little aggregation, thus not working the CPU's, but working the disks hard.
6. If you have multiple queries that use the same tables, you may want to validate that you have synch scan enabled, and attempt to schedule the work in such a way as to have multiple queries using the same tables at the same time. They will then have a chance to be handled without rereading, thus reducing your disk work load.
NOTE: This is not the same as having table held in cache. Large tables probably won't be cached, but will be available for synchronized scanning. See the explain text.
Just a few thoughts. Tracking this stuff down is always a bit of a scavenger hunt, but very rewarding once it quits hurting. (like the head ache that goes away when you quit beating yourself in the head)
|Copyright 2016 - All Rights Reserved|
|Last Modified: 28 Jun 2020|