Home Page for the TeradataForum
 

Archives of the TeradataForum

Message Posted: Tue, 27 Jan 2004 @ 18:34:20 GMT


     
  <Prev Next>   <<First <Prev Next> Last>>  


Subj:   Re: arcmain throughput
 
From:   Dodgson, Mark

Christopher,

We're seeing a similar throughput to you on MVS - approx.15Gb/hour, writing to (in the main) VTS Magstar 3590 units, with some native 3590 drive access as well for larger files that aren't too efficient on VTS.

As for maintaining your 800Gb table, archiving this via a standard all-AMP archive is not really a feasible option (as you've already identified). To get around this problem for DR purposes on our system, you could take the following approach (where TableA is the large table):-

i) introduce permanent after-image journals on TableA (source system), and implement daily housekeeping jobs to checkpoint/archive/delete the journal daily (or consistent with table refresh frequency).

ii) perform one-off cluster archive from source server, of TableA.

iii) sequential load (i.e. all-AMP Copy) these cluster arc files to your DR box. It doesn't matter that your DR cluster config is different to the source server. (n.b. It's essential that your DR TableA is built from dictionary archive of source TableA). This may all take a little while to run (see iv), but this is a one-off load event. Code the arc copies with NO BUILD, and run the final build once all cluster loads are complete to construct the DR version of TableA.

iv) journal load (to your DR box) and rollforward of journals taken from (i) to bring DR data in-line with production. It will obviously take more than a day or two to get all of the 800Gb loaded and rolled-forward, so you may need to run two or three load/rolls in order to bring the DR table into synch with your source system.

v) implement production jobs for journal load and rollforward to DR server, of production journals (taken daily).


From this point on, both systems should be synchronised with ongoing synchronisation via the production jobs you've implemented in (v).

One of the major benefits here (apart from the obvious DR ones), is that you can now implement cluster archives from your DR version of TableA, as the data should be synchronised with the production version. This will free up time in your production batch window, and removes the archiving task from the production server. Bear in mind, that the number of journal files you retain, will reduce the dependency on full-table archive (keep it realistic though - it wouldn't be much fun trying to catch-up a whole months worth of journal files if your source TableA went bananas at the end of a month!)

Some of the downsides to consider :

i) journal processing overheads
Can be heavy with BTEQ processing - especially delete, and to a lesser extent, updates. Multiloads are cool though - almost no noticeable differences.

ii) your DR data is read only!!
Obvious thing to say really (well, in hindsight!), but you can't do any inserts to your PJ maintained TableA on your DR server where you're adding a new primary index value. You may get away with non-PI column updates, but if you insert a row, then you run the risk of corrupting the uniqueness value per row-id (cheers to Jon Christie for finally getting to the bottom of this for us a little while ago). Rule of thumb - if your table is PJ maintained, then it's read-only. Better still, throw an access lock view against it and tie down access rights across your user community to make sure it stays that way.

iii) DDL changes in your application code.
If your application code on the source server includes ANY sort of DDL statements (including drop/create SI's, table renames, etc), then you'll need to run the same DDL change against the DR version of TableA (or better still, redesign the app to remove the DDL). Otherwise, you're going to have table version mismatches, and Teradata won't rollforward the source sys rows into DR as it has no confidence that the row layout between the two is identical. Which is fair enough really.


Obviously, you'd be very wise to implement source vs DR TableA checks, to ensure that your journal maintenance routines continue to 'do the business'. Ignorance can be bliss... but not with PJs unfortunately. They're a bit of a 'black-box' in that you can't easily interrogate the PJ table, so you'll need to keep an eye open to ensure that synchronicity is being maintained.

Hope this helps.


Cheers,

Mark



     
  <Prev Next>   <<First <Prev Next> Last>>  
 
 
 
 
 
 
 
 
  
  Top Home Privacy Feedback  
 
 
Copyright for the TeradataForum (TDATA-L), Manta BlueSky    
Copyright 2016 - All Rights Reserved    
Last Modified: 15 Jun 2023