In today’s world IT professionals may find themselves involved with moving massive amounts of data more than in days past. With cheaper servers, open source operating systems and open source databases, you may be asked for example to move from Oracle on SPARC processors running Solaris to PostgreSQL on Intel processors running Linux. With database virtualization you might be asked to migrate a database from “raw iron” (i.e. dedicated physical server) to a hypervisor. And now with the cloud you could be asked to relocate a database from on-premise to the public cloud. The point is that today there are many reasons to move entire databases from source to target. Moreover there could be reasons to move large portions of the data as well. You might be following good data life cycle management guidelines and moving older data to slower, less expensive secondary and tertiary storage (e.g. FLASH/SSD -> 15K SAS -> 7200RPM SATA). Likewise you might be using the Microsoft SQL Server “stretch database” concept to keep hot data on-premise and warm to cold data in a public cloud, regardless of whether your chosen database platform inherently offers that capability. For as myriad of reasons you could be asked to move mountains of data.
So what tool should you use? If I were doing some landscaping on my yard and the goal was to selectively place some new dirt in key locations to level my yard then I would use a shovel and wheel borrow. If instead I needed to raise the entire yard by two inches due to erosion, then I would probably employ a dump track and bulldozer. My point – always use the right tool for the right job. Moving data is no exception. I generally like to break the data movement tools into two categories: highly selective as with a rifle vs. scattered with a shotgun. When I hunt bear I need bullets to penetrate the thick skin and do large amounts of internal damage. Thus I need a bolt action, high powered rifle. When I hunt pheasant however I need to be able to shoot them as they take flight and accuracy cannot be counted upon. Thus I need a semi automatic shotgun with wide choke (i.e. spread of the pellets) and repeat fire in case I miss. I believe that the figure below very simply and effectively distinguishes their key difference.
Large data migration projects are no different. Common ETL (extract, translate and load) tools are far more like the rifle than the shotgun. They are better at per table operations and complex, conditional logic. They are good moving a table or small groups of related tables from source to target. But asking these tools or those using them to define projects to move hundreds or even thousands of tables is not just Herculean, but in most cases simply overwhelming. You need a data shotgun. You you need is Cirro’s Data Puppy. Data Puppy has been designed from the ground up to move large numbers of tables from source to target while using patented algorithms to handle any and all required implicit data conversions due to internal implementation differences across database platforms. Moreover Data Puppy possesses a highly scalable, threaded processing design which enables leveraging today’s computing power. Want to move 15,000 Peoplesoft tables on Oracle to PostgreSQL, you simply define a Data Puppy project and run it. For example doing exactly that took me about 5-10 minutes to define and an hour or so to run. So choosing the right tool was critical in getting this task done on time and within budget.