Sometimes information systems professionals can draw some very useful analogies from the hard sciences. For example physics offers us one very useful concept – the states of matter. Figure 1 shows an excellent diagram that I copied from Wikipedia explaining the four states of matter.
Looking at these four states, I’m going to suggest that basically there are really just two fundamental states of matter – bound and liberated. Bound connotes that molecular cohesion binds the matter into a semi solid to solid state, while liberated signifies that the matter overcomes such molecular cohesion to more freely “bounce around” and thus be “anti solid“.
It’s my contention that much like matter data, regardless of whether in a database or some other store, has two basic states : inert (i.e. at rest) and mobile (i.e. in motion). This simple concept can be very useful when talking about data. It provides a simple, logical point of reference that can be useful when talking about things such as tools to work with data.
Within the physics of matter framework, if I tell you that the water is in its liquid state then you instantly think of tools that might be useful to work with in in that state such as a rag, sponge, bucket, mop, wet vac, etc. If I then say we need a tool to convert the water from its liquid state to a gaseous state, again you instantly think of tools for that job such as a candle, Bunsen burner, hot plate, microwave or boiler. Thus knowing the state can help us to decide what tool to use for working with that state.
The same kind of thought process can work for the states of data. If I say that the data is at rest, then tools such as an ER diagrammer or data modeler might be useful to examine the design or structure of that data. Likewise if my goal is to put that data in motion then a script, program, utility or software tool to read the data from its current source to write it to a new target springs to mind. This process generally referred to as “data migration”. Cirro offers an extremely capable yet simple data migration tool called Data Puppy. It offers many other useful features, but it’s core function is to simply and quickly move lots of data from point A to point B no mater what A and B are in terms of databases.
Now we can further leverage the physical sciences example to distinguish between a data migration solution such as Data Puppy and an ETL (Extract, Transform and Load) tool. Returning to the physics of matter framework if I now instead tell you that we desire to deconstruct the water into its base elements (Hydrogen and Oxygen) or construct water from those same base elements, then we need radically different tools. We instinctually know that cannot use a simple tool like a beaker or Bunsen burner for this task. We need more complex tools to cause more costly chemical reactions such as an Electrolysis tank to break water apart. That’s akin to an ETL tool for data, because those tools run expensive logic on every row of data to either cleanse, separate or join data. Not only are they expensive in terms of execution, but also generally high in cost for training and time to define tasks.
So when you simply need to move data, think Data Puppy.