The Periodic Table Of Data – Pipe Dream or Possibility?

By Adrian Stokes

I work in a financial environment in a data warehouse creating business and management intelligence solutions in a mortgage and savings environment. As a tester, I don’t want to just help deliver the best value I can to my team and the customer. I want to be a facilitator of learning and deliver value to myself and my company. My job title is Quality Assurance Analyst and while some have the perception that ‘QA’ equals ‘tester’, as well as that important aspect I try to add value by improving ‘how’ we work. So I’m always looking for opportunities to add value and improve what we have.

Data can come in many forms, shapes and sizes. In Star Trek: The Next Generation, for instance, Data is a Lieutenant Commander on a brilliant Sci-Fi TV series and later spin-off films. The idea of a sentient robot/android with a Pinocchio complex is an intriguing fantasy of a possible future. Today’s ‘Data’ coming out of software systems, like Data after his ‘emotion’ chip can be so annoying! There’s loads of it, everywhere you go, everywhere you look and every time you do something, more is generated. And that’s before you even think about all the logs that lie alongside the system! Unfortunately, our ‘data’ is not able to communicate with us as easily as the Commander and we have to find our own ways of analysing it and finding its properties and values. With that in mind, I’ve been thinking of ways to improve how we categorise the data we have, in order to make it easier to use.

I have an idea and just maybe, it’s a possibility, not a pipe dream: It would be nice if someone could come up with a periodic table of data or perhaps a set of them, for different subject areas!

I was told in school (and it’s supported by Wikipedia (1) so it must be true…) One of the main values of the periodic table was being able to predict the properties of an element based on its location in the table. Unfortunately, that’s where my chemistry knowledge stalls but the principle of arranging your stakeholders, customers and users in a table still applies. Maybe it can also apply to your data! Imagine if you have a new data source to bring into your warehouse. How cool would it be to look at its properties and understand how it aligns with your existing data easily? To be able to know what areas you can enhance or expand into quickly before engaging in lots of analysis.

Figure 1 shows a very basic example for a mortgage data set that shows a few different areas that could be considered. I’ve colour coded the different areas like ‘people’, ‘payments’ etc and used shades for the variations. I appreciate this is crude and needs work but that’s where the testing community comes in. I’m sure there are many of you wiser than me who could improve this?

Figure 1. Periodic table of data – An idea by Ady Stokes

So now we have our premise, let’s take some example of how it would work in the real world: You have been asked to bring in a new data source from a large online retailer. This new data source could have such data implications as customer details, contact numbers, tables for both time of day and contact information, how they paid if they wanted standard or express delivery. There could be a history that offers a chance to make recommendations or the customer may have defaulted on a payment or had many returns.

Figure 2. New online retailer data source

In this new example (see figure 2), you can visually see that types of payments align with your transactions and payments. Customer details can be expanded and possibilities of how to use that new data alongside existing data and where they already match are quickly identified. Obviously this is just a small sample of the data you would see but hopefully, it is enough to get the concept. A periodic table of your existing data like the first example could save days of analysis to identify new possibilities for using your new data shown above. Looking at the original and new data characteristics show easily how these align and from there analysis is guided.

For those familiar with the world of data warehouses this could mean new data sets or extensions to dimensions or fact tables simply by seeing what fits where. It could align with your customer dimension to increase your effectiveness in communications. Confirm you are ‘treating customers fairly (2)’ by trending contacts via phone, post, email and letter. See how to best align your product offerings with potential customers.

Another example we could use is a travel company. As well as the standard customer details there could be favourite times and locations to consider. How many are in the party and what level of package they would prefer. The level of service they desire could link directly to the amount they pay so could align with your payments section to offer new ways of looking at customers. Perhaps those choosing half board would go all inclusive if given longer to pay?

Figure 3. New travel company data source

Again you can visually see (see figure 3) how the new data lines up with your existing customer profile. Or if you are just starting to create your data table an example of another could help you decide how you want to proceed and represent your data.

It’s like getting a new Lego set and working out if you can improve your existing models with it! Mapping the new data source against your existing table would expose what areas were affected and give a quick view of how the new data could enhance, improve or offer new opportunities for analysis. And once you understand the concept it should not take too long either!

A basic set of data could be common to most environments with perhaps a few tweaks. Customers, payments, clients and elements of time, be it payment schedules or just dates are familiar with nearly every project and product. Then variations on that common or generic set would apply like arrears or preferences etc. depending on what mattered to your business.

I believe there are all sorts of possibilities for the Periodic Table of Data’s use depending on your environment. Any company that uses MI (Management Information) or data tracking to help make strategic decisions could make use of this. The main thing to remember is that this is an idea rather than a presentation of a technique or fully developed system. In offering it to the community my main hope is the readers will pick up the ball and run with it! If you think you can use this, improve this or make it work in your environment please do. I’d love to know how you get on. It may be I’ve gone into too much detail or not enough? It may need the areas or ‘shades’ redefining for your world? As I said, I’m sure the testing universe will have its thoughts and opinions and I’d love to hear them. My personal mail is adystokes@sky.com and I’d love to hear your thoughts.

Perhaps one day the Periodic Table of Data will seek out new paths and new learning. It may boldly go where no visual aid has gone before! Or not? We will have to wait and see.

References