Let’s talk about mini-marts. When we build out BI,
traditionally, we work towards launching the data-mart; a complete and cohesively
built star or snowflake schema that allows us to create reporting and self-service
areas to launch to a complete business area.
This is a wonderful endeavour, and I need to ensure that
everyone knows this is not something I am against at all. What I have found
though, in my experience, is that when working with a small team, or starting a
BI program, it creates a far-reaching target to get to, before you can show
value to your consumers, in this case, your internal or external clients that
consume your data from the mart.
I need to also add the caveat right near the top here, that
if you and your team are not documenting what you are doing, even with
something as simple as a text document that highlights the purpose of every
action along the way, you are setting yourselves up for failure. Full
transparency within your team, is a must.
To build value rapidly, as we do in AGILE and in DevOps, and
to achieve the shortest turn-arounds for our clients, then, we deploy
mini-marts. We concentrate on the value that the business users identify as
being of the most import to them; the areas that they already use to describe
their business. We usually target areas that are being manually compiled by a
clerk, team lead, or manager, by collecting data from pre-built queries and
then cleansing it through a manual process, usually in Excel.
We identify all the areas and fields from the transactional
system, as well as any low-hanging fruit that is adjacent, and we concentrate
on building our mini-mart based on this; the customer identified value, and a
few extras. Using the 80/20 principle, we begin the process of designing our
mart from the front end backward.
Power BI On-premises is our current reporting server of
choice in our organization, so we usually begin with a mock up, consisting of a
bunch of fake data on a single spreadsheet that feeds the visuals. This is something
that can be achieved in half a day at most, and provides our users with
something they can see, and interact with to guide the design. This allows them
to make rapid choices about what really matters to them, long before we put in
any effort in creating the ETL for the initiative.
Once they are happy and have given us the feedback, we have
a much more solid design from our users that we can then use to design the mart.
This technique is guided through having the mock up, and a conversational
meeting that is led by the 7 question technique outlined in the model-storming
process from the “Agile
Data Warehouse” by Lawrence Corr.
This model-storming technique gives us the measures and dimensions and we can design
our star schema model, usually in a visual manner, using software from QUEST or
just in Visio.
Now we know what needs to be pulled, transformed, and
prepared, and the realm building work begins. Using SSIS we will pull the data
into our landing zone in the staging instance of SQL Server. This landing zone
is a partition specifically designed for us to pull raw data into and, if we
need to, add static tables such as translation values, grouping values, or
targets that are updated yearly. (Some of which we then go on to automate
through SharePoint forms for ease of data entry.)
We use T-SQL, Python and R to affect our stage 2 ETL, and transform
and align the base tables for our model. This is the point where we are creating
the star schema, and that is located in our clean staging area of the data
warehouse. This area also doubles as our future sandbox area, when we are ready
to engage with our organization’s citizen data scientists.
This clean staging area contains mostly transformed data, although
we do not concentrate on ensuring all the final measures are there, but that
our fact tables are well built and designed for the final model. We then pull
the tables for our mini mart into a SSAS tabular model, creating the final
measures, building all our joins, changing titles to plain language, and
ensuring that the mini-mart encompasses the areas we need in the value-add
reporting initiative for our business user.
We deploy the mini-mart to our warehouse, connect it up to
the mock-up report to display real data, and then launch in alpha stage to a
limited user group that allows us to get them involved in the final data
validation of all the business logic, built into all three stages of the ETL
process. (Next week we will discuss three phase ETL, and why it is my choice
for ETL design.)
As I have outlined in the previous
post, we now go through the iterative processes and we have created value
for our users in a very short amount of time. Additionally, we have created what
I would refer to as the embryo of the greater data mart; the seed that will allow
us to build outward, in an organic and client-driven way. (Much like a fractal)
When we get the next project, initiative, or request from
the same business area, we already have some semblance of an idea of where we
are starting. We may already have twenty, forty, or even eighty percent of the
data we need for the next project completed in the existing mart. Now we build
up on it with this next initiative, adding to our dimension tables, maybe
needing to re-create them slightly to conform them to the new project, but it
is not ground up again, and we are still building value for our consumers.
This is the basic concept of building the mini-mart, and why
I use them for rapid development, and immediate value-add to our users. Instead
of spending a year to three building a larger data mart, during the first year we
launched 4 primary mini-marts. I then went on to create a second generation of
2 of them, being able to provide over 15 value-adding reporting initiatives to my
internal clients. This process allowed our minuscule team to create labour time
savings from automating manual compilation and business rules into these mini-marts
that exceeded the amount of time we spent building them, hour for hour, by over
fifteen percent!
Cheers!
SQL Doch
Very nice blog and articles. I am realy very happy to visit your blog. Now I am found which I actually want. I check your blog everyday and try to learn something from your blog. Thank you and waiting for your new post. 迷你倉
ReplyDelete