A brilliant new feature is now available in Power BI – Split column into rows. To understand why we need it, you must go and read the article – Analyzing badly captured Survey data or feedback forms. This method used Power Query concepts of Split and Unpivot. Now these have been combined into a single, intelligent command called Split columns into rows. It sounds confusing at first. But soon you will realize that it is an amazing tool. Learn it just 4 minutes.
Raw data looks like this
And you get a report like this. No need to use formulas or do any manual work.
You must have the May 2017 update for Power BI Desktop installed.
Split column into columns is not useful
The data is already opened in Query editor. Right click on the Products column and choose Split column – By Delimiter. The default option is to split it into columns.
But we do not want more columns. Why not?
Because if the data is split into additional columns, I still cannot create a report for how many people are using each product. Remember our rules for Good Data? One rule is that one meaning should go into ONE column. The data we have contains Excel (all products actually) in more than one columns. This will make you break your head in creating some weird and complex formulas – which is a total waste of time.
Split column into rows
What we really want is that each product and user should have a new row. It may sound very complex but it requires just a few extra clicks.
In the Split dialog, choose Advanced and then choose Rows
Now the data looks like this. I agree that all other columns have to repeat themselves and it will increase the storage requirements to some extent. However, because there is duplication, compression takes care of the file size optimization.
The data does not look good but trust me, input data is not supposed to LOOK good. It is supposed to BEHAVE good – by allowing us to do instant analysis with wasting time in clean up.
Now we can create the visualization shown above very easily. I am sure you already have some data which has this type of delimited column. Try it out and celebrate the efficiency increase.
You will have seen many DAX ways of doing this using CROSSJOIN etc. But as this is usually a data clean-up activity, we should not create DAX overheads when Power Query manages it so well.
Multiple splits are dangerous
Now consider the same data with one more column where Split column into rows is applied.
The resulting data is difficult to analyze. Here neither COUNT nor COUNT (Distinct) will work properly.
I have not yet found a solution to this problem. I am researching it. In case you have any ideas, please post them as comments.