fbpx

Power Bi – Splitting column into rows

A brilliant new feature is now available in Power BI – Split column into rows. To understand why we need it, you must go and read the article  – Analyzing badly captured Survey data or feedback forms. This method used Power Query concepts of Split and Unpivot. Now these have been combined into a single, intelligent command called Split columns into rows.  It sounds confusing at first. But soon you will realize that it is an amazing tool.  Learn it just 4 minutes.

Raw data looks like this

Split column into rows - raw data

And you get a report like this. No need to use formulas or do any manual work.

Split column into rows - Final Charts

You must have the May 2017 update for Power BI Desktop installed.

Split column into columns is not useful

The data is already opened in Query editor. Right click on the Products column and choose Split column  – By Delimiter. The default option is to split it into columns.

But we do not want more columns. Why not?

Split column into more columns

Because if the data is split into additional columns, I still cannot create a report for how many people are using each product. Remember our rules for Good Data? One rule is that one meaning should go into ONE column. The data we have contains Excel (all products actually) in more than one columns. This will make you break your head in creating some weird and complex formulas – which is a total waste of time.

Split column into rows

What we really want is that each product and user should have a new row.  It may sound very complex but it requires just a few extra clicks.

In the Split dialog, choose Advanced and then choose Rows

Split column by Delimiter dialog

Now the data looks like this. I agree that all other columns have to repeat themselves and it will increase the storage requirements to some extent. However, because there is duplication, compression takes care of the file size optimization.

The data does not look good but trust me, input data is not supposed to LOOK good. It is supposed to BEHAVE good – by allowing us to do instant analysis with wasting time in clean up.

Good data - after Split column into rows

Now we can create the visualization shown above very easily. I am sure you already have some data which has this type of delimited column. Try it out and celebrate the efficiency increase.

You will have seen many DAX ways of doing this using CROSSJOIN etc. But as this is usually a data clean-up activity, we should not create DAX overheads when Power Query manages it so well.

Multiple splits are dangerous

Now consider the same data with one more column where Split column into rows is applied.

Two columns need split

The resulting data is difficult to analyze. Here neither COUNT nor COUNT (Distinct) will work properly.

Confusing data after two split to row operations

I have not yet found a solution to this problem. I am researching it. In case you have any ideas, please post them as comments.

Power BI logo

2 Responses

    1. Sure. Generally Power BI desktop gets updated much faster than Power Query that is built in to Excel. Furthermore, we need Office Insider, Fast track configuration for receiving updates to Office. In most companies, regular users do not have Fast deployment enabled.

Queries | Comments | Suggestions | Wish list