Tag Archives: data cleanup

Consolidating crosstab data

This is a continuation of the Data Cleanup series. One common form of bad data is called crosstab. Consolidating crosstab data requires lot of manual effort. Pivot – Multiple Consolidation Ranges is a fantastic feature available in Excel which can do this job in a snap. Here is an example of cross tab data…


Let us see how to get this done.

Consolidating crosstab data
Continue reading Consolidating crosstab data

Data Clean-up: System Reports

Data Clean-up: One of the worst formats to get raw data is the system reports. These are reports you generate from business automation software. Export to Excel / CSV option dumps the report into a file instead of a printer. But the report is originally designed for printing. Now you waste too much time removing unwanted headers and footers.

This article shows a quick method, and more importantly, a concept which can be used in various clean-up scenarios as well.

Data Clean-up

Continue reading Data Clean-up: System Reports

Miraculous Automatic Data Cleanup in Excel

This is the continuation of data cleanup series. In this article we discuss a newly introduced feature “Flash Fill” which is almost miraculous and difficult to believe. When I demonstrate this feature during my seminars, most people suspect that I am playing some kind of trick!

It opens up a new way of handling data cleaning tasks which would have either been very time consuming or impossible using existing methods.

Data Cleanup

Continue reading Miraculous Automatic Data Cleanup in Excel

Good Data vs. Bad Data: How to identify?

This article shows you simple and immediately usable methods of understanding whether the data you have is good or bad.

This article has intentionally been kept short and to the point so that it is practically usable as a set of best practices.

A more detailed version of this article is being written by an expert in the field, which will be also be published as a guest post soon.
Continue reading Good Data vs. Bad Data: How to identify?

Analyzing badly captured survey data

All of us conduct surveys, get feedback forms or scoring sheets filled. When data is captured in Excel, we typically use Excel Pivot Tables to analyze it.

But what if the data is captured as a list of textual values like this? Many professional survey applications often dump the data in this type of format.


Here is a very simple but extremely useful way of analyzing such badly captured data.

Continue reading Analyzing badly captured survey data

Excel data cleanup with Flash Fill


One of the participants (Mr. Amjad) in a seminar I conducted asked me this question. The scenario is common enough. But the solution is not even noticed – it is called Flash Fill. Instead, we struggle to solve it manually (and inefficiently).

There is data in one column which contains name followed by amount. We want to split the amount.

flash fill

Doing it using a formula is tricky. Because in some cases there are two spaces, in some cases there is only one space. And may be if the data is large, there could be three spaces as well.

Easy solution: Excel 2013 – Flash Fill

This is an amazing new feature. I will cover it in detail in a separate article. However, here is the way it works.

In column E, specify what you want. Just type the value. NO formula.

flash fill

Then choose Data tab – Flash Fill

flash fill

Do the same thing for the name in column F. That’s it. Excel has an amazing pattern matching engine which does the job for you.

flash fill

What if you don’t have Excel 2013? Use Word!

Then you will have to resort to complex formulas or VBA. But not really. We often get too focused on the product and forget that Office has other tools as well.

Remember. Excel is great with Numbers and Dates. But not text.

Who is the text expert? Word Logo

In fact what Excel 2013 Flash Fill does is technically called pattern matching (Regular Experssions). But Word had this feature since many years.

Let us see how to solve complex Excel data handling problems in Word.

  1. Copy the column data and paste it in Word
  2. It will be pasted as a table. But the table borders will not be visible.
  3. Click inside the pasted data, you will see the Table Tools tab
  4. Click the Layout tab and choose View Gridlines
    Now you can clearly see the tableimage
  5. We want to select only the numbers
  6. Press Ctrl H to open the Find – Replace Dialog
  7. We don’t want to replace anything so click the Find tab
  8. In the Find textbox, type this exactly as shown here [0-9]{1,}
  9. Click the More>> button and choose Use Wildcards optionimage
  10. Now try Find Next multiple times. Notice that it is selecting all the numbers one by one
  11. But we want to select all numbers. So open the Find In dropdown and choose Main Document
  12. Now all the numbers will be selected. Close the Find dialog.image
  13. Press CTRL X to cut the numbers and put them in clipboard
  14. Only the names are left in the first column.
  15. Now add a new column.
  16. Select the second column
  17. Press CTRL V to paste the numbers
  18. Now select the whole table and paste it into Excel
  19. Job done!

For the sake of completeness, the names column will have an extra space at the end. Add a new column and use the TRIM function to remove the trailing spaces if required.

Sample Word file for practice Download

How did this work?

The magic happened in that complex looking search criteria


Wildcard option means do not search what is typed literally. Try to interpret it.

[0-9] Means search for any number between the given range. But that would select only one digit.

{1,} Means that Word should search for one or more instances of number. That is how it selects the entire number

Want to know more?

Click in the Find textbox in Word, enable Wildcards and click the Special button.

You will see the entire syntax of what is possible. This is infinitely powerful and easy to use once you understand the concept. Try it out.

Character in Range