The tutorial delves into practical real-life scenarios of using Power Query in Excel.
In the previous article, we laid the groundwork by exploring the basics of Excel Power Query. Now, it’s time to put it to use in real-world scenarios. Below, you will find a number of examples that will guide you through the effective applications of PQ in everyday situations.
The examples assume that you have already imported your source data to the Power Query Editor. If not, you can easily catch up by revisiting our previous tutorial that details how to get data into Power Query.
To make it easy for you to follow along, we've prepared a sample workbook that you can download at the end of this post. Let's start our data journey and see how Power Query works in action!
To remove any leading or trailing spaces, or any other unwanted characters, you can use the Trim and Clean functions. Select the columns you want to clean up, go to the Transform tab > Text Column group and click Format > Trim or Clean.
To eliminate duplicate rows in your data, Power Query offers the Remove Duplicates function. Select the column(s) you want to check for duplicates, then go to the Home tab, and click on Remove Rows > Remove Duplicates.
Please keep in mind that this standard "Remove Duplicates" operation will eliminate only the rows that are identical in every way, including the letter case. For a case-insensitive deduplication, you need to modify the M code for the query. Here’s how:
In our case, after doing the standard remove duplicates operation, the Table.Distinct function looked like this:
=Table.Distinct(#"Cleaned Text", {"Full name"})
It successfully removed all the rows with the names in column A that were exactly the same, but it left some entries with variations in letter case, as shown in the screenshot below:
To fix this issue, you can add the Comparer.OrdinalIgnoreCase criterion to the function like this:
=Table.Distinct(#"Cleaned Text", {"Full name", Comparer.OrdinalIgnoreCase})
This will eliminate all the rows containing duplicates in column A, ignoring the letter case.
Users with advanced Excel skills can do this operation in the Advanced Editor by changing the Removed Duplicates line to this format:
Table.Distinct(PreviousStep, {"ColumnName", Comparer.OrdinalIgnoreCase})
Note. Sometimes, you may have to look at more than one column to identify duplicate records. For instance, if a person has different name variations like "Johnson, Bill" and "Johnson, William", you can also check the Address column for duplicates.
In case the imported data doesn't look quite right, you can easily convert it into the correct format.
In our sample dataset, the Registration Date column shows both date and time. To display only the date part of the values, you need to change the data type of the column from Date/Time to Date. This can be done in two ways:
Power Query applies the default date format from your locale (region settings). To display dates in a custom format, you can use the DateTime.ToText function. Here are the steps to follow:
The complete formula takes this form:
=DateTime.ToText([Registration date], "dd-MMM-yy")
The formula bar will show the complete formula in the M language, which will look something like this:
=Table.AddColumn(#"Previous step", "Date in custom format", each DateTime.ToText([Registration date], "dd-MMM-yy"))
Note.
By changing the data type, you can format the values in a more suitable way for your analysis. Similarly, you can change other columns to different data types, such as text, number, or currency, depending on your needs.
To split a column into two or more columns by a certain delimiter, you can use the Split Column function. For example, to split the "Full Name" column into "First Name" and "Last Name", the steps are:
Tip. If you want to preserve the original column, then duplicate it before splitting. To do this, right-click the column and choose Duplicate Column from the context menu. This will create a copy of the column with a (2) suffix in its name. You can then split this column as described above.
If some column in your dataset contains lengthy multi-part strings, you may want to extract certain information into a new column.
For example, let’s see how to extract the country name from the Address column:
A new column with the extracted values will be added to the end of the table, and you can move it to any position you want by dragging the column header.
When dealing with inconsistent or incomplete data, the standard Split and Extract functions might not work as expected.
Picture a scenario where country names within the Address column are separated by various delimiters like commas, spaces, or vertical bars. In such cases, you can rely on Power Query to extract country names based on an example you provide. This is similar to how Excel's Flash Fill feature works.
Here's how to add a new column using an example from existing columns:
You will now have a new column that extracts the country names from the addresses.
In Power Query, replacing missing values, often represented as null, is a straightforward process:
To add a new column based on a set of conditions that use existing columns, make use of the Add Conditional Column feature. For example, to add a column that assigns a donor level based on the donation amount, this is what you need to do:
| If Donation | is greater than or equal to | 4000 | then | Platinum |
| If Donation | is greater than or equal to | 3000 | then | Gold |
| If Donation | is greater than or equal to | 2000 | then | Silver |
| Else | Bronze |
This feature is similar to writing a nested IF statement in Excel, but it’s a lot easier and more convenient to use.
By default, the new conditional column will appear at the end of your dataset, and you can drag it to any position you want.
Power Query makes it easy to fix errors in Excel without spending too much time on debugging formulas or VBA code. To eliminate errors in your dataset, follow these simple steps.
To summarize or aggregate data by specific groups, Power Query offers the powerful Group By function.
For example, to calculate the total donation amount by country and donor level, this is what you need to do:
As a result, a new table will be created displaying the grouped and aggregated data. If needed, you can sort the table by one or more columns: right-click the filter arrow next to the column name and choose either to sort ascending or descending.
In this example, we get a summary of total donation amounts based on both country and donor level.
Tip. After making the necessary changes in the Power Query Editor, don't forget to load your results into a worksheet.
That’s how to use Power Query in Excel. Now that you know the basics, go ahead and unlock more data transforming secrets to impress your boss, colleagues, and clients with your data mastery :-)
Using Excel Power Query - examples (.xlsx file)
위 내용은 예제와 함께 Excel에서 전원 쿼리를 사용하는 방법의 상세 내용입니다. 자세한 내용은 PHP 중국어 웹사이트의 기타 관련 기사를 참조하세요!