“Group By” Is Pretty Useful. Here, I’ll Show You.

Group By might seem scary at first, but that’s because it is.

Everything is difficult before it becomes easy.

You will eventually make the leap, but for that to happen, you have to know what Group By is. You can’t leap without knowing what you are leaping into, or you will leap for an escape.

I think I’ve exhausted today’s leap quota.

Grouping Stuff Together

Say you have a collection of data, and you wish to create sub collections based on certain patterns present within the main collection.

Yes its madness, but sometimes it makes sense to do so.

Here is a nonsensical example that ought to make some sort of sense to you.

DtData

The First column contains “Items” and the Second column contains “Goods Sold Per Month”.

Since this is an introduction to Group By, we will go for something easy and just find the number of months a particular product was sold.

Confused?

I told you it was a nonsensical example.

If you look at the data, you might have noticed that Pringles appears twice.

That means, Pringles brought in sales for two months.

Cheetos and Lays appear once, which means they brought in sales only during the first month.

Are you getting an idea of what our output ought to look like?

It should be:

Time, Space and Reality. I wonder how many of you will get this.

Now that we have established our example, and the nonsensical outcome we wish to achieve, let’s get to the interesting part.

Let The Grouping Commence

There are two ways to LINQ things together.

You can use the query syntax, or lambda syntax, and we are going to explore query in this article. I’ve decided to divide that as well, because I am too lazy to write that right now you won’t be able to pay attention for that long.

Information overload is never a good thing, but its a great excuse to make for content creators like yours truly.

So lets explore our Query Syntax.

(From d In DtData.AsEnumerable
Group d By k = d(“Item”).ToString.Trim Into grp = Group
Let ra = New Object(){k, grp.Count}
Select DtResult.Rows.Add(ra)).CopyToDataTable

We will analyze the line of code highlighted above, as that is responsible for the shenanigans we are about to witness.

Group DataRows By {Key}

The d in

From d In DtData.AsEnumerable

picks up pieces from the DataTable we have minced into a collection of DataRows.

Those pieces are brought together by the Group By Method into separate collections based on the “Item” Column.

This is what our Grouped collections will look like:

Also, this is wrapped up inside of an anonymous function, who prefers to remain anonymous.

It’s a jumbled mess, but it tells us that our DataTable was minced and separated into separate portions. We can see the Pringles has two rows assigned to one portion, while Cheetos and Lays had one assigned to each.

Your favorite snacks have been wrapped up nicely.

And each Group has its own “Key”, or reference with which we can pull out the groups we desire from the jumbled up mess we pushed it into.

Or maybe we want some other info from the group, like how many elements are present in that particular group.

Where is the Key?

{Key} -> d(“Item”)

This tells us that we wish to group all elements based on the Item column. What that means is if there are duplicates present in our Item column, then all row items pertaining to that column gets wrapped up into a single group.

It’s not necessary that the rows in it’s entirety have to be duplicates for them to get grouped. The grouping is performed on the basis of the key we provide(Item Column), and the moment our complier encounters a duplicate item in that column, the entire DataRow is grouped along with the others.

Sure, they may be of different flavors, but they are all Pringles!

Refer To Me By My Alias

We use them whenever required, which happens to be a lot.
Take a look at the code given below:

(From d In DtData.AsEnumerable
Group d By k = d(“Item”).ToString.Trim Into grp = Group
Let ra = New Object(){?,?}
Select DtResult.Rows.Add(ra)).CopyToDataTable

When I first started learning Group By, the elements highlighted above kept giving me Brain damage.

Me: “Ok, we group our collection of DataRows by the mystery variable k and assign that jumbled up mess to the Item Column which makes absolutely no sense, and then into…wait, Into? Well, would you look at that, it’s going into another mystery variable…which is again assigned to a Group.

So we are grouping stuff, assigning them to mystery variables, wrapping them up into another mystery variable only to group it all over again.”

I should have stuck with Mechanical Engineering.

But that confusion is no more, since I know now, what I didn’t know back then.

I will save you from the misery of having to decipher them from scratch.

Group d By k = {Column}

When you group d By k, you are grouping d by what is being assigned to k.

Group d By d(“Item”).ToString.Trim

is the same as

Group d By k = d(“Item”).ToString

The obvious question then would be, “WHY?!”

The “k” will store our Key, that’s why.

Ohh!

Yeaah.

But…WHY?!

Lets look at the code once more.

(From d In DtData.AsEnumerable
Group d By k = d(“Item”).ToString.Trim Into grp = Group
Let ra = New Object(){?,?}
Select DtResult.Rows.Add(ra)).CopyToDataTable

And particularly, the First question mark in:

Let ra = New Object(){?,?}

If you are familiar with LINQ, you might assume that we could simply add:

Let ra = New Object(){d(“Item”).ToString,d("Items").Count}

And you would be wrong.

Once you group a collection, you cannot access the individual column values like you used to. They have been mashed up into separate portions, which is why you can’t retrieve them.

Instead, you have to use the Group Keys, which is why we added the Alias in grouping.

{k,?}

So part one is done.

Question Mark No.2

We set out on this journey to discover the number of times each element repeats.

This is where we achieve that.

grp.Count

It’s a simple bit of code, but you have to understand why it works.

Group blahblah By blah = blah(blah).Toblah Into grp = Group

We have used another alias here so that we may access the new grouped collection and get whatever we are interested in, from that entry point.

Without the alias, we are merely grouping data by the required key or keys(Yes, you can group on multiple columns as well, and we won’t go into any of that shenanigans today) and pushing it into an IGrouping which unifies everything into the jumbled up mess we witnessed earlier.

This is a slightly better representation of what is going on:

Good thing I know PPT.

They have been divided into segments, and thanks to that, we can find the count of each group simply by invoking the .Count Method.

Grp.Count
//I think we've met somewhere before...

Which brings us to the entire solution:

(From d In DtData.AsEnumerable
Group d By k = d(“Items”).ToString.Trim Into grp = Group
Let ra = New Object(){k, grp.Count}
Select DtResult.Rows.Add(ra)).CopyToDataTable

And there you have it.

Results are what matter.

In a coming article, we will attempt to develop a lambda syntax solution for this problem.

Leave a Comment