LINQuist: Select

With Where, we have learnt how to be selective about the data we want. It uses conditions to filter out data before boxing it up with the same paper wrapping it came in.

But Where is built to manipulate data, not the schema which houses it. You can’t filter a DataTable and compartmentalize it into a List without using something which was built to perform that sort of activity.

Select is that something we will use to perform that operations.

But Be Warned

Select is not as straightforward as Where, so you might want to set few minutes aside to read through this.

Download the workflow so that you can follow along.

Trust me, you won’t absorb this merely by reading it. I tried to learn using my all-seeing eyes, and all that did was invite a torrent of question marks into my head.

My all-seeing eyes were not all-retaining.

Is It Storytime?

Like I mentioned earlier, when I started exploring LINQ, I wasted too much time straining my eyes (hoping it will magically sprout a sharingan or two) in front of the laptop, instead of using my brain.

Your eyes are what collect information, and your brain is what processes them.

When you read something, you might think that you “get it”, when in fact you don’t.

Technical concepts only reveal their secrets when you put in the effort to understand them. Reading it is the first step, and exercising what you have read is what comes after that step.

You might understand how a particular concept works, but knowing where it ought to be applied is something that is hammered in through practice.

Let the Hammering Commence

The term “Projection” might not sound as intuitive, and that’s because it isn’t.

Let’s put it this way, you have a box of chocolates, and you want to select those chocolates which ooze with caramel when you bite into it.

Yum!

You can use the Where for this, but what if you wanted to separate them into a glass jar?

Where isn’t capable of creating that glass jar for you, but Select is.

Also remember, Select can’t filter data – all it does is wrap up data in a presentable manner, which is why the Select clause is usually preceded by a Where clause.

Let’s look at few examples.

We will be using the same dataset from earlier, so that it becomes easier for you to follow along.

We will focus on the “Genre” column and play around with it long enough to get a feel for the Select clause.

Again, since this is just an introduction, we won’t explore any mind-numbingly complex scenarios.

I’ll torture you with them later on.

Simple Select Operations

You can pick and choose columns from a DataTable and convert them into either a List/Array of Strings.

dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString).ToList()

What if I wanted more than one column?

You can either concatenate the values, or retrieve (Trauma alert) a List of Array of Strings

dt_sampleData.AsEnumerable().Select(Function(s) {s("Title").ToString, s("Genre").ToString}).ToList()

The parenthesis is important, as it encases our values into an array which our LINQ is comfortable working with.

Also, we can’t simply retain the values into a List of Strings, because although we are retrieving strings from multiple columns, those strings aren’t being concatenated but are retrieved

But none of this does us any good, since it returns EVERYTHING in that column(s).

No point in having everything,
if you can’t extract any value from it.

What if we were only interested in unique values? Is there an operation which will help us find distinct values? Could there be a Distinct clause which we could rely on?

You know the answer to that.

dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString).Distinct().ToList()

There might also be instances when you want the distinct values in a string, maybe you want to output that value into excel or a web application.

Its in times like those that the String’s Join operation shines.

String.Join(",",dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString).Distinct())

The Join operation requires two inputs – the first one tells the compiler by what character the values have to be separated by.

There are various types of Joins, which is why I specifically mentioned that the Join we are talking about belongs to the String Class.

If you don’t want the values to be separated, you can’t simply skip it – you have to provide an empty string or pass in String.Empty.

String.Join("",dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString).Distinct())
String.Join(String.Empty,dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString).Distinct())

Collecting Column Names

What, you thought Columns were safe from the Almighty Select clause?

dt_sampleData.Columns().Cast(Of DataColumn).Select(Function(s) s.ColumnName).ToList()

Before we can access the Column Names, we have to first cast the IEnumerable into DataColumn or else it will throw a tantrum.

You can piece them apart as well and retrieve a collection of column names – which can come in handy when dealing with complex scenarios which we won’t go into today.

If you are really curious, then check out my series of Intermediate Spells which you can find here.

Interesting stuff if I say so myself,
since I wrote all of them myself.

Being Selective About the Columns

No, this is not a continuation, but an introduction to another topic.

Remember, the Where is selective about the data it passes back into the DataTable, but it cannot be selective about the columns it puts back.

If you filter a DataTable consisting of ten columns, the Where clause will put those exact same ten columns back, but with equal or less row items.

It can never be more, because the Where clause shaves data, with the condition behaving as its razor.

If you want to trim off columns, then you have to use an equipment capable of cutting it down to size.

And you already know what that equipment is.

Here are are two ways you can use the Select clause to achieve the operation:

dt_sampleData.AsEnumerable().Select(Function(s) _
dt_genreFirstThreeColsResult.Clone().LoadDataRow({s(0),s(1), s(2)},False)) _
.CopyToDataTable()

And

dt_sampleData.AsEnumerable().Select(Function(s) _
dt_genreFirstThreeColsResult.Rows.Add(New Object(){s(0),s(1), s(2)})) _
.CopyToDataTable()

Query syntax also works:

(From row In dt_sampleData.AsEnumerable()
Select dt_genreFirstThreeColsResult.Clone().LoadDataRow({row(0),row(1), row(2)},False)) _
.CopyToDataTable()

And

(From row In dt_sampleData.AsEnumerable()
Select dt_genreFirstThreeColsResult.Rows.Add(New Object(){row(0),row(1), row(2)})) _
.CopyToDataTable()

What If?

Passing a condition into a Select clause won’t Where-ify it, because the Select doesn’t come with a Predicate.

Only the Where has a Predicate, which can be likened to a monster that gobbles up Boolean values like Scooby Snacks.

The Where requires those Scooby Snacks in order to operate, whereas the Select wouldn’t know what to do with it, which is why it brings it right back to us.

dt_sampleData.AsEnumerable().Select(Function(s) s("Genre").ToString.Length >5).ToArray()

I haven’t come across any situations, at least not yet, where this is useful, however it is an interesting observation.

And with that, we have covered most of the basic operations you will perform with the Select clause.

Leave a Comment