Tuesday 23 December 2014

Splitting Strings in PowerShell

I had a need to split some strings the other day, which turned out to be a little more complicated that I was expecting.  Take this string:

$string = "this is a very nice string"

Splitting this on, say, the letter 'v' is as easy as I thought it would be.

$string.Split("v")

This is a

ery nice string

Nice.  I had to split on a string though, such as "very".

$string.Split("very")

This is a



 nic
 st

ing

Uhm, what happened there?  Well, calling the Split() method as we have done actually passes an array of characters rather than a string.  This is an important distinction as we're now splitting on 'v' or 'e' or 'r' or 'y' instead of "very".  There is a simple alternative in the -split paramter.

$string -split "very"

This is a

 nice string

That's the result I was hoping for!  Job done.

Not really.  .Split() was nagging away at me.  There must be a way to make that work?  Where do I find out if there is any way to call Split() with a string?  From MSDN.  MSDN is a great resource.  It's not the first place I would go to when I need a tutorial or I'm picking up a topic for the first time but if you have a good idea of what you're looking for and need a detailed reference guide, it's an excellent place.

Searching MSDN for "String.Split Method" gets me here.  This page lists all the overloads for the method.


This shows that there is one overload method that has a single argument and that argument must be a character array.  This explains the behaviour we saw above.  However, looking down that list, there are two overload methods that both support string arrays as separators!  Exactly what I was looking for!  

Split(String[], StringSplitOptions)
Split(String[], Int32, StringSplitOptions)

Both these overloads require the StringSplitOptions argument.  This is defined here but basically all this argument defines is whether or not we want blank entries returned in the array or left out.  The second overload also contains an Int32 argument which can be used to define the maximum number of substrings that are returned.  So if we only wanted the first substring returned after the split, we could use a "1" there.

StringSplitOptions are defined as follows:

[System.StringSplitOptions]::None
[System.StringSplitOptions]::RemoveEmptyEntries

The first option returns array elements that include blank strings, the second one removes them.  It doesn't matter which we use, but we need to use one to match the overloads that allow us to pass a string.  Speaking of which, we can't just use an argument of "very" either.  You will see that there are also overloads for the character array arguments that include the StringSplitOptions argument as well.  This means that "very" will still be treated as a character array and we'll be back to square one.  We need to define it as a string.

@("very")

will do nicely.  In the end, this means we can do this:

$string.Split(@("very"), [System.StringSplitOptions]::None)

This is a

 nice string

Yay!  Now, is that easier that using -split?  Well, no, it's not.  But I scratched an itch.



No comments: