One of the great things that was always a little bit fun (well, for some us… what passes for fun) is processing lists using Unix/Linux shell scripts and tools. When you are in practice, you can perform miracles; leap tall buildings in a single bound. Let’s see what we can do with a file using Powershell commands.
I received a file of “lapsed” donors. These are donors to our organization that gave to one of our campaigns in the past, but haven’t given recently. We know they were our friends in the past, and we think they still are, so we’d like to contact them for a one-time mailing, and/or add them to our master mailing lists.
The file is named “lapsed.csv”. The csv extension suggests this is a text file with a “comma separated values. And indeed, if I look at this file, at the powershell prompt, it is easily visible.
PS C:UsersLarrypowershell> cat lapsed.csv
This shows a comma delimited file with a header line:
Addressee,first,spouse,Organization,Street,City,State,ZIP
Joe Blow, Joe,,,123 W 57TH St Apt 123,New York,NY,10019
Jill Smith,Jill,Howard Services,123 Poor Farm Rd,Colchester,VT,05446
Our standard is:
Org, fname, lname, address1, address2, city, state, zip
So, among other things, we’re going to want to change the order of the information in the fields, as well as the field names.
First thing to do is to import the file into a single PowerShell variable, and then see what we’ve got.
PS>$lapsed= Import-CSV lapsed.csv
PS>$lapsed
Addressee : Joe Blow
first : Joe
spouse :
Organization : Paul C Bunn Elementary
Street : 123 W 57TH St Apt 123
City : New York
State : NY
ZIP : 10019
Addressee : Jill Smith
first : Jill
spouse :
Organization : Howard Services
Street : 123 Poor Farm Rd
City : Colchester
State : VT
ZIP : 05446
Using import-CSV, the file is converted into a series of custom objects with members that correspond to the existing field names …so further manipulations can be done using object manipulations, instead of just a bunch of searching and replacing. (Well that’s the theory anyway).
We can find out the number of records we have by looking at the count attribute.
PS>$lapsed.count
553
And we can see the “members” or field names of each record by using Get-Member
PS>$Lapsed |Get-Member
TypeName: System.Management.Automation.PSCustomObject
Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Addressee NoteProperty System.String Addressee=Joe Blow
City NoteProperty System.String City=New York
first NoteProperty System.String first=Joe
Organization NoteProperty System.String Organization=Paul C Bunn Elementary
spouse NoteProperty System.String spouse=
State NoteProperty System.String State=NY
Street NoteProperty System.String Street=123 W 57TH St Apt 123
ZIP NoteProperty System.String ZIP=10019
Ok…we knew the field names before by just looking at the raw .CSV file. But now we have seen how the Import-CSV command converts the .CSV file to an array of objects with a type of PSCustomObject. Since each address record is an object, the way we manipulate it is to use object methods.
1. Add a lname field for the last name
PS>$lapsed | Add-Member -Name “lname”
That takes the $lapsed table and pipes it to the Add-Member cmdlet. This adds the the member to EACH object in the table, rather than adding it to the table itself.
2. Add a fname and lname fields for the first and last name
PS>$lapsed | Add-Member -Name “fname” -MemberType NoteProperty -Value “”
PS>$lapsed | Add-Member -Name “lname” -MemberType NoteProperty -Value “”
Now the fields look like this:
Addressee : Joe Blow
first : Joe
spouse :
Organization : Paul C Bunn Elementary
Street : 123 W 57TH St Apt 123
City : New York
State : NY
ZIP :
lname : ""
fname : ""
3. Copy data from the first name and organization fields to their new fields.
PS>$lapsed | ForEach-Object ($_.fname) {$_.fname=$_.first}
PS>$lapsed | ForEach-Object ($_.org) {$_.org=$_.Organization}
This leaves us with a record looking like this.
Addressee : Joe Blow
first : Joe
spouse :
Organization : Paul C Bunn Elementary
Street : 123 W 57TH St Apt 123
City : New York
State : NY
ZIP : 10019
lname :
fname : Joe
org : Paul C Bunn Elementary
6. Having copied the data from the old fields to the new ones, we can delete the old fields.
There isn’t a cmdlet to remove an object member, so the you have to use a different nomenclature. (Note To Self… opportunity to make a custom cmdlet?)
PS> $lapsed | ForEach-Object ($_){$_.PsObject.Members.Remove(‘Organization’)}
PS>$lapsed | ForEach-Object ($_){$_.PsObject.Members.Remove(‘first’)}
PS>$lapsed | ForEach-Object ($_){$_.PsObject.Members.Remove(‘spouse’)}
Now a typical record is starting to look much more like what we want it to look like.
Addressee : Joe Blow
Street : 123 W 57TH St Apt 123
City : New York
State : NY
ZIP : 10019
lname :
fname : Joe
org : Paul C Bunn Elementary
7. We still need to pick out the last name from the Addressee field.
There might be a couple approaches to this using regular text search methods:
a. Given a string “Joe Blow”, we could find the first blank character, and then take anything to the right of if as our last name.
b. We could start at the right hand side and count backwards until we get to a space.
c. If there are word functions, we can choose the right-most word in the string.
d. Use the split function. This is what we’ll use.
PS> $lapsed | ForEach-Object ($_){$_.lname=$_.Addressee.split()[-1]}
8. Finally, we can eliminate the “Addressee” field
PS> $lapsed | ForEach-Object ($_){$_.PsObject.Members.Remove(‘Addressee’)}
Street : 123 W 57TH St Apt 123
City : New York
State : NY
ZIP : 10019
lname : Blow
fname : Joe
org : Paul C Bunn Elementary
9. Time to export back to a CSV file.
PS> $lapsed | Export-Csv -Confirm -Path “C:UsersLarryPowershellulapsed.csv” -NoTypeInformation
10. Almost done. The one frost is that the field order isn’t exactly as I’d like. This can be fixed with the Select-Object cmdlet.
PS > $lapsed | Select-Object -Property fname,lname,org,street,city,state,zip |
Export-CSV -Path “C:UsersLarryPowershellulapsed.csv” -NoTypeInformation
Recall when typing a string of commands with a pipeline, the pipe delimter will also act as a “newline”, so you can break the command up over the course of a couple of lines and have the full pipeline execute as one command.
Notes:
As they say on public television: “Many thanks to the following:”
“jrv” on Microsoft Technet
http://goo.gl/1Kyw07
“Root Loop” on StackOverflow
http://stackoverflow.com/questions/22029944/batch-or-powershell-how-to-get-the-last-word-from-string
More Info:
http://windowsitpro.com/powershell/csv-excel-or-sql-it-doesnt-matter-powershell
More about the split function in Powershell help
PS> Get-Help about_Split