How to Batch Extract Attachments from MSG Files using PowerShell

You can use a PowerShell script to batch extract attachments from MSG files provided you have Microsoft Outlook installed.

There are a number of utilities, many of them quite pricey, that let you batch extract attachments from email messages stored in MSG format. But if you want a quick and easy (and free) way to batch extract attachments from MSG files, you can try the PowerShell cmdlet below.

The only catch is that you must have Microsoft Outlook installed. I’ve tested this with Microsoft Outlook 2010, but it should work fine with Microsoft Outlook 2007 and 2013 too.

function Expand-MsgAttachment
{
    [CmdletBinding()]

    Param
    (
        [Parameter(ParameterSetName="Path", Position=0, Mandatory=$True)]
        [String]$Path,

        [Parameter(ParameterSetName="LiteralPath", Mandatory=$True)]
        [String]$LiteralPath,

        [Parameter(ParameterSetName="FileInfo", Mandatory=$True, ValueFromPipeline=$True)]
        [System.IO.FileInfo]$Item
    )

    Begin
    {
        # Load application
        Write-Verbose "Loading Microsoft Outlook..."
        $outlook = New-Object -ComObject Outlook.Application
    }

    Process
    {
        switch ($PSCmdlet.ParameterSetName)
        {
            "Path"        { $files = Get-ChildItem -Path $Path }
            "LiteralPath" { $files = Get-ChildItem -LiteralPath $LiteralPath }
            "FileInfo"    { $files = $Item }
        }

        $files | % {
            # Work out file names
            $msgFn = $_.FullName

            # Skip non-.msg files
            if ($msgFn -notlike "*.msg") {
                Write-Verbose "Skipping $_ (not an .msg file)..."
                return
            }

            # Extract message body
            Write-Verbose "Extracting attachments from $_..."
            $msg = $outlook.CreateItemFromTemplate($msgFn)
            $msg.Attachments | % {
                # Work out attachment file name
                $attFn = $msgFn -replace '\.msg$', " - Attachment - $($_.FileName)"

                # Do not try to overwrite existing files
                if (Test-Path -literalPath $attFn) {
                    Write-Verbose "Skipping $($_.FileName) (file already exists)..."
                    return
                }

                # Save attachment
                Write-Verbose "Saving $($_.FileName)..."
                $_.SaveAsFile($attFn)

                # Output to pipeline
                Get-ChildItem -LiteralPath $attFn
            }
        }
    }

    End
    {
        Write-Verbose "Done."
    }
}

Installation

Save the code to a file called MsgUtility.psm1, and put it in its own folder called MsgUtility in your PowerShell modules folder.

On Windows Vista and newer, the PowerShell modules folder is typically found here:

C:\Users\__your_username__\Documents\WindowsPowerShell\Modules

On Windows XP, the PowerShell modules folder is typically found here:

C:\Documents and Settings\__your_username__\Documents\WindowsPowerShell\Modules

If the PowerShell modules folder doesn’t exist, create it.

Usage

To use the cmdlet to extract attachments from all of the MSG files in the current directory, simply import the new module and run the Expand-MsgAttachment as follows:

Import-Module MsgUtility
Expand-MsgAttachment *

And of course you can pipe input files or specify a single file or a wildcard pattern as you can with other PowerShell cmdlets:

Expand-MsgAttachment "Email from * to me dated *.msg"
Get-ChildItem -Recurse | Expand-MsgAttachment

Read full post »

Tags: MSG, attachments, PowerShell

How to Convert MSG Files to DOC Files using PowerShell

You can use a PowerShell script to batch convert emails in MSG format to DOC files provided you have Microsoft Word and Microsoft Outlook installed.

There are a number of utilities, many of them quite pricey, that let you convert email messages stored in MSG format to other file formats. But if you want a quick and easy (and free) way to convert MSG files to DOC files, you can try the PowerShell cmdlet below.

Since there are a larger number of apps that deal with DOC files, once the emails are in DOC format, it is a relatively simple task to convert them to other file formats. For example, you could use Adobe Acrobat or other PDF software to convert the HTML files to PDFs.

The only catch is that you must have Microsoft Word and Microsoft Outlook installed. I’ve tested this with Microsoft Office 2010, but it should work fine with Microsoft Office 2007 and 2013 too.

function ConvertFrom-MsgToDoc
{
    [CmdletBinding()]

    Param
    (
        [Parameter(ParameterSetName="Path", Position=0, Mandatory=$True)]
        [String]$Path,

        [Parameter(ParameterSetName="LiteralPath", Mandatory=$True)]
        [String]$LiteralPath,

        [Parameter(ParameterSetName="FileInfo", Mandatory=$True, ValueFromPipeline=$True)]
        [System.IO.FileInfo]$Item
    )

    Begin
    {
        # OlSaveAsType constants
        $olTXT = 0
        $olRTF = 1
        $olTemplate = 2
        $olMSG = 3
        $olDoc = 4

        # WdPaperSize constants
        $wdPaper10x14 = 0
        $wdPaper11x17 = 1
        $wdPaperLetter = 2
        $wdPaperLetterSmall = 3
        $wdPaperLegal = 4
        $wdPaperExecutive = 5
        $wdPaperA3 = 6
        $wdPaperA4 = 7
        $wdPaperA4Small = 8
        $wdPaperA5 = 9
        $wdPaperB4 = 10
        $wdPaperB5 = 11
        $wdPaperCSheet = 12
        $wdPaperDSheet = 13
        $wdPaperESheet = 14
        $wdPaperFanfoldLegalGerman = 15
        $wdPaperFanfoldStdGerman = 16
        $wdPaperFanfoldUS = 17
        $wdPaperFolio = 18
        $wdPaperLedger = 19
        $wdPaperNote = 20
        $wdPaperQuarto = 21
        $wdPaperStatement = 22
        $wdPaperTabloid = 23
        $wdPaperEnvelope9 = 24
        $wdPaperEnvelope10 = 25
        $wdPaperEnvelope11 = 26
        $wdPaperEnvelope12 = 27
        $wdPaperEnvelope14 = 28
        $wdPaperEnvelopeB4 = 29
        $wdPaperEnvelopeB5 = 30
        $wdPaperEnvelopeB6 = 31
        $wdPaperEnvelopeC3 = 32
        $wdPaperEnvelopeC4 = 33
        $wdPaperEnvelopeC5 = 34
        $wdPaperEnvelopeC6 = 35
        $wdPaperEnvelopeC65 = 36
        $wdPaperEnvelopeDL = 37
        $wdPaperEnvelopeItaly = 38
        $wdPaperEnvelopeMonarch = 39
        $wdPaperEnvelopePersonal = 40
        $wdPaperCustom = 41

        # Load applications
        Write-Verbose "Loading Microsoft Outlook..."
        $outlook = New-Object -ComObject Outlook.Application

        Write-Verbose "Loading Microsoft Word..."
        $word = New-Object -ComObject Word.Application

        # Disable signature
        $signaturesPath = Join-Path $env:APPDATA "Microsoft\Signatures"
        Get-ChildItem $signaturesPath | % {
            Rename-Item $_.FullName ("_" + $_.Name)
        }
    }

    Process
    {
        switch ($PSCmdlet.ParameterSetName)
        {
            "Path"        { $files = Get-ChildItem -Path $Path }
            "LiteralPath" { $files = Get-ChildItem -LiteralPath $LiteralPath }
            "FileInfo"    { $files = $Item }
        }

        $files | % {
            # Work out file names
            $msgFn = $_.FullName
            $docFn = $msgFn -replace '\.msg$', '.doc'

            # Skip non-.msg files
            if ($msgFn -notlike "*.msg") {
                Write-Verbose "Skipping $_ (not an .msg file)..."
                return
            }

            # Do not try to overwrite existing files
            if (Test-Path -LiteralPath $docFn) {
                Write-Verbose "Skipping $_ (.doc already exists)..."
                return
            }

            # Extract message body
            Write-Verbose "Extracting message body from $_..."
            $msg = $outlook.CreateItemFromTemplate($msgFn)
            $msg.SaveAs($docFn, $olDoc)

            # Convert to A4
            Write-Verbose "Converting file size to A4..."
            $doc = $word.Documents.Add($docFn)
            $doc.PageSetup.PaperSize = $wdPaperA4
            $doc.SaveAs([ref]$docFn)
            $doc.Close()

            # Output to pipeline
            Get-ChildItem -LiteralPath $docFn
        }
    }

    End
    {
        # Enable signatures
        Get-ChildItem $signaturesPath | % {
            Rename-Item $_.FullName $_.Name.Substring(1)
        }

        Write-Verbose "Done."
    }
}

Installation

Save the code to a file called MsgUtility.psm1, and put it in its own folder called MsgUtility in your PowerShell modules folder.

On Windows Vista and newer, the PowerShell modules folder is typically found here:

C:\Users\__your_username__\Documents\WindowsPowerShell\Modules

On Windows XP, the PowerShell modules folder is typically found here:

C:\Documents and Settings\__your_username__\Documents\WindowsPowerShell\Modules

If the PowerShell modules folder doesn’t exist, create it.

Usage

To use the cmdlet to convert all the MSG files in the current directory, simply import the new module and run the ConvertFrom-MsgToDoc as follows:

Import-Module MsgUtility
ConvertFrom-MsgToDoc *

And of course you can pipe input files or specify a single file or a wildcard pattern as you can with other PowerShell cmdlets:

ConvertFrom-MsgToDoc "Email from * to me dated *.msg"
Get-ChildItem -Recurse | ConvertFrom-MsgToDoc

Read full post »

Tags: MSG, DOC, PowerShell

How to Convert EML Files to HTML using PowerShell

You can use a PowerShell script to batch convert emails in EML format to HTML files.

There are a number of utilities, many of them quite pricey, that let you convert email messages stored in EML format to other file formats. But if you want a quick and easy (and free) way to convert EML files to HTML files, you can try the PowerShell cmdlet below.

Since there are a larger number of apps that deal with HTML files, once the emails are in HTML format, it is a relatively simple task to convert them to other file formats. For example, you could use Adobe Acrobat or other PDF software to convert the HTML files to PDFs.

function ConvertFrom-EmlToHtml
{
    [CmdletBinding()]

    Param
    (
        [Parameter(ParameterSetName="Path", Position=0, Mandatory=$True)]
        [String]$Path,

        [Parameter(ParameterSetName="LiteralPath", Mandatory=$True)]
        [String]$LiteralPath,

        [Parameter(ParameterSetName="FileInfo", Mandatory=$True, ValueFromPipeline=$True)]
        [System.IO.FileInfo]$Item
    )

    Process
    {
        switch ($PSCmdlet.ParameterSetName)
        {
            "Path"        { $files = Get-ChildItem -Path $Path }
            "LiteralPath" { $files = Get-ChildItem -LiteralPath $LiteralPath }
            "FileInfo"    { $files = $Item }
        }

        $files | % {
            # Work out file names
            $emlFn  = $_.FullName
            $htmlFn = $emlFn -replace '\.eml$', '.html'

            # Skip non-.msg files
            if ($emlFn -notlike "*.eml") {
                Write-Verbose "Skipping $_ (not an .eml file)..."
                return
            }

            # Do not try to overwrite existing files
            if (Test-Path -LiteralPath $htmlFn) {
                Write-Verbose "Skipping $_ (.html already exists)..."
                return
            }

            # Read EML
            Write-Verbose "Reading $_..."
            $adoDbStream = New-Object -ComObject ADODB.Stream
            $adoDbStream.Open()
            $adoDbStream.LoadFromFile($emlFn)
            $cdoMessage = New-Object -ComObject CDO.Message
            $cdoMessage.DataSource.OpenObject($adoDbStream, "_Stream")

            # Generate HTML
            Write-Verbose "Generating HTML..."
            $html = "<!DOCTYPE html>`r`n"
            $html += "<html>`r`n"
            $html += "<head>`r`n"
            $html += "<meta charset=`"utf-8`">`r`n"
            $html += "<title>" + $cdoMessage.Subject + "</title>`r`n"
            $html += "</head>`r`n"
            $html += "<body style=`"font-family: sans-serif; font-size: 11pt`">`r`n"
            $html += "<div style=`"margin-bottom: 1em;`">`r`n"
            $html += "<strong>From: </strong>" + $cdoMessage.From + "<br>`r`n"
            $html += "<strong>Sent: </strong>" + $cdoMessage.SentOn + "<br>`r`n"
            $html += "<strong>To: </strong>" + $cdoMessage.To + "<br>`r`n"
            if ($cdoMessage.CC -ne "") {
                $html += "<strong>Cc: </strong>" + $cdoMessage.CC + "<br>`r`n"
            }
            if ($cdoMessage.BCC -ne "") {
                $html += "<strong>Bcc: </strong>" + $cdoMessage.BCC + "<br>`r`n"
            }
            $html += "<strong>Subject: </strong>" + $cdoMessage.Subject + "<br>`r`n"
            $html += "</div>`r`n"
            if ($cdoMessage.HTMLBody -ne "") {
                $html += "<div>`r`n"
                $html += $cdoMessage.HTMLBody + "`r`n"
                $html += "</div>`r`n"
            } else {
                $html += "<div><pre>"
                $html += $cdoMessage.TextBody
                $html += "</pre></div>`r`n"
            }
            $html += "</body>`r`n"
            $html += "</html>`r`n"

            # Write HTML
            Write-Verbose "Saving HTML..."
            Add-Content -LiteralPath $htmlFn $html

            # Output to pipeline
            Get-ChildItem -LiteralPath $htmlFn
        }
    }

    End
    {
        Write-Verbose "Done."
    }
}

Installation

Save the code to a file called EmlUtility.psm1, and put it in its own folder called EmlUtility in your PowerShell modules folder.

On Windows Vista and newer, the PowerShell modules folder is typically found here:

C:\Users\__your_username__\Documents\WindowsPowerShell\Modules

On Windows XP, the PowerShell modules folder is typically found here:

C:\Documents and Settings\__your_username__\Documents\WindowsPowerShell\Modules

If the PowerShell modules folder doesn’t exist, create it.

Usage

To use the cmdlet to convert all the EML files in the current directory, simply import the new module and run the ConvertFrom-EmlToHtml as follows:

Import-Module EmlUtility
ConvertFrom-EmlToHtml *

And of course you can pipe input files or specify a single file or a wildcard pattern as you can with other PowerShell cmdlets:

ConvertFrom-EmlToHtml "Email from * to me dated *.eml"
Get-ChildItem -Recurse | ConvertFrom-EmlToHtml

Read full post »

Tags: EML, HTML, PowerShell

Codehire Cup Solutions: Grand Final

A solution in Ruby to the Grand Final problem in the inaugural Codehire Cup held in September 2012 at the University of Adelaide.

In September 2012, Codehire held the inaugural Codehire Cup in Adelaide, featuring some relatively simple coding problems. Contestants could choose to use C#, Java, JavaScript, PHP, or Ruby to submit solutions through a web-based interface. Fastest contestant wins.

The Problem

The Grand Final problem read:

John meets a girl in a bar. He doesn’t know her but she’s knows him. She says that they have exactly 1 single facebook friend in common.

Write a program to help John narrow down the search for this girl. You have access to a friends graph in your input string.

You need to find all of the users that have exactly 1 friend in common with John.

Each friend is represented by a number with John’s number being 1. A connection (or friendship) is represented by two user numbers separated by a hyphen:

For example, users 1 and 2 are friends:

1-2

Use the input to find all users that have exactly 1 friend in common with John.

For example:

1-2,1-3,1-4,1-5,2-4,2-7,3-4,3-5,3-8,4-8,5-7,5-6,5-8,6-7,6-8,7-8

Should return the list of matching users in ascending order.

2,5,6

The Solution

The solution to this problem involves two simple steps. First, we parse the input string to get an array for each person containing that person’s friends. Then, we can compute the intersections of the array of John’s friends and each of the other arrays of friends using the built-in Ruby & operator, and select the people where the size of that intersection is 1.

map = Hash.new([])
input.scan /(\d+)-(\d+)/ do |a, b|
    map[a] += [b]
    map[b] += [a]
end

john = map.delete('1')
map.select! { |k, v| (v & john).size == 1 }
output << map.keys.sort.join(',')

More Codehire Cup Solutions

Read full post »

Tags: Codehire Cup, coding contests

Codehire Cup Solutions: Cup Semi-final

A solution in Ruby to the Cup Semi-final problem in the inaugural Codehire Cup held in September 2012 at the University of Adelaide.

In September 2012, Codehire held the inaugural Codehire Cup in Adelaide, featuring some relatively simple coding problems. Contestants could choose to use C#, Java, JavaScript, PHP, or Ruby to submit solutions through a web-based interface. Fastest contestant wins.

The Problem

The Cup Semi-final problem read:

Calculate the result of the expression.

The input string will be random but will only contain the numbers one through nine and the plus and minus operators. No operator precedence rules need be applied.

The input may include up to 10 operators.

Your result should simply be an Integer cast to a string.

Example:

five plus four plus six minus seven

Result:

8

The Solution

The problem is made trivial by the built-in Ruby eval method: we simply need to replace ‘plus’ with ‘+’, ‘minus’ with ‘-’, and each digit specified as a word with the same digit as a numeral.

nums = 'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'

str = input.gsub('plus', '+').gsub('minus', '-')
nums.each_with_index do |s, i|
    str.gsub!(s, i.to_s)
end
output << eval(str).to_s

Another Solution

If we didn’t have the built-in Ruby eval method, we’d have to be more clever.

nums = 'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'

res = nums.find_index input[/^[a-z]+/]
input.scan(/plus ([a-z]+)/)  { |m| res += nums.find_index m[0] }
input.scan(/minus ([a-z]+)/) { |m| res -= nums.find_index m[0] }
output << res.to_s

More Codehire Cup Solutions

Read full post »

Tags: Codehire Cup, coding contests