Tuesday, October 15, 2013

PowerShell: Preventing Arrays from Acting like Scalars

PowerShell like many scripting languages plays fast and loose with types. When dealing with arrays, this can be problematic because PowerShell will interpret arrays to be scalars. This is exacerbated by PowerShell using the + operator to append items to an array. If the alleged array variable is treated as a scalar and not an array, the + operator has a completely different meaning.

This write-up will demonstrate how to insure arrays behave as arrays and not scalars. This write-up will cover the simple case of arrays created programmatically and the more interesting cases of arrays created through serialization via cmdlets and arrays returned by functions.

Consider the PowerShell script that any C++/C# developer would have written where allegedly the variable's type is specified ([Array] $BadArray):

[Array] $BadArray

$BadArray += 1
$BadArray += 2
$BadArray += 3

Write-Host "BadArray: $BadArray"

It should be noted that the line "[Array] $BadArray" does not declare a variable of type Array but instead casts a variable.

The previous code executes without an error and it is assumed the += operator will append the values of 1, 2, and 3 to be elements of the array. The output of this script should be an array with elements 1, 2, 3 but instead is the scalar value of 6:


Clearly, the variable $BadArray was treated as a scalar integer as 1 + 2 + 3 = 6.

The correct way to declare an array is to assign the array to an empty array using $() as is the case with variable $GoodArray below:

$GoodArray = @()

$GoodArray += 1
$GoodArray += 2
$GoodArray += 3

Write-Host "GoodArray: $GoodArray"

The output generated is as expected, an array containing elements 1, 2, and 3:


To make things interesting, an array of two elements is exported to an XML file using cmdlet Export-Clixml and then re-imported as an array using cmdlet Import-Clixml. After the array is de-serialized an additional element is appended to the end of the array ($BadFromDiskArray += 3):


$ToDiskArray = @()
$ToDiskArray += 1
$ToDiskArray += 2
$SerializationFile = 'D:\Blog\SerializedData.xml'

$ToDiskArray | Export-Clixml $SerializationFile
$BadFromDiskArray = Import-Clixml $SerializationFile
$BadFromDiskArray += 3
Write-Host "BadFromDiskArray: $BadFromDiskArray"

The output from the previous script shows the array being serialized and de-serialized correctly. The reason this works is because the array that was serialized has more than one element when it was serialized:


The previous example behaved correctly. The twist occurs when the serialized array contains one element. The below snippet shows an array with one element being serialized and de-serialized. Once the array is de-serialized two elements are appended to the array with the += operator (+= 2 and +=3):
$SerializationFile = 'D:\Blog\SerializedData.xml'

$ToDiskArray = @()
$ToDiskArray += 1
$ToDiskArray | Export-Clixml $SerializationFile
$BadFromDiskArray = Import-Clixml $SerializationFile
$BadFromDiskArray += 2
$BadFromDiskArray += 3
Write-Host "BadFromDiskArray: $BadFromDiskArray"

When the previous script is executed the de-serialized array is converted to a scalar because the array has one element:

This is the worst kind of developer nightmare – the code works correctly sometimes but at other times does not work. The solution is not to assign arrays directly from the return value from a cmdlet. The following code is not correct:

$BadFromDiskArray = Import-Clixml $SerializationFile

The correct way to handle arrays is to always append the results from the function to the array using the += operator. When the left-hand side is an array and the right-hand side is a scalar the += appends the scalar to the array on the left-hand side. When the left-hand side is an array and the right-hand side is an array the += appends the array to the right to the end of the array on the left.

The following snippet shows a single-element array being serialized and de-serialized. The de-serialized value may be an array or may be a scalar. The += operator highlighted below with boldface appends the value or values returned by Import-Clixml to the array on the left:

$ToDikArray = @()
$ToDiskArray += 1
$SerializationFile = 'D:\Blog\SerializedData.xml'
$ToDiskArray | Export-Clixml $SerializationFile

$GoodFromDiskArray = @()
$GoodFromDiskArray += Import-Clixml $SerializationFile
$GoodFromDiskArray += 2
$GoodFromDiskArray += 3
Write-Host "GoodFromDiskArray: $GoodFromDiskArray"

The results of this script demonstrate the array behaving as an array by the following screenshot displaying all the elements of the array and not behaving like a scalar:


Using += applies to both cmdlets and functions. Consider the following function that returns an array of elements based on the input parameter $numberOfElements:

function CreateAnArray($numberOfElements)
{
    $ReturnValue = @()
    for ($i = 0 $i -lt $numberOfElements $i++)
    {
        $ReturnValue += $i + 1
    }
    
    return $ReturnValue
}   

The previous function behaves correctly. The array is created as an empty array correctly ($ReturnValue = @()) and each element is appended to the array correctly using operator += ($ReturnValue += $i + 1).

The following code would exhibit incorrect behavior where an array, $EvilFromFunctionArray,  is assigned to the return value of the function:

$EvilFromFunctionArray = CreateAnArray(1)
$EvilFromFunctionArray += 2
$EvilFromFunctionArray += 3
Write-Host "EvilFromFunctionArray: $EvilFromFunctionArray"

The previous code will display a value of 6 because the alleged array $EvilFromFunctionArray is a scalar due to assignment.

The following code would exhibit correct behavior where the array is created as an empty array and the return value to the function is appended to the existing array variable $GoodFromFunctionArray:

$GoodFromFunctionArray = @()
$GoodFromFunctionArray += CreateAnArray(1)
$GoodFromFunctionArray += 2
$GoodFromFunctionArray += 3
Write-Host "GoodFromFunctionArray: $GoodFromFunctionArray"

The previous code will display a value of 1, 2, 3 because the return value of the CreateAnArray function is appended to the array variable $GoodFromFunctionArray.

The lesson learned is with arrays, append, and don't assign.

No comments:

Post a Comment