One of the tasks that less experienced programmers find really difficult is to sort multidimensional arrays by one of the values in the child arrays. In today's tutorial I'm going to show you 3 ways of achieving that - you decide which ones seems easier for you and provides the best performance.
Let's assume a scenario where we have a huge array like the one below and we need to sort it by a value contained in the last child array - in our case the value of the 'weight' key.
<?php
$array = [
[
['name'=>'John B'],
['age'=>30],
['sizes'=>
[
'weight'=>80,
'height'=>120
]
]
],
[
['name'=>'Marie B'],
['age'=>31],
['sizes'=>
[
'weight'=>60,
'height'=>110
]
]
],
[
['name'=>'Carl M'],
['age'=>12],
['sizes'=>
[
'weight'=>70,
'height'=>100
]
]
],
[
['name'=>'Mike N'],
['age'=>19],
['sizes'=>
[
'weight'=>70,
'height'=>150
]
]
],
[
['name'=>'Nancy N'],
['age'=>15],
['sizes'=>
[
'weight'=>60,
'height'=>150
]
]
],
[
['name'=>'Cory X'],
['age'=>15],
['sizes'=>
[
'weight'=>44,
'height'=>150
]
]
]
];
?>
usort
and a simple callback function (Recommended).
This is a really simple method - usort automatically loops through each combination of two elements of the array, compares whatever you want it to compare in the callback function, and by returning -1 or 1 respectively, it assigns the appropriate index to the given element in the resulting array. In our case, if the 'weight' in $a
is smaller or equal than the 'weight' in $b
, the content of $a
will end up having a smaller index in the array to be ordered. Here's the code:
<?php
//Method1: sorting the array using the usort function and a "callback that you define"
function method1($a,$b)
{
return ($a[2]["sizes"]["weight"] <= $b[2]["sizes"]["weight"]) ? -1 : 1;
}
usort($array, "method1");
print_r($array);
?>
The bubble method is something people used to learn in computer science back in the 90s. It is known to be a slow method of reordering as it needs to loop through the array multiple times.
The bubble method checks each set of adjacent values and if the required condition is met it swaps their positions. This action is then performed again and again until no reordering is necessary in a full array traversal. In this example the array is traversed until all weights found at key $j are smaller than the weights found at $j+1. Have a look:
<?php
//Method 2: The bubble method
$j=0;
$flag = true;
$temp=0;
while ( $flag )
{
$flag = false;
for( $j=0; $j < count($array)-1; $j++)
{
if ( $array[$j][2]["sizes"]["weight"] > $array[$j+1][2]["sizes"]["weight"] )
{
$temp = $array[$j];
//swap the two between each other
$array[$j] = $array[$j+1];
$array[$j+1]=$temp;
$flag = true; //show that a swap occurred
}
}
}
print_r($array);
?>
Some programmers simply prefer writing their own code for such tasks so I thought I would give you an example of custom code for achieving the same result.
The way I designed it is pretty straightforward:
First create a temporary empty array, then loop through the array that needs to be ordered and replace its existing keys with the values of what you are ordering by. Please note that you also need to make sure that the new keys are unique so you don't lose any elements. In my case - I decided to use the old keys and append them to the new ones so the resulting keys are something like: "new_key" . "some_string" . "old_unique_key". Then I sort the temporary array by key using the built in ksort()
function and use array values to re-index the array numerically. Here's the code:
Please note that the code below will not work correctly in some cases. For example, if one 'weight' would be more than or equal to 100, then the ksort
function would fail to give the right results. You can use ksort($temp, SORT_NATURAL)
in such a case.
<?php
//Method3: DIY
$temp = [];
foreach ($array as $key => $value)
$temp[$value[2]["sizes"]["weight"] . "oldkey" . $key] = $value; //concatenate something unique to make sure two equal weights don't overwrite each other
ksort($temp); // or ksort($temp, SORT_NATURAL); see paragraph above to understand why
$array = array_values($temp);
unset($temp);
print_r($array);
?>
array_multisort()
way...
Another option for sorting multi-dimensional arrays is to use a built in function called array_multisort()
. While it can also be used to sort several independent arrays at once, its real power comes into play because it provides an off the shelf method to sort a multi-dimensional array by one or more "child" dimensions.
Combining array_multisort()
with array_map()
and its callback for obtaining an array of weights, allows for this complex sorting to happen in only one call.
<?php
array_multisort(array_map(function($element) {
return $element[2]['sizes']['weight'];
}, $array), SORT_ASC, $array);
Because this last method doesn't look as easy as the others... I will explain what it does:
array_map
applies a callback function on each element of the input, thus generating a numerically indexed array of weights.array_multisort
reorders the array of weights and it also applies it to the input array.Many of you have asked me which one of these methods is best. This led me to writing a quick and dirty performance and memory usage test with different data sizes. This should give you an idea on what to use depending on your server specs and array size.
The test code can be found on my git repo as a snippet.
count($array)=1000
& memory_limit=128M
As expected, the algorithm complexity doesn't really matter when dealing with small amounts of data. Both execution time and memory usage are very low. Even so, we can already observe that the bubble-sort method (Method 2)
is the slowest. This led me to introduce a time limit of 20 seconds after which, the script will just bail out of trying to resolve the sorting using that method. Another early conclusion is that Method 3
will use more memory because of the necessity to build the temporary array which is used for sorting.
count($array)=10.000
& memory_limit=128M
Stepping up the array size to 10.000 elements only confirms the initial findings. The bubble sort
method has already failed whereas method 3
in which we assign the sorting column as the key of a temporary array seems the fastest but uses the most memory.
count($array)=100.000
& memory_limit=128M
Pushing our test even further but keeping the maximum allocated memory to 128M makes the test fail which was to be expected so from this point on I decided to just give the script as much memory as it needed so I set memory_limit
at 2GB.
count($array)=100.000
& memory_limit=2G
This is where things start to get really interesting because the usort() (method1)
method and the array_map()/array_multisort() (method4)
combination are on par when it comes to memory usage but it looks like method4
is a bit faster than the first. Again, while using a bit more memory, the temporary array path seems to be the way to go.
count($array)=1.000.000
& memory_limit=2G
Pushing this to the limit provides us certainty that previous conclusions are accurate:
Method 3
(the technique in which we create a temporary array holding the values to be sorted) is the fastest but uses an average of 4%
more memory.
Method 4
(the use of array_multisort() in combination with array_map()) is the second fastest in fact almost twice as fast as the usort()
method, while using the same amount of memory.
Method 1
(usort() + callback function) is slow when it comes to dealing with large amounts of data on this scenario.
Method 2
(the bubble method) is clearly useful only when dealing with very small amounts of data as it has a worst case complexity of О(n2)
.
Because the test doesn't care about weights being unique, nor does it care about what happens when 2 weights are equal, in some cases the resulting arrays will not be equal.
After writing about the methods one could employ to sort a multidimensional array by a child key or value, coders have asked me about the possibility to restructure "flat" arrays into an array that has a hierarchical structure and then sort it using the methods described above.
Let's look at the example below in which a series of William Shakespeare plays is fetched from an external data source.
$books = [
["title" => "Romeo and Juliet", "genre" => "tragedy", "noofpages"=> "100"],
["title" => "Hamlet", "genre" => "tragedy", "noofpages"=> "200"],
["title" => "A Midsummer Night's Dream", "genre" => "comedy", "noofpages"=> "50"],
["title" => "Mcabeth", "genre" => "tragedy", "noofpages"=> "180"],
["title" => "Merchant of Venice", "genre" => "comedy", "noofpages"=> "40"],
];
The task is to group the books by genre and then sort each sub-array by the number of pages of each play. This would be achieved by looping through the array and pushing elements based on their genre into a new list. One important aspect is that you need to check if the sub array for a particular genre already exists and if it doesn't create an empty one prior to pushing an item to it.
$books_by_genre = [];
foreach($books as $key => $book) {
if(!isset($books_by_genre[$book['genre']])) {
$books_by_genre[$book['genre']] = [];
}
array_push($books_by_genre[$book['genre']], $book);
}
var_dump($books_by_genre);
Then, the solution is to loop through the newly generated array and sort it by the value of noofpages
.
foreach ($books_by_genre as $key => &$genre) {
usort($genre, function($a,$b){
return ($a["noofpages"] <= $b["noofpages"]) ? -1 : 1;
});
}
var_dump($books_by_genre);
Thanks for reading this. Hope you learned something.