Showing posts with label memory. Show all posts
Showing posts with label memory. Show all posts

Monday, August 8, 2011

understanding PHP memory management

Let's keep exploring how PHP manages memory. Let's check this code and the output():

<?php
var_dump(memory_get_usage());
$name = 'henry';
var_dump(memory_get_usage());
unset($name);
var_dump(memory_get_usage());
?>

The output:

int(325496)
int(325672)
int(325532)

The first output tells us the memory usage is 325496. After we assign a value to a variable $name, the memory usage becomes 325672. The interesting part is the following code: unset($name),  which is supposed to release the memory taken by $name. But the output tells us, the memory usage after we unset($name) is 325532. But,
325496 - 325532 = -36. We still use 36 more memory even we unset($name)!

So, the question comes. Where does the 36 more memory go? Does unset truly release memory? Before we analyse this question in detail, I hope you stand firm with this answer: yes, unset really can release memory, no doubt about that(Surely it also depends on the zval.refcount, see this post: http://hengrui-li.blogspot.com/2011/08/php-copy-on-write-how-php-manages.html ).

Now let's see how PHP allocates memory. For a simple statement $name = 'henry', PHP will do at least two things: 1. PHP allocates memory for the name of the variable, which is '$name', and save it into a symbol table; 2. PHP allocates memory for the value of the variable, which is 'henry', and create a zval to save the value.

However, we must know this: When PHP is trying to ask memory from OS, it won't simply ask memory for its current task only. It will actually ask a large amount of memory(more than what it needs for the moment) from OS, and then allocate a small part of this 'large amount of memory'  to handle its current task. The benefit of asking large amount of memory at the beginning is, if PHP needs to allocate more memory later, it doesn't have to ask from OS again, and this avoids frequent system calling between PHP and OS.

When we do unset($name), PHP will release the memory. But it doesn't mean PHP will return the memory to OS. Actually, PHP will make the memory as its own spare memory that it can allocate to others if necessary.

Let's check this code:

<?php
var_dump(memory_get_usage(true));
$name = 'henry';
var_dump(memory_get_usage(true));
unset($name);
var_dump(memory_get_usage(true));
?>

Note memory_get_usage(true) means we want to get the real size of memory allocated from system. Now the output:

int(524288)
int(524288)
int(524288)

This output tells us one thing: when we do $name = 'henry', PHP does NOT ask more memory from OS.  

Now, let's answer the question at the beginning, where does the 36 more memory go? Let's check this code:

<?php
var_dump("dumping");
var_dump(memory_get_usage());
$name = 'henry';
var_dump(memory_get_usage());
unset($name);
var_dump(memory_get_usage());
?>

The output:

string(7) "dumping"
int(326668)
int(326808)
int(326668)

This time, 326668 - 326668 = 0. It is normal now. Memory allocation in PHP is quite implicitly, sometime it is hard for us to imagine. We var_dump("dumping"); at the beginning and everything looks normal now. This tells us, the 36 memory is taken by the output function. More precisely, the 36 memory is taken by the Header of the output.

Sunday, August 7, 2011

PHP copy on write - how PHP manages variable memory (2)

Continue with my last post http://hengrui-li.blogspot.com/2011/08/php-copy-on-write-how-php-manages.html, let's check some interesting cases.

case 1:

$name   = 'henry';
$fname = 'henry';

How does PHP handle these two variables?

//the output is: name: (refcount=1, is_ref=0)='henry'
xdebug_debug_zval('name');

// the output is:  fname: (refcount=1, is_ref=0)='henry'
xdebug_debug_zval('fname');

From the output, we can see that PHP actually creates two zval to store each of them.

case 2:

$fname = $name = 'henry';

// the output is: name: (refcount=2, is_ref=0)='henry'
xdebug_debug_zval('name');

//the output is: fname: (refcount=2, is_ref=0)='henry'
xdebug_debug_zval('fname');

In this case, we find that PHP only uses one zval, which is more efficient.

Thursday, August 4, 2011

PHP copy on write - how PHP manages variable memory

I've been asked a similar question a few times by a few developers so i think it is better to write it down. Let's check the code

//assume we have a large size array
$largeArray = getLargeSizeArray();

function doTask(Array $large)
{
                //do the task
}
doTask($largeArray);

The question is like this: the argument is passed by value, which means a copy of $largeArray is made. This will take a lot more memories. Is it better to pass the argument by reference: function doTask(Array &$large)?

To this question, my answer is always 'No'. Well, to be honest, i simply don't want developers think passing by reference is a good practice even for the sake of memory. But the truth is, it actually depends on what 'the task' is inside the doTask() function. Most of the time, we can simply pass by value.

To get a solid understanding, we better dig deeper. Let's see how Zend Engine manage variables internally.

Zend is actually using a C struct, zval, to store the value of a variable:

typedef struct _zval_struct {
    zvalue_value value;
    zend_uint refcount;
    zend_uchar type;
    zend_uchar is_ref;
  } zval;

'zvalue_value value' is where the value of the variable is stored. zvalue_value is a union:

typedef union _zvalue_value {
    long lval;
    double dval;
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;
    zend_object_value obj;
} zvalue_value;

As you can guess, zval.type stores the variable type. Zend is using this zval.type and zval.value to make PHP a weak typing language, even though C is strong type language. Anyway, typing is not what we want to discuss here.

A simple code: $name = 'henry'; The question is, how Zend uses a zval to store '$name', or, how a zval knows that it is storing a value for '$name'? In zval, we can't find any field to store the '$name'. The answer is, PHP stores the name of a variable in a hash table, called symbol_table. And there is a mapping mechanism from the variable name to the variable value(zval).

Now Let's check the PHP code:

$name  = 'henry';
$fname = $name;
unset($name);

The first line, PHP allocates a 6 bytes of memory to store 'henry'(5 bytes) and \0 (1 byte), which is NULL.
The second line, a new variable $fname is created, and the value of $name is "copied" to $fname.
The third line, unset $name trying to free the memory taken by $name.

This kind of code is quite common. If PHP allocates a new memory for every new variable assignment, then for this example, PHP must give 12 bytes of memory for $name and $fname. We know we don't really need that much memory. We can simply make symbol_table's $fname refers to the same zval that $name is referring to. And that is exactly how Zend Engine does. Humm, that sounds like we are not really copying, we are referring. So what will happen when we do unset($name)? How does PHP knows there is a $fname referring to $name?

Time to have a look at "zend_uint refcount". Let's try this:

$name='henry';
xdebug_debug_zval('name');

The output is "name: (refcount=1, is_ref=0)='henry'"

And we can see that zval.refcount=1, which means there is one variable referring to this zval. Now do this:

$fname = $name;
xdebug_debug_zval('fname');

The output is "fname: (refcount=2, is_ref=0)='henry'"! Strange? Shouldn't be refcount=1? Let's try:

xdebug_debug_zval('name');

We get the same output: "name: (refcount=2, is_ref=0)='henry'"!

So, actually, $fname and $name are referring to the same zval. By changing the value of zval.refcount, PHP knows that there are two variables referring to the same zval. If we assign $name to more other new variables, PHP will simply increase the value of zval.refcount and it will NOT allocate more memories. Ok, what happens if we unset($name)? You can guess! Right, PHP simply descrease the value of zval.refcount. Let's do this:

unset($name);
xdebug_debug_zval('fname');

The output is "fname: (refcount=1, is_ref=0)='henry'". You know what? In this situation(two variables referring to same zval), using unset cannot release/free the memory.

Now, try this:

$name  = 'henry';
$fname = $name;
$name  = 'li';

Obvious, $fname is still 'henry'. But if $fname is referring to the same zval, its value should change to 'li' too, right? Well, PHP has a copy on write mechanism: When PHP is going to change a variable, it will check its zval.refcount first. If zval.refcount > 1, PHP will create a new zval, descrease the old zval.refcount by 1, and modify the symbol_table so that $fname and $name is referring to different zval. So, at this time, PHP must allocate new memory. And also at this time, if we unset($name), we can really save some memory.

Now we know that when PHP is doing pass by value, or copying a variable to another, it is not "really copying". It makes them referring to the same zval to save memory. So back to the question at the beginning:

//assume we have a large size array
$largeArray = getLargeSizeArray();

function doTask(Array $large)
{
                //do the task
}
doTask($largeArray);

Simply passing the $largeArray by value into doTask function will not cost more memory. But, i also say it really depends on what we do inside the doTask() function. If we need to change the value of the argument, then PHP has to spend more memory. Like this:

//assume we have a large size array
$largeArray = getLargeSizeArray();

function doTask(Array $large)
{
                $large[0] = 'xxxx';
}
doTask($largeArray);

We change the value of the argument and PHP has to create a new zval to save it.

Alright, finally, just simply mention it here: what is "zend_uchar is_ref"? I think you can easily guess now:

$name  = 'henry';
$fname = &$name;
xdebug_debug_zval('name');

The output is "name: (refcount=2, is_ref=1)='henry'". Don't have to explain more, right?

Friday, March 19, 2010

memory usage in PHP

A very excellent blog about PHP memory usage: http://blog.preinheimer.com/index.php?/archives/354-Memory-usage-in-PHP.html

Unlike C/C++ developers, most PHP developers never think about memory management in PHP. It is true that PHP can free the memory after a request is processed. However, if you use PHP to process critical tasks with large amount of data, you'd better read this blog carefully, and test its memory-usage-example.php by yourself. You will see how different it is.