Henry @ Web Apps: August 2011

Wednesday, August 31, 2011

jQuery check if dom element exists

When we use jQuery method $("#idName"), we will always get a jQuery object, even the DOM element may not be really existing on the web page. Sometimes we want to check if the returned object by $("#idName") really is an existing DOM element on the page. The easiest way to do this is to check the length property of the object:

var fakeDom = $("#fake");

if (fakeDom.length == 0) {

console.log('i am not an existing DOM);

}

There is another way to do this though:

var fakeDom = $("#fake");

if (fakeDom[0] === undefined) {

console.log('i am not an existing DOM);

}

Why do they work? When we check jQuery source code (v1.6.2), we find this:

// HANDLE: $("#id")

else {

elem = document.getElementById( match[2] );

// Check parentNode to catch when Blackberry 4.6 returns

// nodes that are no longer in the document #6963

if ( elem && elem.parentNode ) {

// Handle the case where IE and Opera return items

// by name instead of ID

if ( elem.id !== match[2] ) {

return rootjQuery.find( selector );

}

// Otherwise, we inject the element directly into the jQuery object

this.length = 1;

this[0] = elem;

}

this.context = document;

this.selector = selector;

return this;

}

As we can see, jQuery gets the DOM element: elem = document.getElementById( match[2] ). And if the DOM element is valid, it will set the length property of jQuery object to 1：

elem = document.getElementById( match[2] );

if ( elem && elem.parentNode ) {

...

this.length = 1;

this[0] = elem;

}

...

return this;

So, if elem is not a valid DOM, this.length will be 0. From the source code, we also find that the valid DOM elem will be assigned to this[0]. Therefore, if the elem is not valid DOM, this[0] should be undefined. So we can also check if this[0] is undefined.

Thursday, August 25, 2011

Explicit is better than implicit

Explicit is better than implicit is one of python's Zen. Coming to PHP, however, there are too many ways to make things implicitly. One of them is variable variable:

$var = 'name';
$$var = 'henry';
echo $name;

The output is 'henry'. Maybe this code snippet is not that horrible. But sometimes it is not that simple.

There was once I tried to locate where a variable is defined. The variable is $branchName, for example. It is being used directly in a file in this way: echo $branchName; and it works, so the variable must be defined/assigned a value somewhere else. Since the legacy code is procedure style, so I naturally trace back to other files that are included into the file. It must be one of those included files in which $branchName is defined. However, I can't find it. So i start to search the whole application, but i still can't find any place that defines $branchName.

The first possibility i can come up with is, register global, another notorious php feature. The application is so old that register global is turned on and used. Unfortunately, I still can't find any place showing that $branchName is a register global.

Then, i think that must be a variable variable, so i search the string '$$' in those included files. I cannot find any. I start to get a little confused. But suddenly, i remember that PHP can define variable variable in two ways: $$var = 'henry' or ${$var} = 'henry'. They both work. So i try to search '${$' in the files, finally, i found this code, with a comment from a prior developer, which is funny:

for ($i = 0, $last = count($parameters); $i < $last; $i++) {
//Good ol' dollar-dollar string interpolation... um, yeah, "thanks".
${$parameters[$i]['name']} = $parameters[$i]['value'];
}

When i var_dump($parameters), I finally found this 'branchName'. PHP can make things so implicitly, and even worse, it provides different ways to do implicit thing while Python preaches there should be only one way to do the same thing. That is why when i learned the fact that "Ruby inherited the Perl philosophy of having more than one way to do the same thing", I immediately dump this language into my rubbish bin and never want to touch it again, unless i was forced to.

Wednesday, August 24, 2011

shift array elements to the right in PHP

Another algorithm related question for C/C++ devs. It is always interesting to see how to solve it in PHP.

The question is: implement a function to shift $k array elements to the right. Time complexity must <= O(n)

For example, we have an array $a = array('a', 'b', 'c', 'd', 'e'), a function rightShift($array, $k), and if we call rightShift($a, 2), the return result must be array('c', 'd', 'e', 'a', 'b'). 'a' and 'b' have been shifted to the right of the array.

It is quite simple in PHP, with PHP's array functions.

function rightShift($array, $k)

{

for($i=0; $i<$k; $i++) {

$temp = array_shift($array);

array_push($array, $temp);

}

return $array;

}

Even without using PHP array function, it is still simple:

function rightShift($array, $k)

{

for($i=0; $i<$k; $i++) {

$temp = $array[$i];

unset($array[$i]);

$array[] = $temp;

}

return $array;

}

The function works and the complexity is O(n), so it meets the requirement.

But there is a flaw with the rightShift function and that is always easy for people to neglect. We don't carefully consider the value of $k. If $k <= 0, that is fine, the array keeps same. But what if $k > count($array)? We tend to assume $k < count($array) so in the function we don't do anything about it. Actually, $k could be much bigger than count($array). The function still works when $k > count($array), but obviously, it is not efficient.

So let's consider when $k > count($array). In this example, if $k = 8, what the result should be? It should be array('d', 'e', 'a', 'b', 'c'). We can find that we don't really need to shift $k times, we only need to shift $k % count($array) times. So, let's change our code:

function rightShift($array, $k)

{

$k = $k % count($array);

for($i=0; $i<$k; $i++) {

$temp = $array[$i];

unset($array[$i]);

$array[] = $temp;

}

return $array;

}

This will actually introduce another issue: what if count($array) === 0? So, we need to check this:

function rightShift($array, $k)

{

$total = count($array);

if ($total === 0) {

return false;

}

$k = $k % $total;

for($i=0; $i<$k; $i++) {

$temp = $array[$i];

unset($array[$i]);

$array[] = $temp;

}

return $array;

}

Tuesday, August 23, 2011

find the only duplicate number in an array

Another interesting algorithm question. The question is this: we have an array $numbers with 1001 numbers as elements, all these numbers are between [1, 1000], including 1 and 1000. Now we know there is only one number appears twice in this array, and we need to find out what this number is (We are not allowed to modify $numbers).

So the array is like $numbers = array(1,2,3...,1000); count($numbers) = 1001; find out the only number that appears twice. The code to generate this $numbers array is below:

<?php

$limit = 1000;

$numbers = array();

//randomly generate duplicate number

$duplicate = mt_rand(1, $limit);

//randomly generate the position of the duplicate number

$randomPosition = mt_rand(0, $limit-1);

for($i = 0; $i < $limit; $i++) {

$numbers[] = $i + 1;

if ($i === $randomPosition) {

$numbers[] = $duplicate;

}

To see what $numbers array is like, we can change the $limit = 10 to make it easier.

I guess this is supposed to be an interview question for C/C++ devs. When it comes to PHP, well, we know PHP provides the most powerful array functions among all these programming languages.

So, first, see how can we do it quickly with help of PHP array functions.

$unique = array_unique($numbers);

$diff = array_diff_key($numbers, $unique);

echo $numbers[$diff];

If we don't consider what array_unique and array_diff_key really do in the background, simply from the standpoint of PHP, we need constant steps(3 steps) to get it done. So the time complexity looks like O(1), although this is really misleading.

What if we are not allowed to use any special PHP array functions?

1. Use a temporary storage

$temp = array();

for($i=0; $i<=$limit; $i++) {

$value = $numbers[$i];

if (isset($temp[$value])) {

//we find it!

echo $value;

break;

}

$temp[$value] = $value;

}

In this way, the worst case of time complexity is O(n). But the worst case of space complexity is O(n) as well. That means this algorithm may cost a lot of memories (we need another array as a temporary storage).

2. Another idea is, we can get the sum of 1 - 1000, say $s1; get the sum of numbers in $numbers array, say $s2; the duplicate number must be $s2 - $s1. Since 1 - 1000 is an arithmetic sequence (http://en.wikipedia.org/wiki/Arithmetic_series), we have a formula to get its sum easily.

$s1 = ($limit + 1) * $limit / 2;

$s2 = 0;

for($i=0; $i<=$limit; $i++) {

$s2 += $numbers[$i];

}

echo $s2 - $s1;

For $s2, as i said, not allowed to use special PHP array functions, so array_sum() is not allowed here.

The time complexity is always O(n) in this way. So it is not as fast as the first one. However, the space complexity is O(1), actually, nearly to none. We don't need too much extra space to get the task done.

Monday, August 22, 2011

efficient javascript dom access & dom manipulation

DOM access and DOM manipulation are the most common tasks in browser javascript. And naturally, they are also the most common bottleneck when it comes to javascript performance.

Generally speaking, DOM access and manipulation is expensive, because DOM is not a part of javascript engine. It is separated and doesn't belong to 'core javascript'. For browsers, it makes sense because a javascript application probably doesn't need and DOM at all while other languages may need to work with DOM document.

Since DOM access and manipulation is expensive, the guideline is: we should try to reduce DOM access & manipulation to minimum. Let's check some code examples to learn and understand this guideline:

document.onmousedown = function(){

var i, LIMIT = 200, end=0, start = new Date().getTime();

for (i = 0; i < LIMIT; i++) {

document.getElementById("content").innerHTML += "<p>" + Math.random() + "</p>";

}

end = new Date().getTime();

console.log(end - start);

}

</script>

<p>Original Content</p>

</div>

It works, but not efficient. We access DOM elements in a loop. On my computer, the result shows clicking the mouse first time takes more than 2500 ms, which is 2 to 3 seconds. Let's make it better:

document.onmousedown = function(){

var i, LIMIT = 200, end=0, html='', start = new Date().getTime();

for (i = 0; i < LIMIT; i++) {

html += "<p>" + Math.random() + "</p>";

}

document.getElementById("content").innerHTML += html;

end = new Date().getTime();

console.log(end - start);

}

</script>

<p>Original Content</p>

</div>

We use a local variable to save the html content first, and this is pure core javascript operations without DOM involved. We avoid DOM access in a loop. On my computer, clicking the mouse first time only takes less than 40 ms, less than 1 second.

But we still have more room to improve. Note that we are using document.getElementById("content").innerHTML += html? Since we are already accessing DOM, using DOM's native appendChild() method is more efficient than using +=. But innerHTML doesn't support appendChild (innerHTML is not in DOM standard, but most browsers support innerHTML feature). So, let's change our code:

document.onmousedown = function(){

var i, LIMIT = 200, end=0, html='', start = new Date().getTime();

for (i = 0; i < LIMIT; i++) {

html += "<p>" + Math.random() + "</p>";

}

var temp = document.createElement("div");

temp.innerHTML = html;

document.getElementById("content").appendChild(temp);

end = new Date().getTime();

console.log(end - start);

}

</script>

<p>Original Content</p>

</div>

This time, clicking the mouse first time only takes less than 10 ms on my computer. So first of all, avoid DOM access in a loop. If we already get a DOM element, try to use its native method.

Second, also quite obvious, save a DOM reference in a local variable and work on the variable. This is quite easy to understand:

Instead of doing

document.getElementById("content").appendChild(child1);

document.getElementById("content").appendChild(child2);

document.getElementById("content").appendChild(child3);

We should do

var e = document.getElementById("content");

e.appendChild(child1);

e.appendChild(child2);

e.appendChild(child3);

web/user interface test with phpunit only

The best tool to test user interface is selenium(http://hengrui-li.blogspot.com/2011/08/selenium-user-interface-test-with.html). However, in case we can't use selenium, we can still test user interface with phpunit by using DOMDocument and DOMXPath. Let's check this code:

class SampleTest extends PHPUnit_Framework_TestCase
{
   /**
   * Prepares the environment before running a test.
   */
   protected function setUp()
   {
   parent::setUp();
   $_SERVER['REQUEST_METHOD'] = 'GET';
   $this->setDb();
   }

   /**
   * set up database
   * @throws Exception
   */
   protected function setDb()
   {
   shell_exec('mysql -uroot -proot test < ' . dirname(__FILE__) . '/../fixtures/test.sql');
   }

   public function testIndex()
   {
   ob_start();
   Sample::index();
   $output = ob_get_contents();
   ob_end_clean();
   ob_end_flush();
   try {
   $dom = new DOMDocument();
   $dom->loadHTML($output);
   $xpath = new DOMXPath($dom);
   $this->assertEquals('Test Title', $this->getText($xpath, '//head/title'));
   $this->assertEquals('Test Header', $this->getText($xpath, '//div[@id="header"]/h1'));
   $this->assertEquals('Last Name', $this->getText($xpath, '//table/tr/th[1]'));
   $this->assertEquals('First Name', $this->getText($xpath, '//table/tr/th[2]'));
   $this->assertEquals("Henry", $this->getValue($xpath, '//input[@name="firstname"]'));
   } catch ( Exception $e ) {
   $this->fail('Not valid dom document - ' . $e->getMessage());
   }
   }

   private function getText(DOMXPath $xpath, $query)
   {
   if ($xpath->query($query)->length == 0) {
   throw new Exception('Text not found in query ' . $query);
   }
   return $xpath->query($query)->item(0)->nodeValue;
   }

   private function getValue(DOMXPath $xpath, $query)
   {
   return $xpath->query($query)->item(0)->getAttribute('value');
   }
}
This code will output the index page:

ob_start();

Sample::index();

$output = ob_get_contents();

ob_end_clean();

ob_end_flush();

Then we create a DOM object and load the html output. For a legacy web page, it is quite possible that the html output is not a valid DOM document. So this test can help us fixing our html pages.

$dom = new DOMDocument();

$dom->loadHTML($output);

Then we create a DOMXPath object for the $dom object so we can use DOMXPath's powerful methods to query our DOM documents.

$xpath = new DOMXPath($dom);

Code below is actually checking the content of the DOM document.

$this->assertEquals('Test Title', $this->getText($xpath, '//head/title'));

$this->assertEquals('Test Header', $this->getText($xpath, '//div[@id="header"]/h1'));

$this->assertEquals('Last Name', $this->getText($xpath, '//table/tr/th[1]'));

$this->assertEquals('First Name', $this->getText($xpath, '//table/tr/th[2]'));

$this->assertEquals("Henry", $this->getValue($xpath, '//input[@name="firstname"]'));

As we can see, testing web interface in this way is not very enjoyable and it can't properly test javascript's DOM manipulation functions. But, it is better than none.

Thursday, August 18, 2011

avoid pass by reference in PHP

One of my post (http://hengrui-li.blogspot.com/2011/08/php-copy-on-write-how-php-manages.html) discusses PHP's copy on write mechanism and explain why passing by value won't cost more memory in most cases.

Actually, pass by reference in PHP is considered as a bad practice, even from the perspective of performance. Today i found these two very valuable posts about PHP's reference mechanism. It definitely worths a read:

1. http://schlueters.de/blog/archives/125-Do-not-use-PHP-references.html
2. http://schlueters.de/blog/archives/141-References-and-foreach.html

Both posts from the same guy and his English is much better than mine. Anyway, here quotes from the summary of his post: "Do not use references for performance". And he explains the reason very well.

We should try to avoid using pass by reference in php. The reason is quite simple and common: when we have a lot of references in our system, changing one will change another and it will be hard for us to track what happened.

So passing by reference for performance is No, what if we want to return multiple value from a function? We can do that in other ways, for example, we can return an array from the function, or we can pass a parameter object into the function.

What if sometimes a function is defined and we don't want to change its return type, and we don't want to build parameter object? This is almost like saying 'I just want to use reference'. Well, honestly, sometimes i use reference as well, for convenience and laziness. When i was tempted to use reference, i always check if this prerequisite holds true:
it is only in a private method of a class. That means, the method using pass by reference should be hidden within the class. It should not be exposed to others. It must be private only(no protected, no public).

Well, even that, avoid using reference is still a generic rule and we should respect it.

Wednesday, August 17, 2011

UTF-8, multibyte functions in php web application

Output text/string in UTF-8 encoding

There are several ways to tell browser how to encode a page. One way is to specify a meta tag in html:

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

This approach is simple and easy. But it does contain some disadvantages. One issue is, most browsers will have to start re-parsing the document after reaching the meta tag, because they may have already parsed the document with incorrect encoding. This may cause a delay in page rendering. Due to this, it is better that we output the UTF-8 header with PHP.

Output UTF-8 header with PHP:

We use PHP's header function header("Content-Type:text/html;charset=utf-8");

The method is safe enough to ensure the page is encoded with UTF-8. However, it is obviously not as convenient as the way of specifying meta tag of http-equiv, which can be done in a layout template or a header file and then is included in other pages.

We still have the third solution. Assuming we are using Apache web server(i believe most PHP apps run on apache), we can specify the charset in a specific .htaccess file.

Specify chartset in .htaccess file:

AddDefaultCharset utf-8

In this way we send regular http utf-8 head through web server configuration.

PHP multibyte string functions

We know that PHP provides a set of multibyte string functions, which are prefixed with 'mb_'. There are some interesting things about them. Let's take mb_substr for example. I'm using PHP 5.3.3 with the default php.ini configuration. To do the test, nothing could be better than using my native language: Chinese.

<?php

header("Content-Type:text/html;charset=utf-8");

$str = '我爱编程';

echo substr($str, 2, 2), '<br>';

echo mb_substr($str, 2, 2), '<br>';

Ok, let me explain. The first line header("Content-Type:text/html;charset=utf-8"); simply sends a utf-8 header to ensure the output is encoded with UTF-8. The second line, $str = '我爱编程', is a Chinese characters string. It is 4 Chinese characters. In Chinese, 1 character is 1 word as well, so the string is also 4 words. Translating it into English is 'i love programming', which is 3 words, 18 characters including space.

Now, I want to return part of the Chinese string, starting from position 2, and length is 2. The correct result should be 编程.

The third line: echo substr($str, 2,2), '<br>'. Here we use normal substr. As we can expect, the output would be wrong. On my screen, the output is ��

Next, the last line: echo mb_substr($str, 2, 2), '<br>'. Now we use PHP's mb_substr and expect it could work properly. Does it work? Unfortunately, it doesn't! The output is still ��. Let's check PHP manual about mb_substr: "string mb_substr ( string $str , int $start [, int $length [, string $encoding ]] ). The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used." So, based on the manual, we change our code to:

<?php

header("Content-Type:text/html;charset=utf-8");

$str = '我爱编程';

echo substr($str, 2, 2), '<br>';

echo mb_substr($str, 2, 2, 'UTF-8'), '<br>';

This time, it works! The output is 编程, as what we expect. So in this case, we can't simply use mb_substr and expect it can work properly. We still have to specify the encoding method. If we don't specify the character encoding, PHP will use the internal character encoding value. Then we have another question: what is the internal character encoding value? To answer this question, we must have a look at our php configuration file. Let's open our php.ini. We can find a [mbstring] section.

[mbstring]

...

;mbstring.internal_encoding = EUC-JP

...

Ok, that is quite clear now. Let's uncomment this line and change the value to UTF-8, and then restart Apache server(Don't forget this).

[mbstring]

...

mbstring.internal_encoding = UTF-8

...

Now, let's try this code again:

<?php

header("Content-Type:text/html;charset=utf-8");

$str = '我爱编程';

echo substr($str, 2, 2), '<br>';

echo mb_substr($str, 2, 2), '<br>';

Now it works, mb_substr is using internal encoding value, and that value has been set to UTF-8.

One thing i don't like about PHP(and javascript) is, for a same task, it always provides a few different ways to do it. I used to call it too much flexibility (http://hengrui-li.blogspot.com/2011/04/too-much-language-flexibility-good-or.html). Recently i learned the core philosophy of Python: "There should be one - and preferably only one - obvious way to do it" and i found why i don't think PHP is a great programming language(purly from programming language perspective, not from the point that how it boosts web and makes web programming so easy).

So, let's suppose we only want to use one function for the task "to return part of a string". Obviously, mb_substr is our choice, and replace all substr in old system with mb_substr is not hard. But what if we don't want to change our code? Or what if we simply think typing mb_substr is less efficient than typing substr?

Actually, mbstring supports overloading the existing string manipulation functions. If we enable overloading, when we call substr(), PHP will actually call mb_substr() automatically. Let's see how to enable overloading in php.ini:

[mbstring]

...

; overload(replace) single byte functions by mbstring functions.

; mail(), ereg(), etc are overloaded by mb_send_mail(), mb_ereg(),

; etc. Possible values are 0,1,2,4 or combination of them.

; For example, 7 for overload everything.

; 0: No overload

; 1: Overload mail() function

; 2: Overload str*() functions

; 4: Overload ereg*() functions

; http://php.net/mbstring.func-overload

;mbstring.func_overload = 0

...

So we simply uncomment the last line, and set to value to 7: mbstring.func_overload = 7, restart apache, and try the code:

<?php

header("Content-Type:text/html;charset=utf-8");

$str = '我爱编程';

echo substr($str, 2, 2), '<br>';

echo mb_substr($str, 2, 2), '<br>';

We can find both functions work fine! But doing this overloading can cause issues. If we are using normal string manipulation functions to handle real binary data(it means real binary data, NOT the text string treated as binary), enable overloading could break the binary handling code. Although i hardly see PHP code need to handle real binary data, it is safest that we simply use mb_ string functions. Just remember this on PHP manual also: "It is not recommended to use the function overloading option in the per-directory context, because it's not confirmed yet to be stable enough in a production environment and may lead to undefined behaviour."

Tuesday, August 16, 2011

Selenium user interface test with PHPUnit

I used to simply use Selnium IDE, a firefox plugin, for a website's interface test. However, if we want to integrate the user interface tests into our continuous integration system, we can use Selenium RC server to do automated user interface tests in our continuous integration system. Selenium runs all the tests directly in a browser, just as a real user is browsing the website. PHPUnit provides the functions we need to talk to Selenium RC server and we can write user interface test cases just like usual unit test cases in PHPUnit.

First we must download Selenium Server from here: http://seleniumhq.org/download/, my current version is the latest 2.3.0. It is a single .jar file: selenium-server-standalone-2.3.0.jar. To start the server, we must run this command:

java -jar /path/to/selenium-server-standalone-2.3.0.jar

That is it. We have setup our testing server.

Now, write our first test case, we just want to browse http://localhost/ and check if it works! :D

TestLocalhost.php

<?php

class TestLocalhost extends PHPUnit_Extensions_SeleniumTestCase

{

protected function setUp()

{

$this->setBrowser("*firefox");

$this->setBrowserUrl("http://localhost/");

}

public function testLocalhost()

{

$this->open("/");

$this->verifyTextPresent("It works!");

}

And then we can just run the test with the command:

phpunit TestLocalhost.php

Let's check through this test case.

In setUp method, we set up the browser we want to use: $this->setBrowser("*firefox"). This tells selenium to use firefox for testing. We can setup other browsers as long as they are installed on our system, some of them are:

*firefox

*chrome

*iexplore

*safari

*opera

What if we want to run the tests on a series of browsers? We can declare a public static $browsers array in the test class:

class TestLocalhost extends PHPUnit_Extensions_SeleniumTestCase

{

public static $browsers = array(

array(

'name' => 'Firefox on Linux',

'browser' => '*firefox',

'host' => 'localhost',

'port' => 4444,

'timeout' => 30000,

array(

'name' => 'Chrome on Linux',

'browser' => '*chrome',

'host' => 'localhost',

'port' => 4444,

'timeout' => 30000,

);

protected function setUp()

{

$this->setBrowserUrl("http://localhost/");

}

public function testLocalhost()

{

$this->open("/");

$this->verifyTextPresent("It works!");

}

Now, Selenium will run the tests through all browsers declared in the static $browsers array.

$this->setBrowserUrl("http://localhost/"); set up the base Url of our web application.

Now we have one test case testLocalhost(). $this->open("/") tells we open the root of our web site first: http://lcoalhost/. $this->verifyTextPresent("It works!") will verify the text 'It works' should be presented after we go to http://lcoalhost/,

Now let's look at another example:

<?php

class TestWebApp extends PHPUnit_Extensions_SeleniumTestCase

{

protected function setUp()

{

$this->setBrowser("*chrome");

$this->setBrowserUrl("http://webapp/");

}

public function testLogin()

{

$this->open("/");

$this->type("id=txtUserId", "username");

$this->type("id=txtPassword", "wrongpassword");

$this->click("id=frmLoginButton");

$this->waitForPageToLoad("30000");

$this->verifyTextPresent("Username or Password incorrect");

$this->type("id=txtPassword", "correctpassword");

$this->click("id=frmLoginButton");

$this->waitForPageToLoad("30000");

$this->assertEquals("WebApp 3.3.0", $this->getTitle());

}

This simple test case test login of a web application. $this->type("id=txtUserId", "username") will type the text 'username' into the text field with id=txtUserId, and then type the text 'wrong password' into the password text field. $this->click("id=frmLoginButton") will do a click action on the login button. $this->waitForPageToLoad("30000"), well, quite self explained. Since we enter a wrong password, we expect the text "Username or Password incorrect" is displayed on the web page, so we do $this->verifyTextPresent("Username or Password incorrect");

Writing all these test cases for a whole web site is quite time consuming. We can use Selenium IDE to record all our tests and export them in PHPUnit format, which makes our life much easier.

Monday, August 15, 2011

MySql stored procedure commands

1. To show all stored procedures of a database:

SHOW PROCEDURE STATUS where DB = 'databasename';

2. Dump(export) all stored procedures of a database

mysqldump -uuser -ppassword --routines databasename > outputfile.sql

Sunday, August 14, 2011

an interesting SQL query question

A very interesting SQL statement question. We have a 'payment' table:

payment table:

year salary

2000 1000

2001 2000

2002 3000

2003 4000

The question is: Write a query on payment table so we get the result below:

Query Result:

year salary

2000 1000

2001 3000

2002 6000

2003 10000

We can find that the result's salary is the sum of the salary of the year and the salary of last year. For example:

In Query Result, 2001's salary is 3000 = Table payment's year 2001's salary 2000 + Table payment's year 2000's salary 1000

Answer 1:

select year, (select sum(b.salary) from payment b where b.year <= a.year) as salary from payment a;

To expalin this one, better imagin we have two tables payment a, and payment b

payment a:

Year payment

2000 1000

2001 2000

2002 3000

2003 4000

payment b:

Year payment

2000 1000

2001 2000

2002 3000

2003 4000

First, MySql tries to retrieve all the year data from table a. For the second column, MySql will select the rows from table b that b.year <= a.year, and then get the SUM of the payment of these rows. The process is like:

For a.2000: b.2000 <= a.2000; SUM(b.1000)

For a.2001: b.2000 <= a.2001; b.2001 <= a.2001; SUM(b.1000,b.2000)

For a.2002: b.2000, b.2001, b.2002 <= a.2002; SUM(b.1000,b.2000,b.3000)

...

Finally, we can get the result:

year salary

2000 1000

2001 3000

2002 6000

2003 10000

Answer 2:

Answer 1 using sub query is straight forward and easier for people to understand. But when it comes to SQL, sub query is mostly not the best option. We can always write the easier sub query first, and then after we fully understand how the data should be retrieved, we can change it to use JOIN. That is how to use JOIN to do it:

select a.year, sum(b.salary) salary from payment b join payment a on b.year <= a.year group by a.year;

Answer 3 (Here is the real interesting answer):

If we still remember the math we learned at primary school(Well, at least that is being taught in primary school in China), we can find that the salary in payment table is actually an arithmetic sequence (http://en.wikipedia.org/wiki/Arithmetic_series)! For arithmetic sequence, we have this formula: Sn = (A1 + An) * n / 2. So, we can use this query:

select year, (1000+salary)*salary/2000 from payment;

It works perfectly for this specific question.

Thursday, August 11, 2011

how to reverse a string in php

This is just a simple question for fun. Well, it could be an interview question. Let's assume we are in an interview and being asked this question. The answers i can come up with are listed below:

$str = 'this is testing'; How to reverse the string to 'gnitset si siht'?

1. strrev, as long as we know this function exists.

$str = 'this is testing';

echo strrev($str);

2. What if we don't know this strrev function? Then this is not bad as well (i think):

$str = 'this is testing';

$strArray = str_split($str);

$reverseArray = array_reverse($strArray);

$reverseStr = implode($reverseArray);

3. If we cannot remember any of these functions, we have to solve this problem by figuring out our own algorithm.

$str = 'this is testing';

$length = strlen($str);

$reverseStr = '';

for($i=$length-1; $i>=0; $i--) {

$reverseStr .= $str[$i];

}

Oh yes, we can access a string's elements via array operator. It is not surprising if you know how C handles string.

4. What if we can't even remember strlen() either? Seriously? Ok, still not a problem:

$str = 'this is testing';

$reverseStr = '';

$i = 0;

while(isset($str[$i])) {

$reverseStr = $str[$i] . $reverseStr;

$i++;

}

As long as we know how PHP handles string internally(actually, how C handles string), we can solve the problem.

Frankly speaking, if i were asked this question in an interview, i could probably only come up with answer 3 and answer 4, cause I can't remember all those PHP functions either.

5. What if we can't even remember isset() function? You must be kidding.

6. What if we can't figure out our own algorithm? humm... that is not good. If an interviewer ask you this question, he probably expect you can come up with your own solutions without using any special functions to see your capability to analyse and solve problems(Well, but i do see interview just trying to test candidates' memory). He probably even tells you that you cannot use any of those special functions. Anyway if we really cannot figure out any implementation, I probably would just write down my ultimate answer/solution: Google

:D :D :D

Wednesday, August 10, 2011

php isset, is_null and ===null

This is post is simply for fun.

what is the difference between PHP's isset and is_null? Let's see what PHP manual states:

isset: Determine if a variable is set and is not NULL.

is_null: Finds whether the given variable is NULL.

It seems they work for exactly same purpose simply in an opposite way:

$variable = null;

isset($variable) !== is_null($variable)

or

!(isset($variable)) === is_null($variable)

Is that simple? Not really. One difference between isset and is_null is: as its name suggests, when using isset to check an undefined variable, PHP won't raise a NOTICE. But if we use is_null to check an undefined variable, a NOTICE will be raised.

//a NOTICE will be given by PHP

is_null($undefinedVar);

//no NOTICE

isset($undefinedVar);

In PHP manual, there is a note for isset:

Note: Because this (isset) is a language construct and not a function, it cannot be called using variable functions

This tells us isset is a language construct like echo, for, foreach. It is not a function. So here is one difference

//this works

$func = 'is_null';

$func($variable);

//this doesn't work

$func = 'isset';

$func($variable);

Also, is_null is a function, so it can take a function return value as its argument, while isset cannot do that.

//this works

is_null(getVariable());

//this doesn't work

isset(getVariable());

Since isset is language construct while is_null is a function, you can guess there is performance difference between them. Using language construct is more efficient than calling a function. To check the difference, I will use VLD to expose the opcode(about VLD, check this http://hengrui-li.blogspot.com/2011/07/review-php-opcode-with-vld.html)

test1.php

<?php

$name = null;

isset($name);

php -dvld.active=1 test1.php

line # * op fetch ext return operands

---------------------------------------------------------------------------------

2 0 > EXT_STMT

1 ASSIGN !0, null

3 2 EXT_STMT

3 ZEND_ISSET_ISEMPTY_VAR 5 ~1 !0

4 FREE ~1

4 5 > RETURN 1

As we can see, PHP parse isset as one opcode operation: ZEND_ISSET_ISEMPTY_VAR.

test2.php

<?php

$name = null;

is_null($name);

php -dvld.active=1 test2.php

line # * op fetch ext return operands

---------------------------------------------------------------------------------

2 0 > EXT_STMT

1 ASSIGN !0, null

3 2 EXT_STMT

3 EXT_FCALL_BEGIN

4 SEND_VAR !0

5 DO_FCALL 1 'is_null'

6 EXT_FCALL_END

4 7 > RETURN 1

To call a is_null function, PHP does these opcode operations: EXT_FCALL_BEGIN, SEND_VAR, DO_FCALL, EXT_FCALL_END

The result is quite obvious now. isset is more efficient. But sometimes, checking is null simply makes more sense than checking isset, especially when we are checking a function return value:

$name = $user->getName();

isset($name);

is_null($name);

For this example, we can still use isset($user) to check if $user is null or not, but, by the meaning of the name 'isset', here $user seems being set obviously. What we really want to check is if $user is null or not. At this situation(when we simply want to check if a variable is null, not worrying about if it is set), using $name === null is better than using is_null($name)

$name === null returns exactly same result as is_null($name), however, $name === null is more efficient. It is almost as efficient as isset. Let's check the opcode:

test3.php

<?php

$name = null;

$name === null;

php -dvld.active=1 test3.php

line # * op fetch ext return operands

---------------------------------------------------------------------------------

2 0 > EXT_STMT

1 ASSIGN !0, null

3 2 EXT_STMT

3 IS_IDENTICAL ~1 !0, null

4 FREE ~1

4 5 > RETURN 1

From micro performance optimization perspective: isset better than ===null better than is_null.

I think the correct usage of them is: if we want to check if a variable is "set"(or existing), use isset; if we want to check if an existing variable's value is null, use === null.

The performance difference among them belongs to micro-optimization, so i have to run 5 million times of comparison to see the difference:

My testing code is quite simple:

<?php

$counter = 5000000;

$name = 'henry';

$start = microtime(true);

for($i=0; $i<$counter; $i++) {

isset($name);

}

$end = microtime(true);

echo $end - $start, "\n";
?>

I simply replace isset(name) with is_null($name) and $name === null and run the script respectively. And the result is:

isset: 1.62s

===null: 1.80s

is_null: 8.47s

It is a little out of my expect that is_null could be that slower, anyway, just a simple test for fun.