Sunday, July 31, 2011

I like Python philosophy

I like Python's philosophy. Especially this one, my favorite: "There should be one - and preferably only one - obvious way to do it".

I also like these: "Explicit is better than implicit", "Readability counts", "If the implementation is hard to explain, it's a bad idea".

More Python philosophy: http://c2.com/cgi/wiki?PythonPhilosophy

Wednesday, July 27, 2011

mysql run query from command line

I've seen many times that a developer login mysql just to run a single/simple query. This is usually what he would do:
mysql -uuser -ppassword db;
mysql> select * from users;

Instead, we can run a mysql query from command line by using option -e, --execute=name. it means execute command and quit. For example:

mysql -uuser -ppassword db -e "select * from users";

or

mysql -uuser -ppassword -e "select * from db.users";

Usually when we want to install some open source PHP software, we need to create the database first. We can create a mysql database from command line as well:

mysql -uuser -ppassword -e "create database db";

javascript prototype only used for properties inheritance

Check this code:

function Person(name) {
                this.name = name;
}
Person.prototype.type = 'human';
var instance = new Person('henry');
console.log(instance.type); //this will be 'human'
console.log(Person.type); //this will be 'undefined'

It is a little confusing that why Person.type is 'undefined' while the instance of Person can have correct type. The explanation is this, quoted from http://www.mollypages.org/misc/js.mp:

"The prototype is only used for properties inherited by objects/instances created by that function. The function itself does not use the associated prototype."

Tuesday, July 26, 2011

javascript 'this' keyword

Let's remember this rule: by default, javascript's 'this' always refers to the owner of the function or the object that has the function as a member. With this rule in mind, let's take a closer look at 'this'.

1. 'this' in a global function

var name = 'henry';
function sayHello() {
                console.log('hello ' + this.name);
}
sayHello();

In this example, what does 'this' refer to? Based on the rule, let's ask this question: what/who is the owner of sayHello() function? The answer is window (assuming in a browser environment). So 'this' refers to window. In fact, every so called 'global' stuff in javascript are members of the window object. So, the example will simply log 'hello henry'. And we probably know that in the sayHello() we can simply do console.log('hello ' + name);

2. 'this' in an event handler function

<html>
<head>
<script>
function alertName() {
    alert(this.value);
}
window.onload = function(){
    document.getElementById('name').onclick = alertName;
}
</script>
</head>
<body>
<input id="name" type="text"  name="name" value="my name is henry" />
</body>
</html>

Guess what? the code works perfectly fine. Back to the rule: 'this' always refers to function owner. In this case, the owner of alertName() is window. So the alert(this.value) should show 'undefined', right? But it does show the value in the input field. Let's dig this deeper. In javascript, everything is object. Function is object too and can be assigned as a variable. document.getElementById('name') returns a DOM object with id = 'name'. if we break document.getElementById('name').onclick = alertName into two steps, they are:

var domObj = document.getElementById('name');
domObj.onclick = alertName;

So what we do is actually assign(copy) alertName as a variable to domObj's onclick property. When we click the input field, domObj.onclick is triggered. Who is the owner of onclick? The answer is domObj, so 'this' is actually referring to domObj. To be more clear, we can alert(domObj.onclick); and the output is below:

function alertName() {
  alert(this.value);
}

Ok, so far so good. But let's change the above example to below:

<html>
<head>
<script>
function alertName() {
    alert(this.value);
}
</script>
</head>
<body>
<input id="name" type="text"  name="name" value="my name is henry" onclick="alertName()"/>
</body>
</html>

This time, it doesn't work. What is happening now? Let's alert(document.getElementById('name').onclick); and we see the output:

function onclick() {
     alertName();
}

See the difference? Now onclick is actually a function that invokes alertName(). So in this case, 'this' inside alertName is pointing to the window object.

We can actually change what 'this' refers to if we want. One example is to use 'apply' or 'call'. See http://hengrui-li.blogspot.com/2011/05/javascript-callback-function-scope.html

Monday, July 25, 2011

mysql innodb row level locking

We know that MySql InnoDB engine provides row level locking, while MyISAM can only do table level lock. But, InnoDB's row level locking mechanism may not be as what you expect. InnoDB doesn't really lock rows of data, instead, it set locks on on every index record that is scanned in the processing of the SQL statement. This means, only when your SQL statement is using index to query data, InnoDB will use row level lock. Otherwise, InnoDB will use table level lock. If we don't pay attention to this, we may end up with lots of lock conflicts in our application.

Let's do some simple tests.

mysql> create table no_index_lock_test(id int,name varchar(10)) engine=innodb;
Query OK, 0 rows affected (0.15 sec)
mysql> insert into no_index_lock_test values(1,'henry'),(2,'alice'),(3,'bob'),(4,'jack');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0

Let's start two sessions
session_1
session_2
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from no_index_lock_test where id = 1 ;
+------+------+
| id   | name |
+------+------+
| 1    | 'henry' |
+------+------+
1 row in set (0.00 sec)
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from no_index_lock_test where id = 2 ;
+------+------+
| id   | name |
+------+------+
| 2    | 'alice'  |
+------+------+
1 row in set (0.00 sec)
mysql> select * from no_index_lock_test where id = 1 for update;
+------+------+
| id   | name |
+------+------+
| 1    | 'henry'    |
+------+------+
1 row in set (0.00 sec)
mysql> select * from no_index_lock_test where id = 2 for update;

Will keep waiting...

This example show us InnoDB can only use table level locking if no index is available. In session 1, it looks like we only set a lock on one row (where id=1 for update). But we don't have index on id column, InnoDB is actually locking the whole table, so in session two, when we try to set a lock to a different row(where id=2 for update), we have to wait until the lock in session 1 is released.

Now, let's add index to our id column.
mysql> create table index_lock_test(id int,name varchar(10)) engine=innodb;
Query OK, 0 rows affected (0.15 sec)
mysql> alter table index_lock_test add index id(id);
Query OK, 4 rows affected (0.24 sec)
mysql> insert into index_lock_test values(1,'henry'),(2,'alice'),(3,'bob'),(4,'jack');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
session_1
session_2
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from index_lock_test where id = 1 ;
+------+------+
| id   | name |
+------+------+
| 1    | 'henry' |
+------+------+
1 row in set (0.00 sec)
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from index_lock_test where id = 2 ;
+------+------+
| id   | name |
+------+------+
| 2    | 'alice'  |
+------+------+
1 row in set (0.00 sec)
mysql> select * from index_lock_test where id = 1 for update;
+------+------+
| id   | name |
+------+------+
| 1    | 'henry'    |
+------+------+
1 row in set (0.00 sec)
mysql> select * from index_lock_test where id = 2 for update;
+------+------+
| id   | name |
+------+------+
| 2    | 'alice'  |
+------+------+
1 row in set (0.00 sec)
This time, InnoDB is using row level lock.

Sunday, July 24, 2011

Quality Assurance Tools for PHP

1. PHPUnit
No need to introduce this one, i believe. And it can be integrated into the  continuous integration tool 

I don't see this one is very valuable.

3. phpcpd: PHP Copy-Paste-Detector (http://github/sebastianbergmann/phpcpd)
This one is interesting. Let's have a look at it.
to install:
sudo pear channel-discover pear.phpunit.de;
sudo pear install phpunit/phpcpd;

to check copy & paste code:
phpcpd /var/www/project/application

And the result is like below:
phpcpd 1.3.2 by Sebastian Bergmann.

Found 1 exact clones with 10 duplicated lines in 2 files:

  - modules/user/login.php:28-38
    modules/user/edit.php:28-38

0.03% duplicated lines out of 35460 total lines of code.

Time: 2 seconds, Memory: 19.50Mb

Just remind that its detection doesn't always make sense. As a tool, it can only help you to find possible copy & paste codes. It finally depends on you, the developer, to judge if the codes make sense or not.

4. phpdcd: PHP Dead Code Detector (http://github/sebastianbergmann/phpdcd)
Many developers are very cautious when it comes to delete. In fact, usually they would not like to delete anything even the function/method is not used anymore. The good side of not deleting is you can have a piece of mind, cause you know you never accidently delete something. The bad side of it is the system expands rapidly and is filled with lots of dead code.

This tool can help you detect dead code. BUT, have to warn you again that you should not rely on its detection result.  It cannot detect the methods which are called implicitly/dynamically, which is a practice that should be avoid as much as possible. Say for example:
function test(){};
$fn = 'test';
$fn();
If your function is called in this way, phpdcd cannot correctly recognize it. However, you can still use the tool to help you find dead code. Just ensure you confirm the code is really dead before you delete it.

To install:
sudo pear channel-discover pear.phpunit.de;
sudo pear channel-discover components.ez.no
sudo pear install phpunit/phpdcd-beta

To use:
phpdcd /var/www/project/application

And you will get a list of functions that are considered 'dead'

5. PHP_Depend: pdepend (http://pdepend.org)
6. PHP Mess Detector (phpmd)
7. PHP_CodeSniffer (phpcs)
8. PHP_CodeBrowswer (phpcb)

These tools can be integrated into the continuous integration tool

9. phpUnderControl (http://phpundercontrol.org/)
This is the most wildly used continuous integration tool in PHP world.

Hudson is another solution to continuous integration.  phpUnderControl is based on CruiseControl, however, CruiseControl is outdated and Hudson is more robust and easier to handle.  Hudson is now renamed to Jenkins.

Installing Jenkins on ubuntu is easy:
wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -
sudo sh -c "echo 'deb http://pkg.jenkins-ci.org/debian binary/' > /etc/apt/sources.list.d/jenkins.list"
sudo apt-get update
sudo apt-get install jenkins

After the installation is done, you can try jenkins locally: http://localhost:8080. It provides a very nice web interface for you to manage your projects. You can go to " Manage Jenkins" -> "Manage Plugins" to install plugins you need.

Now you can read
and
and
to learn more about how to integrate PHP project with Jenkins

Thursday, July 21, 2011

review PHP opcode with VLD

First of all, this is simply for fun.

Now let's see what VLD is. Quote from a book VLD is "a plug-in for the Zend Engine that displays all Opcodes a script uses when executed. VLD allows us to look under the hood as to what each function is doing and which system calls it's making, and most importantly it allows us to compare what may seem like a similar PHP function on the surface, lighter vs. heavier function calls"

To install:
pecl install channel://pecl.php.net/vld-0.10.1
And then add one line to your php.ini:
extension=vld.so

To show the opcodes of a script:
php -dvld.active=1 script.php

A small example:
Let's create three files called echo1.php, echo2.php, echo3.php. They all do the same thing on the surface.
echo1.php:

<?php
$name = 'henry';
echo 'hello ' . $name;

Now let's run php -dvld.active=1 echo1.php. We see the opcodes:
line    # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2     0  >   EXT_STMT                                               
          1      ASSIGN                                                 !0, 'henry'
   3     2      EXT_STMT                                                
          3      CONCAT                                      ~1     'hello+', !0
          4      ECHO                                                     ~1
   4     5    > RETURN                                               1

Pretty cool, isn't it? Now we can see PHP ASSIGN 'henry' first, then do a CONCAT, and then ECHO the result. The whole script takes 5 opcode operations. Now, move to echo2.php:
echo2.php:

<?php
$name = 'henry';
echo 'hello ' , $name;

The only difference is the echo statement in which we use comma ',' instead of period '.' . Let's see the opcodes:
line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2       0  >   EXT_STMT                                                
            1      ASSIGN                                                !0, 'henry'
   3       2      EXT_STMT                                                
            3      ECHO                                                    'hello+'
            4      ECHO                                                    !0
   4       5    > RETURN                                              1

aha, now you can understand why people say using comma in echo statement is faster than using period. If we use period to concatenate a string, PHP has to do a CONCAT call and a ECHO call. In echo2.php, instead, PHP simply needs to take two ECHO operation, which is cheaper than CONCAT.  If we use more '.', we have more CONCAT operations.

Now, echo3.php:
<?php
$name = 'henry';
echo "hello $name";

The opcodes are:
line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2      0  >   EXT_STMT                                                
           1      ASSIGN                                                 !0, 'henry'
   3      2      EXT_STMT                                                
           3      ADD_STRING                              ~1      'hello+'
           4      ADD_VAR                                    ~1      ~1, !0
           5      ECHO                                                      ~1
   4     6    > RETURN                                                 1

In echo3.php, we found that we need one more opcode operation than the other two. But just keep in mind that simply a few more operations doesn't necessarily mean it takes more time. Anyway, I am not a fan of micro-optimization. Say for this example, if i see someone write codes like echo1.php, i won't go to him and ask him to change for the sake of optimization. But if "using comma instead of period in echo statement" is a coding convention, then i may ask him to change for keeping coding convention.

Wednesday, July 20, 2011

learning jQuery source code (v1.6.2) - 2

Continue from the last post http://hengrui-li.blogspot.com/2011/07/learning-jquery-source-code-v162.html, we know that jQuery object is actually generated by the code below:
var jQuery = function( selector, context ) {
     // The jQuery object is actually just the init constructor 'enhanced'
     return new jQuery.fn.init( selector, context, rootjQuery );
}
Today we will learn how jQuery.fn.init function works. When i look at a function, I always look at the function signature first, and then, try to look at what it returns. At this stage, i don't have to worry about its implementation details.

jQuery.fn = jQuery.prototype = {
            constructor: jQuery,
            init: function( selector, context, rootjQuery ) {
                        ...
                        return jQuery.makeArray( selector, this );
            },
...
} 
Well, jQuery is an excellent js lib, but i still have to say i don't like functions without any annotation at the top.

We see the function takes three parameters. 'selector' probably is the easiest one to understand: when we call $("#id"), "#id" is the selector. Let's leave the other two parameters for now.

We look at what this function will return at the last line: "return jQuery.makeArray( selector, this );". Usually, i look at the last return statement to grab the first impression that what a function could return. However, it is also quite common that a function has multiple return statements. I think it is a good practice that we should try to avoid multiple return statements within one function, but, sometimes, it does make our life easier, so i can accept that. One thing i really insist is the function's return type must be consistent. That means, if a function is supposed to return a boolean value, it should only return true or false. It is bad that it returns true at one place, at another place it returns null, and at another place it returns empty string '' etc. In general, a function should return only one type of data. It is a bad practice that a function could return multiple types. "return jQuery.makeArray( selector, this );" tells us the function could return an array(object). I don't know what jQuery.makeArray really does but it doesn't matter as long as its name is not misleading.

Now let's look at the implementation details step by step.
code 1:
// Handle $(""), $(null), or $(undefined)
if ( !selector ) {
     return this;
}
The comment is quite clear here. if we do $("") kindof stuff, it simply returns this.

code 2:
// Handle $(DOMElement)
if ( selector.nodeType ) {
            this.context = this[0] = selector;
            this.length = 1;
            return this;
}

Again, the comment tells all. To see what it does, we can write some logs in the code, and try:
var dom = document.getElementById('id');
$(dom);

From this code, we can find that jQuery makes this[0] refers to the dom elements we are querying.

code 3:
// The body element only exists once, optimize finding it
if ( selector === "body" && !context && document.body ) {
    this.context = document;
    this[0] = document.body;
    this.selector = selector;
    this.length = 1;
    return this;
}
This simply optimizes the way to find body element.

code 4:
// Handle HTML strings
if ( typeof selector === "string" ) {
...
}
This is the most complex block cause it is handling the most common tasks like $("#id"), $('input[name="name"]'), etc. We will look at it next time.

Tuesday, July 19, 2011

learning jQuery Source Code (v1.6.2)

By its name, jQuery, we can learn something. 'j' means javascript. 'Query' tells us what this library can do or what it is good at. An interesting question is why jQuery becomes so popular even it comes after some other mature javascript libraries like YUI, prototype?

To answer this question, we better open our memory and think what we usually use javascript to do in web development. Mostly, we do these things: 1. search DOM element using getElementById. Once we get the DOM element, we get its value or set its value; 2. Set DOM element's content by setting innerHTML; 3. DOM elemnt event listening, such as 'click'; 4. Using AJAX to get data from backend and update DOM elements' content; 5. Update DOM elements' CSS value.

So most of our tasks are related to DOM elements and these tasks can be divided into two groups: 1. Search/query DOM elements; 2. DOM elements manipulation.

For javascript gurus, writting 'document.getElementById' or 'document.getElementsByTagName' may not be a problem.  Manipulating DOM elements is not hard either. But crossing browsers like IE, mozilla could be a headache to every javascript developer. jQuery doesn't try to do everything. It simply makes DOM manipulation easier and crossing browsers less painful.

We can see the whole jQuery code is wrapped into a self executing function (or immediate function). The obvious benefit of doing this is no global variables would be left behind. So you don't have to wrroy that your library code(temporary variables) may accidently pollute the global space. We can see "window" is passed into this self invoking function as a parameter. In browser environment, window is the top level global object. Everything else are actually properties of window object.
code 1:
(function( window, undefined ) {
...
})(window);

Inside the immediate function, we find another immediate function. This function returns a value to the jQuery variable.
code 2:
var jQuery = (function() {
...
})();

We know that jQuery is the single object we can use straightaway (How about "$"? Well $ is exactly the jQuery object). We can see the last line in code 1:
(function( window, undefined ) {
...
// Expose jQuery to the global object
window.jQuery = window.$ = jQuery;
})(window);

This creates two global objects: jQuery and $. They both refers to a single identical object. So once we include jQuery library, we can use either jQuery or $ to invoke other methods.
Continue looking through code 2:
var jQuery = (function() {
var jQuery = function( selector, context ) {
                                return new jQuery.fn.init( selector, context, rootjQuery );
                },
...
})();

Don't get confused. The second " var jQuery" is simply another local variable. Note that it is inside the immediate function scope? If this can make you clear, you can imagine the code is:
var jQuery = (function() {
var localJquery = function( selector, context ) {
                                ...
                },
...
})();

Next time, we will look through this jQuery.fn.init( selector, context, rootjQuery ) function.

Monday, July 18, 2011

Node.js for beginners

For those who are new to server side javascript and node.js, here is a very excellent tutorial: http://www.nodebeginner.org/

Read through it and you will understand how it works. I believe you will start to like it.

Thursday, July 14, 2011

zend framework _forward utility method

From the official document:"_forward($action, $controller = null, $module = null, array $params = null): perform another action. If called in preDispatch(), the currently requested action will be skipped in favor of the new one. Otherwise, after the current action is processed, the action requested in _forward() will be executed."

I mention this method here because it seems many people haven't noticed the last sentence: " Otherwise, after the current action is processed, the action requested in _forward() will be executed". And i have seen several times people write codes like this in action methods:

if ($condition) {
                $this->_forward($action);
}
//still do some tasks here
$this->doSomething();

It is easy to assume that once we need to forward a request, the rest of the code won't get executed. Unfortunately, this is not necessarily true. _forward() is different from _redirect(). So it is better that you design your code carefully to ensure that once you need to forward the request, the rest of the code won't get executed.

Wednesday, July 13, 2011

a question regarding PHP string & array

$data['first'] = 'Hello world';
$data['first']['second'] = 'Good morning';
//what is the result of following output?
echo $data['first'];

The result is 'Gello world'. Why is that?

First, when php has to use any non-numeric character/string as a number, php will treat it as 0. Try this:

//print 0
echo (int)'second';
//print 2
echo (int)'2';

So, $data['first']['second'] is actually $data['first'][0]. 

Second, when we treat a string as an array(just like char array in C), and try to access its elements through array index like $data['first'][0], $data['first'][0] only refers to one single character, which is 'H' initially. Even we try to assign 'Good morning' to $data['first'][0], PHP will only take the first character, which is 'G'. So $data['first'] becomes 'Gello world' 

Tuesday, July 12, 2011

PHP truncate words(shorten text string)

We want to truncate/shortern a string of text into a specific number of characters and add three dots (...) to the end. For example, we have a string "For he was looking forward to the city with foundations, whose architect and builder is God."; and we want to show it as "For he was looking forward to the city with foundations..."

This requirement can be more generic that we want to truncate words into a given $max number of characters and add $symbol to the end. For our example, $max = 60 and $symbol = '...'

Let's see how we can do it.

function truncate($text, $max, $symbol)
{
                //step 1: we get the part of the text with maximum $max number of characters
                $sub = substr($text, 0, $max);
                //step 2: we find the position of last occurence of space
                //   this is where we start to truncate the words
                $last = strrpos($sub, ' ');
                //step 3: get the sub string from beginning to the truncate position
                $sub = substr($sub, 0, $last);
                //step 4: padding the sub string with $symbol and return
                return $sub . $symbol;
}

We can try this function
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 28, '...');
The result is: For he was looking forward...

The turncate function serves well in most cases and probably is most wildly used, but it does have some minor flaw. If we try
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 60, '...');
The result is: For he was looking forward to the city with foundations,...

See the last comma before '...'? If you can accept that, then it is fine. But if we don't want to leave punctuation or other non-word character as the final character, we need to modify our truncate function a bit.

function truncate($text, $max, $symbol)
{
                //step 1: we get the part of the text with maximum $max number of characters
                $sub = substr($text, 0, $max);
                //step 2: we find the position of last occurence of space
                //   this is where we start to truncate the words
                $last = strrpos($sub, ' ');
                //step 3: get the sub string from beginning to the truncate position
                $sub = substr($sub, 0, $last);
                //step 4: remove any non word characters, padd the sub string with $symbol and return
                $sub = preg_replace("/([^\w])$/", "", $sub);
                return $sub . $symbol;
}

The difference is $sub = preg_replace("/([^\w])$/", "", $sub); This will replace any non-word character([^\w]) at the end($) with empty string(""). Now let's try again:
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 60, '...');
The result is: For he was looking forward to the city with foundations...

Monday, July 11, 2011

scatter thought about programming language popularity

What is the most popular web programming language? For backend, we have a lot of options like python, c#, java etc. But no doubt, PHP is the answer to the question. For frontend, we still have some options, but javascript obviously wins.

However, if we are only talking about languages, in my opinion, both PHP and javascript are awful. 

If you know how Brendan Eich created javascript, you probably will get disappointed to this language. This is what he said about javascript: ok, back to JavaScript popularity. We know certain Ajax libraries are popular. Is JavaScript popular? It's hard to say. Some Ajax developers profess (and demonstrate) love for it. Yet many curse it, including me. I still think of it as a quickie love-child of C and Self. Dr. Johnson's words come to mind: "the part that is good is not original, and the part that is original is not good."

PHP, even PHP5, you can easily find lots of awful parts from it. Simple example:
class AwfulTest
{
public function showOff()
{
echo __CLASS__;
}
}

//this works!
AwfulTest::showOff();

This simple example shows a couple of awful parts of PHP:
1. An instance method can be called statically as a class method, which should not be allowed(i believe)
2. Without any warning message you might expect, even, your error reporting is E_ALL. E_ALL doesn't really mean 'all', to show the warning message, you have to explicitly turn on E_STRICT. 

However, even they are awful, it doesn't stop them becoming the most popular programming languages. So what are the reasons to their popularity?

The first reason, obviously, they are easy to pick up. Compared with learning other languages like java or c++, the learning curve for PHP & javascript is flat. I think this is the reason that most people believe why they are popular. But i think the following reasons contribute more. 

Second, they only try to solve one problem, or i should say they only focus on one thing: web programming. PHP at backend and javascript at frontend but both of them are designed to solve web programming issues.

Third, web itself. Web's popularity helps php & js become popular. If PHP and JS were focusing on desktop programming issues, they would never become that popular today. Fortunately, they are designed to solve web programming issues.

According to these reasons, if we want to use other 'better' languages to replace them, these languages must have these features: 1.Easy to learn; 2.Easy to solve web programming issues. Also, the pre requsite is web must still keep popular or even more popular.

So what other languages may become more popular than PHP in web programming? Based on the above criteria, none. Other languages may be better in other aspects, but at the moment i can't find one language especially designed for web programming and also as easy as PHP. Python? Obviously not designed for web programming. Is Ruby designed for web programming at the beginning? i don't think so. They are all general-purpose and try to be able to do everything. PHP is declared to be general purpose, but, obviously, web is what it focuses on. Anyone is trying to use PHP GTK to create desktop application?

To the frontend, we have flash/flex/actionscript and sliverlight and javafx. They are designed for frontend. But my opinion, none of them was designed to solve the most common issue at frontend: DOM manipulation. And none of them is as easy to learn as javascript(Although they are not hard).

The interesting thing is, now I believe the language that can most likely replace PHP is javascript! 

PHP highlight search keywords

Sometimes we may wish to highlight the search keywords(search terms) in the searching result.

For example, there is a string "A cat is a mini tiger. Actually, cat and tiger belong to the same category". We may want to highlight all the 'cat' within this string, but we obviously don't want to highlight the 'cat' in 'category'.

Here is the result of our highlighting.

A cat is a mini tiger. Actually, cat and tiger belong to the same category

Here is how we can do it:

$string = "A cat is a mini tiger. Actually, cat and tiger belong to the same category";
$match  = "cat";
$string = preg_replace("/(\b)($match)(\b)/i", "$1<u>$2</u>$3", $string);

\b is word boundary. This will ensure we only match the 'cat' as a whole word. It can match "cat" or "cat's", but it won't match "category".

If we want to make the word italic. We can do:
$string = preg_replace("/(\b)($match)(\b)/i", "$1<i>$2</i>$3", $string);

Sunday, July 10, 2011

rewrite rule in a .htaccess file

I've been using Zend framework for a while but never spent time on figuring out how the rewrite rules work. I simply copy them from other places and as long as it works, i don't look at them. But now let's have a look at them carefully. 

ErrorDocument 404 /index.php
DirectoryIndex index.php
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1

Let's check through the rules line by line.

ErrorDocument 404 /index.php tells Apache that the index.php file should deal with any 404(file not found) errors

DirectoryIndex index.php tells the default file in a directory is index.php, instead of others like index.html most of the time. So if a request doesn't specify the filename, index.php is called automatically.

RewriteEngine on: this one simply turns on the rewrite engine. You must ensure that you enable Apache's rewrite mode

RewriteCond %{REQUEST_FILENAME} !-f tells if a request is trying to access a file, don't follow the rewrite rule. Otherwise any requests to images, css, javascripts would be routed through index.php as well, which is not we expect.

RewriteCond %{REQUEST_FILENAME} !-d tells if q request is trying to access a directory that exists, don't follow the rewrite rule.

RewriteRule ^(.*)$ index.php/$1 is our final rewrite rule. We take everything(^(.*)$) from the URL except the domain name and append it to the index.php(index.php/$1). So if a URL is like 'http://domain.com/user/login/?username=test&password=test', the URL will be rewrited to 'http://domain.com/index.php/user/login/?username=test&password=test'

Thursday, July 7, 2011

javascript semicolon insertion

I mentioned javascript semicolon insertion in my post:http://hengrui-li.blogspot.com/2011/04/too-much-language-flexibility-good-or.html. But I think this topic deserves more attention.

In javascript, semicolon is optional. We better understand how javascript handles a statement without a semicolon at the end.

Usually, if javascript cannot parse the code without a semicolon, it will treat the line break as a semicolon. Let's check this code:

var name
name
=
'henry'

Javascript will interpret the code as:

var name;
name = 'henry';

Since javascript cannot parse the code: var name name, it treats the line break of 'var name' as a semicolon, therefore it is just as 'var name;'. Javascript can parse the code: name = 'henry', so it does not insert a semicolon at the end of the name statement.

So the general rule is that javascript will treat a line break as a semicolon if the next non space character cannot be parsed as a continuation of the current statement.

This awful rule can cause awful result, for example:

var a = 3, b=5;
var func = function(v)
{
    alert(v);
    return v+10;
};
var f = func
(a+b).toString();
console.log(f);

Javascript interprets the code as: var a = 3, b=5; var func = function(v) { alert(v); return v+10;}; var f = func (a+b).toString(); console.log(f);
See the code  var f = func (a+b).toString()? that is probably not what you want. 

That is not the end of the story yet. There are two exceptions to the general rule. The first is about 'return', 'break' and 'continue' statements. Javascript always interpret their line breaks as semicolons. So, if we have:

return 
{}

Javascript will parse it as return; {}; instead of return {}; The second exception is ++/-- operation. For example:
x
++
y
Javascript read the code as x; ++y; instead of x++; y; 

The best practice in writing javascript code is we should always remember to put semicolon at the end of a statement. Don't rely on javascript's awful mechanism.

Wednesday, July 6, 2011

php wrap lines

PHP's wordwrap() function probably can serve well. But this function will even put paragraphs into one line. We want to wrap our lines based on the given width number, but we also want to make sure that a paragraph starts from a new line and we might also want to do indention.

The logic is below.

1. we want to break the text into paragraphs. This can be done by
$paragraphs = explode("\n", $text);

2. For each paragraph, we break the it into words.
$words = explode(" ", $paragraph);
We also want to indent for each new paragraph: str_repeat("&nbsp;", $indent);

3. For each word, we check if (the length of current line + word length) > $width, if yes, then put the word at the same line; if no, put the word at the next line.

Based on the logic, we can have our own wrapping function

function myWordwrap($text, $width, $indent, $break="\n")
{
   //initialize $wrappedText, which will store the wrapped text
   $wrappedText = "";
   
   //break the text into paragraphs
   $paragraphs = explode("\n", $text);
   foreach($paragraphs as $paragraph)
   {
      //if we do indent, prefix with space
      if ($indent > 0) {
         $wrappedText .= str_repeat("&nbsp;", $indent);
      }
      
      //break a paragraph into words
      $words      = explode(" ", $paragraph);
      $lineLength = $indent;
      foreach($words as $word)
      {
       //get the word length
         $wordLength = strlen($word);
 
         //the current line length cannot be larger than the given width
         if (($lineLength + $wordLength ) < $width)
         {
            $wrappedText .= $word . ' ';
            $lineLength  += $wordLength + 1;
         } else {
            //we have to put the word in a new line
            $wrappedText  = trim($wrappedText);
            $wrappedText .= $break . $word . ' ';
            //update length
            $lineLength = $wordLength;
         }
      }

      $wrappedText  = trim($wrappedText);
      //one paragraph ends, start a new line
      $wrappedText .= $break;
   }

   return $wrappedText;
}

PHP class visibility

PHP's visibility applies at the class level, not instance level. We better use an example to explain this:

class Message
{
        protected $msg;
        public function __construct($msg)
        {
                $this->msg = $msg;
        }
}

class MessageService extends Message
{
       protected $message;
       public function __construct(Message $message)
       {
              $this->message = $message;     
       }
       public function printMessage()
       {
               echo $this->message->msg;
       }
}

$m = new Message('hello world');
$service = new MessageService($m);
$service->printMessage();

$service->printMessage() looks like it should not be able to work because it is trying to access another instance's protected property. But, surprisingly, it works. This is because PHP's visibility applies at class level instead of instance level. So, as long as MessageService extends Message, an instance of MessageService class would be able to access the protected properties of the instance of Message class.

If we break the inheritance between MessageService and Message, simply do:
class MessageService{...}

Then the code will not work properly and we will get "Fatal error: Cannot access protected property Message::$msg"

Tuesday, July 5, 2011

phpsh -- A php interactive shell much better than php -a

you can learn all about phpsh at http://phpsh.org

To install it,  simply git clone git://github.com/facebook/phpsh.git or you can directly download it from http://github.com/facebook/phpsh And then enter your phpsh folder and run 'sudo python setup.py install'

This shell is much better than the php's built-in shell php -a) with a lot of handy features like auto-complete, built-in docs, etc.

Monday, July 4, 2011

php support oci8 on windows

It is not enjoyable to use PHP + Oracle. But if we have to, we must setup our PHP to support oci8(I haven't tried pdo_oci yet since it is highly experimental). I'm using PHP 5.2.14 and Oracle Database 10g on Windows. My web server is Apache2.2. 

Firstly i simply uncomment this line in php.ini and thought it should do the work:

extension=php_oci8.dll

Unfortunately, after i restart Apache, it still doesn't work and i got this error message:
PHP Fatal error:  Uncaught exception 'Zend_Db_Adapter_Oracle_Exception' with message 'The OCI8 extension is required for this adapter but the extension is not loaded'

I found oracle's offcial instruction from http://www.oracle.com/technetwork/articles/technote-php-instant-084410.html and start to follow it.

1. Download the "Instant Client Package - Basic" for Windows from the OTN Instant Client page. Because PHP is 32 bit, use the 32 bit version of Instant Client.

Unzip the Instant Client files to C:\instantclient_11_2

2. Edit the Windows PATH environment setting and add C:\instantclient_11_2

3. uncomment extension=php_oci8.dll

4. Restart Apache

However, it still doesn't work! 

Finally, i found someone with exact same problem with me. He sovled the problem by copying these three files from instantclient_11_2 to Apache/bin folder:

oraociei10.dll
orannzsbb10.dll
oci.dll

So i also copy them into Apache/bin and restart the Apache. This time, it works, finally. So my steps to enable PHP OCI8 on windows are:

1. Download the "Instant Client Package - Basic" for Windows from the OTN Instant Client page. Because PHP is 32 bit, use the 32 bit version of Instant Client.

Unzip the Instant Client files to C:\instantclient_11_2

2. Edit the Windows PATH environment setting and add C:\instantclient_11_2

3. Copy oraociei10.dll, orannzsbb10.dll, oci.dll from C:\instantclient_11_2 to Apache/bin

4. uncomment extension=php_oci8.dll in php.ini

5. restart Apache 

Sunday, July 3, 2011

how to check if a number is prime - a step by step way to problem analysis and solving

Suppose we don't know the most efficient algorithm to check if a number is prime or not and we are being asked with this question in an interview. How can we solve this problem? 

Based on the definition of prime number, we can simply do this(i ignore the special case when $n === 2 or $n === 1):

function isPrime($n)
{
    for($i=2; $i<$n; $i++) {
        if($n % $i === 0) {
            return false;
        }
    }
    return true;
}

Obviously, this is not efficient, but at least we can solve the problem and give the correct result. I think this is the first step to problem analysis and solving.

Now, we can try to optimize the function. Let's suppose $n is not a prime number, which means it can be divided by two numbers other than 1 and itself. So we have $n = $x * $y. We can find out we definitely should have $x >= 2, and we can have $n / 2 >= $y. So we can change our function:

function isPrime($n)
{
    $end = $n / 2;
    for($i=2; $i<=$end; $i++) {
        if($n % $i === 0) {
            return false;
        }
    }
    return true;
}

Cool, it is obviously more efficient than the first one. Now, let's see if we can improve a bit more. Think of $n = $x * $y again. We obviously can have $x >= $y(or $y >= $x, it doesn't matter). So we can have $n >= $y ^ 2. We can change our function again:

function isPrime($n)
{
    $end = sqrt($n);
    for($i=2; $i<=$end; $i++) {
        if($n % $i === 0) {
            return false;
        }
    }
    return true;
}

I think this answer could be enough for those who just want to see if you can analyze and solve a problem logically instead of testing your math knowledge.