Wednesday, October 12, 2011

linux format xml command line

i simply use xmllint --format source.xml --output output.xml

php refactoring: Sprout Method

Sprout Method is a concept from the book 'Working Effectively with Legacy Code'. The method is particularly useful when working with legacy PHP code. I use this method quite often although i never know this technique is called 'Sprout Method' until i read the book. 

The basic idea of Sprout Method is, when we need to add a feature to a system and it can be formulated completely as new code, write the code in a new method. Call it from the places where the new functionality needs to be. Theoretically, the new code should be testable, although it may still be hard to get the calling points under test easily.

I've seen quite a lot legacy PHP applications. One feature of these legacy code is large long procedures full of dependencies. Developers tend to keep injecting new logic into the procedure when they need to add or change something, because that seems to be the fastest and safest way to make a change. But, obviously, that is also why and how large long procedures are created. 

We should use Sprout Method whenever we can see the code that we are adding as a distinct piece of work or we can't get tests around a method yet. It is far preferable to adding code inline. The steps of using Sprout Method is:


1.Identify where we need to make our code change.

2.If the change can be formulated as a single sequence of statements in one place in a method, write down a call for a new method that will do the work involved.

3.Determine what local variables we need from the source method, and make them arguments to the call.

4.Determine whether the sprouted method will need to return values to source method. If so, change the call so that its return value is assigned to a variable.

5.Develop the sprout method. If one method is too big, then break it into several smaller methods until we think each method's logic is so simple and clear that we can easily test them. 

The advantage of using Sprout Method is quite obvious. When we use Sprout Method, we are clearly separating new code from old code. Even we still can't get the old code under test immediately, we can at least see our changes separately and have a clean interface between the new code and the old code. A method/function actually creates a local scope, whatever we do inside there, we don't have to worry that our local variables and changes in the method could probably get messed up with the old procedure.

Everything has two sides. The disadvantage of Sprout Method is the source method might contain a lot of complicated code and a single sprout of a new method. Sometimes it isn't clear why only that work is happening someplace else, and it leaves the source method in an odd state. But at least that points to some additional work that we can do when we get the source class under test later.

We probably can't see the benefit of a Sprout Method immediately. In fact, we probably will never see it if we never have to come back and make changes. But as long as we need to come back and make changes, the good side of Sprout Method will come up immediately. A developer will feel much more safer and comfortable working with a Sprout Method than making changes in the long procedure.

Tuesday, October 4, 2011

Remove duplicate lines with uniq

sort file.txt | uniq

To list the unique lines only: sort file.txt | uniq -u
To list the duplicated lines only: sort file.txt | uniq -d