Search This Blog

Tuesday, March 4, 2014

Memoization and Decorators

Definition of Memoization

The term "memoization" was introduced by Donald Michie in the year 1968. It's based on the Latin word memorandum, meaning "to be remembered". It's not a misspelling of the word memorization, though in a way it has something in common. Memoisation is a technique used in computing to speed up programs. This is accomplished by memorizing the calculation results of processed input such as the results of function calls. If the same input or a function call with the same parameters is used, the previously stored results can be used again and unnecessary calculation are avoided. In many cases a simple array is used for storing the results, but lots of other structures can be used as well, such as associative arrays, called hashes in Perl or dictionaries in Python.

Memoization can be explicitly programmed by the programmer, but some programming languages like Python provide mechanisms to automatically memoize functions.

Memoization with Function Decorators

In our previous chapter about recursive functions, we worked out an iterative and a recursive version to calculate the Fibonacci numbers. We have shown that a direct implementation of the mathematical definition into a recursive function like the following has an exponential runtime behaviour:

####################### Program
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

####################### End of Program

We also presented a way to improve the runtime behaviour of the recursive version by adding a dictionary to memorize previously calculated values of the function. This is an example of explicitly using the technque of memoization, but we didn't call it like this. The disadvantage of this method is, that the clarity and the beauty of the original recursive implementation is lost.

The "problem" is, that we changed the code of the recursive fib function. The following code doesn't change our fib function, so that its clarity and legibility isn't touched. To this purpose, we use function which we call memoize. memoize() takes a function as an argument. The function memoize uses a dictionary "memo" to store the function results. Though the variable "memo" as well as the function "f" are local to memoize, they captured by a closure through the helper function which is returned as a reference by memoize(). So, the call memoize(fib) returns a reference to the helper() which is doing what fib() would have done plus a wrapper which is saving the results, which are still not stored, in the memo dictionary and is preventing recalculation of results, which are already in memo.

####################### Program 
def memoize(f):
    memo = {}
    def helper(x):
        if x not in memo:           
            memo[x] = f(x)
        return memo[x]
    return helper
   

def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

fib = memoize(fib)

print(fib(40))
####################### End of Program

Now we come to what's called the decorator. Let's look at the line in our code were we assign memoize to fib:
fib = memoize(fib)
One says, that the fib function is decorated by the the memoize() function.


Memoize as a Class
We can encapsulate the caching of the results in a class as well, as you can see in the following example:

####################### Program 
class Memoize:
    def __init__(self, fn):
        self.fn = fn
        self.memo = {}
    def __call__(self, *args):
        if args not in self.memo:
                self.memo[args] = self.fn(*args)
        return self.memo[args]
####################### End of Program

As we are using a dictionary, we can't use mutable arguments, i.e. the arguments have to be immutable.

Decorators in Python
A decorator in Python is a callable Python object that is used to modify a function, method or class definition. The original object, the one which is going to be modified, is passed to a decorator as an argument. The decorator returns a modified object, e.g. a modified function, which is bound to the name used in the definition. Python decorators have a similar syntax than Java annotations. The decorator syntax in Python can be seen as pure syntactic sugar, using @ as the keyword.

Example: Using a Decorator for Memoization
We already had used a decorator without saying so. The memoize function in the example at the beginning of this chapter is a decorator. We used it to memorize the results from the fib function.

What we hadn't used was the special decorator syntax of Python, i.e. the at character "@"
Instead of writing the statement
fib = memoize(fib)
we can write
@memoize
But this line has to be directly in front of the decorated function, in our example fib():

####################### Program  
def memoize(f):
    memo = {}
    def helper(x):
        if x not in memo:           
            memo[x] = f(x)
        return memo[x]
    return helper
   
@memoize
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

#fib = memoize(fib)

print(fib(40))
####################### End of Program

Checking Arguments with a Decorator

In our chapter about recursive functions we introduced the factorial function. We wanted to keep the function as simple as possible and we didn't want to obscure the underlying idea, so we hadn't incorporated any argument checks. So, if somebody had called our function with a negative argument or with a float argument, our function would have got into an endless loop.

The following program uses a decorator function to ensure that the argument passed to the function factorial is a positive integer:

####################### Program  
def argument_test_natural_number(f):
    def helper(x):
        if type(x) == int and x > 0:
                return f(x)
        else:
                raise Exception("Argument is not an integer")
    return helper
   
@argument_test_natural_number
def factorial(n):
    if n == 1:
        return 1
    else:
        return n * factorial(n-1)

for i in range(1,10):
    print(i, factorial(i))

print(factorial(-1))
####################### End of Program

Monday, March 3, 2014

Usage of Maven for code build over using ant

A Simple Comparison

I'm only showing you this to illustrate the idea that, at the most basic level, Maven has built-in conventions.

In simple Ant example, you can see how you have to tell Ant exactly what to do. There is a compile goal which includes the javac task that compiles the source in the src/main/java directory to the target/classes directory. You have to tell Ant exactly where your source is, where you want the resulting bytecode to be stored, and how to package this all into a JAR file. While there are some recent developments that help make Ant less procedural, a developer's experience with Ant is in coding a procedural language written in XML.

Contrast the previous Ant example with a Maven example. In Maven, to create a JAR file from some Java source, all you need to do is create a simple pom.xml, place your source code in ${basedir}/src/main/java and then run mvn install from the command line. The example Maven pom.xml that achieves the same results.

That's all you need in your pom.xml. Running mvn install from the command line will process resources, compile source, execute unit tests, create a JAR, and install the JAR in a local repository for reuse in other projects. Without modification, you can run mvn site and then find an index.html file in target/site that contains links to JavaDoc and a few reports about your source code.
Admittedly, this is the simplest possible example project. A project which only contains source code and which produces a JAR. A project which follows Maven conventions and doesn't require any dependencies or customization. If we wanted to start customizing the behavior, our pom.xml is going to grow in size, and in the largest of projects you can see collections of very complex Maven POMs which contain a great deal of plugin customization and dependency declarations. But, even when your project's POM files become more substantial, they hold an entirely different kind of information from the build file of a similarly sized project using Ant. Maven POMs contain declarations: "This is a JAR project", and "The source code is in src/main/java". Ant build files contain explicit instructions: "This is project", "The source is in src/main/java", "Run javac against this directory", "Put the results in target/classses", "Create a JAR from the ....", etc. Where Ant had to be explicit about the process, there was something "built-in" to Maven that just knew where the source code was and how it should be processed.

High-level Comparison

The differences between Ant and Maven in this example? Ant...

    •    Ant doesn't have formal conventions like a common project directory structure, you have to tell Ant exactly where to find the source and where to put the output. Informal conventions have emerged over time, but they haven't been codified into the product.
    •    Ant is procedural, you have to tell Ant exactly what to do and when to do it. You had to tell it to compile, then copy, then compress.
    •    Ant doesn't have a lifecycle, you had to define goals and goal dependencies. You had to attach a sequence of tasks to each goal manually.

Where Maven...
    •    Maven has conventions, it already knew where your source code was because you followed the convention. It put the bytecode in target/classes, and it produced a JAR file in target.
    •    Maven is declarative. All you had to do was create a pom.xml file and put your source in the default directory. Maven took care of the rest.
    •    Maven has a lifecycle, which you invoked when you executed mvn install. This command told Maven to execute a series of sequence steps until it reached the lifecycle. As a side-effect of this journey through the lifecycle, Maven executed a number of default plugin goals which did things like compile and create a JAR.

My Profile

My photo
can be reached at 09916017317