Tuesday, January 1, 2013

Recursion

Recursion

Recursion

In the beginning there was the SUBROUTINE.

In the early days of programming there were no large standard libraries available. There were a small set of routines that gave the programmer access to operating system facilities like printers and screen output. In some languages these were built right into the language, much the way Java has some built in constructs for the String class.

In assembly language programming, a language to code directly in the codes the CPU (Central Processing Unit) understands, there is a 'JSR' statement. 'JSR' is a mneumonic that stands for Jump SubRoutine. In an assembler language program this means to jump to a 'label' placed in your code then, return when the 'RET' statement is hit. Return to where? To the statement right after the 'JSR'. It's much like when your playing a video game and your mom asks you to do the dishes or take out the garbage. You, as your very own CPU, pause the video game, go and accomplish the chore, then return right back at the video game picking up exactly where you left off. The problem for the assmebly language programmer is that these statements do not allow parameters. The programmer must take care of all of that themselves in the assembly language code

FORTRAN was one of the very first computer languages developed. It's short for Formula Translation. It's big claim to fame was to provide the user with variables and arithmetic operators built into the language. This became a main stay of Engineers and Scientists who wanted to solve difficult mathmatical problems. Fortran also provided a feature called SUBROUTINEs. It had a SUBROUTINE statement and better than it's assembly language counterpart it allowed you to define parameters and use them in your programs. In fact there is not much difference between the look of a FORTRAN SUBROUTINE and a Java method. Java is a little fancier there are 'public' and 'private' methods. There are 'static' methods. In the early days of FORTRAN you didn't have to worry about this every SUBROUTINE was 'static' SUBROUTINEs never return a value because FORTRAN had FUNCTION statements for that.

FORTRAN also had a CALL statement so you could execute your SUBROUTINE. This is why you may hear people talk about a 'Method Call' or an instructor may say "Don't for get to call your method". Computer Science has a very informal language when we talk about various topics, but we try to use different terminology when we can to better describe what we are doing. The example here is that I may tell you to 'Run' your method and you may understand what I mean. But more formally you 'Run' a program and the program 'Calls' your method.

A SUBROUTINE can call other SUBROUTINEs

The whole point of this stroll down memory lane is that once the syntax of the subroutine became establised, there was a natural inclination to allow a SUBROUTINE to call itself. We already had recursive definitions as part of mathematics. So now that we have computers that animate our mathematics for us, the push for recursive SUBROUTINEs was on.

It may seem that it should be self-evident that a SUBROUTINE can call itself but in the early design of computer languages it turns out you must do some special things under the covers to get this to happen. So newer languages that followed after FORTRAN made sure they had these capabilities. I won't bore you with a cronology of how recursion came into being. There were languages in existance at the time of FORTRAN that already had it. But suffice it to say that when Personal Computers came about and the C language (a predecessor to Java) dominated the programming of that platform, C had built in recursion. Lucky for you! because it's a pain to simulate recursion in a language that doesn't have it. Java as an ancestor of C also has recursion built in.

METHODS ARE JAVA SUBROUTINES

As languages change so does the nomenclature. SUBROUTINEs are now methods. Methods don't even have a METHOD statement like SUBROUTINES do. This is because after people became very familiar with FORTRAN they realized that a SUBROUTINE definition can be recognized by:

  • it's name
  • the parenthesis that follow
  • and the parameter definitions inside the parenthesis

So by the time Java comes around we expect you as a newbie just to pick up on this on your own. If you are an older programmer you know you got sick of writing SUBROUTINE for each and every SUBROUTINE definition. That's a large amount of typing saved (well kind of there are things like public and void but they are much shorter words. We even dropped a formal statement for calling the SUBROUTINE. Now you are just supposed to know that any name followed by parenthesis with comma separated parameters is a Method call. Methods have all the stuff of a subroutine but now are grouped into objects (ie. class definitions). They can be inherited and they can be overridden. These are certainly things that don't happen with SUBROUTINES. The name change is fitting since the old fashioned SUNROUTINES have more in common with "static" methods than with methods in general.

METHODS can call themselves

Recursion being well established in computer science methods can call themselves, it's built in to the language. The danger of course is that the programmer must make sure they have accounted for any and all 'base' cases, otherwise the code will run forever! Well, not forever you eventually run out of memory.

Recursive definitions

The first example is always factorial. Sorry about that. But it has a simply stated recursive definition so it is popular. Students usually focus on factorial being so trivial to implement that they feel recursion is senseless since you could do it easily with a loop. But like so many things in computer science the trivial case is foreshadowing of more complicated stuff yet to come.

Factorial

factorial (n) = 1 (for n = 1)

                   = factorial(n-1) * n (for n > 1)

Fibonacci numbers

Fib(n) = 0 (for n = 0)

           = 1 (for n = 1)

           = Fib(n-1) + Fib(n-2) (for n > 1)

This is a better example for recursion. It does not have a trivial loop alternative. That's not to say you can't create a loop that can do it. Every recursive implementation has a looping counterpart. It's just that it's not trivial like factorial is. The double calls to Fib above do us in for coming up with an easy looping code design.

This type of recursion is termed tree recursion by Abelson and Sussman in their classic work "Structure and Interpretation of Computer Programs"(known in computer science circles as just SICP). That's because if you were to draw out all the recursive calls at each step, you would form an inverted tree with the first call leading to 2 calls and those two calls lead to two calls and so on.

Greatest Common Divisors

Given 2 integers this is the greatest integer that divides both of them evenly. It is very helpful when you are trying to reduce fractions to lowest terms.

Gcd(a,b) = a (for b = 0)

             = Gcd (b, remainder(a,b)) (for b > 0)

If you think the above definition is some modern mathematic torture dreamed up to make your life a living hell. Abelson and Sussman point out that the above definition appears in Euclid's Elements Book 7 circa 300 BC. So the reality is that mathematicians have been developing torturous problems for their students from time immemorial.

Your Job

I leave it up to the student to implement these functions. If your interested there are more recursive functions defined in the SICP which is available online. There are also Scheme Programming Language examples of their implementation. That may assist you with the Java implementation or it may confuse you more.

Princeton University has an excellent collection of recursive functions at their introduction to computer science web site. Their site includes a large number of implementation examples. I would suggest you try to implement the examples above and then visit their site if you have any problems. http://introcs.cs.princeton.edu/java/23recursion/ The link to Princeton's course is here: http://introcs.cs.princeton.edu/java/home/ This includes a much broader perspective also going over some computer architecture which I believe is incredibly helpful when you are trying to make sense of computer science in general.

What No Code?

Of course there is code. It wouldn't be a comp sci article without code. Using our previously defined APString class I will implement a reverse method that uses recursion to reverse the elements of our string. Now, because we have no built in operators for the APString class (such as string concatenation), we can't do a simple recursive routine that's possible with the Java String class. Actually that's a good idea let's first look at using recursion with the Java String class.

A simple recursive methodology for Java Strings would be the following:

  1. take the first character of the String and store it as a string
  2. reverse the rest of the string after the first character is removed
  3. concatenate the new reversed remainder with the first character added to the end.

Step 2 has the recursion in it if I name the method 'revStr' the pseudo code will look like:

String firstch;
String restStr;

// ignore the base case for now
firstch = str.subString(1,2);
restStr = str.subString(2);
return revStr(restStr) + firstch;  // concatenate the first character onto the end of the string

Take some time to think of what the base case is for the code fragment above. I went ahead and implemented the whole code and placed it at the end of the article but given what I have done above you should be able to code the whole thing yourself. Go ahead and see if you can reverse a Java string recursively in your own Java IDE. I would create a separate class for this assignment and give it it's own main method. Then once you have it working compare it to the code at the end of the article.

APString reverse done recursively

This was not such an obvious recursion because there are no built in facilities for APString like the concatenation operator that the Java String has. We have not even implemented a concatenation method or a substring method. So you can see the time savings that Java provides and the difficulty we have trying to duplicate the same functionality in a class of our own. I will add those in a future article and try to implement reverse as we did for the Java String class.

Since we don't have the benefit of the methods described above I will take advantage of a Java method for arrays.

Arrays.copyOfRange(<array variable name>, start index, end index).

This is provided by the Java class library. It's kind of cheating because I don't have to make my own copy routine, which would probably loop through all the elements of the array. Since we have to loop why bother reversing the string recursively other than to show how you can think of problems recursively? Well, that's the point really. There are times when a looping construct is the best mechanism but there are times when recursion will help to simplify your code. Most of the time you will have to try both and see which one provides you with an elegant solution.

Design of the recursive algorithm

The APString class uses char arrays to implement it's String functionality. We will need a temporary char array to hold the reversed elements and to be our working storage as we change the order of the characters. We could do this in the char array we use for the object itself but then this would actually change our object rather than providing a new object with the desired results. This also means we need to return an object of APString as the final result. This creates a problem for recursion because I want to recurse on the char array and then return an APString when all is done.

  • Design Requirements
    1. return an object of APString
    2. create a recursive routine to recurse on char array
    3. use a temporary char array to be the working storage
    4. Don't let any of our methods change the underlying values of our original APString object

This means we will need 2 methods instead of one. One method to setup our working storage and return an APString value and another method to do the recursion on our working storage.

Recursive method and setup method

// reverse_array recursive helper procedure for reverser
// 
private void reverse_array(int startIdx, int endIdx, char ch[]) 
{   
    char tmpch;
    // base case 1: odd length we have reached the middle element
    if (startIdx == endIdx) 
        return;
    // base case 2: even length array the indexes are side by side
    if ((endIdx-startIdx) == 1) 
    {
        tmpch = ch[startIdx];
        ch[startIdx] = ch[endIdx];
        ch[endIdx] = tmpch;
        return;
    }
    // recursive case swap current indexes then recurse on the rest
    tmpch = ch[startIdx];
    ch[startIdx] = ch[endIdx];
    ch[endIdx] = tmpch;
    reverse_array(startIdx+1,endIdx-1,ch);        
}

 // The intial call to set up the recursive routine above with working storage array
public APString revStr() {
    char[] tmparray = Arrays.copyOfRange(chArray, 0, this.length());
    reverse_array(0,this.length()-1,tmparray);
    return new APString(tmparray);
}

The main method used to test the above code

public static void main(String[] args) {
    // Since we are trying to avoid the use of the built-in Java Strings
    // we are forced to define strings by initializing a char array
    // I think you can see why it's so nice to have Strings built into
    // Java. Imagine having to define every string like this:
    char[] a = {'T','h','i','s',' ','i','s',' ','a',' ','t','e','s','t','\0'};
    APString ap = new APString(a);
    // Using the String double quotes built in the following is an equivalent
    // statement:
    APString ap1 = new APString("This is a test".toCharArray());

    System.out.println(ap.toString()+ "   length = " + ap.length());
    System.out.println(ap1.toString()+ "   length = " + ap1.length());
    System.out.println(ap.reverse().toString());
    System.out.println(ap.revStr().toString());
    System.out.println("No change to original: " + ap.toString());
}

And the output from testing:

run:
This is a test   length = 14
This is a test   length = 14
Looping reverse: tset a si sihT
Recursive reverese: tset a si sihT
No change to original: This is a test
BUILD SUCCESSFUL (total time: 0 seconds)

The whole APString code with the recursive procedures built in

/*
 * APString class
 * The idea is to create a string class that uses arrays to hold the characters
 * This methodology is closer to how the actual hardware is organized.
 * The String class in Java has built in support. In fact any double quoted 
 * String is itself a String class object. "a string".toString(); is valid
 * Java code and all of the methods of the String class may be used above
 * in place of toString().
 * 
 * Java hides much of this organization because it is object oriented and 
 * there is a long history of String operations in computer science. So the
 * designers (rightly so) added String handling capabilities in the language 
 * itself. However for people trying to learn about computers for the first
 * time this shielding (known in computer science terms as abstracting) of the
 * lower level organization is a disservice. While Strings are well known the 
 * patterns of their processing are repeated enough for other things that a 
 * thorough understanding of the low level processing of Strings will only help
 * the new Programmer/Computer Scientist.
 * 
 * Indeed Donald Knuth in his classic treatise "The Art of Computer Programming"
 * goes so far as to invent a ficticious computer with a ficticious 
 * assembly language to bring to light the issues of computer architecture and
 * their effect on the design of computer programs.
 */
package apcompsci;

import java.util.Arrays;

/**
 *
 * @author nasty
 */
public class APString {    
    // It's not necessary to limit a char array like this. Java will 
    // dynamically create space as needed but in some computer languages
    // they must be allocated out ahead of time. So as a place holder we
    // define our initial array as 100 chars.
    char chArray[] = new char [100];
    int len;

    APString(char chArray[])
    {
        // Let's copy the char array given into the one we defined above as
        // part of the class
        int i = 0;
        for(i = 0; i < chArray.length; i++)
          this.chArray[i] = chArray[i];
        this.len = i;
        this.chArray[i]='\0';
        // The above '\0' is the ASCII null character. By setting the very
        // last character to this value it acts as a flag to let us know
        // we are at the end of our string. The length() method uses this to
        // calculate the length of the string
        // I also implemented a len field to store this in. It is left to 
        // the reader to reimplement this class with a length field and get
        // get rid of the null termination. This would save a lot of looping
        // over the string to get the string length
    }

// All classes defined in Java have a base toString but because we have
// defined some fields as part of our class we need to convert those to 
// Strings so we can print them out and we override the base toString features
//
// It's unfortunate but we must rely on the built-in String class because
// they are used by all the print procedures
@Override
    public String toString()
    {
        String s = "";

        for (int i = 0; i < this.len; i++)
            s = s + this.chArray[i];
        return s;
    }

    public int length()
    {
        int i = 0;
        for (i = 0; this.chArray[i] != '\0'; i++)
        {
            // Do nothing just let i keep count and keep looping until we hit
            // the null character which will break the test above
        }
        return i;
    }

    // reverse - create a new APString that reverses all the characters in 
    // the current class. Except for the null string we don't want that to 
    // be our first character. WHY?
    public APString reverse()
    {
        char tmp[] = new char[this.length() + 1];
        int j = 0;
        for (int i = this.length() - 1; i >= 0; i-- , j++)
        {
            tmp[j] = this.chArray[i];
        }
        tmp[j] = '\0';
        return new APString(tmp);
    }

    // reverse_array recursive helper procedure for reverser
    // 
    private void reverse_array(int startIdx, int endIdx, char ch[]) 
    {   
        char tmpch;
        // base case 1: odd length we have reached the middle element
        if (startIdx == endIdx) 
            return;
        // base case 2: even length array the indexes are side by side
        if ((endIdx-startIdx) == 1) 
        {
            tmpch = ch[startIdx];
            ch[startIdx] = ch[endIdx];
            ch[endIdx] = tmpch;
            return;
        }
        // recursive case swap current indexes then recurse on the rest
        tmpch = ch[startIdx];
        ch[startIdx] = ch[endIdx];
        ch[endIdx] = tmpch;
        reverse_array(startIdx+1,endIdx-1,ch);        
    }

     // The intial call to set up the recursive routine above with working storage array
    public APString revStr() {
        char[] tmparray = Arrays.copyOfRange(chArray, 0, this.length());
        reverse_array(0,this.length()-1,tmparray);
        return new APString(tmparray);
    }

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        // Since we are trying to avoid the use of the built-in Java Strings
        // we are forced to define strings by initializing a char array
        // I think you can see why it's so nice to have Strings built into
        // Java. Imagine having to define every string like this:
        char[] a = {'T','h','i','s',' ','i','s',' ','a',' ','t','e','s','t','\0'};
        APString ap = new APString(a);
        // Using the String double quotes built in the following is an equivalent
        // statement:
        APString ap1 = new APString("This is a test".toCharArray());

        System.out.println(ap.toString()+ "   length = " + ap.length());
        System.out.println(ap1.toString()+ "   length = " + ap1.length());
        System.out.println("Looping reverse: " + ap.reverse().toString());
        System.out.println("Recursive reverese: " + ap.revStr().toString());
        System.out.println("No change to original: " + ap.toString());
    }
}

Recursive reverse on the Java String class

package apcompsci;

/**
 *
 * @author Nasty Old Dog
 */
public class NODString {
    public String revStr(String s)
    {
        String firstch;
        String restStr;

        // base case s length is 1 just return s
        if (s.length() == 1)
            return s;
        // recursive case
        firstch = s.substring(0, 1);  // get the first character
        restStr = s.substring(1);     // get the rest of the string

        // reverse the string by moving the first char to the end and the
        // reverse of the rest of the string on the front.
        return revStr(restStr) + firstch;   
    }

    public static void main(String[]args)
    {
        String teststr = "reverse this";
        NODString nod = new NODString();
        System.out.println("The reverse of "+teststr+" is "+nod.revStr(teststr));
    }

}

The output from the above program is:

run:
The reverse of reverse this is siht esrever
BUILD SUCCESSFUL (total time: 1 second)

References

  1. Abelson, Harold, Gerald Jay. Sussman, and Julie Sussman. Structure and Interpretation of Computer Programs. Cambridge, MA: MIT, 1985. Print.
  2. "Welcome to the SICP Web Site." Welcome to the SICP Web Site. N.p., n.d. Web. 01 Jan. 2013. http://mitpress.mit.edu/sicp/.
  3. Sedgewick, Robert, and Kevin Wayne. "Introduction to Programming in Java." Introduction to Programming in Java. Addison-Wesley, n.d. Web. 01 Jan. 2013. http://introcs.cs.princeton.edu/java/home/.

Author: Nasty Old Dog