Home  |  FAQ  |  About  |  Contact  |  View Source   
 
SEARCH:
 
BROWSE:
    My Hood
Edit My Info
View Events
Read Tutorials
Training Modules
View Presentations
Download Tools
Scan News
Get Jobs
Message Forums
School Forums
Member Directory
   
CONTRIBUTE:
    Sign me up!
Post an Event
Submit Tutorials
Upload Tools
Link News
Post Jobs
   
   
Home >  Tutorials >  General ASP.NET >  Sorting Techniques in C#
Add to MyHood
   Sorting Techniques in C#   [ printer friendly ]
Stats
  Rating: 4.79 out of 5 by 52 users
  Submitted: 09/18/02
Larry Mak ()

 
Sorting Effectively in C#

This tutorial is a tutorial I've done previously in Java, but now rewritten in C#, as well as some common mistakes beginners (or rusty coders) make.

This tutorial is generally about sorting in Java – ways to do it, and how to do it correctly, quickly, and effectively. This tutorial will use one problem, and shows the many approaches to do so.

Here’s the problem:
Suppose you’re to sort a list of books, given by the number of pages, and the initial of the author. The style of input is on one line, a number between 0 – 999 denoting the number of pages, a space, then the author’s initials with 2 uppercase letters.

We are to sort them by the number of pages, and then by the first, then last initials of the author.

The input will be given by one string, with one number denoting the number of books. We’re to implement a “bookOrder” method that returns the sorted string.

Input:
15 AB 219 AE 518 SE 518 AB 17 CE, 5

Output:
15 AB 17 CE 219 AE 518 AB 518 SE


Method 1
Because in timed coding competitions, time is important, so one might write it very inefficently, using bubblesort:
class Sort{
    public string bookOrder( string books, int number ){
        int[] pages = new int[ number ];
        string[] title = new string[ number ];
        
        string[] values = books.Split( ' ' );

        // Parsing
        for ( int i = 0; i < number; i++ ){
            pages[i] = int.Parse( values[2*i] );
            title[i] = values[2*i+1];
        }
        
        // Bubblesort
        for ( int i = 0; i < number; i++ ){
            for ( int j = 0; j < number - 1; j++ ){
            // This does the actual comparison
            if ( pages[j] > pages[j+1] ||
            ( pages[j] == pages[j+1] && title[j].CompareTo( title[j+1]  ) > 0 ) ){
                    // Swapping
                    pages[j] ^= pages[j+1];
                    pages[j+1] ^= pages[j];
                    pages[j] ^= pages[j+1];
                    string temp = title[j];
                    title[j] = title[j+1];
                    title[j+1] = temp;
                }
            }
        }

        // Return it in the format required.
        string s = "";
        for ( int i = 0; i < number; i++ ){
            s += pages[i] + " " + title[i] + " ";
        }
        return( s.Trim() );
    }
}

The above method took about 5 minutes to write (even though I am a C# beginner, I rewrote it in 5 minutes. The advantages to this is that it is stable, and very simple to write (two for loops). However, at O( n^2 ), it's very inefficient. For an input of 100, it will run 10000 iterations, and at 1000, it will run 1 million iterations. So depending on your constraints, you should figure out if it'll run in the allocated time.

This code is, of course, very flexible, as you need to only change the if statement to sort it differently:
if ( pages[j] > pages[j+1] ||
( pages[j] == pages[j+1] && title[j].CompareTo( title[j+1]  ) > 0 ) ){


Let's say we want to first sort by initials and then the page number, we can easily modify it to:
if ( title[j].compareTo( title[j+1] ) > 0 ) || 
( title[j].compareTo( title[j+1] ) == 0 && pages[j] > pages[j+1] ) )


or changed the signs if we wanted to sort descendingly:
if ( pages[j] < pages[j+1] || 
( pages[j] == pages[j+1] && title[j].compareTo( title[j+1] ) > 0 ) )


Wrong Method
A very (relatively) common mistake that some people make is writing something akin to the following:
class Sort{
    public string bookOrder( string books, int number ){
        int[] pages = new int[ number ];
        string[] title = new string[ number ];
        
        string[] values = books.Split( ' ' );

        // Parsing
        for ( int i = 0; i < number; i++ ){
            pages[i] = int.Parse( values[2*i] );
            title[i] = values[2*i+1];
        }
        
        // Bubblesort
        for ( int i = 0; i < number; i++ ){
            for ( int j = 0; j < number - 1; j++ ){
            // This does the actual comparison
            if ( pages[j] > pages[i] ||
            ( pages[j] == pages[i] && title[j].CompareTo( title[i]  ) > 0 ) ){
                    // Swapping
                    pages[j] ^= pages[j+1];
                    pages[j+1] ^= pages[j];
                    pages[j] ^= pages[j+1];
                    string temp = title[j];
                    title[j] = title[j+1];
                    title[j+1] = temp;
                }
            }
        }

        // Return it in the format required.
        string s = "";
        for ( int i = 0; i < number; i++ ){
            s += pages[i] + " " + title[i] + " ";
        }
        return( s.Trim() );
    }
}

Notice the difference?

Well, the if statement (the most crucial part of this method, of course) is incorrect:
if ( pages[j] > pages[i] ||
( pages[j] == pages[i] && title[j].CompareTo( title[i]  ) > 0 ) ){

Well, it will do enough comparisons to put everything in sorted order, right? Well, this is somewhat true. However, this sort is not stable. A stable sort means that given inputs a1, a2, a3, ..., an if ax is equal to ay, ax is guarantee to be before ay, given that x is less than y. In plain english, if an input that comes earlier in the input sequence, it will remain before another input of the same value in the input sequence.

The way of doing it (comparing indices j and i instead of j and j+1) will yield unstable results when the input is negative. Although this doesn't affect our problem outcome in this case, there are times when it's important. (It won't affect this problem because if input a is equal to input b, then a is equal to b. This isn't always true.

For example, you have to sort a bunch of two digit numbers, by their digit sum (sum of their digits), then 15 and 24 will have the same value (but different input), so which one is in front will depend on their position on the input. Sometimes this is crucial!


Method 2
A better and more efficient way to take advantage of the C# library. This is by using the interface IComparable, and System.Array.
using System;
using System.Collections;
using System.Reflection;

class Sort{
    private class B : IComparable {
        // Fields to be sorted
        public int pages;
        public string title;

        // Initializer
        public B ( int p, string t ){
            title = t;
            pages = p;
        }

        // The CompareTo method
        public int CompareTo( object o ){
                B k = (B) o;
                if ( pages != k.pages ){
                    return pages - k.pages;
                }
                else 
                    return title.CompareTo( k.title );
        }
        
    }
    public string bookOrder( string books, int number ){
        B[] book = new B[ number ];
        string[] values = books.Split( ' ' );
        
        // Parsing
        for ( int i = 0; i < number; i++ ){        
            book[i] = new B(int.Parse( values[2*i] ), values[2*i+1] ); 
         }
    
        // Using Arrays.sort
        Array.Sort( book );
    
        // Return it in the format required.
        String s = "";
        for ( int i = 0; i < number; i++ ){
            s += (int)book[i].pages + " " + book[i].title + " ";
        }
        return s.Trim();
    }

}

This ensures that it runs in O(n * log n ) time, which means, for the case of 100, it will only run roughtly about 665 times - no way to compare this to the 10,000 iterations of the first method, and if we let the input run to 1000, it will only do about 99658 iterations, compared to the 1 million! Now, that's saving computation time.

Another advantage is that the code is clean and slick. You've reduced a sorting problem into a parsing problem - should be trivial.

Let's go over it a bit:
private class B : IComparable

This is a private helper class that implements the interface IComparable. This means we're required to write the CompareTo( object ) method, which would allow us to sort.
public int CompareTo( object o ){
    B k = (B) o;
    if ( pages != k.pages ){
        return pages - k.pages;
    }
    else 
            return title.CompareTo( k.title );
}

This is the required method of the helper class - the CompareTo( object ) method. The requirement of this method is that it returns a negative integer if this object is smaller, a positive integer if object o is smaller, and zero if they're the same. What we did above is that compare the pages, if they're not equal, then return the difference in pages, otherwise, use string's very own CompareTo( object ) to compare the two string.

Array.Sort( object[] ) 

This lets us sort anything, but it requires that all elements in the array be implementing the IComparable interface. That means if you're sorting just ints, or doubles, or strings, this is very handy and let you get the work done quicker. Why reinvent the wheel, after all?

The above way is not that hard to implement, and it doesn't take too long, but generally requires me to remember too much syntax.

This is, however, the preferred way to sort things in C#. Learning it and perhaps practice enough so you can use it on demand will help a lot - sorting problems are very common, after all.

To modify what to sort by, we only have to modify the CompareTo method. Very Object-Oriented Programming friendly!
public int CompareTo( object o ){
    B k = (B) o;
    if ( pages != k.pages ){
        return pages - k.pages;
    }
    else 
            return title.CompareTo( k.title );
}

for example, if we wanted to sort descendingly:
public int CompareTo( object o ){
    B k = (B) o;
    if ( pages != k.pages ){
        return k.pages - pages;
    }
    else 
            return title.CompareTo( k.title );
}

or by title first:
public int CompareTo( object o ){
    B k = (B) o;
    if ( title.ComprareTo( k.title ) != 0 )
        return title.CompareTo( k.title );
    else 
            return pages - k.pages;
}

or by title descendingly:
public int CompareTo( object o ){
    B k = (B) o;
    if ( title.ComprareTo( k.title ) != 0 )
        return -1 * title.CompareTo( k.title );
    else 
            return pages - k.pages;
}


Method 3
Since we already gone over the preferred way of sorting, why do we need another method? Well, perhaps you don't remember the syntax, or perhaps you want something that's slimmer (but only for specific cases), or something that you can code faster off you head, then I provide you with the alternative.

Well, Array.Sort( object ) can sort strings, so let's take advantage of this by presenting the data as a string so we can just call Array.Sort and we'll be done!

Well, since the number of pages is at most 3 digits, and the initials is at most 2 characters, then we can obviously format this string as a concatenation of this information - plus the original data so we know exactly what it is:
public string bookOrder( string books, int number ){
    string[] values = books.Split( ' ' );
    string[] book = new string[ number ];
    
    // Parsing
    for ( int i = 0; i < number; i++ ){        
    /**

    * Format is as following (exactly):
    *
    * DDDTTNNNNNN
    * Where DDD is the number of pages
    * TT is the initials
    * and NNNNNN is the original string, with a space, 
    * formatted for output.
    **/

        int pages = int.Parse( values[2*i] );
        string title = values[2*i+1];    
    // What is format( int )?  It is a helper method that pads a number 
    // with leading zeroes to ensure that it will use up all 5 characters.
        book[i] = format( pages ) + title + pages + " " + title;
    }
    
    // Using Arrays.sort
    Array.Sort( book );
   
    // Return it in the format required.
    String s = "";
    for ( int i = 0; i < number; i++ ){
        // Why 5?  The first 5 characters are throw away letters that were used by the sort.
        s += book[i].Substring(5) + " ";
    }
    return s.Trim();
}

// This pads the number of pages with leading zeroes.
string format ( int pages ){
    string temp = "0000000" + pages;
    // Why 3? We only need 3 digits!
    return temp.Substring( temp.Length - 3 );
}

The above works, is very short to write, and is pretty intuitive. In general, I recommend the second method, but usually I have to look up the syntax, and the above saves me some time, and time is a very important constraint sometimes.

To modify what to sort, we can simply modify the parsing line:
book[i] = format( pages ) + title + pages + " " + title;

for example, to by descending pages:
book[i] = format( 999 - pages ) + title + pages + " " + title;

or by initials:
book[i] = title + format( 999 - pages ) + pages + " " + title;

or to descending initials (somewhat long)
book[i] = ( (char) ('Z' - ( title[0] - 'A' )) ) + "" + 
  ( (char) ('Z' - ( title[1] - 'A' ) ) ) + format( 999 - pages ) + pages + " " + title; 

You just got to be creative. (You can easily extent this to for 3 or more initials, or to more than 3 digits as well.)

To test the above methods, we can add the following main method:
public static void Main(){
    string book = "15 AB 219 AE 518 SE 518 AB 17 CE";
    int number = 5;
    Book a = new Book();
    System.Console.WriteLine( a.bookOrder( book, number ) );
}


This was meant to be a quick tutorial to coders that already knows how to program, and to provide them alternative way of solving a problem quicker.

Links
Sorting in Java - my original tutorial here on devhood.
Devhood - Community with many resources.

Return to Browsing Tutorials

Email this Tutorial to a Friend

Rate this Content:  
low quality  1 2 3 4 5  high quality

Reader's Comments Post a Comment
 
very good tutorial larry. 5 *s
-- Justin Jones, September 21, 2002
 
Nice tutorial. Personally, I prefer the Array.Sort method, but that's me. Good overview of sorting in C#.
-- Robert Wlodarczyk, September 23, 2002
 
Ya, for general clean coding, using Array.Sort with a Comparable is definitely preferred, but the stuff after it is geared slightly toward coding competitions..

thanks for the comments.
-- Larry Mak, September 24, 2002
 
Good in depth look at sorting
-- Dan Cramer, October 10, 2002
 
I'm lost but its great ^_^
-- Brian Li, October 10, 2002
 
OH i get it now!
-- Brian Li, October 10, 2002
 
nice tutorial, larry. 5*.
-- Aaron Brethorst, October 21, 2002
 
Good introduction, 5*
-- Daniele Pagano, October 30, 2002
 
Good analysis of the common inefficiencies programmers often don't remember
-- Marcus Griep, November 10, 2002
 
oh dang. i was starting to laugh when you showed us the bubble sort, but then after that i realized it was only an intro... phEW!
-- John Woo, November 17, 2002
 
It was intended to show you what not to do. Besides, if you had only 5 minutes to do a sort in a time-intensive coding competition, and the input size is small enough, N^2 is not too bad. The point of showing the bubble sort is because most people get it wrong and ended up writing an instable bubble sort, which can cause some undesireable results if you're not too careful.
-- Larry Mak, November 20, 2002
 
Copyright © 2001 DevHood® All Rights Reserved