jSGB

2007/02/12

Genbuild released

Genbuild is a generic Apache Ant build script that uses "inheritance" to re-use tasks in a way that it is possible to do a complete build only by supplying dependencies on those tasks and overriding the appropriate properties in each sub-project.

Links to the project:
http://sourceforge.net/projects/genbuild
http://genbuild.sourceforge.net

# posted by SGBware @ 20:44 0 comments

Riddle


/**
 * This solves the riddle:<BR><BR>
 * 
 * If you have a 10 digit number using all the digits 0-9, find a number
 * where the leftmost value is divisible by 1, the two leftmost values are
 * divisible by 2, etc. The full number will eventually be divisible by 10.
 * 
 * The answer to this riddle is 3816547290.
 */
private static void solve() {
  for (int x0 = 0; x0 <= 9; x0++) {
    final long t0 = x0;
    for (int x1 = 0; x1 <= 9; x1++) {
      if (x1 == x0) continue;
      final long t1 = 10 * t0 + x1;
      if (t1 % 2 != 0) continue;
      for (int x2 = 0; x2 <= 9; x2++) {
        if (x2 == x1 || x2 == x0) continue;
        final long t2 = 10 * t1 + x2;
        if (t2 % 3 != 0) continue;
        for (int x3 = 0; x3 <= 9; x3++) {
          if (x3 == x2 || x3 == x1 || x3 == x0) continue;
          final long t3 = 10 * t2 + x3;
          if (t3 % 4 != 0) continue;
          for (int x4 = 0; x4 <= 9; x4++) {
            if (x4 == x3 || x4 == x2 || x4 == x1 || x4 == x0) continue;
            final long t4 = 10 * t3 + x4;
            if (t4 % 5 != 0) continue;
            for (int x5 = 0; x5 <= 9; x5++) {
              if (x5 == x4 || x5 == x3 || x5 == x2
               || x5 == x1 || x5 == x0) continue;
              final long t5 = 10 * t4 + x5;
              if (t5 % 6 != 0) continue;
              for (int x6 = 0; x6 <= 9; x6++) {
                if (x6 == x5 || x6 == x4 || x6 == x3
                 || x6 == x2 || x6 == x1 || x6 == x0) continue;
                final long t6 = 10 * t5 + x6;
                if (t6 % 7 != 0) continue;
                for (int x7 = 0; x7 <= 9; x7++) {
                  if (x7 == x6 || x7 == x5 || x7 == x4
                   || x7 == x3 || x7 == x2 || x7 == x1 || x7 == x0) continue;
                  final long t7 = 10 * t6 + x7;
                  if (t7 % 8 != 0) continue;
                  for (int x8 = 0; x8 <= 9; x8++) {
                    if (x8 == x7 || x8 == x6 || x8 == x5
                     || x8 == x4 || x8 == x3 || x8 == x2
                     || x8 == x1 || x8 == x0) continue;
                    final long t8 = 10 * t7 + x8;
                    if (t8 % 9 != 0) continue;
                    for (int x9 = 0; x9 <= 9; x9++) {
                      if (x9 == x8 || x9 == x7 || x9 == x6
                       || x9 == x5 || x9 == x4 || x9 == x3
                       || x9 == x2 || x9 == x1 || x9 == x0) continue;
                      final long t9 = 10 * t8 + x9;
                      if (t9 % 10 != 0) continue;
                      System.out.println(x0 + "" + x1 + "" + x2 + ""
                                       + x3 + "" + x4 + "" + x5 + ""
                                       + x6 + "" + x7 + "" + x8 + "" + x9);
                      return;
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

# posted by SGBware @ 20:25 0 comments

2006/11/23

Design Pattern: The enum logger

Recently I have come up with this design pattern that solves many of the problems relating to logging in both small and large applications. Some of the problems are related to:

Ability to change logger (implementation, e.g. Log4j)

Managing log statements in the application (e.g. knowing what is logged)

Managing log levels in the application (e.g. what is logged at what level)

Managing multiple loggers (different outputs, e.g. system log, service log, etc.)

My solution is to have the logger inside an enum, where the log statement is stored as a string. Additionally and optionally, the log levels and the logger itself can also be stored inside the enum, simplifying logging quite dramatically. Here is an example of an implementation that would be used for logging inside an RS232 component (serial communication).

import org.apache.log4j.Level;
import org.apache.log4j.Logger;

@LogStatements
public enum RS232LogStatements {
  
  /** Debug freetext.<BR> %0 = Any debug string */
  RS232_DEBUG("%0", Level.DEBUG),
  /** Trace freetext.<BR> %0 = Any trace string */
  RS232_TRACE("%0", Level.TRACE),
  /** Port %0 in use<BR> %0 = Port name */
  RS232_001("Port %0 in use", Level.WARN),
  /**
   * Port %0 is currently owned by %1<BR> %0 = Port name
   * <BR> %1 = Port owner name
   */
  RS232_002("Port %0 is currently owned by %1", Level.INFO),
  /** Port %0 does not exist.<BR> %0 = Port name */
  RS232_003("Port %0 does not exist", Level.INFO);
  
  private static final Logger logger = Logger.getLogger("RS232");
  
  private final String logStatement;
  private final Level logLevel;
  
  private RS232LogStatements(final String logStatement,
                             final Level logLevel) {
    this.logStatement = logStatement;
    this.logLevel = logLevel;
  }
  
  public String toString() {
    return logStatement;
  }
  
  public void log(final String ... values) {
    if (logger.isEnabledFor(logLevel)) {
      logger.log(logLevel, parse(values));
    }
  }
  
  public void log(final Exception e) {
    if (logger.isEnabledFor(logLevel)) {
      logger.log(logLevel, "", e);
    }
  }
  
  public void log(final Exception e, final String ... values) {
    if (logger.isEnabledFor(logLevel)) {
      logger.log(logLevel, parse(values), e);
    }
  }
  
  private String parse(final String ... values) {
    final int size = values.length - 1;
    String returnString = logStatement;
    for (int i = size; i >= 0; i--) {
      returnString = returnString.replace("%" + i, values[i]);
    }
    return returnString;
  }
}

Now, it is possible to have one of these enum loggers as a system-wide logger, or it is possible to have one for each single component in the system (however "component" is defined in the system at hand). The design pattern allows for log levels to be changed without the need to dig deep into code to find exactly where the log statement was logged. You can easily change the type of logger as well by simply changing the enum, while the code in the system itself remains untouched. Also, the parameterised parsing actually allows for unlimited number of parameters being replaced in the log string, though more than three is fairly unlikely. Moreover, if something like having the name of the enum as a prefix to the log message is preferable, then this can very easily be achieved by changing the log method(s) in the enum itself. Additionally, when using the enum logger all that have to be done to use it is simply doing static imports. Here is an example of how it can be used:

import static ...RS232LogStatements.*;

  .
  .
  .

  private SerialPort openSerialPort(CommPortIdentifier comm) {
    try {
      final SerialPort serialPort = (SerialPort) comm.open(appName,
                                                           timeout);
      serialPort.setSerialPortParams(baudrate, databits,
                                     stopbits, parity);
      return serialPort;
    } catch (PortInUseException e) {
      RS232_001.log(e, comm.getName());
    } catch (UnsupportedCommOperationException e) {
      RS232_DEBUG.log(e);
    }
    return null;
  }

Some extra benefits that are not that obvious would be that when you hover the enum being used in an IDE, the JavaDoc neatly comes up and displays what is being logged and what the parameters are. What should also not be missed is the @LogStatements annotation attached to the enum, which could be used by a simple tool to gather all log statements in the system and use these to produce documentation. More importantly, if the enum implements an interface that defines the methods used for the logging, then they could all be populated from a single aspect using AOP. The main thing to consider is will the software ever log the exact same log statement more than once at different log levels? If so, the enum logger could still be used by having the same statements at different levels. Also it should be noticed that in this example there is only one "debug" element. This is mainly because debug messages normally do not need to be "managed" the way other log messages need to be, as the developers have access to the source code anyway. However, if it is required to have a clear separation between, say, system and service logging, then it could be specified at the enum level by having SERVICE_RS232_001 and SYSTEM_RS232_001 statements. The main positive thing with this design pattern is that when the log statements are defined, then you don't have to stop and carefully consider "is this debug, info, warn, or error level?", you simply just do enum.log();

# posted by SGBware @ 00:46 0 comments

2006/06/27

Odd and even numbers

I don't think you'll ever find a better way of detecting whether a number is odd or even. The binary AND operator is brilliant in this setting.


if ((number & 1) == 0); // even
if ((number & 1) == 1); // odd

of course, I believe in C you'd be able to do something like


if (number & 1); // true if odd and false if even

Link: http://www.rgagnon.com/javadetails/java-0488.html

# posted by SGBware @ 16:25 0 comments

2006/04/23

CSV Reader / Writer

A convenient tool for quickly reading and/or writing CSV data in Java, has been of great help at more than one occasion. Useful when you want to concentrate on the fun parts of programming.

# posted by SGBware @ 22:46 0 comments

2006/04/17

GSM 7-bit encoding and decoding

The GSM 7-bit encoding is a method for compressing data by using the leftover bit when representing the data in 8-bit octets. A simple explanation on how this bit-fiddeling works can be found here:
http://www.dreamfabric.com/sms/hello.html

Information on SMS and the PDU format can be found here:
http://www.dreamfabric.com/sms/

In this blog I will examine this seemlingly simple algorithm a bit closer and point out a few problem areas that can easily lead to early alopecia for implementors.

Examining the "hellohello" example:

The most natural way of solving the problem would be to reverse the the array of values, then chip off 7 bits at the time from the end of the array and working our way to the beginning. Now if we use binary strings on the form "01101001" it is fairly easy, we just use substring and iterate our way to the start of the string. This is ok for a simple algorithm that just needs to solve the problem or for learning. However, this is not very performant, so we'll use primitives instead.

Assume we have two helper arrays like this:


final int[] upperBits = {
  0xFE, // 0 = B7|B6|B5|B4|B3|B2|B1
  0xFC, // 1 = B7|B6|B5|B4|B3|B2
  0xF8, // 2 = B7|B6|B5|B4|B3
  0xF0, // 3 = B7|B6|B5|B4
  0xE0, // 4 = B7|B6|B5
  0xC0, // 5 = B7|B6
  0x80  // 6 = B7 
};

final int[] lowerBits = {


  0x01, // 0 =                   B0
  0x03, // 1 =                B1|B0
  0x07, // 2 =             B2|B1|B0
  0x0F, // 3 =          B3|B2|B1|B0
  0x1F, // 4 =       B4|B3|B2|B1|B0
  0x3F, // 5 =    B5|B4|B3|B2|B1|B0
  0x7F  // 6 = B6|B5|B4|B3|B2|B1|B0
};

The upper bits and the lower bits will be referred to as U and L respectively, where L3 means index 3 in lower bits yielding 0x0F.

The encoding of "hellohello" in the GSM 7bit default alphabet is:
E8329BFD4697D9EC37
and in binary:

00110111 11101100 11011001

10010111 01000110 11111101

10011011 00110010 11101000

Now we reverse the order of the values which gives us 37ECD99746FD9B32E8 and we make a note of which bits we need (from the helper arrays) in order to extract the last 7 bits.

00110111 11101100 11011001
10010111 01000110 11111101
10011011 00110010 1	1101000

If we continue in this manner we'll get something like this:

00	110111		1	1101100		1101100	1		100101	11		01000	110		1111	1101		1001	1011
00	110010		1	1101000

U6+L5

U5+L4

U4+L3

U3+L2

U2+L1

U1+L0

U6+L5

So it is quite simple and we could go about and implement an algorithm to decode GSM 7bit. However, it wouldn't be truly optimal as we do two operations, first we reverse the list and then we decode it. In general it is not really a performance issue, as the SMS limit is 140 bytes. But if one takes a closer look at the bit-fiddeling above, then it can easily be decoded without reversing the array. Repeating the process by starting from the beginning of the array yields this:

1	1101000		00	110010		100	11011		1111	1101		01000	110		100101	11		1101100	1
1	1101100		00	110111

U6+L5

U5+L4

U4+L3

U3+L2

U2+L1

U1+L0

U6+L5

The bit values that we need for our algorithm is evidently the same, which also is logical, but perhaps not easy to spot at first glance. With this information we can go on and implement our algorithm:


  /**
   * Decodes GSM 7-bit septets in 8-bit octets.
   *
   * @param ud The 8-bit octets representing 7-bit data
   * @param udl The message length, if known
   * @return The converted message
   */
  public static int[] decode7bit(int[] ud, int udl) {
    final int[] output = new int[udl];    

    int b = 6, p = 0;
    for (int i = 0; i < udl; i++) {
      switch (b) {
      case 7: // U0
        output[i] = (ud[p] & upperBits[0]) >> 1;
        break;        

      case 6: // L6
        output[i] = ud[p] & lowerBits[b];
        break;

      default: // The rest
        output[i] = ((ud[p] & lowerBits[b]) << (6 - b))
                  + ((ud[p-1] & upperBits[b+1]) >> (b+2));
        break;
      }

      if (--b == -1) b = 7;
      else p++;
    }

    return output;
  }

Ok, so far so good, but there is still a bit further to go.

Question: Do we really need to know the length of the message? Wouldn't it be much more convenient to calculate it by doing ud.length * 8 / 7?

Answer: We could calculate it, but it would be wrong 1 in 7 times, enough to make us think the algorithm can decode anything until that odd one comes in and spoils the evening. Why? The benefit of this algorithm is that you can store an extra character per 7 characters you use, i.e. you can store eight 7bit characters in seven bytes. Now, if your message is "message", how does your algorithm know that you are not using the last 7 bits with you have gained by using this encoding? Well, you could check if the final character is of value 0, and if so omit it, though then you'd probably have to recreate the array as well. In other words, it gets quite messy when you don't know the actual length of the message. This algorithm could perfectly be used to decode alphanummeric destination and originating addresses of the SMS, but is not quite good enough to fully decode all SMS messages.

What this algorithm is currently missing is fill bits support. The need for fill bits is actually an odd one. It has to do with that the length of the message represents the total number of 7bit segments. When user data header is present, which is a number of 8bit octets, a certain number of fill bits are needed to make the message itself start at a septet boundary. The fill bits will shift all bits in the message between 1 and 6 spaces, more than 6 fill bits does not make much sense and we'll not support that in this algorithm. The fill bits complicates the algorithm further as when using int or byte arrays each character can span across 2-3 values, and it may not be possible to create a simple algorithm using the helper arrays as in the previous example. Perhaps it is possible to generate helper arrays based on the number of fill bits, but it may just as well be more performant to just shift the bits first and then use our first algorithm. Here's my current solution of a GSM 7bit default alphabet decoding:


  /**
   * Decodes GSM 7-bit septets in 8-bit octets.
   *
   * @param ud The 8-bit octets representing 7-bit data
   * @param fillBits The number of fill bits, if any
   * @param udl The message length, if known
   * @return The converted message
   */
  public static int[] decode7bit(int[] ud, int udl,
                                 int fillBits) {
    final int[] upperBits = {
        0xFE, // 0 = B7|B6|B5|B4|B3|B2|B1
        0xFC, // 1 = B7|B6|B5|B4|B3|B2
        0xF8, // 2 = B7|B6|B5|B4|B3
        0xF0, // 3 = B7|B6|B5|B4
        0xE0, // 4 = B7|B6|B5
        0xC0, // 5 = B7|B6
        0x80  // 6 = B7 
    };    

    final int[] lowerBits = {
        0x01, // 0 =                   B0
        0x03, // 1 =                B1|B0
        0x07, // 2 =             B2|B1|B0
        0x0F, // 3 =          B3|B2|B1|B0
        0x1F, // 4 =       B4|B3|B2|B1|B0
        0x3F, // 5 =    B5|B4|B3|B2|B1|B0
        0x7F  // 6 = B6|B5|B4|B3|B2|B1|B0
    };    

    final int length = ud.length;
    if (fillBits != 0) {
      final int len = length - 1;
      final int cut = lowerBits[fillBits - 1];
      final int move = 8 - fillBits;
      for (int f = 0; f < len; f++) {
        ud[f] >>= fillBits;
        ud[f] |= (ud[f + 1] & cut) << move;
      }
      ud[len] >>= fillBits;
    }    

    if (udl < 0) udl = length * 8 / 7;
    final int[] output = new int[udl];    

    int b = 6, p = 0;
    for (int i = 0; i < udl; i++) {
      switch (b) {
      case 7: // U0
        output[i] = (ud[p] & upperBits[0]) >> 1;  
        break;        

      case 6: // L6
        output[i] = ud[p] & lowerBits[b];
        break;
        
      default: // The rest
        output[i] = ((ud[p] & lowerBits[b]) << (6 - b))
                  + ((ud[p-1] & upperBits[b+1]) >> (b+2));
        break;
      }

      if (--b == -1) b = 7;
      else p++;
    }

    return output;

}

What about encoding? Well, having done the decoding it should be a breeze to do the encoding, right? The pattern from the hellohello example should be quite obvious, shifting the bits from the current value rightwards while merging this with the bits from the next value shifted leftwards. The only thing to note here is the gap that appears between the two arrays (ud and output), which increases every 8 characters due to the compression. So at position 160 in the array ud, we will be in the position 140 in the output array, thus the limit of 140 bytes gives 160 7bit letters in the SMS message. Here's the algorithm:


  /**
   * Converts 7bit septets into octets. Make sure the user-data
   * has been converted into the GSM 7-bit default alphabet
   * before using this method.
   *
   * <BR><A HREF="http://www.dreamfabric.com/sms/hello.html">
   *              http://www.dreamfabric.com/sms/hello.html</A&gt
   *
   * @param ud The data to be transformed into 7bit septets
   * @param fillBits Fill bits in range 0 - 6
   * @return The encoded message
   */
  public static int[] encode7bit(int[] ud, int fillBits) {
    final int[] upperBits = {
        0xFE, // 0 = B7|B6|B5|B4|B3|B2|B1
        0xFC, // 1 = B7|B6|B5|B4|B3|B2
        0xF8, // 2 = B7|B6|B5|B4|B3
        0xF0, // 3 = B7|B6|B5|B4
        0xE0, // 4 = B7|B6|B5
        0xC0, // 5 = B7|B6
        0x80  // 6 = B7 
    };
    
    final int[] lowerBits = {
        0x01, // 0 =                   B0
        0x03, // 1 =                B1|B0
        0x07, // 2 =             B2|B1|B0
        0x0F, // 3 =          B3|B2|B1|B0
        0x1F, // 4 =       B4|B3|B2|B1|B0
        0x3F, // 5 =    B5|B4|B3|B2|B1|B0
        0x7F  // 6 = B6|B5|B4|B3|B2|B1|B0
    };

    final int udl = ud.length;
    final int ol = (udl * 7) / 8;
    final boolean rest = ol != (double)udl * 7 / 8;
    final boolean fill = fillBits != 0;
    int[] output = new int[ol + (fill ? 1 : 0)
                              + (rest ? 1 : 0)];

    int b = 0, p = 0;
    for (int i = 0; i < ol; i++) {
      output[i] = (ud[p] >> b)
                + ((ud[p + 1] & lowerBits[b]) << 7 - b);
      if (++b == 7) {
        b = 0;
        p++;
      }
      p++;
    }
    if (rest) output[ol] = ud[p] >> b;   

    if (fill) {
      final int cut = upperBits[7 - fillBits];
      final int move = 8 - fillBits;
      p = 0;
      for (int i = 0; i <= ol + 1; i++) {
        b = output[i] & cut;
        output[i] ^= b;
        output[i] <<= fillBits;
        if (p != 0) output[i] = p;
        p = b >> move;
      }
    }

    return output;
  }

A few more things to note here, the output array's length is defined by the number of bytes needed to store all characters plus an eventual rest and eventual fill bits. The rest is what's left over at the end of the array after our algorithm has finished, there will be a rest in 6 out of 7 times. Since the fill bits essentially shifts the whole array up to 6 bits, then the array will need the extra space. Keeping the helper arrays inside the method or static outside is up to the user, having them static is probably more efficient for small encoding/decoding sets such as SMS messages. When it comes to fill bits for encoding I still think it is easiest to do it in two steps as combining the encoding and shifting makes things very complicated. If someone solves this problem and would want to share it with me, I'd be very interested in comparing the complexity of the algorithms and performance.

# posted by SGBware @ 18:50 7 comments

jSGB

2007/02/12

Genbuild released

Riddle

2006/11/23

Design Pattern: The enum logger

2006/06/27

Odd and even numbers

2006/04/23

CSV Reader / Writer

2006/04/17

GSM 7-bit encoding and decoding

About Me

Links

Previous Posts

Archives