Andrew Que Sites list Photos
Projects Contact
Main

April 28, 2018

Multiplying by Power of 10

Ran across a really clever algorithm for integer multiplication by a power of 10.

C source code:

static inline int multiplyByPowerOf10( int value, int power )
{
  value <<= power;
  while ( 0 != power )
  {
    value += value << 2;
    power -= 1;
  }

  return value;
}

The first thing to notice is there isn’t a single multiplication. This function is done entirely with shifts (multiplying by powers-of-two) and adds.  The function does work, but what exactly is going on?

The algorithm takes advantage of this relationship: 10n = 2n·5n. The first left shift is the multiplication by 2n and the following loop performs the multiply by 5n. The multiply by 2n is fairly obvious, but the loop is more complicated. Start with noting that 5= 4x which is (<< 2). This is done n times by the loop. Mathematically what we have is the following equivalence:

The while loop doesn't look like this as written, but keep in mind following lines are identical operations:

value += value << 2;
value += 4 * value;
value = value + 4 * value;
value = value * ( 1 + 4 );
value = value * 5;

Putting it all together the function is:

At the assembly level the function can be implemented in just a few instructions.

Intel x86 (32-bit)

  sal   eax, cl           # value <<= power
L4:
  lea   eax, [eax+eax*4]  # value = value + value * 4
  sub   ecx, 1            # power -= 1
  jne   L4

Atmel AVR (8-bit)

  mov r0,r22 # value <<= power
  rjmp 2f
  1:
  lsl r24
  rol r25
  2:
  dec r22
  brpl 1b
 
.L4:
 
  # value += value << 2;
  mov r18,r24
  mov r19,r25
  lsl r18
  rol r19
  lsl r18
  rol r19
  add r24,r18
  adc r25,r19
 
  subi r22,lo8(-(-1))    # power -= 1
  brne .L4

ARM (32-bit)

  lsl  r0, r0, r1
.L2:
  add  r0, r0, r0, lsl #2
  subs  r1, r1, #1
  bne  .L2

Atmel AVR (32-bit)

  lsl  r12, r12, r11
.L2:
  sub  r11, 1
  add  r12, r12, r12 << 2
  cp.w  r11, 0
  brne .L2

The 32-bit assembly is just a few instructions (assuming power is not zero). The 8-bit Atmel assembly is larger for a couple reasons. Firstly, because this is an 8-bit processors, 16-bit integers operations require two instruction for each shift and add. Second, the AVR instruction set can only shift one bit at a time. Although the ARM is a RISC processor as and the Intel CISC, the number of ARM and x86 instructions are identical. Unlike the ARM’s subs instruction, the 32-bit AVR’s sub instruction doesn’t set flags requiring an extra compare instruction. Otherwise all three 32-bit implementations are equal.

While this technique is clever, modern CPUs have very fast multiply instructions. The ARM requires 1 to 2 clocks for a multiply. Intel’s multiplication takes 2 operations when operating on registers (the exact timing is more complex). Thus a loop simply multiplying by 10 would be faster. Still, this algorithm is interesting and was fun to investigate.

February 01, 2018

Auto-Incrementing Build Number In MakeFile

Tracing binaries back to archived sources is a necessity as a programmer. When writing software there is generally a process for releasing software where the binaries and accompanying source code are archived. Using version control it is fairly easy to tag some specific snapshot of the source code so it can be linked (or include) the resulting binary. However this formal release system doesn’t work as well during development. Releases are often lengthy processes not conducive to rapid development and debug.

One system I have been using since the early 2000s is an automated build number. Each time the source code is compiled, a build number is incremented. As an embedded developer the software usually cannot be directly identified, but since the build number is increased each compile it is generated to be unique (even if two builds are otherwise identical). This build number is accessible from the outside world and thus pinpoints exactly what software a controller is running. In addition I archived every binary ever produced so that if a problem was found on a unit under test I could duplicate the setup. That was important for a small company where software was constantly changing and tests constantly being run. Devices in the field were better controlled, but development was not.

When I started using version control more I kept the build number concept. With Subversion I was able to link in an ID related to the version in the repository. It required software first be fully checked into version control before compiling binaries but worked well enough. There was no good way to extend this functionality to Git. So I fell back on the build number concept.

Recently I discovered what I think is the best incarnation of the build number. I tend to use make files for compiling, and this section of code takes care of build numbers.

BUILD_NUMBER_HEADER = buildNumber.h
BUILD_NUMBER_FILE = buildNumber.txt

…

$(BUILD_NUMBER_FILE): $(cFiles) $(filter-out ./$(BUILD_NUMBER_HEADER), $(hFiles)) $(sFiles)
	@read lastNumber < $(BUILD_NUMBER_FILE);                               \
	newNumber=$$(($$lastNumber + 1));                                      \
	echo "$$newNumber" > $(BUILD_NUMBER_FILE)

# Create the build number header file.
# Increments the build number and places this in the header.
$(BUILD_NUMBER_HEADER): $(BUILD_NUMBER_FILE)
	@read buildNumber < $(BUILD_NUMBER_FILE);                                  \
	export BUILD_DATE=`stat --format=%y $(BUILD_NUMBER_FILE) | cut -c 1-23`;   \
	echo "#ifndef _BUILDNUMBER_H_"                   > $(BUILD_NUMBER_HEADER); \
	echo "#define _BUILDNUMBER_H_"                  >> $(BUILD_NUMBER_HEADER); \
	echo ""                                         >> $(BUILD_NUMBER_HEADER); \
	echo "#define BUILD_NUMBER  $$buildNumber"      >> $(BUILD_NUMBER_HEADER); \
	echo "#define BUILD_DATE    \"$$BUILD_DATE\""   >> $(BUILD_NUMBER_HEADER); \
	echo ""                                         >> $(BUILD_NUMBER_HEADER); \
	echo "#endif // _BUILDNUMBER_H_"                >> $(BUILD_NUMBER_HEADER); \
	echo ""                                         >> $(BUILD_NUMBER_HEADER);

There is a file called buildNumber.txt which contains a number that is incremented when the source code changes. There is a header file included by something in the project called buildNumber.h. This contains two defines that have the current build number, and a time stamp of when the compilation took place. The header file is auto-generated. It can be left out of the main branch of the repository, but should be included in branches for tagged releases. The build number file must be checked into the repository for both trunk and branches.

What is going on in the make file two rules. The first is to regenerate the build number text file. It depends on all the source code (less the build header). Thus, should any source file change, the build header must be recompiled. The second rule is for the build header file. It must be regenerated if the build number text file changes. So any change to source code causes a new build header file to be created. As long as the dependencies for the build header are properly handled this system will always ensure a unique build number when the software changes. However, unlike my previous system, the build number will not change if the software has not changed. This allows source code in a repository to be exactly reproduced. As long as source code was checked in, binaries can be traced back to source code and reproduced from source code.

Generally I use a universal Makefile for my projects. It simply compiles all C files and computes dependencies automatically. There was some configuration parameters such as the name of the resulting executable, but for the most part the Makefile can be dropped into any new project and with minimal change compile any project.

November 16, 2017

Impulse Noise Filtering - Slew Limit Filter

In the past few articles in this series I have demonstrated using a median filter to mitigate impulse noise. The problem with a median filter is that even using a small window size and taking advantage of the speed of a mostly sorted list entering an insertion sort, the filter is still significantly slower than a windowed average. There is another option. It does not filter as nicely as a median filter but does help to dampen impulse noise.

The slew rate is a measure of how quickly voltage changes over time. A windowed average acts as limiter on slew rate but usually not enough to attenuate impulse noise. However one can directly limit the slew rate of a signal. The equation is quite simple:

For each sample of the input (an) the filter output (fn) has its change (Δn) limited to the last to some maximum (smax). If the magnitude of change is less than the maximum, the filter output is the same as the input. If the magnitude exceeds the change the filtered output is modified by the maximum change but no further.

Following the examples of previous article on this subject, here is an example of the filter output:

There is some distortion on the signal caused by the impulse noise but the filtered output is fairly effective at eliminating the large spikes. When used in combination with a first-order low-pass filter the input signal is fairly clear.

The reason this kind of filter is attractive is because of how simple it is to implement.

//-----------------------------------------------------------------------------
// Uses:
//   In-place filter of input buffer.
// Input:
//   data - Data to filtered..
//   dataSize - Number of data points.
//   limit - Maximum change allowed.
// Output:
//   Filtered output is returned in 'data'.
// Author:
//   Andrew Que <https://www.DrQue.net/>
//-----------------------------------------------------------------------------
static inline void slewLimitFilter
(
  int * data,
  size_t dataSize,
  int limit
)
{
  for ( size_t index = 1; index < dataSize; index += 1 )
  {
    int delta = data[ index ] - data[ index - 1 ];
    if ( delta > limit )
      data[ index ] = data[ index - 1 ] + limit;
    else
    if ( delta < -limit )
      data[ index ] = data[ index - 1 ] - limit;
  }
}

Just as with any filter, artifacts are introduced by the slew limit filter.

This sine wave has become a triangle wave because the slew limit is too low. This shape is the result of the filter attenuating the actual data signal. This is an other demonstration.

Here is a clean signal, but because the slew limit is too low the filter cannot keep up with the rate of change of the true signal.

The other problem with the slew limit filter is that it is not very strong.

Here the impulse noise has been doubled and the filter has produced a very messy result. So the filter is clearly not a replacement for a median filter. If you can afford the processing power, use the median filter. If you cannot, the slew limit filter may be a possibility. It depends on what the impulse noise looks like, and how much signal distortion can be tolerated.

November 06, 2017

Impulse Noise Filtering - Implementation of Median Filter

When implementing the median filter there are two options: off-line or real-time. If not done off-line then the filter loops though each data point and calculates the median over a window of the data. Here is an example with a buffer size of 5 and the median window size of 3:

55

42

1000

86

61

55

42

1000

86

61

55

42

1000

86

61

Taking the median for the number highlighted in blue for each row, the filtered data would then be:

55

86

86

Note how the spike of 1000 is filtered out. Note also the resulting buffer is shorter. The filtered data size is always the initial data size less the window size.

If running the filter in real-time, there needs to be a ring-buffer that holds the window size samples. The filtered result is the median of this buffer. New samples replace old samples and the filter continues. Starting this filter requires either loading some starting values or waiting until enough samples have been taken to fill the buffer.

If the following is the initial ring buffer at start:

21

30

71

62

72

Then the first output is 62. Assume the current into into the buffer is the second sample. When a new sample arrives, it will replace the second sample. For instance if the new sample is 41, the ring buffer becomes:

21

41

71

62

72

The median of this is again 62 which is the output of the filter. In this way the filter can be used in real-time.

Getting the filter started is up to the implementer. Most of the time simply using an initial buffer full of zeros works just fine. One can also delay producing filtered output until enough samples have been acquired. The entire buffer can be loaded entirely with the first sample taken. Or the window used for calculations can gradually grow until it reaches the desired size. Which option used depends on the application.

Basic implementation of the median calculation itself is simple. Sort the data and choose the middle data point. However sorting data is a time consuming task and fast sorting algorithms don’t have fixed run times. In addition if one is using a ring-buffer to hold samples the sorting cannot be done in place.

The easiest filter to implement is a selection sort. It is simple and always takes the same number of steps for any window size. Selection sort is also the slowest sort with a fixed complexity of O( n2 ). One bit of acceleration can be gained by only sorting half the data. Since the median is looking for the middle data point in a sorted set, one needs only to find the middle data point—everything else is ignored.

My co-worker Flux suggested I look into using an insertion sort for small filter window sizes. Worst case an insertion sort has the same speed, O( n2 ) as a selection sort. However the speed improves the more sorted the data initially is. If already sorted the speed of the algorithm is O( n ). Since we are running as a windowed filter we are only ever changing one piece of data at a time. Thus the data remains mostly sorted. As a result the insertion sort runs closer to O( n ) complexity and worst case O( 2n )—once through the array, and once again needing to move the last element to the first element. That allows this filter to still be useful in real-time applications, especially for small window sizes.

Here is a simple C implementation of a median filter useful for real-time applications.

//-----------------------------------------------------------------------------
// Uses:
//   Return the median of the data buffer.
// Input:
//   data - Data from which to select median.
//   indexes - Buffer to hold sort indexes (must contain 'windowSize' elements).
//   windowSize - Number of data points.
// Output:
//   Median of data points.
// Notes:
//   Designed for small data sizes.
//   The 'indexes' must be pre-initialized at least once.  After
//   initialization it can be used for all subsequent calls of the same
//   window size.
//       for ( int index = 0; index < windowSize; index += 1 )
//         indexes[ index ] = index;
// Author:
//   Andrew Que <https://www.DrQue.net/>
//-----------------------------------------------------------------------------
static inline int median
(
  int const * data,
  int * indexes,
  int windowSize
)
{
  // Insert sort the data using swap buffer.
  for ( int indexA = 1; indexA < windowSize; indexA += 1 )
  {
    int value = data[ indexes[ indexA ] ];
    int swapIndex = indexes[ indexA ];

    int indexB = indexA - 1;
    while ( ( indexB >= 0 )
         && ( data[ indexes[ indexB ] ] > value ) )
    {
      indexes[ indexB + 1 ] = indexes[ indexB ];
      indexB -= 1;
    }

    indexes[ indexB + 1 ] = swapIndex;
  }

  // Return median.
  return data[ indexes[ windowSize >> 1 ] ];
}

The code is designed for a ring-buffer and requires a second buffer to hold indexes. One should replace all int with the necessary word size for the application but otherwise the code should be drop-in. The algorithm is fairly efficient up to a few tens of window size. After that some more efficient sorting techniques should be used.