Andrew Que Sites list Photos
Projects Contact
Main

July 01, 2017

Integer Sizes

I wrote a simple program to display the sizes of the standard types provided by C99. Below is a table of the result of compiling and running this program on 4 different processors, two Intel x86 based processors and two ARM processors. Two are 32-bit, and two are 64-bit.

  AMD 64 i686 Cortex-A53 Cortex-A8
Char 8-bits 8-bits 8-bits 8-bits
Short 16-bits 16-bits 16-bits 16-bits
Int 32-bits 32-bits 32-bits 32-bits
Long 32-bits 64-bits 64-bits 32-bits
Long long 64-bits 64-bits 64-bits 64-bits
Float 32-bits 32-bits 32-bits 32-bits
Double 64-bits 64-bits 64-bits 64-bits
Long double 96-bits 128-bits 128-bits 64-bits
Data pointer 32-bits 64-bits 64-bits 32-bits
Function pointer 32-bits 64-bits 64-bits 32-bits
         
int_least8_t 8-bits 8-bits 8-bits 8-bits
int_least16_t 16-bits 16-bits 16-bits 16-bits
int_least32_t 32-bits 32-bits 32-bits 32-bits
int_least64_t 64-bits 64-bits 64-bits 64-bits
         
int_fast8_t 8-bits 8-bits 8-bits 8-bits
int_fast16_t 16-bits 64-bits 64-bits 32-bits
int_fast32_t 32-bits 64-bits 64-bits 32-bits
int_fast64_t 64-bits 64-bits 64-bits 64-bits
intmax_t 64-bits 64-bits 64-bits 64-bits

The results are mostly expected. The two 64-bit system treat long and long long as 64-bit values where as the 32-bit system treat long as 32-bit and long long as 64-bit. Pointers on a 64-bit system are also 64-bits wide as one would expect. Both data pointers and function pointers are identical in size. This is almost always the case with von Neumann architecture. Harvard architecture, which separates program and data memories, could have differences in data and function pointer sizes. This is something we embedded programmers pay attention to, but few others need worry about.

The fast integer section is kind of interesting. Standard 32-bit Intel, each fast integer is the it’s size. On the rest of the system, faster integers are the word size of the CPU. The latter doesn’t strike me as strange, but the fact the two x86 based CPUs differ makes me wonder. Why does a 32-bit Intel machine consider a 16-bit integer to be faster than a 32-bit integer when the CPU prefers to deal with 32-bit values? From an assembly standpoint I know the answer as x86 assembly breaks up registers by size. For example, the AX register is 16 bits, and consists of the two registers AL and AH (low and high word respectively). The is the 32-bit extension of this register, EAX and the 64-bit extension RAX. Instructions can deal with various operations depending on what registers are used, so the CPU knows how many bits are being worked on. I presume the 64-bit x86 instruction set still allows the use of the 32/16/8 bit instructions. So why is the 64-bit machine using 64-bit words for all fast types? It could be that memory is always accessed 64-bits at a time. This means reads of 16-bit values actually need to read 64-bits and then chop off the 48-bits not used. But the 32-bit processors need to do the same thing. So I’m not sure.

The only other item of notice was the size of long double. On the 32-bit x86 machine this defaults to 96-bits. Modern x86 FPUs can deal with 128-bit floating-point, but by default it looks like the size is 96-bits. Both 64-bit machines have 128-bit floating point. However, the 32-bit ARM machine doesn’t have a floating point value larger then 64-bit. I wasn’t aware that was the case.

One thing that has always bothered me in the embedded world is when programmers use the primitive integers (like int and long) when they are expecting fixed bit sizes. Yes, if you know the platform you are working with you know the primitive sizes. However, if that code ever has to move to another platform, or the CPU changes, all your assumptions are going to change. I’ve been on porting projects where this fact has required a lot of changes. Customer wants to transition from a 32-bit system to 64-bit, and now everything needs to be modified. Had they used the intxx_t and uintxx_t types, little else would have to be modified. I know, I’ve ported code written correctly with relative ease. These types have been available since the C99 standard as in the C standard formalized in 1999.

Some companies opt to alias types so they have the correct sizes. A general header file defines things like uint32 which can be defined as long (on a 16-bit system) or int (on a 32-bit system) or uint32_t on any system with a C99 compiler. That allows the use of an older compiler (C89 compliant or some K&R variant). For companies where code can live and be maintained for decades (Aerospace being an example) I think this practice is a good one. However, for new code that doesn’t need to consider really old compilers, I prefer the C99 types.

June 30, 2017

8 Queens Puzzle Improvement

In doing my write-up for the 8 queens puzzle article, I thought of a further improvement. I was creating a mask of threatened squares and checking the current layout of the chess board with it. However it occurred to me that we didn't need to check threats by row because the way the moves were being done only allowed a single piece per row. Thus there would never be common-row threat.

Then I started thinking about columns. If we simply marked a column as used, any new piece could simply check to see if that column was in use already and know the new piece was creating a threat. But what about diagonals?

For this I started with a labeling all the possible diagonals.

  0 1 2 3 4 5 6 7
0 0 1 2 3 4 5 6 7
1 1 2 3 4 5 6 7 8
2 2 3 4 5 6 7 8 9
3 3 4 5 6 7 8 9 10
4 4 5 6 7 8 9 10 11
5 5 6 7 8 9 10 11 12
6 6 7 8 9 10 11 12 13
7 7 8 9 10 11 12 13 14

 

The pattern for this diagonal is simple: the diagonal occupied by any piece is the row number plus the column number.

  0 1 2 3 4 5 6 7
0 7 8 9 10 11 12 13 14
1 6 7 8 9 10 11 12 13
2 5 6 7 8 9 10 11 12
3 4 5 6 7 8 9 10 11
4 3 4 5 6 7 8 9 10
5 2 3 4 5 6 7 8 9
6 1 2 3 4 5 6 7 8
7 0 1 2 3 4 5 6 7

The opposite diagonal is similar, but it is column plus 7 subtract row.

Both diagonals just require 14-bits, and the column check 8-bits. So I made a quick modification to my program.

//=============================================================================
// Name: queens.c
// Uses: Print all possible combinations of placing 8 queens on chess board
//       without any queen threatening any other.
// Date: 2017-06-15
// Author: Andrew Que (https://www.DrQue.net/)
//
//                                Public Domain
//=============================================================================
#include <stdio.h>
#include <stdbool.h>
#include <inttypes.h>

// Rows/columns on a chess board.
enum { ROWS    = 8 };
enum { COLUMNS = 8 };

// The board state uses 3-bits/row.  This is because there are 8 columns and 2^3 = 8.
enum { COLUMN_BITS = 3 };

// There 3-bits/row times 8 rows for a total of 2^24 possible board states.
enum { STATES = 1 << ( COLUMN_BITS * ROWS ) };

// Mask used to select bits for current column.
enum { COLUMN_BIT_MASK = ~-( 1 << COLUMN_BITS ) };

//-----------------------------------------------------------------------------
// Uses:
//   Print an 8x8 grid representing the queen placements for the given state.
// Input:
//   state - State of chess board.
//-----------------------------------------------------------------------------
static void printState( uint_fast32_t state )
{
  for ( unsigned row = 0; row < ROWS; row += 1 )
  {
    // Column selected in this row.
    unsigned selectedColumn = ( state & COLUMN_BIT_MASK );
    state >>= COLUMN_BITS;

    char rowText[] = "........";
    rowText[ selectedColumn ] = 'x';
    puts( rowText );
  }

  putchar( '\n' );
}

//-----------------------------------------------------------------------------
// Uses:
//   Display all the possible combinations of placing 8 queens on a chess board.
// Output:
//   Always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Number of functional board layouts.
  unsigned goodCount = 0;

  // For every possible board state that contains 8 queens...
  for ( uint_fast32_t state = 0; state < STATES; state += 1 )
  {
    // Keep track of the columns and diagonals in use.  There can only ever
    // be one piece per row so no need to check that.
    uint_fast16_t columnUse = 0;
    uint_fast16_t leftDiagonal = 0;
    uint_fast16_t rightDiagonal = 0;

    // True if the current state has threats.  Assumes none at start.
    bool isBad = false;

    // Current row (0-7).
    unsigned row = 0;

    // Sub-state is a copy of state, but shifted as column bits are used.
    uint_fast32_t subState = state;

    // Loop through all rows, adding queens until all are placed or any of the
    // queens threaten another.
    while ( ( row < ROWS )
       && ( ! isBad ) )
    {
      // Each row must have a queen, so we just need to calculate the
      // column the queen is located from the game state.
      unsigned column = subState & COLUMN_BIT_MASK;
      subState >>= COLUMN_BITS;

      // Check column and diagonals to see if this new piece creates a threat.
      // If so, this setup is bad.
      uint_fast16_t mask = ( 1 << column );
      isBad |= ( columnUse & mask ) > 0;
      isBad |= ( leftDiagonal & ( mask << row ) ) > 0;
      isBad |= ( rightDiagonal & ( mask << ( 7 - row ) ) ) > 0;

      // Mark the used column and diagonals.
      columnUse |= mask;
      leftDiagonal |= ( mask << row );
      rightDiagonal |= ( mask << ( 7 - row ) );

      // Next row...
      row += 1;
    }

    // If the loop finished and the state is not bad, this is one of the
    // possible layouts.
    if ( ! isBad )
    {
      if ( false )
        printState( state );
      else
        printf( "%08" PRIXFAST32 "\n", state );
      goodCount += 1;
    }
  }

  // Display the total number of possible configurations.
  printf( "Total: %u\n", goodCount );

  return 0;
}

The results produced by the program are identical. On the same computer as the first program I went from 350 ms to around 190 ms—a 54% speed improvement. Makes we wonder: what other improvements could be made?

Back in 1995 I wrote a program to brute force a solution to the 8 queens puzzle for the game The 7th Guest. It was the first time I recall using my programming skills to solve a real-world problem (although I had written a program to do my math homework prior to this). Prior to this all I had written were simply utilities and graphics demonstrations.

While driving on vacation I thought about this problem and wondered how I would solve it with the knowledge of programming I have today. Didn't take much thought before I decided I had to do an implementation.

The 8 queens puzzle is simple: positions 8 queens on a chess board such that no queen threatens any other. A queen in chess has the ability to move to any square in the same row, same column, or to any square along either diagonal. Thus any piece in this path is threatened.

A chess board contains 64 squares arranged in 8 rows and 8 columns. Since all I am trying to do is setup queens on the board I only need to know if a queen occupies a square or not. Thus I can represent the entire state of the board with a single 64-bit binary bitmap. The first bit is the first row, first column; second bit first row second column; 8th bit the second row, first columns; etc.

0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63

In order to see what squares a queen threatens I need to construct a bitmap of the threatened squares based on the queen's location. The column is easy. Start with the first column of bits all set. Now shift left by the number of columns the queen is located.

               
               
               
         
               
               
               
               

Rows are pretty much as simple. Start with the first row set and shift, but now shift by eight times the row location.

 
               
             
             
             
               
               
               
               

Diagonals are more complected, but one starts with a full diagonal across the board and can shift based on the rows and columns to put it in the correct position.

               
               
           
             
             
           
               
               

When you logically OR the results together you get a bitmap of what squares the queen threatens.

               
               
               
               
             
               
               
               

What's useful here is that if we have a bitmap of the current positions of queens on the chess board and we make a threaten bitmap, logically ANDing the results will show you what pieces are threaten.

               
               
               
             
               
             
               
               

Say we have the above board and want to place a queen in the third column on the second row.

               
             
               
             
               
             
               
               

Here we see the threaten mask. If we AND the threaten mask with the playing field we get cells in red, showing we have a threatened queen by adding the one at the new position. As a binary operation, any non-zero result means there are threatened cells.

Treating the playing field as a bitmap makes threat testing very fast. Now for testing possible setups. For this I will use something I figured out back in 1995. We have 8 queens and 8 row/columns. Thus every row and column must have a queen. I start with all the queens on the first row, test this scenario, and if it fails, move the piece in the first column down a row and repeat. When the first column reaches the end, I move it back to the top and move the piece in the second column down one. This process repeats for all pieces in the row.

The question I had for myself was how to more efficiently implement this. There are only 8 possible rows which is a nice power of two (23). I could use 3-bits for each row, concatenated one after the other for each column. There are 8 pieces, each with 8 possibilities, or 88 = 16,777,216 possible locations which requires 24-bits. So I could simply have an incrementing counter that I break up into row/column. This requires 8 break-up and place operations per increment. However, this is what we want to do. Each time we increment, we start with a blank playing field. We then place each pieces at the specified row/column and then check for threats. If there is a threat, we can stop checking and move to the next increment. If all 8 pieces are placed without a threat, we have a solution. This is what I have implemented.

//=============================================================================
// Name: queens.c
// Uses: Print all possible combinations of placing 8 queens on chess board
//       without any queen threatening any other.
// Date: 2017-06-14
// Author: Andrew Que (https://www.DrQue.net/)
//
//                                Public Domain
//=============================================================================
#include <stdio.h>
#include <stdbool.h>
#include <inttypes.h>

// Rows/columns on a chess board.
enum { ROWS    = 8 };
enum { COLUMNS = 8 };

// The board state uses 3-bits/row.  This is because there are 8 columns and 2^3 = 8.
enum { COLUMN_BITS = 3 };

// There 3-bits/row times 8 rows for a total of 2^24 possible board states.
enum { STATES = 1 << ( COLUMN_BITS * ROWS ) };

// Mask used to select bits for current column.
enum { COLUMN_BIT_MASK = ~-( 1 << COLUMN_BITS ) };

//-------------------------------------
// Base bit masks used for generating move mask.
//-------------------------------------

// Macro for setting the nth bit of a 64-bit constant.
#define BIT( n )  ( INT64_C( 1 ) << n )

// First entire first row set.
static uint64_t const ROW =
    BIT( 0 ) | BIT( 1 ) | BIT( 2 ) | BIT( 3 ) | BIT( 4 ) | BIT( 5 ) | BIT( 6 ) | BIT( 7 );

// The entire first column set.
static uint64_t const COLUMN =
    BIT( 0 ) | BIT( 8 ) | BIT( 16 ) | BIT( 24 ) | BIT( 32 ) | BIT( 40 ) | BIT( 48 ) | BIT( 56 );

// Top-left to bottom-right diagonal set.
static uint64_t const LEFT_DIAGONAL =
    BIT( 0 ) | BIT( 9 ) | BIT( 18 ) | BIT( 27 ) | BIT( 36 ) | BIT( 45 ) | BIT( 54 ) | BIT( 63 );

// Top-right to bottom-left diagonal set.
static uint64_t const RIGHT_DIAGONAL =
    BIT( 7 ) | BIT( 14 ) | BIT( 21 ) | BIT( 28 ) | BIT( 35 ) | BIT( 42 ) | BIT( 49 ) | BIT( 56 );

//-----------------------------------------------------------------------------
// Uses:
//   Set bit at row/column.
// Input:
//   row - Location's row (0-7).
//   column - Location's column (0-7).
// Returns:
//   Mask with set row/column.
//-----------------------------------------------------------------------------
static inline uint64_t set( unsigned row, unsigned column )
{
  return UINT64_C( 1 ) << ( column + ( row * COLUMNS ) );
}

//-----------------------------------------------------------------------------
// Uses:
//   Generate a mask for all the moves a queen can make from given position.
// Input:
//   row - Location's row (0-7).
//   column - Location's column (0-7).
// Returns:
//   Mask with all move locations set.
//-----------------------------------------------------------------------------
static inline uint64_t queenMask( unsigned row, unsigned column )
{
  // Diagonal mask from top-left to bottom-right.
  uint64_t leftDiagonal = LEFT_DIAGONAL;
  if ( row > column )
    leftDiagonal <<= ( row - column ) * COLUMNS;
  else
  if ( row < column )
    leftDiagonal >>= ( column - row ) * COLUMNS;

  // Diagonal mask from top-right to bottom-left.
  uint64_t rightDiagonal = RIGHT_DIAGONAL;
  unsigned oppositeColumn = COLUMNS - column - 1;

  if ( row > oppositeColumn )
    rightDiagonal <<= ( row - oppositeColumn ) * COLUMNS;
  else
  if ( row < oppositeColumn )
    rightDiagonal >>= ( oppositeColumn - row ) * COLUMNS;

  // Start with a mask of the row and column moves covered.
  uint64_t value = 0;
  value |= ROW << ( COLUMNS * row );
  value |= COLUMN << column;

  // Add in diagonals moves.
  value |= leftDiagonal;
  value |= rightDiagonal;

  return value;
}

//-----------------------------------------------------------------------------
// Uses:
//   Print an 8x8 mask representing the chess board.
// Input:
//   mask - Mask to be printed.
//-----------------------------------------------------------------------------
static void printMask( uint64_t mask )
{
  for ( unsigned row = 0; row < ROWS; row += 1 )
  {
    for ( unsigned column = 0; column < COLUMNS; column += 1 )
    {
      if ( BIT( 0 ) == ( mask & BIT( 0 ) ) )
        putchar( 'x' );
      else
        putchar( '.' );

      mask >>= 1;
    }

    putchar( '\n' );
  }

  putchar( '\n' );
}

//-----------------------------------------------------------------------------
// Uses:
//   Display all the possible combinations of placing 8 queens on a chess board.
// Output:
//   Always returns 0.
//-----------------------------------------------------------------------------
int main()
{
  // Number of functional board layouts.
  unsigned goodCount = 0;

  // For every possible board state that contains 8 queens...
  for ( uint_fast32_t state = 0; state < STATES; state += 1 )
  {
    // Initial field has no pieces on it.
    uint64_t field = 0;

    // Assume this is a valid arrangement.
    bool isGood = true;

    // Current row (0-7).
    unsigned row = 0;

    // Sub-state is a copy of state, but shifted as column bits are used.
    uint_fast32_t subState = state;

    // Loop through all rows, adding queens until all are placed or any of the
    // queens threaten another.
    while ( ( row < ROWS )
       && ( isGood ) )
    {
      // Each row must have a queen, so we just need to calculate the
      // column the queen is located from the game state.
      unsigned column = subState & COLUMN_BIT_MASK;
      subState >>= COLUMN_BITS;

      // Only check the field if there is at least one other piece on it.
      if ( field > 0 )
      {
        // Generate mask for position covered by this move.
        uint64_t mask = queenMask( row, column );

        // If the move mask intersects with the current playing field, this
        // setup is invalid.
        if ( 0 != ( field & mask ) )
          isGood = false;
      }

      // Add this move to the current field.
      field |= set( row, column );

      // Next row...
      row += 1;
    }

    // If the loop finished and the setup is still good, this is one of the
    // possible setups.
    if ( isGood )
    {
      printMask( field );
      goodCount += 1;
    }
  }

  // Display the total number of possible configurations.
  printf( "Total: %u\n", goodCount );

  return 0;
}

My old Pascal code took several seconds to find the first solution which takes around 1.3 million checks. On modern hardware this code will find all 96 solutions to the 8 queens puzzle in around 350 milliseconds, checking all 16.8 million possible combinations.

February 04, 2017

AQ Graph Release

   After a fair bit of cleanup work, my Javascript graphing library AQ Graph is ready for the world.
Graph demo   The design of AQ Graph began more than 6 months ago after an experiment at work with HTML 5 canvases.  I had started the basics for a simple line graph but ended up not needing any graphing functionality on the project I was using.  Then some months ago I started working on some FFT code where I needed a Javascript graphing library.  In the past I have used my PHP library X/Y Plot, but I wanted someone that ran client-side rather than server-side.  So I decided to look into what I was going to use for work but never pursued.  The results are a very fast X/Y graphing library that work with large data sets.  As part of a (as yet unreleased) project called WaveLab, I finished the base implementation of the graphing library.  This project required two kinds of graphs: an X/Y plot, and a bar graph.  They were quite messy so I decided that before finishing WaveLab, I would finish the graphing library.  The finishing process involved taking the project specialized library and making it general purpose. 
   I kept the high speed of the X/Y graph and worked on cleaning up the division and labeling function, making them common between the X/Y and bar graph types.  The bar graphs needed the most work.  While fairly basic in terms of options, they are fully functional.  I also made them capable of running as stand-alone and AMD libraries.

1 comment has been made.

From custom essay writing service (http://https://rospher.com/)

FL,USA

June 12, 2017 at 11:35 PM

Pleasant to visit your site page once more, it has been months for me. Well this article i have been sat tight for so long. I require this piece to finish my task in school, and it has same theme with your piece.