Andrew Que Sites list Photos
Projects Contact
Main

June 17, 2012

The Kobold's Cave

Kobold's Cave

Kobold's Cave

   It has been a long time since I updated any of the information on me work area, so I decided to make a page about my current setup in Madison called the Kobold's Cave.
   Pictured is my primary work area in the Kobold's Cave.  This is actually the first time I have used High Dynamic Range (HDR) imaging.  Since this work area has fairly extreme contrast this shot was a good candidate.  The image is comprised of 9 images, from -6 to +2 F-stops.

1 comment has been made.

From Pluvius

Platteville, WI

June 18, 2012 at 10:41 PM

Very nice shot! Did not realize HDR could be so useful.

June 15, 2012

Least-Square Regression Demo

The other day I received an e-mail with questions about my least-square regression PHP class. I haven't touched this implementation since I wrote it for an article on least-square regression in June of 2009, so it's time for a demo.

The original implementation used Cramer's Rule to solve the resulting system of equations. I wrote in May of 2011 an article about a faster method for doing this. So I decided to implement that method and then make a demo to show how the regression curve fits data.

 _fcksavedurl=
 

This demo has a number of black points that can be moved around to form a polynomial curve. The thin red line in the center represents the true polynomial curve. The blue dots represent data points along the true curve with random error introduced. The scatter and concentration of the error can be controlled with the two sliders. The higher the concentration value, the closer the error will fall toward the curve. The scatter magnitude controls how much it is possible for the error to deviate from the true data. The data with the random errors is then used as input to the least-square regression function, and the output of that function is displayed in green. So the green curve should match closely the red curve.

What this simulation shows is the ability of the regression function to recover polynomial coefficients from a signal with a fairly low signal-to-noise ratio with pretty good accuracy. The function must assume the data is from a polynomial of a specific degree. The real-world applications are probably limited, but surely exist—especially with lower degree polynomials.

From experimentation, it seems that curves that have higher curvature are reconstructed the best. That is, curve that change a lot do better than curves that are fairly flat.

The fit of the curve is being measured with residual sum of squares. The lower this value, the closer to the regression curve is to the actual curve with zero being perfect. In this graph, values below 0.5 are pretty good fits, and values below 0.01 put the true curve (in red) in the regression curve (green).

I updated the least-square regression PHP class page with the new version of the class, added some documentation, and some examples. If one person found this class useful, maybe more people will as well.

A couple of weeks ago, I wrote about weighted random numbers. After implementation and some experimentation, I settled on a versatile function that incorporates all the fetchers of the weighting system. Mathematically, it's a little ugly because there is an “if” statement and we end up with a piecewise function.

Where m is the minimum value, M is the maximum value, c is the center point (mcM), S is the concentration coefficient (useful range 1 ≤ S < ∞), and α and β are random numbers between 0 and 1. The core of this function is the weighting.

This has been scaled so the output is in a given range.

Here, mwsM were as 0 ≤ w ≤ 1. From here, the body of the function is split before and after the center point. For this we require a second random number, α. This value is used to determine of the value is to the left or right of center, and the min and max of the function are adjusted accordingly.

Min (m)
Max (M)
Center (c)
Concentration (S)

The top graph shows the distribution of 1,000 samples, and the lower graph shows a histogram of the distribution. The average is calculated over all the samples. If the center value is half-way between min and max, the average should be the center value (or close to). The center value reflects the highest peak value in the histogram, which should always be close to the specified center.

There are some things you can do with this function that are not meaningful. Having a center value outside the min and max value will still generate values, but probably not useful for anything.

You can also use a concentration coefficient less than one and greater than zero (0 ≤ S < ≤ 1) . This has the effect of pushing the concentration away from the center point and toward the min and max values—basically the acting in the reverse of the normal algorithm. This may be useful for generating a value that is usually either one value or an other, with very little in between.

Here the min is 0, max is 100, center is 20, and the concentration coefficient is 0.1. Notice how the center point is the least populated area of the graph.

There are some ways to use this function to generate some of the other weighted functions. For example, let c = ½ (M – m) + m. This will make the function have equal distribution on both sides of the center point.

Here, the function C is a centered function, c is the center point, and s is the span that can be deviated from the center.

For a simple left or right weighted version of the function, simply set the center point to the min value (left weighted) or max value (right weighted).

Using a concentration coefficient of one (S =1) results in just random uniform random data (assuming β is random). Small values of S are harder to notice in this demo, but become pronounced when more samples are used.

Here is an example of a center at 70, min of 0, max of 100, and a concentration coefficient of 2. At 1000 samples it is not apparent there is any concentration, but at 100,000 samples it is easier to see. The higher sample set also makes the histogram more clear. Notice how the histogram falls to around 100 on both sides, but more rapidly to the right of center. This is necessary because of the uneven weight. So a 0 or a 100 are both equally likely (or unlikely as the case may be), but a 60 and 80, despite being equal distance from the center point are not both as likely as one an other (higher likelihood of 60 over 80).

//----------------------------------------------------------------------------
// Return a weighted random number with an uneven distribution from center.
//   $min - Smallest possible.
//   $max - Largest possible value.
//   $center - Location of highest conentration.
//   $concentration - How strongly to curve number--the higher the value,
//     the strong the curve tends toward center.
//   $alpha - Number between 0 and 1, generally random.
//   $beta - Number between 0 and 1, generally random.
//----------------------------------------------------------------------------
function uneven$min$max$center$concentration$alpha$beta )
{
  
// Curve beta.
  
$numerator   $beta;
  
$denominator $beta * ( $concentration ) + 1;
  
$result      $numerator $denominator;

  
// Get center point.
  
$centerDivide = ( $center $min ) / ( $max $min );

  
// Figure out if this result is to the left or right of center.
  
if ( $alpha $centerDivide )
  {
    
$result *= $center $min;
    
$result  $center $result;
  }
  else
  {
    
$result *= $max $center;
    
$result += $center;
  }

  return 
$result;
}

May 14, 2012

Weighted Random Number

I've written articles about weighted random number in the past, but today I ran into a use I've been meaning to explain for a long time.

For example when rolling two dice, the mostly likely number to roll is 7. With 4 dice, it's 14. These are weighted rolls in the context of this article as the likely outcomes are not evenly distributed, but tend toward some center point.

One of the weighting algorithm I've written about in the past is Banded Inverse Root Nonuniform Scatter. This is the function:

Where α1 and α2 are random numbers between 0 and 1, and S is the “scatter coefficient”. The root of this function is the banding part.

This weights the roll toward 0. The larger the value of S, strong the pull toward 1. Using two of these functions together give a range the function a peak centered at 0 that goes both positive and genitive. Note that the last part of the function normalizes the output so it is between 0 and 1. The process will be explained in a bit, but this function will be called nb(S). So in parts, the full function is:

This function can be simplified if the square root is removed. The root makes the curve more gradual, but this isn't needed.

The trick to this function is the use of the -1, +1 in the denominator. This allows the scatter coefficient to have a defined range between negative infinity and positive infinity (i.e. -∞ < S < +∞), although the useful range is 0 ≤ S < +∞.

The normalized function looks like this:

Rebuilding the center-weighted function results in:

So g( α1, α2, S ) is our weighted function. α1 and α2 are random numbers between 0 and 1. S is the scatted coefficient 0 ≤ S < ∞. The larger the value of S, the more weighted the output is toward 0.

The graph above shows the histogram for distribution for various scatter values and illustrates how as the scatter coefficient increases, the concentration toward the center increases. Note that this function does not create a bell curve (or normal distribution). Instead it has a sharp point at the center. This means that for larger values of S the likelihood of being away from the center point diminishes very rapidly—much more than it would with a function that has normal distribution. So the function favors the center point more strongly than those producing normal distribution.

Now some of the function's versatility. The function is normally used to generate some range.

Here, M is a scale factor (magnitude) and c is an offset that allows the function to have a range such that -(M + c) < v < (M + c). Now a function can be defined to return a value in a given range with some weight.

Where vmin < w( vmin, vmax, S ) < vmax. The floor function makes sure the values are integer numbers, and can be omitted if real number are desired. The center point will always be half way between vmin and vmax.

This function can be modified slightly to simulate a dice roll. Let n be the number of dice, and s be the number of sides on each die. Then vmin = n, vmax = n * s. The scatter coefficient (S) can be varied, but the distribution will not be identical to that of an actual dice roll.

Here the floor function is required. n < d( n, s, S ) < n*s.

In this histrogram, the difference in distribution can be seen between an actual dice roll (in this case, five 6-sides die) and the simulated function d( n, s, S ) where S = 3. Note they both peak at the same location (between 17 and 18) with roughly the same likelihood for these numbers. However, the chances for rolling a 15 are greater with a true dice roll, and less in the simulated. Likewise, rolling an 8 is less likely with dice, and more likely simulated. Keep in mind that the simulated dice roll can do something an actual dice roll can not: produce fractional results. If the floor function part of d( n, s, S ) is removed, any real number in the range can be returned. So while an exhaustive check for every dice roll is possible, every simulated roll is not. Thus, the graph above used one million samples to produce the simulated histrogram.

There are some additional way the function g( α1, α2, S ) can be used. If an uncentered value desired, the random input can be fixed.

These histrogram show the output of 10,000 samples of the function, where α is a random number (0 ≤ α ≤ 1). Note how in both cases, when S = 1 the distribution is uniform for all values. This is because when S = 1, the weighting function is doing nothing, and the random value α is being returned.

Boston

Boston

   Started getting SPAM to two e-mail addresses from the same group:  Dice Stars Casino.  They somehow got my last.fm e-mail address as well as my linkedin e-mail address.  I use a unique e-mail address for every online service so that when I get SPAM, I know the origin, and I can remove the address.  It's strange that the same group got two addresses and started using them within days of each other, but it seems even more strange that large site like linkedin and last.fm both somehow gave up my address.  It's possible that somehow my e-mail address alias list was compromised, but that seems rather unlikely.  Time to keep eyes open.

1 comment has been made.

From asdf

May 11, 2012 at 5:13 PM

What? Companies give private user data to shady information brokers? I'm shocked, simply shocked!

I was doing some work in Google Sketchup, and started experimenting with star polygons. I was drawing an 8-point star when it dawned on me was more than one configuration that could be used to draw such a star. After a little reading, I discovered the nomenclature on this topic. Using the Schläfli symbol, I discovered what I had normally been drawing when I made stay polygons was of the form {/ %u230A/ 2%u230B-1}. Schläfli symbol is of the form {p / q}, where p is the number of points in the star, and q is the number of points between connecting lines. For example, an octagon shape is {8 / 1} as it has 8 points, and each line is connected to the very next point. A pentagram is {5 / 2}, having 5 points, and each line is connected to the 2nd closest point to either side. What I had been drawing was a star with the distance between points always %u230A/ 2%u230B-1, or having the connecting points as far away from one an other as possible for the star.

After learning this, I decided to create a little web application to demonstrate this.

 

Points.

Points between connections.

Line width.

The demo is done using Scalable Vector Graphics (SVG), with some Javascript used manipulate the image. The math is quite simple. First we get the distance between vertices (points on edge). The distance is in degrees (or radians). For example, an 8-point star is 360º / 8º = 45º degrees between points. To draw an octagon, we simply start a 0º and draw a point to 45º, and then 45º to 90º, ext. In order to be coordinates, we need a distance from the center—the radius—which depends on the height and width of the image. The SVG image uses standard computer coordinate—that is (0,0) is the upper left part of the screen. To convert from the polar coordinates requires first knowing where the center of the view port is located. This is half the width and height of the view port. Polar coordinates typically start with 0º on the right side, but I wanted it like a clock—0º on the top. So the polar to screen conversions are as follows: x = centerXradius * sin( angle ), y = centerYradius * cos( angle ).

The only trick comes when drawing star figures. For example, a {6 / 2} star is actually two {3 / 1} stars (the notation is 2 {3 / 1}), and not a single continuous path. For this case, we need only to know how many smaller polygons this figure is made from, and draw each of them offset one point from the previous. For example, a {15 / 6} is the same as 3 {5 / 2}. This means there are 3 pentagrams, each offset 24º (360º / 15 sides). So the first pentagram would be drawn with it's tip at 0º, the second at 24º, ext.

D.C.

D.C.

Ubuntu 12.04 was released today. After it was, and I managed to get on their website, I started doing a torrent download of, well, all of them. If nothing else, it's a good test of our bandwidth. Our connection has been holding around 5.2 Mbit/sec, peaking out around 5.36 Mbits/sec. I don't even know what speeds our ISP say we should have, but it's nice to give them a workout from time to time.

Why do I need all the flavors of Ubuntu? Well, my main computer is a 64-bit machine, but I have several virtual machines setup, and several other computers that run Ubuntu as their primary OS. So having each of the types (desktop, server, and alternate both i386 and AMD64) will save me a step in the future. Otherwise I always find I need the version I don't have downloaded.

April 23, 2012

Booting Firefox full screen on system start

Lake Champlain, Vermont

Lake Champlain, Vermont

The first question someone might ask is why one would want to have a Linux system that starts Firefox full-screen on boot. A web browser makes a good cross-platform human-machine interface. While there are still issues with web pages looking different in different browsers, these issues have become less and less as browser developers comply to standards. Javascript with AJAX and server-side scripting allow for the developer to create rich applications. So designing a user-interface with a web browser is sometimes a great option. One example would be a controller for home automation. After some back-end scripting to handle calls the server would make to change settings, making an interface is as simple as making a web page. On a low-cost touch screen might server this setup from multiple locations in a house. I've implemented an interface to a large engine controller using Firefox. So there are reasons one might want such a setup.

I want to share something I learned and had wanted to explore for a long time. What is minimum required to have a Linux box boot into Firefox? This was a question I first asked sometime in 2005, when I chopped down a Debian install of Linux to make it small enough to fit on a compact flash to be used on single board computer. I recall I did get the system to fit on a 512 MB compact flash, and the boot setup optimized from power-on to fully running happened in 30 seconds. The next part of the project would have been to get Firefox to start full screen on this setup, but unfortunately never happened as the project was ended before we reached this phase.

I have much more experience with Linux as a text-based server than I do with it as a desktop. DrQue.net has been running on a Linux setup since 2003, but it wasn't until around 2008 I was regularly using Ubuntu on my laptop. How X Windows (X11), the windows manager that sits atop it, and applications that run in the GUI fit together is still rather fuzzy to me. However, I knew that it all starts with X11—so I had to have that. But the windows manager, user log-in screen and all the other things I'm use to seeing on a Ubuntu desktop setup I was less clear about. Turns out for Firefox, you only need X11 and Firefox. No windows manager, no log-in handler, or anything else.

So after install just the bare-bones system, you only need get fetch two additional packages. I added a third, unclutter, so I could turn the mouse off after some timeout period (I'll address that latter).

apt-get install xorg firefox unclutter –no-install-recommends

To start Firefox from the command prompt, first start X11, and then start Firefox:

startx & firefox –display=:0.0

Firefox needs to be told where the display is, but that's the only caveat. To get this to happen on system start up requires a couple of additions. First, add a line to /etc/rc.local

su <user name> -c startx

This will start X11 under some user name. By default, most users are not allowed to start X11. This can be changed be editing /etc/X11/Xwrapper.config and modifying :

allowed_users=anybody

This will let anyone start X11. There is a security risk here as everyone recommends setting this value back, but my system is a single user system with no keyboard/mouse. So I'm not going to be terribly concerned about it.

Now that X11 is setup to start when the system does, it's time to add Firefox to the mix. For this, create ~/.xinitrc with the line

firefox <url>

This will cause Firefox to start with X11 for the specified user, and go to some URL. My setup goes to a local web page so that Firefox begins to view some AJAX web page.

In order to get the setup to run full-screen, you will need a plug in for Firefox called “autohide”. I e-mailed the developer of this plug-in to thank him for his work, and he stated that he no longer does anything with this project. So while it works now, it isn't going to be maintained. The developer also states it was only tested on a windows-based machine, but it does work fine under Linux. What this plug-in will allow is the command line option “-fullscreen” which will cause Firefox to start full screen.

In my setup I found the X11 places a resize triangle in the bottom right corner, even when Firefox is full-screen. Since I want nothing on the screen but the content of my web page, I was not pleased with this artifact. While I didn't find a way to remove it, I did find a workaround. On the command line, you can specify the height and width of the window. Making the height 24 pixels longer than the screen resolution places the triangle out of view.

The last item I had to change was the mouse cursor. My system doesn't normally have a mouse connected, so having the cursor on the screen is rather moot. The package “unclutter” can take care of this. One of the parameters is a delay for how long of a pause to allow before turning off the mouse cursor. Setting this delay time to 1 second makes the mouse go away quickly on start. So the ~/.xinitrc becomes something like this:

unclutter -idle 1 &
firefox -height 1048 -width 1280 -fullscreen 127.0.0.1

Where 1024x1280 is the screen resolution for my system.

The last item, and one I found really frustrating to get functional is disabling the screen saver. By default after 10 minutes of no keyboard/mouse activity X11 blanks the screen. Since there is no keyboard or mouse on my setup, this is an issue. I found no better way to stop the screen saver than by doing this in the X11 configuration. I created the file /etc/X11/xorg.conf and added the following lines:

Section "ServerLayout"
         Identifier "Default Layout"
         Option "BlankTime"   "0"
         Option "StandbyTime" "0"
         Option "SuspendTime" "0"
         Option "OffTime"     "0"
EndSection

Of all the methods to disable screen blanking I tried, this was the only one that worked. Other methods involved using the command xset, but they didn't work for me.

That's it, and I hope someone finds this useful.

April 22, 2012

Javascript web application with elected master

My project recently had a unique requirement. The setup is like this: one configuration script, and one or more views. The configuration script is used to adjust settings that multiple views receive, and this works through an AJAX setup. This is accomplished by a very simple server-side script that simply writes all the parameters it was passed into a file. The views all regularly load this file and adjust their content accordingly. So far, nothing too complicated—almost a text-book AJAX application.

Where is gets complicated is in the fact there are several timers, and the timer counts are javascript controlled. Feedback to the control script was needed so the timers could be monitored. That's easy until you have more than one view. Each view has it's own set of timers, and they would all be fighting to write their data back to the server. This creates a mess of chase conditions. What is needed is one dedicated time master view. This time master will be the only view to return it's clock data to the server. All the other views will be passive observers, basing their clocks on this master clock.

So how it is it decided which view is the time master? We could force the user to pick a master, but what would be better is if this were automatically negotiated.

The first thing to think about is the configuration script. It needs to see that time data is being regularly updated from someone—anyone. The time data may not always be changing—clocks could be stopped. So each time the master view writes data back to the server, it also writes a unique time stamp. This time stamp is simply a random number. The configuration script is periodically reloading the time data, and so as long as this time stamp is changing, the configuration script knows the data is being modified by some remote view. If the time stamp is unchanging, either there is no time master, or there are no open views. Since at least one view is required to get time feedback, the script will assume that there is no time master assigned.

When it is determined there is no time master, the configuration script will signal that a time master is needed. This is accomplished by using a the time master ID. This value will hold some identifying value of the current time master. When a new time master is needed, the value will be set to some token value (I used -1 as this value). The view scripts will see the master ID as part of their periodically updated data, and when the value is seen as the master needed token, they will generate a master ID. This is just some unique value (again, a random number). Once generated, they will write their clock data back to the server, along with their master ID. Now it is a chase condition, but one in which we don't care who wins. The server will prevent two scripts from writing data to a file at the same time. So at some point, the configuration script will load the clock data, and the last view to have written the clock data will have it's data on the server.

The configuration script will see the change in time stamp, and a time master ID to be used. It will latch this ID and write it's own data back to the server. As a time master is no longer needed, subsequent changes to the time master ID from other views will be ignored. Each of the views will then read the configuration data and see this new time master ID. Only one view has this ID, and it will continue to push it's time data to the server. The remaining views will see that the time master ID does not match the one they generated, see the ID is no longer the master request token, and revert to being observers.

In this way, the views elected from themselves which will be the time master. Should the master view be closed updates to the time data will stop being made. The configuration script will see these changes are no longer taking place, and request a new master from the remaining pool of views.

One more mode of operation was added: a self-elected time master. This is setup for times when there is a “main view” that is always desired to be the time master. For this, an other time master ID token is used (I used 1 for this value). When a self-elected master view comes online, it always writes it's clock data. When the configuration script sees this self-elected token ID, it will switch the time master ID to this token value. The acting time master (if there was one) will yield and become an observer automatically just as if it's request for being time master was ignored.

This setup has some short comings. Two self-elected time masters will fight with one an other and create the original problem. The falling out of the time master will cause the clocks to have a delay caused by the clock values not being updated until a new time master is elected. And