Saturday, September 19, 2015

Circle Fitting in Maxima

In previous posts, I've considered the problem of fitting a circle to 2D data points and made some remarks about 3D cases. Fitting problems such as this require:
  1. An equation form to fit to. 
  2. A function to quantify the error, which is to be minimized.
For a circle, the form of equation we want to fit to is
\[r^2=(x-a)^2 + (y-b)^2.\]
Maxima's lsquares package removes the burden of item number 2 for us. But for interest's sake, the error function we want to minimize can be given as
\[\mathrm{r}\left( a,b\right) =\frac{\sum_{i=1}^{n}\sqrt{{\left( {x}_{i}-a\right) }^{2}+{\left( {y}_{i}-b\right) }^{2}}}{n}\]
\[\mathrm{SSE}\left( a,b\right) =\sum_{i=1}^{n}{\left( \mathrm{r}\left( a,b\right) -\sqrt{{\left( {x}_{i}-a\right) }^{2}+{\left( {y}_{i}-b\right) }^{2}}\right) }^{2}.\]
Recalling that \(\sum_{i=1}^{n}(\bar{x}-x_i)^2 = \sum_{i=1}^{n}{x_i}^2 - (\sum_{i=1}^{n}{x_i})/n\), we see that
\[\mathrm{SSE}\left( a,b\right) =\sum_{i=1}^{n}\left({\left( {x}_{i}-a\right) }^{2}+{\left( {y}_{i}-b\right) }^{2}\right)-\frac{{\left(\sum_{i=1}^{n}\sqrt{{\left({x}_{i}-a\right) }^{2}+{\left( {y}_{i}-b\right) }^{2}}\right)}^{2}}{n}.\]


Given a set of points in matrix form (m), we need only to use the equation of a circle and indicate which are the data point variables and which are the "solve for" variables. In our problem,

lsquares_estimates(m, [x,y], (x-a)^2 + (y-b)^2 = r^2, [a,b,r]);

will suffice. The following function can be used to check the SSE.

sse(a, b, r, pts) := sum((r-sqrt((pts[i][1]-a)^2+(pts[i][2]-b)^2))^2,i,1,length(pts));

Testing for Reasonableness

It is reasonable to ask whether least squares (or other data fitting methods) will reliably produce good results with a particular quality of data for a particular application. This is true for any problem. What if the math works fine, but the likelihood of being able to "math filter" the  likely errors in the data to reach an accurate answer is not so high? In other words, how much inaccuracy in the collected data can you tolerate and still have a reasonable expectation of getting to the right answer? I'm not going to attempt to deal with probabilities (per se) or confidence intervals, but look at a simulation approach that may be able to give you an idea of the reasonableness of the circle fit based on an (intuitively) estimated amount of random error in your data.

Simulation of Circular Arc Data

We will consider the problem of data collected, using a total station, of a circular arc, with an "as-built" scenario particularly in view. As such, auto-generated sample points should simulate the expected behaviors of a rod/prism person stepping out regular sample locations from start to end of an ostensibly circular arc. This can be implemented as random normal perturbations of a circular arc.

One of the first pieces to a realistic perturbed circle is to be able to apply randomization that follows a normal distribution given an expected value and a standard deviation. Maxima has a function that behaves this way. random_normal() can be called with an expected value and standard deviation and optionally with a number of values to generate. Here is a sample use and output of the function, pictured using the histogram function:

    random_normal(50.0, 1.0, 1000),
    nclasses = 10,
    xrange = [40,60]);

Fig. 1. Normally distributed data
The standard deviation should be chosen such that roughly two-thirds of the time (≈68%), the sample value will be within one standard deviation of the mean; that is, on the interval \((\bar{x}-\sigma, \bar{x}+\sigma).\) 

Where the radius is concerned, we have the relatively predictable variations of the prism person's actions and the relatively unpredictable variations of the ostensible circular arc, which could turn out to not be an arc at all. In the absence of any sort of meaningful solution to this latter variation, we could possibly just bump up the standard deviation. But I'm not clear on whether this is a meaningful thing to do in assessing the acceptability of the data collection precision. The error should be "small" if the thing we measured conforms well to a circle. In order to check for whether this is "small" you need to produce a standard deviation from the SSE: \[\sigma_{SSE} = \sqrt{\frac{SSE}{n-1}}\] In the scenario we're dealing with here, someone would probably draw the computed arc in a CAD program and compare it to the measured data points and in an "arm-waving" sort of way decide whether it looks good or not.

Where the point sampling distances are concerned, we will assume that the rod person can step out distances (or angle sweeps) within some percentage of that intended. We will also make the pragmatic assumption that the prism person has decided a definite number of samples per given arc and will try to make them evenly spaced. The prism person will try to subdivide the remaining space at each sample point, and two thirds of the time, they will be within some percentage of the distance (or angle sweep) they intended.

Here is the perturbedArc() function with a sample usage:

Fig. 2 - This is a perturbed circular arc. Doesn't look very perturbed does it? If we bumped up the standard deviation of the radius enough, we'd get something more clearly perturbed.
To produce the matrix form of the data I use a simple function I created (see here) called MakeMatrix(). Beyond that, it is a simple as

m: MakeMatrix(pts)$
lsquares_estimates(m, [x,y], (x-a)^2 + (y-b)^2 = r^2, [a,b,r]);

Monday, September 7, 2015

Plotting a LWPOLYLINE in Maxima

In this post, I want to show how to produce a representation of LWPOLYLINEs in Maxima that is suitable for plotting to the plot2d function. The direct usefulness of such an operation is dubious. This is more in the way of a worked example to help build understanding of a few mathematical and computing tools.

The LWPOLYLINE is a 2d representation of polylines that represents each vertex as having x-value, y-value, and bulge value. The bulge value indicates the path of travel from the current point to the next one. Either it will go straight (bulge = 0), veer to the left and back as a CW arc (bulge < 0), or it will veer to the right and back as a CCW arc (bulge > 0).

So, it will
  • bulge left (< 0),
  • go straight (= 0), or
  • bulge right (> 0).
More details on the mathematics of the bulge factor can be found here and here.

The Pieces
First of all, we define our problem and break its solution up into pieces. A polyline is of such a nature that is does not make a good "\(f(x)\)" type function (plotting \(y\) versus \(x\)). However, it lends itself to parametric expression well enough. So, we will have an \(x(s)\) function and a \(y(s)\) function. Ideally, I would like to specify a length along the polyline (measured from the start) and get a point \((x(s), y(s))\). (If you have a surveying background, think \(s = \) station number.)

So, I need to have parametric equations for lines and for arcs that are parameterized for arc length and initial value of the parameter. I then need to stitch these together to make both an \(x\) and a \(y\) piecewise function.


Parametric equations for lines involve a direction and a starting point. Simply normalize the direction vector to parameterize for length. This looks like
\[(x(s), y(s)) = A + \frac{(B-A)}{||B-A||}s,\]
where \(A\) is the starting point and \(B\) is the final point. We will need to adjust the initial value to account for all of the length of the polyline prior to the initial point on this segment, say, \(s_i\) which changes our equations to
\[(x(s), y(s)) = A + \frac{(B-A)}{||B-A||}(s-s_i).\]


Parametric equations for arcs use a center, radius, initial angle, direction of rotation, and total angle subtended (some variation of feasible input parameters admitted). Parameterizing for arc length isn't as obvious as it was for lines, so let's start with a basic circle:
\[x(s) = c_x + r \cos{s}\]
\[y(s) = c_y + r \sin{s},\]
where \((c_x, c_y)\) is the center and \(r\) is the radius of the circle. For our reparameterization to work we require that radian measure be used (or else require degrees and do the conversion in the function, but why make work for ourselves and introduce inefficiency?). Recall that the definition of radians is the ratio of arc length to radius (\(\theta = s/r\)). So, our parameter becomes \(s/r\) and so we have

\[x(s) = c_x + r \cos{\left(\frac{s}{r}\right)}\]
\[y(s) = c_y + r \sin{\left(\frac{s}{r}\right)}.\]

To get the right starting point for our function we need the parameter inside the trig functions to be equal to the correct initial angle when the input value of \(s_i\) (length of polyline so up to the start of the arc) is given. We need, then, the initial angle, say \(\theta_i\), plus the angle past initial that the given \(s\) corresponds to. Thus,

\[x(s) = c_x + r \cos{\left(\theta_i + \sigma \frac{s-s_i}{r}\right)}\]
\[y(s) = c_y + r \sin{\left(\theta_i + \sigma \frac{s-s_i}{r}\right)},\]
where \(\sigma\) is 1 or -1 as per the sign of the bulge factor.


Making a piecewise function parameterized for length requires that we keep track of the length of the polyline up to the segment we are about to address. Implementing it in Maxima also requires a bit of acquaintance with the methods available to us for doing so. It is possible to use if-then-else logic to achieve this and obtain something that works fine for plotting purposes. As an example of this, consider a really simple example:

f(x) := if x < 1 then x^2 else x^3;

Fig 1. If-then-else logic in functions plots fine.
There's another, somewhat handier approach to this problem, namely, the characteristic function. The characteristic function function is a way of expressing conditions as multiplication by 1 or 0 in order to include and exclude the appropriate expression. charfun2(x,a,b) returns 1 if \(x\in [a,b)\), else 0. So, we can modify the above function to be as follows and obtain the same result. 

f(x) := charfun2(x,-inf,1)*x^2 + charfun2(x,1,inf)*x^3;

One nice thing about these expressions is that it is possible to create them separately and add them together later and get a relatively normal Maxima expression. (There are some hiccups if you try to differentiate or integrate them.) I can't comment on the performance of the characteristic function approach versus the if-then-else approach, but the characteristic function approach may be more manageable to produce.

The Program
There are a number of details that don't specifically relate to this problem that I will not go into detail about and other details I have given in the two previously mentioned posts on bulge factors. Here is the complete code for producing the parametric equations and plotting them followed by the result for a specific example polyline.