Dyno Plot smoothing for the mathmatically inclined.

Reply Subscribe

Thread Tools

Search this Thread

Jul 6, 2005 | 12:24 PM

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Dyno Plot smoothing for the mathematically inclined.

When building datalogging and engine managment software, I came across the odd problem of how to properly smooth/interpoloate/visual-enhance dyno plots. Most of the software that does this kind of stuff is closed source and not particularly interested in sharing their algorithms.

Being a trained mathematician and computer programmer myself, I decided to just come up with my own solution. I'm no expert in numerical analysis, but I think my findings are pretty interesting by themselves. For a while, I thought I had a pretty good solution using what I affectionately call my "box" method. Recently I've come across a much better smoothing filter that originates in a 1964 Analytical Chemistry paper by Savitzky and Golay.

Brief background: I first became aware of smoothing issues when I took a look at the dyno generation spreadsheet that dustin@vishnu created last spring. I noticed that he was smoothing the plots by keeping a running average (called a moving window in math literature). This has the undesirable effect of pushing the peaks left or right, depending on which way your place your window. In other words, you might have power peak at 6500 rpm, but after smoothing it could appear at 6700 -- that's bad juju.

The "box" method I came up with works as the following. You pick four points in your plot, then build the smallest box that contains them all. Then pick the center of the box as a new point in your plot. Iterate...This works fairly well, but can move your peaks up or down -- worse juju.

Thirdly there is a method related to the running average where you place the window on _both_ sides of your target point. I call it Nearest Neighbor Averaging -- NNA.

Fourth is the Savitzky-Golay algorithm. I believe it's quite superior. It correctly identifies peaks and dips, but filters out the noise. It's inner workings can be a bit complicated, but I can decribe them if people are interested. It's also very fast.

Anyway, I've built all 4 of these and added them to my Xede software. As a first example, let's see how they do at approximating a noisy line.

Here is the NNA method:

Here's the "box" method:

Here's the Running Average method:

And finally here's Savitzky-Golay:

Although the endpoints are messy, notice how much better the curve features can be identified.

d

EDIT: doh, I mispelled mathematically.

Last edited by donour; Jul 6, 2005 at 12:58 PM. Reason: grammar

Jul 6, 2005 | 01:25 PM

Dustin@Vishnu

Former Sponsor

iTrader: (1)

Joined: Jun 2004

Posts: 1,114

Likes: 0

From: Livermore, Ca

Quote:

EDIT: doh, I mispelled mathematically.

GOD! Seriously, why are you so dumb Donour. tehehehe.

Keep up the good work man, IM me on AIM when you get a chance, I have an idea for a few of the things we were talking about.

BTW, we call it the SMART xede for a reason.

Jul 6, 2005 | 01:27 PM

MalibuJack

EvoM Guru

iTrader: (5)

Joined: Feb 2003

Posts: 10,572

Likes: 14

From: Royse City, TX

LOL Is that algorithm you found the "scatter averaging" method I had been researching?

FWIW, My spelling and grammar are pretty atrocious. Yet for some reason people seem to think I have a clue.. LOL

Last edited by MalibuJack; Jul 6, 2005 at 01:29 PM.

Jul 6, 2005 | 01:35 PM

MalibuJack

EvoM Guru

iTrader: (5)

Joined: Feb 2003

Posts: 10,572

Likes: 14

From: Royse City, TX

I also think that last algorithm is the same one used in graphics imaging for enhancing images and reducing abhorrant noise artifacts..

If you can e-mail me some info on that algorithm, I actually am writing a filter for photoshop that is supposed to do the same thing..

Last edited by MalibuJack; Jul 6, 2005 at 01:37 PM.

Jul 6, 2005 | 01:36 PM

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Quote:

Originally Posted by MalibuJack

LOL Is that algorithm you found the "scatter averaging" method I had been researching?

Doubtful, the savitzky-golay filter is pretty insane. I was able to gank a copy from my old numerical recipes book. All said and done, my smoothing library has 2000 lines of C just for that one method alone.

Numerical recipes gives the method various names: Savitzky-Golay, least-squares interpolation, and (EDIT

DISPO

Using a running average, bias is introduced if the second derivative of your curve is nonzero. From numerical recipes, "The idea of savitzky-golay filtering is to find filter coefficients c_n that preserve higher moments." In other words, it does moving window approximation, but instead of using a constant (averaging) it uses high order polynomials (quartics I believe).

EDIT: Oops, forgot some things. SavGol (as I've taken to calling it), requires that the data points be evenly spaced. In addition, the numerical recipes code requries the datasize to be a power of 2. So, I have to do some preprocessing to get the data in something that the filter can handle. It's a real pain.

The nice thing though is that the SavGol method only calculates the coefficients. It doesn't apply the filter. This means you can generate a filter and apply it to a bunch of plots very quickly. Note almost every dyno run will be 1024 samples since that is the closest power of 2. 512 samples is too few and it would take over twenty seconds to collect 2048 samples!

d

Last edited by donour; Jul 6, 2005 at 01:42 PM.

Jul 6, 2005 | 01:42 PM

MalibuJack

EvoM Guru

iTrader: (5)

Joined: Feb 2003

Posts: 10,572

Likes: 14

From: Royse City, TX

Okay EEK! 2000 lines of code! I think I get what your describing for that one.. your discarding values which are way out of the range, but still allowing the weighting to correctly bias the curve without artificially smoothing it. But what your describing might work really well in digital imaging.

Jul 6, 2005 | 01:48 PM

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Quote:

Originally Posted by MalibuJack

Okay EEK! 2000 lines of code!

Most of it is support code. The actual coefficient generation fits on a single page.

Let's put it this way. Numerical recipes required.

1) their own vector management system
2) LU decomposition
3) FFTs
4) custom convolution function

Quote:

I think I get what your describing for that one.. your discarding values which are way out of the range, but still allowing the weighting to correctly bias the curve without artificially smoothing it.

Well sort of. I'll dig up the numerical recipes link and post it.

Voila:

http://www.ma.utexas.edu/documentati...fpdf/f14-8.pdf

Quote:

But what your describing might work really well in digital imaging.

Other people seem to think so too, but you have to generate 2D coefficients:

http://research.microsoft.com/users/...Gol/SavGol.htm

Trending Topics

Dual stage power circuit for fuel pump NA lancer

4

142
Act New Twin Disk Who Running One?

388

69.7k
Can anyone who has swapped in EvoX cams testify?

10

7.5k
Tephra Brake Switch Patch

2

315
Official Engine Bay Picture Thread

2.8k

1.5M

Jul 7, 2005 | 08:39 AM

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Another example

My first example was only a noisy line (y=mx+b). While savgol worked best, all of the methods provided fairly good approximations. Now, lets try it with something that has nonzero second derivatives -- sinusoidal curves.

The following plots have three data sets. The first is the raw data. The second is the original curve used to generate the data. The third is the curve fit by the various methods.

First is NNA:

Next is the "box" method:

Thirdly we have the running average:

And finally Savitzky-Golay:

The first time I tested this code, I was amazed at how much better SavGol preserves the peak values and locations. Whose your daddy?

Jul 7, 2005 | 08:57 AM

MalibuJack

EvoM Guru

iTrader: (5)

Joined: Feb 2003

Posts: 10,572

Likes: 14

From: Royse City, TX

Thanks for the link, reading the PDF helped alot, it also very clearly shows how its bias towards smoothing (losing data) on short duration, high peaks can be used very effectively as an imaging noise filter. This will help my work alot.. though it may require a bit of thought how to preserve intentional contrast.

Jul 7, 2005 | 09:07 AM

#10

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Quote:

Originally Posted by MalibuJack

I'm glad you found it useful. At lunch, I'll try to generate an example using actual dyno data (probably the same example I've used for every other plot in the past two months).

I _still_ haven't had a chance to look at those UTEC logs that you sent me. It's down in the todo list somewhere. I'm afraid it's going to get push off again, as I'm going to have new car parts and a SMART package to play with in a week or two.

d

Jul 7, 2005 | 10:41 AM

#11

barney malone

Newbie

Joined: May 2005

Posts: 39

Likes: 2

From: california

Dude!
For the NNA, it looks like at the endpoints you are averaging in a bunch of fictitious 0.0's and that you also have some indexing problem that shifts everything off to the side.
See the attached for a simple +/-14 (i.e. 29 point) NNA does.

Also, it looks like your in your Savitzky-Golay algo it is assuming somehow that the endpoint values are 0.0 so it would do a bunch better by:
1) doing a simple pre-processing step that subtracts off a simple linear fit based only on some reasonable value for the endpoints (could even be just the 1st and last data points)
2) doing your Savitzky-Golay
3) adding back the same linear fit you subtracted off in step 1

Attached Thumbnails

Dyno Plot smoothing for the mathmatically inclined.-noisefilt.gif

Jul 7, 2005 | 11:18 AM

#12

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Quote:

Originally Posted by barney malone

One of my points is that other methods, that I've tried, shift the peaks around. My window size for that NNA is only 4 points. The reason you can find a "shift" which puts the interpolation back on the curve is because you know where the original location is. If you _know_ the underlying function, you can do a least squares fit or something and get a much better approximation. I have been unable to properly formulate the power/torque curve of a modern automobile -- at least in any kind of closed form.

Quote:

Also, it looks like your in your Savitzky-Golay algo it is assuming somehow that the endpoint values are 0.0 so it would do a bunch better by:
1) doing a simple pre-processing step that subtracts off a simple linear fit based only on some reasonable value for the endpoints (could even be just the 1st and last data points)
2) doing your Savitzky-Golay
3) adding back the same linear fit you subtracted off in step 1

No, I'm just using a really big window so the approximation is terrible at the ends. There's several parameters to the savgol function and I've not properly selected which ones I want to use.

d

Jul 7, 2005 | 12:15 PM

#13

barney malone

Newbie

Joined: May 2005

Posts: 39

Likes: 2

From: california

Quote:

donour: One of my points is that other methods, that I've tried, shift the peaks around. My window size for that NNA is only 4 points. The reason you can find a "shift" which puts the interpolation back on the curve is because you know where the original location is. If you _know_ the underlying function, you can do a least squares fit or something and get a much better approximation. I have been unable to properly formulate the power/torque curve of a modern automobile -- at least in any kind of closed form.

Huh?
It is totally obvious from inspection that neither of your "NNA" plots is using 4 X_consecutive points or even +/-4 X_points. I would guess your are using more like +/-50 (or 101 total) points around each point and are shifting your results by that same 50 points.
Did you look at my plot? I did not use any a priori info. I just did a simple bonehead average of the 29 Y_values (14 on either side and from the "X_target" point itself) around the noisy data shown to get my "NNA" plot.
Maybe state again what the heck you are doing, because it does not seem to be

Quote:

donour: Thirdly there is a method related to the running average where you place the window on _both_ sides of your target point. I call it Nearest Neighbor Averaging -- NNA.

Jul 7, 2005 | 12:29 PM

#14

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Quote:

Originally Posted by barney malone

Actually the NNA plots _are_ using a 4 point window. The examples I've posted have the filters applied multiple times in order to exaggerate the effects. Each time you apply the smoothing, the plots shifts a little, but it's not a shift 4, 2, or anything that I've found to be meaningful.

Also, the examples I've posted so far have fairly mild noise. The curve shape is still very obvious. Try it with raw RPM sampled (EDIT: ) Dyno data.

d

ps - I appreciate your directed criticisim though. I definitely don't think I have any perfect solutions.

Last edited by donour; Jul 7, 2005 at 12:53 PM.

Jul 7, 2005 | 01:04 PM

#15

donour

Thread Starter

Evolved Member

iTrader: (6)

Joined: May 2004

Posts: 2,502

Likes: 1

From: Tennessee, USA

Here's a plot of raw dyno samples, which I've found to be quite a bit more difficult to fit correctly.

Here's a comparison of how NNA and savgol do. The first run (1) is NNA, the second (3) is savgol:

That was dustin's car which should have a very stock evo-like poweband. What I don't like about NNA is how resistant it is to sudden peaks or dips.

Perhaps there is a better windowsize to pick for this particular example. Still, a moving window average will have the tendancy to push your points "up" in the direction that you iterate. Here, you can see that the hp level appears artificially high. It even appears to keep going after power is pulled (a little before 7k rpm if memory serves).

d

Last edited by donour; Jul 7, 2005 at 01:06 PM. Reason: grammar

Reply Share

First
Prev
1 / 2
Next
Last