Notices
ECU Flash

DMA logging now working - logging 25x faster than MUT

Thread Tools
 
Search this Thread
 
Old Dec 21, 2007 | 09:37 AM
  #1  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
DMA logging now working - logging 25x faster than MUT

Running 62500 baud DMA I have 64 items (MUT requests 0x00-0x3F) being logged 1000 times in approx 15 seconds with the engine running smoothly and no errors. It reverts to MUT after the items have been zapped down the port. This is over 4200 items per second compared with typical 150-180 on standard MUT. At 4000 RPM you can log all these items just every engine revolution. With MUT you could do the same every 25 revolutions.

If we want to run a standard 15625 MUT baud rate, it still runs 8 times faster than standard MUT.

I need to rewrite my logger to get this logging nicely, and also I want to add a facility so that the PC can write blocks of RAM to the laptop for fast realtime mapping.

As well as fast logging, this development should make realtime mapping changes virtually instant (e.g 0.1sec for ignition map) when you hit the button in the logger to transfer the updated map to the car whilst the engine is running.

Often I've found when writing loggers that the PC can't keep up with the ECU! In this case it will be less taxing on the PC as well since it is received in a buffer. The logging might slow down when we ask the PC to convert the raw data to real units and write it to a file in realtime, we'll see. It might be that we need a raw binary dump to file and interpret it once logging is finished.

Last edited by jcsbanks; Dec 21, 2007 at 09:40 AM.
Reply
Old Dec 21, 2007 | 10:05 AM
  #2  
donour's Avatar
Evolved Member
iTrader: (6)
 
Joined: May 2004
Posts: 2,502
Likes: 1
From: Tennessee, USA
Originally Posted by jcsbanks
Often I've found when writing loggers that the PC can't keep up with the ECU! In this case it will be less taxing on the PC as well since it is received in a buffer.
Seriously? Are you writing them in C?

Even the most moderate laptop I've tried is 90+ % idle with my mut code. I haven't even looked at optimizing it, but I'm sure I could make it faster if I hand tuned the I/O loop(s).

d

EDIT: On the topic of unit conversions, it is almost certainly a bad idea to do eval() statements on text strings in a tight loop. Especially if what you are doing is arithmetic. The RightWay(tm) is to compile the operation to machine code beforehand then just apply the generated function. Alternatively, you can do what I did in libmut. Just realize that all of your transforms are basically the same (Ax^3+Bx^2 +Cx+D) and just save A,B,C,D for each logged parameter. Then just do the aforementioned arithmetic. I expect a good C implementation can do it in 50 clock cycles and easily process tens of millions of requests/sec.

Last edited by donour; Dec 21, 2007 at 10:13 AM.
Reply
Old Dec 21, 2007 | 10:41 AM
  #3  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
No, Visual Basic. Live graphs, displaying data in text boxes may be slow, but the real delay is often IO. Two examples - on my dual core laptop, FTDI virtual COM ports can't keep up even with MUT with simple serial IO read/writes in a loop that runs at full speed on a much slower Pocket PC native serial port. Also, FTDI calls on a Pocket PC seem to run at half the speed that they do on a PC.

I've sorted the PC performance issues by using FTDI calls rather than virtual COM ports, but I thought almost anything knocked up on a PC would fly and laugh at anything an ECU could throw at it.

However, with DMA on a 32MHz SH2 we could run the throughput of MUT using 1/50000th of the CPU time. It seems that the serial signal becomes unreliable over 62500 baud, so the PC should manage, but we might need to pay more attention to optimising the PC side with future developments?

My simplistic understanding (carried over from playing with the BLITTER on my Amiga 500) is that hardware implemented features on very modest hardware can kick the *** of a faster setup. The DMA on channel 3 of our ECUs is really powerful, it looks after all the serial comms setting and clearing all the flags, reading the MUT table, fetching the data using indirection and calling interrupts on termination is really very impressive to me. Implementing this on a chip of the same performance without DMA would really hog the CPU - illustrated by even the multibyte MUT stuff I did only logging at 750 items per second, still 5 or 6 times slower than DMA.

Last edited by jcsbanks; Dec 21, 2007 at 10:48 AM.
Reply
Old Dec 21, 2007 | 10:52 AM
  #4  
donour's Avatar
Evolved Member
iTrader: (6)
 
Joined: May 2004
Posts: 2,502
Likes: 1
From: Tennessee, USA
Originally Posted by jcsbanks
No, Visual Basic. Live graphs, displaying data in text boxes may be slow, but the real delay is often IO. Two examples - on my dual core laptop, FTDI virtual COM ports can't keep up even with MUT with simple serial IO read/writes in a loop that runs at full speed on a much slower Pocket PC native serial port. Also, FTDI calls on a Pocket PC seem to run at half the speed that they do on a PC.

I've sorted the PC performance issues by using FTDI calls rather than virtual COM ports, but I thought almost anything knocked up on a PC would fly and laugh at anything an ECU could throw at it.

However, with DMA on a 32MHz SH2 we could run the throughput of MUT using 1/50000th of the CPU time. It seems that the serial signal becomes unreliable over 62500 baud, so the PC should manage, but we might need to pay more attention to optimising the PC side with future developments?
Honestly, I don't know exactly what the overhead of VB is going to be, but I would expect it to be substantial.

Are you doing all of the I/O processing in a single thread. If performance had every become an issue I had planned to have one thread just grab every byte of data and drop it in a queue. Then have another thread transform the data for display and writing to disk. One thing I do NOT do in MacLogger is update the display every data refresh. I update the display at 10 hz -- locked. You can't see anything happening faster than that anyway, so why bother.

d
Reply
Old Dec 21, 2007 | 11:00 AM
  #5  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
Yes I was lazy with the display, I just wanted to do (not so) quick and dirty. I still don't like programming PCs, but do it of necessity, because for these developments the same person has to be doing the PC and the ECU programming otherwise nothing gets done.

When I was playing with VB to see where the delays were before realising it was the FTDI VCP read/writes, I did some testing to see how long it would take to do say 100000 instructions and it seemed reasonable for tight loops etc.

Yes, everything is in one thread, works OK for MUT, but it won't for MUT DMA because the display will hog it.
Reply
Old Dec 21, 2007 | 11:35 AM
  #6  
donour's Avatar
Evolved Member
iTrader: (6)
 
Joined: May 2004
Posts: 2,502
Likes: 1
From: Tennessee, USA
Originally Posted by jcsbanks
My simplistic understanding (carried over from playing with the BLITTER on my Amiga 500) is that hardware implemented features on very modest hardware can kick the *** of a faster setup. The DMA on channel 3 of our ECUs is really powerful, it looks after all the serial comms setting and clearing all the flags, reading the MUT table, fetching the data using indirection and calling interrupts on termination is really very impressive to me. Implementing this on a chip of the same performance without DMA would really hog the CPU - illustrated by even the multibyte MUT stuff I did only logging at 750 items per second, still 5 or 6 times slower than DMA.

Your point is valid except that a modern x86 vastly more powerful than that 32 mhz SH4. My day job is working on high-performance networking protocols. Think about your PC and TCP.

Your PC has no problem processing a gigabit/sec across a network and TCP/IP is doing _way_ more than MUT has to. I well written library would laugh and point fingers at the puny 60kbits coming across the serial port. Similiarly, the disk is not going to have any problems writing your data since your additional formatting will probably only add up to about 300kb/s.

As they say in the scientific community, the display should be computed out-of-band. This is why I always write my interface last (and consequentially why my GUIs are so fragile and nobody uses them :-p ).

d
Reply
Old Dec 21, 2007 | 12:12 PM
  #7  
ixbreaker's Avatar
Evolving Member
iTrader: (3)
 
Joined: Apr 2006
Posts: 146
Likes: 0
From: so cal
i'm a newb at programing...but this sounds exciting...does this basically mean we'll be able to log more data points with evoscan or mitsulogger?
Reply
Old Dec 21, 2007 | 12:32 PM
  #8  
donour's Avatar
Evolved Member
iTrader: (6)
 
Joined: May 2004
Posts: 2,502
Likes: 1
From: Tennessee, USA
Originally Posted by ixbreaker
i'm a newb at programing...but this sounds exciting...does this basically mean we'll be able to log more data points with evoscan or mitsulogger?
Maybe...eventually. It's a long time off for the casual user though.

d
Reply
Old Dec 22, 2007 | 08:37 PM
  #9  
codgi's Avatar
Evolved Member
Photogenic
Liked
Loved
Community Favorite
iTrader: (22)
 
Joined: Aug 2004
Posts: 2,493
Likes: 41
From: Atlanta, GA
donour is right in the end JCS . I bet there's a good bit of optimization that can be done. Of course part of it is due to your choice of using a CLR language for this type of stuff (particularly VB, but when I spoke about this last I was jumped on ).

When I get back I have it on my list of things to do...i.e look at the bits of code you have been sharing out and see where you can hopefully make a bunch of optimizations. But you're going to have to make this multi-threaded if you want to be serious. Going a bit further would be to actually pull any of the code doing the raw communications into unmanaged dlls.

You can still use managed code for your UI, but the less JITing the CLR has to do the better your performance is going to be.
Reply
Old Dec 23, 2007 | 02:03 AM
  #10  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
It isn't choice, it is necessity to get things done within my limited programming experience. When I was a teenager I was not bad at programming the stuff around then as a hobbyist, but then I lost about 10 years of computer development when I trained to be a physician, which as you can imagine kept me busy. Now my career has matured I have more free time for my hobby again. When I picked it all up again in the last few years I found that modern microcontrollers were at the level of the computers I'd last seriously played with, probably why I like them LOL.

I am hoping as you say that a proper programmer might pick this up if I provide the working ideas.

My interest is purely in the ECU and coding the SH2 in assembly. I program the PC because I have to do so to test my ECU patches.

Presently I don't have any performance issues, I described a few above. It seems to be FTDI VCP or PPC FTDI drivers.

Can you explain in simple terms what I need to do to reduce the use of JIT? Can I assemble it to native x86 for example by changing the options in my project?

How do I make it multi-threaded?

The essence of what I am doing with DMA logging is just an extension of previous loggers except that packets will be going back and forth rather than single byte instructions. I've not decided yet the neatest protocol, but it would:

1 bit for read or write
1 bit for direct or indirect addressing
at least 10 bits for the length of the transfer in bytes
4 bytes for the address
Then the data.

For realtime mapping, the logger can effectively peek or poke ranges of memory that contain the fuel and timing maps. The user needs to see these on the PC as nice editable tables.
Reply
Old Dec 23, 2007 | 02:27 AM
  #11  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
http://www.grimes.demon.co.uk/dotnet/man_unman.htm is interesting reading.

I think for DMA logging I should produce the absolute minimalist code I can to merely connect to the ECU and log to csv file, and lose all the display. We'll see if there are any performance issues, and if there are the code will be so simple that it will be obvious to those in the know how to streamline it/rewrite it. Does this sound reasonable?
Reply
Old Dec 23, 2007 | 08:21 PM
  #12  
codgi's Avatar
Evolved Member
Photogenic
Liked
Loved
Community Favorite
iTrader: (22)
 
Joined: Aug 2004
Posts: 2,493
Likes: 41
From: Atlanta, GA
Originally Posted by jcsbanks
It isn't choice, it is necessity to get things done within my limited programming experience.
Yes I remember when we spoke about this last, but the effort spent learning VB could have been spent messing with C# (so close in styles now in the latest incarnation and with a minor perf advantage...why not?). VB has always been, and probably always will be a "quick and dirty" language. Its main purpose is quick development and not necessarily performance and strength.

Can you explain in simple terms what I need to do to reduce the use of JIT? Can I assemble it to native x86 for example by changing the options in my project?
Not much you can do about it once the main amount of the code is written using managed code (what VB and others use). When you compile, your code is compiled into CLR byte code which of course the computer can do nothing until it JIT'd by the framework at run time into good old machine code.

The way to cut down on this would be write the majority of the libraries using unmanaged code (see flat C++ or similiar) and store it in a dll. Less work for the runtime to do and you can call this code from managed code

You would also get a minor perf advantage by having less byte code (less stuff to JIT) which would mainly come from cleaner more efficient code and of course using a language that was designed from the ground up to work with the framework (C#) as opposed to ported over and "made to work" (VB).

I doubt this is where the main bottleneck is though...its more than likely just how the code itself is written...but just putting that out there for you to think on .

Also note that in the article above they are some minor discrepancies with the comparisions and compiler arguments used in the comparison. Either way, i doubt "the edge" is where your code is slowing down, but if you really wanted to go all out and see .

How do I make it multi-threaded?
Have a peak at the examples in this System.Threading MSDN article:

http://msdn2.microsoft.com/en-us/lib...ng.thread.aspx

Search around a bit for it too cause its not something that I can explain easily in a thread. You could do this a few ways. The simplest way is to simply use two threads: one for the UI and one for the communication/logging.

The best way to do it would be have a thread of each unique part of the problem. I.e one thread of the UI, one for communication, one for logging and so on. If you really wanted to get fancy, you could break the tasks down as miniscule as possible and have a thread for each sub task . The whole point of this, is that you don't tie up the machine waiting for one specific function to stop, i.e waiting until one bit of communication is finished before logging, or waiting until the log is finished and being unable to re-paint the UI on the screen.

Getting them to play nicely together might be a chore, and you might introduce other bugs here as well but . If used properly, it could potentially give you the great balance of getting fast information out of the ECU while still being able to show it in realtime to the user, or log it to the HDD without causing other parts of the program to miss information.

Edit for some that might be following but don't know:

JIT: Just In Time. Interpreter languages/runtimes change code into machine executable code just as it is called hence Just In Time.

Managed code. Nut shell garbage collection and other such computer arithmetic is taken care of by the language. CLR languages (C#, VB.net etc) are managed code. Unmanaged is the opposite...good example of this is C++ or any other language where you basically have to take care of garbage collection on your own.

CLR. Common Language Runtime. The .Net framework or similiar ports (Mono) which support JIT execution of multiple languages once they are in the correct IL (intermediate language) format.

Last edited by codgi; Dec 24, 2007 at 09:24 AM. Reason: Forgot the CLR definition :)
Reply
Old Dec 24, 2007 | 01:10 AM
  #13  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
Thanks for all the input. Last night I got rid of loads of stuff out of the logger, and implemented a very simple xml read for configuration for logging (based on MalibuJack's stuff with permission), which I'll also use for realtime mapping for the variety of ECUs. I've got rid of the bitmap graphing, the multiple hard coded text boxes and the code to draw them is going, being replaced by a listview. All my evaluations could be done with a single multiplier except for the coolant and air temperatures, for which I already have a 16 item byte lookup table with linear interpolation (based on MMCd) which I think gives the most accurate results compared with polynomial equations which seem to produce errors with outlying data.

I'm afraid I'm so far down the line with this I'm going to stick with VB and try to address performance issues as they arise. I did not have to learn BASIC, and had used something a little like VB before (Toolbook), but I did have to learn better about classes, objects, properties, methods etc. I am at the level of being able to "write" really very simple stuff indeed in C (hello world LOL) so that is going nowhere fast. I will think carefully about adding threading to the project, as presently I do call statements in logging that await their replies.

I am really pleased that I've got rid of the dreaded strings in the comms and logging, just uses values, although I will be more careful in using integers where I can rather than floating and unnecessary type conversions. So far I have also had massive improvements from moving from eVB (interpreted) to VB.net, and also from moving from VCP to FTDI device calls.

Perhaps I am overemphasising the performance problems as at present they are all overcome, but the times they've been hit they slow something that feels slick to use to something useless. It does seem odd that certain string manipulation instructions when playing with Mitsubishilogger can slow logging to a crawl without apparent explanation, that VCP can mean that a fast PC ends up half as fast as MUT when using FTDI calls it is twiddling its thumbs, and that a similar performance hit is experienced on a PPC going from real serial port to FTDI calls. I don't want to junk the whole lot to use the apparent nirvana of unmanaged code just yet when I suspect that is not usually the cause of the performance issues, it is the stuff above or simply my bad (not so) quick and dirty code.

Last edited by jcsbanks; Dec 24, 2007 at 01:21 AM.
Reply
Old Dec 24, 2007 | 09:41 AM
  #14  
codgi's Avatar
Evolved Member
Photogenic
Liked
Loved
Community Favorite
iTrader: (22)
 
Joined: Aug 2004
Posts: 2,493
Likes: 41
From: Atlanta, GA
Originally Posted by jcsbanks
Thanks for all the input. Last night I got rid of loads of stuff out of the logger, and implemented a very simple xml read for configuration for logging (based on MalibuJack's stuff with permission), which I'll also use for realtime mapping for the variety of ECUs.
XML support is built into the framework, so the more of the base libraries you use and the less re-invention of the wheel you have to do will help along perf a good bit.

I will think carefully about adding threading to the project, as presently I do call statements in logging that await their replies.
You are making blocking calls, whereby the code can't complete until the calls come back. If they come back quickly its negligible, if they take too long this is where some of the bottleneck is. For any calls you really should be making non-blocking calls....once again if you want to really get fancy you can setup callbacks to take care of that. Or simply be cheap and add threading which will more or less let you do it for free.

I am really pleased that I've got rid of the dreaded strings in the comms and logging, just uses values, although I will be more careful in using integers where I can rather than floating and unnecessary type conversions. So far I have also had massive improvements from moving from eVB (interpreted) to VB.net, and also from moving from VCP to FTDI device calls.
Strings in .net are immutable and thus eat a good chunk of memory. Wherever possible you want to stay away from string manipulation if you can, its expensive!. Or if you do use the built in libraries. Simple example is concatenation:

You can do this "string1 " + value + " string2"

or

new string("string1 {0} string 2", value) (or something to that effect )

or

String.Concat("string 1 ", value, " string 2")

All of them will produce the final string "string 1 value string 2". But the last one will use the least resources to do so. Not a big thing if you do this once, but in a tight loop where you are doing this over and over again the last will be way better.

In either case I'll take a look and see if can help...of course depends on how things have been going at work since I have left .
Reply
Old Dec 24, 2007 | 12:07 PM
  #15  
jcsbanks's Avatar
Thread Starter
Evolved Member
 
Joined: May 2006
Posts: 2,399
Likes: 6
From: UK
With that in mind (strings), I'm at a crucial point where I have the xml stuff I need loaded into listviews.

The main issue now is that the code will work through the buffer received from the FTDI, and calculate each value and write it to a string to then write to a .csv. The plan was to read all the needed info to do this processing out of a listview, but it is effectively reading a load of strings and converting them to floating point to do the math. Is this a disaster? Should I precalculate all the needed values and store then in a floating point array instead?
Reply



All times are GMT -7. The time now is 05:41 AM.