Code Writing Level « LeafLabs Garden

LeafLabs Garden » Support » Maple Hardware Support

Code Writing Level

(11 posts) (4 voices)

Started 5 years ago by Silntknight
Latest reply from Silntknight

Silntknight
Moderator

This is primarily to address the difference between #define and the "normal" method of declaring variables. Which is preferable to use? Because everything gets compiled anyway, does it matter? I'm also curious about direct registry access for pins (like Arduino's DDR & PORT). I know Maple doesn't support direct registry access (http://forums.leaflabs.com/topic.php?id=268) but if I created methods for this type of access, would that increase performance?

I learned that writing in lower-level code is better for performance, but to what extent is this true? How much of a gain can be expected if everything compared to nothing was written in low-level code.

Posted 5 years ago #
gbulmer
Moderator
Silntknight - cracking questions!

Let me do one, then I'm off to bed.

#define - this is executed at compile time, and generates values for the compiler.
So
```
#define MAX (16367)
 //...
 for (int i=0; i<MAX; ++i) {
   c = MAX*i;
 }
```
turns into
```
for (int i=0; i<(16367); ++i) {
   c = (16367)*i;
 }
```
before the compiler takes a look.
Conceptially a program called the preprocessor (cpp) replaces the #define name with literal text, in this case a number, but can be any valid text, before the text goes to the compiler. The preprocessor also does #include, #if, #ifdef, etc.

Nowadays the preprocessor is often built into the compiler, but the semantics are preserved, i.e. the preprocessor gets to the program text first.

So it is no possible to write:
```
for (int i=0; i<MAX; ++i) {
   MAX = c*i;
 }
```
as that would become:
```
for (int i=0; i<(16367); ++i) {
   (16367) = c*i;
 }
```
which is not C.
In this case it doesn't take up data memory (though that is processor and compiler dependent).

int MAX2 = 16367;
creates a named variable.
It takes up data memory space, and can be assigned new values, e.g. int MAX2 = MAX2+1;

The other form is:
const int MAX3 = 16367;
This also takes up data memory (usually), but the const tells the compiler to prevent its value from changing.

Key differences between
#define MAX (16367)
and
const int MAX3 = 16367;
is MAX3 has a real data type, a size, and a memory address (&MAX3 is meaningful).

Some older compilers that I have used do not let you say:
```
const int MAX3 = 100;
int a[MAX3];
```
which forced the use of #define to set the size of the array.

#define is very general purpose, as it is inserting pieces of text, and can be dangerous.

I tend to write const int MAX3 = 100; when I am writing code for other people who may not be aware of #define and the dangers of using it.
Posted 5 years ago #
josheeg
Member

I would guess writing in a level like c or arduino a little higher can be fine for the adverage go to the grocery store and get a container of homus and flour and make some bread.
Now if you needed to optimise a small loop then asm and machine code seem to be the same. Also they might be well writen by the compiler so the only thing that could be done at that point is to take apart the c and compare.
Also consider the value of time and cost if time has no value shure write it in asm after comparing it with c.
But then depending on the size of the system memory management etc. It could be better to tell the compiler to compile and use more ram or rom and make faster code. I think there were switches for that.

It is similar to my problem I had a lot of sensor data too much for a arduino and too much to send by a teensy++ and too much to process on the fly.

So I got the usb1232h module a high speed 8 bit parrelllel out usb module.
& the maple to control the bits and load the register, operate system clock & etc.

So that answer was to throw hardware at the problem.

Posted 5 years ago #
Silntknight
Moderator

Well this gives me something to think about. Josheeg, I also choose the Maple because of its added capabilities compared to the Arduino.

Gbulmer, given your explanation, I don't actually have anything that could use #define. At least, there is nothing worth changing the code for. Hope to hear more soon about the other parts of the question.

Posted 5 years ago #
gbulmer
Moderator
Silntknight

I'm also curious about direct registry access for pins (like Arduino's DDR & PORT). I know Maple doesn't support direct registry access (http://forums.leaflabs.com/topic.php?id=268) but if I created methods for this type of access, would that increase performance?

poslathian talks about register access on thread http://forums.leaflabs.com/topic.php?id=268

The Maple libraries hide direct register access, but, as long as you are careful you can use direct register access. For example, use pinMode to get the pin in the right mode, then use direct register access to manipulate the pin.

Some of the registers are already named in Maple headers.

Direct access can make a huge difference over the speed of using some of the libraries.

digitalWrite or digitalRead have function calls, then check that the pin number is valid, and convert the Maple pin number to the actual port and pin. This isn't very slow, but is slower than going directly to the pin.

For example:
```
void digitalWrite(uint8 pin, uint8 val) {
    if (pin >= NR_GPIO_PINS) {
        return;
    }

    gpio_write_bit(PIN_MAP[pin].port, PIN_MAP[pin].pin, val);
}
```
Then
```
static inline void gpio_write_bit(GPIO_Port *port, uint8 gpio_pin, uint8 val) {
    if (val){
        port->BSRR = BIT(gpio_pin);
    } else {
        port->BRR = BIT(gpio_pin);
    }
}
```
Then
#define BIT(shift) (1UL << (shift))

if you know the address of the register, e.g.
#define GPIOA (0x40010800)
#define GPIOA_BSRR (GPIOA+0x10)
#define SPARK_PIN (0b00000000001000000)

GPIOx_BSRR = (0xFFFF0000 | SPARK_PIN); // set pin on
GPIOx_BSRR = (SPARK_PIN << 16) // set pin off;
`
These will each compile to a single instruction run in 1 or 2 cycles.
The difference is even more significant if several pins on the same port are being changed.

(If I remember, I'll run this into an oscilloscope.)

I haven't had a look at the code produced by the compiler for digitalWrite, but this will be significantly faster.

The differences between direct access and analogRead are even bigger because analogRead blocks while the value is being sampled and converted. I guess 90%+ of the time spent inside analogRead is just waiting doing nothing.

As far as writing assembler. I do know folks who are very good at it. But, unless you *really* know what you are doing, don't use assembler.
It is feasible that you will do something that prevents the compiler from optimising the code, so clumsy assembler may make things slower.

Machine code is just numbers, and is exactly the same stuff as the assembler generates. Writing machine code is just a very hard way to write the numeric values that assembler produces for you.

As josheeg says, if you really want to turn something into assembler, write C code, then look at the assembler produced by the compiler.

If the application you've been describing in the other threads is representative of what you want to do, then IMHO you'd get much better value from investigating how to use the on-board peripherals, e.g. timers.

I learned that writing in lower-level code is better for performance, but to what extent is this true? How much of a gain can be expected if everything compared to nothing was written in low-level code.

I assume you mean faster by "better for performance"? It is a good idea to be clear about speed, size or some other performance quality.

A good software engineering rule of thumb is more than 80% of the run time is spent in less than 20% of the code.

So, rewriting the 80% which only takes 20% of the run time is highly unlikely to show a useful improvement, even if it goes 5x faster, the overall run time has gone down to 84%, and was probably not worth the effort.

Further, there is a LOT of evidence that developers are not very succesful at choosing the right part of the program to optimise ahead of writing it. Further, we are not much better without careful measurement, and lots of developers are quite bad at doing those careful measurements.

Anyway. The only parts worth considering are the 20% that dominate. It is often true that using a better algorithm makes more difference than hand-crafting assembler code. For example, choosing a faster sort algorithm instead of a slower algorithm. Even trying different compiler optimisation flags may do the trick.
Posted 5 years ago #
Silntknight
Moderator

Lots to think about here. I was especially interested in the direct manipulation of pins for the spark plug because it is handled inside an interrupt. I think I'll just stick to optimizing two areas: math and IO. I can optimize a lot of the math because I happen to be pretty good at it (I'm not bragging when I say this, I really love math). I also know the tolerances I'll accept so I can do many operations before I put them in code.

It is a good idea to be clear about speed, size or some other performance quality.

I generally mean speed, though I also meant size here. I will try to specify from now on though.

Do you know which core files/libraries I can look at to get a better idea of port manipulation?

Posted 5 years ago #
gbulmer
Moderator

Do you know which core files/libraries I can look at to get a better idea of port manipulation?

if you have git, or can install it on a computer, go to http://leaflabs.com/docs/libmaple/ and it gives the git command to download (clone) the source code repository. So run the command and get the libmaple source code. It is full of useful stuff.

The files are pretty helpfully split into two sets. I use grep to find the ones I am looking for, which is how I found the pieces I pulled out in my post. For example libmaple/wirish has the Arduino library look-alikes, e.g. things like analogRead.
digitalWrite and digitalRead are in wirish_digital.c, and libmaple/libmaple has the definition of PIN_MAP in boards.h
(It'll be a bit easier to find stuff if you have a way to search for text in the two directories.)

Posted 5 years ago #
gbulmer
Moderator

It is a good idea to be clear about speed, size or some other performance quality.

I generally mean speed, though I also meant size here

Ah. I didn't really address size. There are two different sizes to consider, one is code size, the other RAM. I find it hard to give general advice on size optimistions, and would tend to look at the program.

To get finer control over program size it may be very useful to experiment with building from the command line. This will give access to the pieces of binary code going into the program, and also the compiler.

Look at the libmaple Unix Toolchain Quickstart

Their is a compiler flag (-Os I think, among others) which compiles using optimised code size. You may find the program size is dominated by the libraries. In that case, careful analysis will reveal which are the big pieces, and you may see several ways to cut it down.

If its data memory (RAM) that is in short supply, there are a few tricks, but it might be so specific to the program that you need to analyse it carefully to find areas to improve.

Sorry I can't be more concrete, but I find size is often program specific.

Posted 5 years ago #
mbolivar
LeafLabs

Silntknight:

Do you know which core files/libraries I can look at to get a better idea of port manipulation?

In addition to what gbulmer said, there's a demo of writing directly to ports in order to control a VGA monitor:
http://github.com/leaflabs/projects/blob/master/vga-colors/main.cpp

The inner display loop is in isr_draw_line. VGA_SET_BSRR is the macro that sets all of the pins connected to the monitor at once (and in deterministic time) through port manipulation.

If memory is an issue, and you have large lookup tables, you can store them (or any other read-only variable) in flash. The same VGA project stores a few ~10K images on the Maple by directing the linker to store them in flash, not RAM. This is done through a GCC __attribute__; example here:
http://github.com/leaflabs/projects/blob/master/vga-colors/tableau_2.c

The libmaple linker scripts will put any variables with section attribute set to ".USER_FLASH" into flash, instead of (default) RAM.

Hope this helps!

Posted 5 years ago #
mbolivar
LeafLabs

Addendum:

As josheeg says, if you really want to turn something into assembler, write C code, then look at the assembler produced by the compiler.

If you need to do this, and have access to a Unix environment (Cygwin will work if you're stuck on Windows), the Unix toolchain makes it easy for you. Using the provided Makefile, "make" will produce a file build/maple.disas, which contains the generated assembly for your program. For example, here's the assembly for the libmaple library function timer_resume():

08005164 <timer_resume>:
8005164: 2318 movs r3, #24
8005166: 4343 muls r3, r0
8005168: 4a03 ldr r2, [pc, #12] ; (8005178 <timer_resume+0x14>)
800516a: 589b ldr r3, [r3, r2]
800516c: 881a ldrh r2, [r3, #0]
800516e: b292 uxth r2, r2
8005170: f042 0201 orr.w r2, r2, #1
8005174: 801a strh r2, [r3, #0]
8005176: 4770 bx lr
8005178: 20000c04 andcs r0, r0, r4, lsl #24

However, I strongly agree with josheeg's and gbulmer's advice against going this route unless you have really, really tried to get things working in C++, and you've measured exactly which parts of your code are the performance bottlenecks. gbulmer's words here are sage counsel:

[T]here is a LOT of evidence that developers are not very succesful at choosing the right part of the program to optimise ahead of writing it. Further, we are not much better without careful measurement, and lots of developers are quite bad at doing those careful measurements.

Posted 5 years ago #
Silntknight
Moderator

I'm going to try and keep the work I have to do simple, so anything involving Unix Toolchains will be saved for later projects. I do like the idea of using Git to inspect the code, so I'll probably do that.

For size, until I exceed the flash size, I won't worry about that. I am more concerned with RAM. I'll look at that later though, after I test my code several times.

Posted 5 years ago #

RSS feed for this topic

Reply

You must log in to post.