**The application note**

After having read the application notes of Atmel on how to code efficiently in C for their devices, I started experimenting with my own example codes.

**General purpose working registers in C**

As an addition to the application note above, I learned that I don't have to worry about the general purpose working registers too much when programming the ATtinys in C with avr-gcc and avrdude, because those registers are used by the compiler anyway, see this link and this discussion. That is good news in terms of I don't have to worry about my variables not being stored efficiently.

**Global variables and types**

The number of global variables and functions obviously have an effect on code size.

Wherever you can, avoid using global variables, but sometimes global variables are the necessary evil when using interrupts (especially in the I2C implementation I'm using), but if you keep their numbers down and use the types requiring the smallest amount of space will have a positive effect on the code size.

**Converting main as __C_task, C_task**

Since I'm using avr-gcc, it doesn't have this option. Essentially you're saving the return statement in your int, which resulted a 12 bytes saving in my code. Luckily, if you omit the return statement from your normal

int main(void){}

avr-gcc only issues a warning, but it doesn't have an effect on the working of your code in any way, provided you don't call main, which is the case mostly (void main(void){} works, too).

**Floating point vs integer arithmetic**

**Let's go through a simple example: taking an average and rounding with integers**

This is not an optimisation in the strict sense, however the choice you make on this could determine whether your code will or will not fit on an ATtiny, that's why I mention it here. Floating point arithmetic is hard on an ATtiny, sometimes it's not even possible at all. Integer arithmetic is mostly fine, but it gets harder when you have to do divisions and rounding. So if you're running into this problem for the first time, here is an explanation below.

Let's assume you have to take more sensor readings in a short period of time and you have to calculate an average. This can be the case when your communication protocol is slow compared to the sampling rate, so you have to sample down, but you want to retain as much of the information as possible, so the first thing you can do is to calculate the average of the readings. This can be tricky when you only have whole numbers to play with. I'm going to show you a simple case here. For further reading on moving averages and oversampling check out this Atmel note.

What's the problem with rounding?

The problem is that integer division truncates. Let's assume you have 4 1 bit readings. They can be either 1 or 0. Let's look at the averages in integer arithmetic. You can even try this in python.

Sample: 1 1 1 1, average 4 / 4 = 1.

Sample: 1 1 1 0, average 3 / 4 = 0.

Sample: 1 1 0 0, average 2 / 4 = 0.

Sample: 1 0 0 0, average 1 / 4 = 0.

Sample: 0 0 0 0, average 0 / 4 = 0.

By intuition, if there are more 1-s than 0-s then we should round up, since the data is more 1-s so that is a more accurate account of the situation. And by convention we round up the 1 1 0 0 case.

To fix this we want the result of the division to be 1 with a non negative remainder in the first 3 cases, so it truncates to 1. An addition of 0.5 to the result would do the trick if we had rational numbers. 1 + 0.5 truncates to 1, 0.67 + 0.5 truncates to 1, 0.5 + 0.5 to 1, 0.33 + 0.5 to 0 and 0.5 to 0. But we don't have rational numbers. However, this is still a good starting point.

We can avoid the 1/2 in our example if we add 2 to each of our averages and then divide by 4.

Sample: 1 1 1 1, average (4 + 2) / 4 = 1.

Sample: 1 1 1 0, average (3 + 2) / 4 = 1.

Sample: 1 1 0 0, average (2 + 2) / 4 = 1.

Sample: 1 0 0 0, average (1 + 2) / 4 = 0.

Sample: 0 0 0 0, average (0 + 2) / 4 = 0.

Let's generalize this idea of a/b + 1/2. Let's multiply by b and then divide by b and do the algebra.

a/b + 1/2 = (a + b/2) / b.

So that's your formula. Rounding a/b to the nearest integer with non negative integers is a/b = (a + b/2) / b. This formula is even used in the Linux kernel see line 102 in /include/linux/kernel.h. Or search for "DIV_ROUND_CLOSEST(x, divisor)". It is line 89 in the Raspberry Pi linux kernel. You can see how this is done for (unsigned) negative numbers in the next line there, too.

You can see that this is the case above, too: 4/4 + 1/2 = (4 + 2) / 4

All divisions are integer divisions (ie. truncations). There's the overflow issue when using this code, so you'll have to make sure you can represent (a + b/2).

Now how do you divide integers more efficiently?

Use powers of 2 as the sample size. So don't take 5 readings, take 4 or 8. Don't take 10 readings, take 8 or 16.

To divide by powers of 2, shift the bits. This is the same truncating integer division as above. So this version doesn't round. (7 / 4 = 1, whereas with the rounding version 7/4 = 2)

Divide 6 by 2: 6 >> 1 = 6 / 2^1 = 3,

6 by 4: 6 >> 2 = 6 / 2^2 = 1...

To put everything together, how do you take the integer average of 4 readings on an ATtiny in C with rounding?

Let's suppose you have the sum.

average = (sum + 2) >> 2;

Just make sure (sum + 2) can be represented properly.