logo slogan

Getting your IoT development under control

By Flora McDonald

June 2018

The whole world seems to be going IoT crazy at the moment with Internet connected fridges that tell me if I’ve run out of milk (because I’m too busy updating my status to open the door), Internet connected toasters that let me set how brown my toast is via an app on my smartphone and Internet enabled door bells that tell ne’er-do-wells that I’m not home right now and the nearest police are (according to their Twitter feed) at least 20 minutes away. All of these are making my life more fulfilled whilst quietly harvesting my data but that’s a story for another day. Every company has a dream of what their future Utopia will be but almost without exception these visions rest on the shoulders of the humble Cortex-M family of microprocessors from ARM.

I love the Cortex-M; it’s a great device. ARM has spent billions getting it just right but many engineers don’t seem to know just what it’s capable of. For example, a lot of time and effort has gone into providing most members of the Cortex-M family with a very comprehensive set of on-chip tools to help you develop better software. Yet most people that I meet are still plugging away with printf() and prayer.

I’m going to use this post to take you on a whirlwind tour of how the Cortex-M can help you make safer, smarter devices in less time. Be aware, though, that not all members of the Cortex-M family support all of these options and, due to the customisable nature of ARM IP, not all vendors include all options in their devices.

The Cortex-M supports up to 256 interrupt priority levels, allowing for plenty of I/O devices to be connected. Smaller members of the family will support less interrupt priority levels but all boast very small cycle counts for interrupt latency, typically 12 or 15 cycle counts. Multiple interrupts can share a priority with a further sub-priority being assigned to each of these. All interrupts are pre-emptible and interrupt service routines can be chained: a second will start immediately after a first has finished without having to return to the original code that was interrupted. This greatly reduces interrupt latency and can make a system much more responsive. Pushing the envelope still further, many multi-cycle instructions (PUSH, POP, etc.) are also interruptible.

The entire family has been on a crash diet and occupies a very small number of silicon gates: about 12K for the Cortex-M0. All of this light-weight computing means it consumes very little power and support for a number of sleep modes allows them to be very frugal indeed. If, like me, you have teenage children you’ll be familiar with the incessant routine of following them around the house turning off lights and devices they’ve left powered on as they breeze from one room to the next, oblivious of their own carbon footprint. Many Cortex-M devices employ a similar ‘parent’ as circuit blocks on the device can be clocked and powered independently, allowing further power savings as whole chunks of the device power down until needed again.

Peripheral registers can be bit-banded. Gone are the days of Read-Modify-Write for setting bits in registers. A single write operation can clear or set any bit in a bit-banded register. The CMSIS (Cortex Microcontroller Software Interface Standard – an abstraction layer provided by ARM) provides some convenient macros for handling all of this. Not only does it improve the speed of traditional read-modify-write operations but it also guards against a bit changing in the register between the read and write back commands. These are very difficult bugs to find!

A nice feature that should give you a warm fuzzy feeling is shadowed stack pointers. There is a privileged and non-privileged stack pointer. Although, only the privileged stack pointer is initialised at reset. This means that your RTOS kernel and exception handlers can use a stack that it is impossible for tasks/threads to corrupt, especially is you protect it using the MPU (Memory Protection Unit) that is available on some members of the family.

I’ll briefly touch on the fault handling capabilities as space prevents me from doing it justice; it really could do with a post all by itself. Runtime faults can be detected by the core and these can also be used to trigger debug events; handy if you have a debugger connected. All of this magic is taken care of by the DEMCR (Debug Exception and Monitor Control Register). From a runtime point of view, error handlers are able to escalate faults to higher and higher-level fault handlers if they find that the problem can’t be dealt with locally. Think about the last time you got nowhere when dealing with a call centre and you had to escalate your problem to the supervisor! The supervisor (in this case your fault handler) has access to a lot of extra information about the nature of the fault and, often more importantly, where it occurred. This gives you a lot of flexibility to try to fix things at runtime before you push the big red panic button and re-boot the device.

Another area that I will gloss over because it’s worthy of two or three entries of its own is the debug features packed into these tiny marvels. Anyone who has spent time writing embedded code knows the value of a good debugger. What they may not realise is that even the best debugger in the world is constrained by the data that the microprocessor is prepared to share with it. Most Cortex-M devices provide JTAG access and, where they do, they always also include the slimmed down SWD (Serial Wire Debug) interface (Think JTAG but only using 2 pins instead of 5!); ideal for resource constrained or I/O heavy systems. In addition, some members also include a program flow trace which can be read out of a separate port of between 1 and 4 bits wide. These pins are included on the debug header. Think of what you could achieve if you knew where your code had been and how long it had been there! With this program flow trace it allows you to verify code coverage metrics to prove your code meets criteria, verify your test cases and generate reports for any applicable safety standards.

There are two trace emitters that can be found on Cortex-M devices: ETM which can be thought of as the aforementioned program flow trace and requires the extra pins to read it out; and ITM which works with another cool toy called the DWT (Data Watchpoint and Trace unit) and will output its data either via the trace port or the JTAG Serial Wire Viewer. The ITM/DWT combo provide some very comprehensive capabilities that put much larger devices to shame.


· Supports breakpoints on data accesses: break on read or write with a value.



· Emits a trace packet when a read or a write to a certain address occurs. This is an extension of the breakpoint logic in the point above but allows you to measure task/thread switches and interrogate data values being read from external devices non-intrusively.



· The address of the Program Counter can be emitted every <n> clock cycles where n is a power of 2 between 64 and 32768. Even without a full trace, this gives you an idea of where your program has been.



· A trace packet can generated on each and every interrupt entry and exit. You can profile your interrupt service routines: how often do they get called, how are they nesting, how long do they take to execute.



· Software Generated Trace via ITM (Instrumentation Trace Macrocell). Developers can emit trace packets under software control to their debug tool. There are 32 channels to use allowing them to divide up the channels into something more meaningful for their application.



Here, I’ve only touched on some of the capabilities of these clever little cores. Hopefully, in future I can expand on some of the topics mentioned here. Let me know what you think.