PSoC 6 Deep Sleep Wakeup Timer


In this article I build several different programs to measure the DeepSleep to Active wakeup time of the PSoC 6 which ranges from 15uS to 120uS.  I will discuss the data sheet max of 25uS and the useful limit of about 60uS.  I will include an analysis of the system issues that will influence your wakeup time.

This article is part of the "PSoC 6 Low Power Techniques" Series which covers a range of tools you have to lower the power of your system.  The following articles are (or will be) available:

PSoC 6 Low Power
PSoC 6 & Using the MCWDT as a Deep Sleep Timer
PSoC 6 Deep Sleep Wakeup Time
PSoC 6 & FreeRTOS Tickless
Managing the PSoC 6 Clock Frequency
Using the PSoC 6 LDO and SIMO Buck Regulators for Low Power
Using the PSoC 6 Always on Backup Domain
PSoC 6 Turning off block of RAM

The following resources are available

The Story

As I was working on implementing the FreeRTOS tickless mode I noticed (incorrectly) that the wakeup time from DeepSleep was around 5ms.  When I saw this number I thought “wow, that is a long time”, then I looked at the data sheet and discovered that it was really “25 uS Guaranteed by design”.  Given that Cypress is very careful about our data sheet numbers I thought “wow… 25uS is a long way from 5ms.  Did we really make that bad an error in the data sheet?”

I decided to dig in to figure out what was really happening.  The real answer is that the wakeup time isn’t anywhere near 5ms. That turned out to be an IoT Expert Alan bug (shhhh don’t tell anyone), but it did turn into an interesting investigation of the PSoC 6 chip.

In order to measure the wakeup time I needed a way to measure between a wakeup trigger and the system being awake.  The best way seemed to use a pin to trigger the wakeup and pin to indicate the system being awake. Then measure using an oscilloscope.  In the PSoC 6, in order to wakeup the system you need to send an interrupt to the Wakeup Interrupt Controller, here is a picture from the TRM.  These interrupts also serve as interrupts for the ARM Cores.

The basic flow of all of these examples is:

  1. Enable interrupts on the input pin which is attached to SW2 aka P04 or Arduino D0
  2. Write a 0 to the P50/D0 output pin
  3. DeepSleep
  4. Write a 1 to the P50/D0 output pin
  5. Go back to 2

Here is a picture of my CY8CKIT-062-WiFi-BT development kit.  Notice that I soldered a wire to the Switch (SW2) which is connected to P04.  The green switch wire is barely attached because I didn’t want to delaminate the switch from the board.  The yellow wire is attached to P50/D0.

What follows is a discussion of 7 different configurations.  As much as possible I try to use the Cypress Hardware Abstraction Layer (HAL) but as I dig, I get down to writing registers directly.

  1. Basic Pin Event and HAL Write
  2. Register Custom ISR Instead of HAL ISR
  3. Disable ARM Interrupts (no ISR)
  4. Write the Output Pin Register Directly (no HAL)
  5. Try Different Clock Frequencies
  6. Modify the Cypress PDL Function Cy_SysPm_EnterDeepSleep
  7. Write the ARM DeepSleep Register Directly and Call __WIFI

Basic Pin Event and HAL Write

I started with this very simple example.  The steps are:

  • Use the HAL to enable two output pins (one attached to the LED) and one attached to the Oscilliscope.
  • Use the HAL to configure the Switch as an input and then enable interrupts on that switch.

Go into the main loop and:

  • Write the LED to On (aka 0) to indicate DeepSleep
  • Write the D0 to 0
  • DeepSleep
  • Write the D0 to 1
  • Write the LED to Off (to indicate Active)
  • Do a little delay… then do it all over again

When I measure this on the Oscilloscope I get 57uS

Register Custom ISR Instead of HAL ISR

Well, 57uS is definitely more than 25uS (though it is way way less than 5ms).  I wondered why isn’t it meeting spec.  And I thought, maybe it is burning time in the HAL pin interrupt service routine.  So, I decided to attach my own ISR.

First there is an interrupt service routine function called “buttonHandler” which toggles the D0 pin to 1, then clears the port interrupt.

In the main function instead of enabling the event, I setup the interrupt directly by:

  • Configuring the Port for interrupts
  • Setting the edge to falling (because there is a resistive pullup on the switch)
  • Loading my own ISR
  • Enabling the interrupt

Then in the main loop I remove the D0 pin write to 1.

When I measure this, I find that it is 54uS instead of 57uS (notice I left the cursors from the previous measurement)

Disable ARM Interrupts (no ISR)

Then I think maybe the problem is the jump to the ISR.  Inside the ARM there are two enable controls over interrupts

  1. In the NVIC
  2. A global ARM interrupt control

So, I use the cyhal_gpio_enable_event to enable the NVIC.  Then I use the CMSIS function __disable_irq to turn off the ARM interrupts.

Now it is 53uS or basically the same as before.  So this doesn’t explain the missing 30uS (or whatever is required to get blow the data sheet spec)

Write the Output Pin Register Directly (no HAL)

Then I think to myself, maybe the HAL functions are slow.  Look at the pin write function:

When you look at the function you see that it is really a MACRO for an inline call to the PDL function

The inline PDL function turns a pin number into a Port, Pin combination with a call to some other PDL functions.

Those functions basically lookup the bit mask and base address of the Port,Pin.  Plus they have some error checking.

Then they call this macro:

Which is just a direct register write.

So I change over my program to write directly to the port register and is make ZERO difference.  On the range where I can see the pulse between the interrupt and the pin write, the difference is too small to register.

Try Different Clock Frequencies

The next thing that I wonder is if the CPU frequency matters.  On my board there are three possible sources of CM4 clock.

  • The 8MHz IMO
  • The FLL (also known as CLK_PATH0)
  • The PLL (also known as CLK_PATH1)

To try the different possibilities, I start by selecting CLK_PATH1 (the PLL) as the source clock for CLK_HF0

Then configure the PLL to 100 MHz

Then I tell the “PATH_MUX1” to use the IMO

Then I start running tests.  Here is the table of results for a bunch of different combinations.

8 MHz - - 119 uS
12.5 MHz 102 uS - -
25 MHz 73uS 102 uS -
50 MHz 60uS 102 uS -
100 MHz 53uS 102 uS -
150 MHz 53 uS - -

OK.  The “wakeup” time seems to depend on the clock source and frequency.  First, notice that if you use the PLL that it typically takes 16uS to lock … and it could take as much as 35uS.  That explains part of the difference.

The FLL consistently takes 7.uS to lock.  In fact that is the main reason it exists on this chip.

We know that the FLL and PLL explain some of the difference in the startup time.  But where is the rest?

Modify the Cypress PDL Function Cy_SysPm_EnterDeepSleep

The answer is that there are a bunch of things that happen AFTER the chip wakes up inside of the Cy_SysPm_EnterDeepSleep.  These things are part of the house keeping that it takes to make everything really work.

First, look at the cyhal_system_deepsleep() function, which is really just a #define for the PDL DeepSleep function.

If you dig through that function you will find yourself down in another function named “EnterDeepSleepRam.  If you look on line 2965 you will find that the code sets the bit in the ARM System Control Register which tells it to DeepSleep.  Then on line 2969 it executes the ARM assembly language instruction “WFI” also known as Wait For interrupt.  The WFI puts the CPU to sleep, or deep sleep depending on bit in the SCR register.  On lines 2970, 2993 and 3030 you can see that I instrumented the code to toggle the D0 GPIO so I can measure time.

Here is the SCR register documentation where you can see the bit “SLEEPDEEP” bit[2]

And later in the documentation the Wait For Interrupt (WFI) instruction.

When I ran the code I got:

  • From the falling edge of the to the rising edge is 17.12uS (deep sleep to first instruction on line 2970)
  • From the rising to falling edge is 6.25uS (line 2970 to line 2993)
  • From the falling to riding edge is 30uS (line 2993 to 3030)
  • From the rising to falling edge is 3.5uS (line 3030 to the first line in the main function)

Here is the scope trace.

What does it all mean?  There are basically three things going on from the Wakeup until the application developer has control.

  1. Cypress implementation of work arounds for chip issues
  2. Synchronization between the two MCUs in the PSoC 6
  3. Unwinding the DeepSleep preparations (user callbacks)

Write the ARM DeepSleep Register Directly and Call __WIFI

So this gives us a hint.  We could implement just the DeepSleep instructions.  If you did, the code would look like this:

Well there it is 12uS.  That is for sure below the data sheet limit of 25uS.

But is it a good idea?  No, almost certainly not.  If you don’t call the Cypress functions you will

  1. Not be protected from the dual core interactions
  2. Not call our functions to work around silicon bugs
  3. Potentially not manage the clocks correctly

So unless you have some really compelling reason you should just use the Cypress functions and accept the 50ish uS to get back to Active.