Debugging ARM without a Debugger 2: Abort Handlers

This is my second post in the series Debugging ARM without a Debugger.

This is an excerpt from my debugging techniques document for Real-time Programming. These techniques are written in the context of writing a QNX-like real-time microkernel and a model train controller on a ARMv4 (ARM920T, Technologic TS-7200). The source code is located here. My teammate (Pavel Bakhilau) and I are the authors of the code.



It is useful to have a simple abort handler early on before working on anything complex, like context switch. The default abort handlers that come with the bootloader spew out minimal information for gdb if lucky, or often they just hang with no message (In fact, I am now very grateful that I am able to kernel panic messages at all when things are gravely wrong with my computer). By installing an abort handler, you will be able to what went wrong in case the asserts were not good enough to catch problems earlier.

Installation

There are three interrupt vectors that need to be intercepted: undefined instruction (0x4), prefetch abort (0xc) and data abort (0x10). We can re-use one abort handler because the abort type can be read from the cpsr. One exception is that both instruction fetch abort and data fetch abort share the same processor mode. We can work around this by passing a flag to the C abort handler. The following is a sample code:

// c prototype of the abort handler
void handle_abort(int fp, int dataabort);

// the abort handler in assembly that calls the C handler
.global asm_handle_dabort
asm_handle_dabort:
    mov r1, #1
    b abort

.global asm_handle_abort
asm_handle_abort:
    mov r1, #0
    abort:
    ldr sp, =0x2000000
    mov r0, fp
    bl handle_abort
    dead:
    b dead

Because ARM has a separate set of banked registers for abort modes, the stack pointer is uninitialized. Since I wanted to use a C handler to print out messages, I need to set up a stack. In this code, I manually set the stack pointer to be the end of the physical memory (our board had 32MB RAM in total so 0x2000000 is the end of the memory). For convenience, I also pass the current frame pointer in case I want to examine the stack of the abort-causing code.

When dealing with register values directly in C, it is convenient to have the following macro to read register values:

#define READ_REGISTER(var) \
__asm volatile("mov %[" #var "], " #var "\n\t" : [var] "=r" (var))
// usage: int lr; READ_REGISTER(lr);
#define READ_CPSR(var) \
__asm volatile("mrs %[mode], cpsr" "\n\t" "and %[mode], %[mode], #0x1f" "\n\t" \
: [mode] "=r" (var))
// usage: int cpsr; READ_CPSR(cpsr);

In the C abort handler, by reading the cpsr, you should be able to figure out the current mode. Refer to ARM Reference Manual section A2.2.

The following a brief summary of the abort environment and their interpretation. The precise information can be found in the reference manual chapter A2. You should read the manual to understand the process better.

An important thing to remember is that you should do your best to ensure that your abort handler does not cause another abort inside. Again, be very conservative when dereferencing pointers.

Interpretation

Read all the values from the registers first, and then print. Otherwise, there is a chance some registers might get overwritten.

cpsr

dabort refers to the second parameter passed into the C abort handler.

The lower 5 bits of cpsr

Interpretation

0x13

You are in svc mode. It probably means your abort handler caused another abort inside. Fix it.

0x17 (dataabort = 0)

Instruction fetch abort

0x17 (dataabort = 1)

Data fetch abort

0x1B

Undefined instruction

lr

Link Register normally contains the address to one instruction after the instruction that called the current function.

Current mode

Interpretation

Data fetch abort

The abort was caused by the instruction at lr - 8

Instruction fetch abort

The abort was caused by the instruction at lr - 4

Undefined instruction

The abort was caused by the instruction at lr

Fault type (in case of data/instr. fetch abort)

Read the fault type using the following code:

volatile unsigned int faulttype;
__asm volatile ("mrc p15, 0, %[ft], c5, c0, 0\n\t" : [ft] "=r" (faulttype));
faulttype &= 0xf;

Fault type value

Interpretation

(faulttype >> 0x2) == 0

misaligned memory access

0x5

translation

0x8

external abort on noncacheable

0x9

domain

0xD

permission

To see a big picture of how the fault checking works (other than misaligned memory access), you are advised to read the section 3.7 of ARM920T Technical Reference Manual. In short, unless you are making use of memory protection, you will never get domain and permission faults.

Data fault address (only applicable to a data abort)

This is the address the code tried to access, which caused the data fetch abort. Read it using the following code:

volatile unsigned int datafaultaddr;
__asm volatile ("mrc p15, 0, %[dfa], c6, c0, 0\n\t" : [dfa] "=r" (datafaultaddr));

Our actual abort handling code is located here.

Summary

It is very convenient to have a bullet-proof abort handler. It really gives you a lot more information about the problem than a hang. As well, don’t forget that most DRAM content is not erased after a hard reset, so you can use RedBoot’s dump (x) command to examine the memory, if really needed. With some effort, one can also set up the MMU to implement a very simple write-protection of the code region. Such protection could be useful to prevent the most insidious kind of bugs from occurring (Luckily, we did not have to deal with such bugs). 

Catergorized under: programming / cs452

Published: 2012-01-27T05:18:00.000
Last modified: 2019-02-18T19:56:27.167708
Permalink

Debugging ARM without a Debugger 1: Use of Asserts

This is my first post in the series Debugging ARM without a Debugger.

This is an excerpt from my debugging techniques document for Real-time Programming. These techniques are written in the context of writing a QNX-like real-time microkernel and a model train controller on a ARMv4 (ARM920T, Technologic TS-7200). The source code is located here. My teammate (Pavel Bakhilau) and I are the authors of the code.


Failing fast is an extremely useful property when programming in C. For example, problems with pointers are much easier to debug if you know exactly when an invalid pointer value is passed into a function. Here are few tips for asserting effectively:

There is no such thing as putting too much asserts.

CPU power used for asserts will almost never cause a critical performance issue [in this course]. You can disable them when you know your code is perfect. Verify pointers every pointer dereference.

Assert pointers more aggressively.

Do not just check for NULLs. We know more about the pointer addresses. We know that the pointer address is limited by the size of the memory. As well, from the linker script, we can even deduce more information. For example, we know that normally, we would not want to dereference anything below the address 0x218000 because that is where the kernel is loaded. Similarly, we can figure out what memory region is text and data.

Remove all uncertainties.

Turn off interrupts as soon as possible in the assert macro. When things go wrong, you want to stop the program execution (and the trains) right away. If you do not turn off interrupts, a context switch might occur to other task and you might not be able to come back ever to stop and display what went wrong.

Print as much information as possible.

Make an assert macro that resembles printf and print as much contextual information as possible. When you have no debugger, rebooting and reproducing can be really time-consuming. 1.5 months is a very short time to build an operating system from scratch so use it wisely.

e.g. ASSERT(condition, “oops! var1:%d, var2:%x, var3:%s”, var1, var2, var3);

Example

Here’s a short snippet of ASSERT macro. It has evolved over 3 months and it looks really dirty but it works. (source)

typedef uint volatile * volatile vmemptr;

#define VMEM(x) (*(vmemptr)(x))
void bwprintf(int channel, char *fmt, ...);
#define READ_REGISTER(var) __asm volatile("mov %[" TOSTRING(var) "], " TOSTRING(var) "\n\t" : [var] "=r" (var))
#define READ_CPSR(var) __asm("mrs %[mode], cpsr" "\n\t" "and %[mode], %[mode], #0x1f" "\n\t" : [mode] "=r" (var))
void print_stack_trace(uint fp, int clearscreen);
void td_print_crash_dump();
int MyTid();

#if ASSERT_ENABLED
#define ASSERT(X, ...) { \
        if (!(X)) { \
                VMEM(VIC1 + INTENCLR_OFFSET) = ~0; /* turn off the vectored interrupt controllers */ \
                VMEM(VIC2 + INTENCLR_OFFSET) = ~0; \
                int cpsr; READ_CPSR(cpsr); \
                int inusermode = ((cpsr & 0x1f) == 0x10); int tid = inusermode ? MyTid() : -1; \
                bwprintf(0, "%c", 0x61); /* emergency shutdown of the train */ \
                int fp, lr, pc; READ_REGISTER(fp); READ_REGISTER(lr); READ_REGISTER(pc); \
                bwprintf(1, "\x1B[1;1H" "\x1B[1K"); \
                bwprintf(1, "assertion failed in file " __FILE__ " line:" TOSTRING(__LINE__) " lr: %x pc: %x, tid: %d" CRLF, lr, pc, tid); \
                bwprintf(1, "[%s] ", __func__); \
                bwprintf(1, __VA_ARGS__); \
                bwprintf(1, "\n"); /* if in usermode ask kernel for crashdump, otherwise print it directly */ \
                if (inusermode) { __asm("swi 12\n\t");} else { td_print_crash_dump(); } \
                bwprintf(1, "\x1B[1K"); \
                print_stack_trace(fp, 0); \
                die(); \
        } \
}
#else
#define ASSERT(X, ...)
#endif

That’s it for today.

Catergorized under: programming / cs452

Published: 2012-01-25T05:04:00.000
Last modified: 2019-02-18T19:56:27.167708
Permalink

Working wpa_supplicant.conf configuration for the network uw-secure at UWaterloo for Xperia X10 (1.6)

While Sony Ericsson has promised us that they will update X10 with a moderately recent version (2.1) of the Android Operating System by Q4 2010, those of us who are stuck with Android 1.6 cannot normally connect to the most wireless networks using WPA-EAP including uw-wireless at the University of Waterloo. Apparently, the reason is while Android 1.6 does support WPA-EAP, there is no user interface (!) for editing these network configurations.

Fortunately, X10 (including X10a sold in Canada by Rogers) has been rooted very recently by the people at xda-developers.com. You can follow the guidelines here (For X10a users, it is important to install stuff in the post #5 as well).

After obtaining the root of the phone, you can edit the file wpa_supplicant.conf in /data/misc/wifi directory. I made a copy before making changes just in case. It is important that the owner and the permission of the file remains the exact same (owner: system, group: wifi and permission: 660).

Using your favourite method, append the following to the file:

network={
    ssid="uw-secure"
    scan_ssid=1
    proto=WPA
    key_mgmt=WPA-EAP
    eap=PEAP
    identity="UWDirID"
    password="UWDirPASSWORD"
    phase1="peaplabel=0"
    phase2="auth=MSCHAPV2"
}

I’ve assembled the configuration from this post at Arch Linux Forum by vogt. Two modifications I made is that I removed the line specifying ca_cert and added the line proto=WPA. For whatever reason, the phone will ignore the configuration if there is no proto=WPA line.

Catergorized under: techtips / android

Published: 2010-07-09T02:45:00.000
Last modified: 2015-08-31T03:41:54.657861
Permalink

About Me

My name is Woongbin Kang.

I graduated from the University of Waterloo with Bachelor of Computer Science with Software Engineering Option and Cognitive Science Option (Here is the list of courses that I finished at the University of Waterloo during my undergrad career).

The views expressed on this site are my own and do not reflect those of my employer or its clients.

Contact

LinkedIn: profile
Twitter: @selasonic
profile for wbkang on Stack Exchange, a network of free, community-driven Q&A sites

Catergorized under: uncategorized

Published: 2010-06-27T04:00:43.000
Last modified: 2017-09-27T11:07:35.001856
Permalink