Contact Info

(for those who care)

Instant Gratification   



Thu, 10 Mar 2011

Debugging PHP segfault backtraces with `gdb`

Hello, and welcome. I am going to assume you have gone through the normal PHP documentation about how to get core files, load them into gdb and run the bt (backtrace) command.

In my particular case at work, I ran into a PHP segfault in in the oci_execute function. It’s a C function / module for querying the oracle databases which evidently has some sort of crash bug in certain circumstances.

This was clear from the multiple backtraces we had captured, consistently OCIStmtExecute exposed in PHP via oci_execute().

(gdb) bt
#0  0x00000000 in ?? ()
#1  0xf6f83cd6 in ttcdrv () from /lib/libclntsh.so.10.1
#2  0xf6e25461 in nioqwa () from /lib/libclntsh.so.10.1
#3  0xf6c97032 in upirtrc () from /lib/libclntsh.so.10.1
#4  0xf6c2dce9 in kpurcsc () from /lib/libclntsh.so.10.1
#5  0xf6be9fba in kpuexecv8 () from /lib/libclntsh.so.10.1
#6  0xf6bec360 in kpuexec () from /lib/libclntsh.so.10.1
#7  0xf6c60b3a in OCIStmtExecute () from /lib/libclntsh.so.10.1
#8  0xf7609199 in php_oci_statement_execute
    (statement=0xf1446534, mode=138243868) at /...442
#9  0xf760fdc5 in zif_oci_execute (ht=1, return_value=0xf1325ab4,
    return_value_ptr=0x0, this_ptr=0x0, return_value_used=1) at /...1302
#10 0x081ab5c1 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff60d0) at /...200
#11 0x081aad69 in execute (op_array=0xf61a00b8) at /...92
#12 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff6500) at /...234
#13 0x081aad69 in execute (op_array=0xf7b9378c) at /...92
#14 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff6650) at /...234
#15 0x081aad69 in execute (op_array=0xf7ba6a9c) at /...92
#16 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff6cd0) at /...234
#17 0x081aad69 in execute (op_array=0xf5c2db80) at /...92
#18 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff7000) at /...234
#19 0x081aad69 in execute (op_array=0xf5b69434) at /...92
#20 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff99c0) at /...234
#21 0x081aad69 in execute (op_array=0xf5ad5684) at /...92
#22 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffff9d20) at /...234
#23 0x081aad69 in execute (op_array=0xf5ad46fc) at /...92
#24 0x081aafb5 in zend_do_fcall_common_helper_SPEC
    (execute_data=0xffffb090) at /...234
#25 0x081aad69 in execute (op_array=0xf7b72f1c) at /...92
#26 0x08191311 in zend_execute_scripts
    (type=8, retval=0x0, file_count=3) at /...1135
#27 0x08157b1a in php_execute_script (primary_file=0xffffd640) at /...2064
#28 0x0820e6b4 in main (argc=18, argv=0xffffd724) at /...1176

Looking at the PHP doc’s there were a few more items to try.

(gdb) print (char *)(executor_globals.function_state_ptr->function)->common.function_name
$1 = 0x218bac "oci_execute"

(gdb) print (char *)executor_globals.active_op_array->function_name
$2 = 0xb7beaa5c "getData"

(gdb) print (char *)executor_globals.active_op_array->filename
$3 = 0xb7b4e4c4 "DatabaseConnection.php"

Great… still oci_execute, and yep, it’s called via the database wrapper. But what SQL is being executed? What is the function that is calling the generic getData() function?

Obviously there are other “execute” items on the stack, but how to get to them, and what to do with them once you’re there?

With some help from the wonderful #gdb ambassador “xdje” on irc.freenode.net, I was able to get a crash course on using gdb in a slightly more advanced way for debugging PHP core files.

You already know bt which will print you out the backtrace. After that I started with help but I didn’t know how to describe what I was looking for so that was basically a dead end.

Through some googling, I learned a little about the frame #, up, and down commands which will move your context through the backtrace, but I couldn’t understand why even when I moved through the stack the print ... commands from the PHP web page always printed out the same data.

Then I learned about the info ... subsets. Ok, now we are getting somewhere. In the example above, I knew there was important stuff located in frames 13, 15, 17, etc. which I hoped contained the PHP function names that were responsible for each execute... line. From a “C” perspective, the function question is always execute() but I needed to know what was going through PHP’s tiny little brain.

help info, info locals looks promising.

(gdb) bt
....

(gdb) frame 13
#13 0x081aad69 in execute (op_array=0xb7bead08) at zend_vm_execute.h:92
92      in zend_vm_execute.h

(gdb) info frame
Stack level 13, frame at 0xbff17080:
 eip = 0x81aad69 in execute (zend_vm_execute.h:92); saved eip 0x1cbad4
 called by frame at 0xbff170d0, caller of frame at 0xbff16e10
 source language c.
 Arglist at 0xbff17078, args: op_array=0xb7bead08
 Locals at 0xbff17078, Previous frame's sp is 0xbff17080
 Saved registers:
  ebx at 0xbff17070, ebp at 0xbff17078, esi at 0xbff16e00,
  edi at 0xbff17074, eip at 0xbff1707c

(gdb) info locals
execute_data = {opline = 0xb7b55e88, function_state =
  {function_symbol_table = 0xb7150838, function = 0xa270c48,
  reserved = {0x8188816, 0xbff170c0, 0x0, 0xbff17068}}, 
  fbc = 0x0, op_array = 0xb7bead08, object = 0x0, Ts = 0xbff16e50,
  CVs = 0xbff16e20, original_in_execution = 1 '\001', symbol_table = 0xb714f98c, 
  prev_execute_data = 0xbff175e0, old_error_reporting = 0x0}

That looks promising, but I have no idea what I’m looking at or how to figure out what is hidden away in this core dump. At this point my IRC buddy on xdje jumped in to save the day.

Let’s go back to the original PHP print commands to understand them, but first a slight detour back to the “frame info” from above. It turns out that gdb accepts “expressions in the programming language of the core dump” (or something). It all of a sudden makes the following line very important:

(gdb) info frame
Stack level 13, frame at 0xbff17080:
 eip = 0x81aad69 in execute (zend_vm_execute.h:92); saved eip 0x1cbad4
 called by frame at 0xbff170d0, caller of frame at 0xbff16e10
 source language c.
 Arglist at 0xbff17078, args: op_array=0xb7bead08
 Locals at 0xbff17078, Previous frame's sp is 0xbff17080
 Saved registers:
  ebx at 0xbff17070, ebp at 0xbff17078, esi at 0xbff16e00,
  edi at 0xbff17074, eip at 0xbff1707c

…and helps to understand a bit the original print statement which is mostly c-style struct and pointer traversal.

(gdb) print (char *)executor_globals.active_op_array->function_name
      ^-- GDB command
            ^-- dereference / type
                     ^-- variable
                                     ^-- variable
                                                      ^-- variable

OK, so what is executor_globals? Turns out (don’t ask how just yet) that it is basically a global variable. So you go daisy-chaining down the line of function calls and data values and finally you can get to function_name. Dandy.

Now what else can we do with print? How can we print the function_name of the current position in the stack?

(gdb) info frame
Stack level 13, frame at 0xbff17080:
 eip = 0x81aad69 in execute (zend_vm_execute.h:92); saved eip 0x1cbad4
 called by frame at 0xbff170d0, caller of frame at 0xbff16e10
...this is important
 source language c. 
...aha, here is another op_array
 Arglist at 0xbff17078, args: op_array=0xb7bead08 
 Locals at 0xbff17078, Previous frame's sp is 0xbff17080
 Saved registers:
  ebx at 0xbff17070, ebp at 0xbff17078, esi at 0xbff16e00,
  edi at 0xbff17074, eip at 0xbff1707c

...hrm, not what I expected
(gdb) print 0xb7bead08->function_name 
Attempt to extract a component of a value that is not a structure pointer.

...maybe it needs that char * thing?
(gdb) print (char *)0xb7bead08->function_name 
Attempt to extract a component of a value that is not a structure pointer.

...oh, there are operator precedence rules at play here
(gdb) print ((char *)0xb7bead08)->function_name 
Attempt to extract a component of a value that is not a structure pointer.

...here is where IRC buddy saved the day
(gdb) info args 
op_array = (zend_op_array *) 0xb7bead08

...aha! you have to treat that memory address as a *type* and then you can get data off of it
(gdb) print ((zend_op_array *)0xb7bead08)->function_name 
$11 = 0xb7beaa5c "getData"

Ok, so now we’re getting somewhere. But guess what. There is an even simpler way of doing the above.

...d'oh! you can reference it just like a variable, nothing fancy required
(gdb) print op_array->function_name 
$11 = 0xb7beaa5c "getData"

…but how do I know that I can access function_name off of that op_array thing? More help from my IRC buddy exposes the ptype command:

...aha! this is like dir(...) in python
(gdb) ptype op_array 
type = struct _zend_op_array {
    zend_uchar type;
    char *function_name;
    zend_class_entry *scope;
    zend_uint fn_flags;
    union _zend_function *prototype;
    zend_uint num_args;
    zend_uint required_num_args;
    ...
} *

So now we’re really getting somewhere.

...backtrace
(gdb) bt 
...

...jump to where i want info
(gdb) frame 13 

...hrm, see the op_array variable
(gdb) info args  

...there's a function_name field, looks useful
(gdb) ptype op_array 

...this is what i'm looking for
(gdb) print op_array->function_name 
$13 = 0xb7beaa5c "getData"

...go to the next function call up the chain
(gdb) frame 15 

...print its name too
(gdb) print op_array->function_name 
$14 = 0xb7f6d248 "getMissingDates"

...i wonder if I can get the SQL for this as well
(gdb) frame 8 
#8  0x00210199 in php_oci_statement_execute (statement=0xb17ccfbc, mode=171218828)
    at oci8_statement.c:442
442     oci8_statement.c: No such file or directory.
        in oci8_statement.c

(gdb) info args
statement = (php_oci_statement *) 0xb17ccfbc
mode = 171218828

(gdb) ptype statement
type = struct {
    int id;
    int parent_stmtid;
    php_oci_connection *connection;
    sword errcode;
    OCIError *err;
    OCIStmt *stmt;
    char *last_query;
    long int last_query_len;
    HashTable *columns;
    HashTable *binds;
    HashTable *defines;
    int ncolumns;
    unsigned int executed : 1;
    unsigned int has_data : 1;
    ub2 stmttype;
} *
...hrm, last_query looks useful?
(gdb) print statement->last_query 
 print statement->last_query
$15 = 0xb1546574 "select ... from date_table where ..."

(gdb) print "bingo"
You can't do that without a process to debug.

Wow. Now that is pretty productive. I started with a PHP script that randomly segfaulted and dumped core. From the core file I can generate a “C” backtrace. Walking through the “execute” statements I can get at basically the “PHP” backtrace. And furthermore I can walk a bit more “forward” into “C-land” and get info from the C module itself (statement->last_query).

Messing with GDB was really quite rewarding and just so long as I post this guide to my blog, I won’t forget it next time I need to go debugging PHP segfaults.

21:55 CST | category / entries
permanent link | comments?

Like what you just read? Subscribe to a syndicated feed of my weblog, brought to you by the wonders of RSS.



Thanks for Visiting!