(by Disane)
Prologue
Hello and welcome to my brand new tutorial. I know you guys have been waiting for it to come out. As i mentioned on IRC, this tutorial is not talking about any practical hacking of the PS3, but what you can see here (and hopefully learn) can help you in achieving this goal. First of all, I'm assuming that you have some basic computer science knowledge and know a little bit of assembly. Knowing C is also very important in understanding the problem we're going to solve here. In order to try out the code and follow the tutorial, you are going to need a PS3 with Linux installed. You're also going to need the GNU tool-chain for the Cell Processor installed on your PS3 Linux. Without further a due let’s take a look at the documents we are going to need.
So, if you haven’t picked up PPU and SPU assembly yet, then you might want to check out the articles and tutorials I’m going to list here:
PPC32-64 assembly:
IBM Introduction to PowerPC assembly: http://www.ibm.com/developerworks/library/l-ppc/
MacTech PPC assembly introduction: http://www.mactech.com/articles/develop/issue_21/21balance.html
IBM’s article series (1-4) on PowerPC assembly: http://www.ibm.com/developerworks/views/linux/libraryview.jsp?topic_by=All+topics+and+related+products&sort_order=asc&lcl_sort_order=asc&search_by=Assembly+language+for+Power+Architecture&search_flag=true&type_by=Articles&show_abstract=true&sort_by=Relevance&end_no=100&show_all=false (recommended)
Beginners guide to PPC32 assembly: http://www.lightsoft.co.uk/Fantasm/Beginners/begin1.html (optional)
MacTech PPC Function Calls: http://www.mactech.com/articles/mactech/Vol.10/10.06/FunctionCalls/
PPC ABI: http://wall.riscom.net/books/proc/ppc/cwg/a_abi.html
IBM developers guide to the PPC architecture: http://www.ibm.com/developerworks/linux/library/l-powarch/?ca=dgr-lnxw961PowPCGuide
PPC compiler writers guide (the ultimate doc for people who want to reverse engineer PPC code): https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256996007558C6/$file/cwg.pdf
Official PPC32-64 Docs we are going to use in this tutorial:
PPC32: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600719DF2/$file/6xx_pem.pdf
PPC64: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/F7E732FF811F783187256FDD004D3797/$file/pem_64bit_v3.0.2005jul15.pdf
Cell Processor SPU assembly:
IBM SPU assembly article series: http://www.ibm.com/developerworks/views/power/libraryview.jsp?search_by=programming+high-performance+applications+on+the+Cell+BE+processor (Trust me it’s better than reading boring documents
About the SPU ABI: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/02E544E65760B0BF87257060006F8F20/$file/SPU_ABI-Specification_1.9.pdf (The second thing you want to read after the article series above.)
Cell BE Linux ABI: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/44DA30A1555CBB73872570B20057D5C8/$file/CBE_LINUX_ABI_1.2.pdf
Cell Processor ABI: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/1741C509C5F64B3300257460006FD68D/$file/CellBE_PXCell_Handbook_v1.11_12May08_pub.pdf
(This last document is actually a very interesting reading in general (I think this is one of the best manuals I’ve ever read. You are going to need to read the following section Objects, Executables, and SPE Loading starting on page 397.)
We are also going to use the following documentations when we are reversing SPU code:
SPU language specification: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/EFA2B196893B550787257060006FC9FB/$file/SPU_Assembly_Language_Specification_1.7.pdf
SPU Instruction Set Architecture: https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F3987257060006F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf
FAQ
Another tutorial on reversing, why?
Well the answer is very simple. I think we haven't seen a tutorial on reversing SPU code yet. I'm also confident that there are not too many beginner tutorials on reversing PPU code as well. Plus we are doing everything on the PS3 Linux. I also have personal reasons for writing this tutorial. As I mentioned on Twitter, I'm researching methods on how to make use of the Hypervisor and the CellOS-lv2 dump. You can consider this tutorial as a warm up before you jump into the endless ocean of IBMs Hypervisor code.
Why do you reverse a program when you have its source code?
For learning purposes of course. You can only learn reversing if you view the assembly code of your own programs. So from now on you might want to check out the assembly code that your compiler generates.True reverse engineers are very experienced programmers who know how to debug their own applications and how to optimize them using their assembly knowledge.
Our Guinea Pig: The Good old Guessing Game
So now that we are through the introduction. Let’s get down and dirty. The first thing we need is an application we can reverse. Well, we could try and reverse existing applications compiled on our PPC64 Linux but that would take too much time and besides this is not a book. So we are going to start with something very basic. The application we are going to reverse today is the all times classic “Guessing Game” written in C. The idea is simple. The computer generates a number between 1 and 10 and the player has to guess which number has the computer generated. Simple, right? In C probably yes but in PPC and SPU assembly this could be a real challenge even for experienced programmers. Well we are not going to write the game in PPC and SPU assembly simply because of two reasons. The first one is that C is portable code. We can compile our C code to run on the PPU or on the SPU. Cool, what’s the other reason? Well the other reason is that this is a tutorial on reversing SPU and PPU code and not an assembly tutorial.
About the sample code I’m going to show you. We are not going to use classes or anything fancy this is a classic C (sequential programming 101) so no Object Oriented Programming introduced. No exceptions to invoke or anything fancy. We are going to use one single Game Loop to keep the game going until the player wins (yeah you can’t lose in this game, but you can modify it if you want, the possibilities are endless
Alright so let’s fire up Linux on our PS3 and open up the Terminal. I don’t know about you, but I’ve got Ubuntu 10.4 on my machine. Locate a nice spot like ~/Projects/Reversing/guessing_game/ (use cd <Dir_Name> command to get to you home directory then mkdir <Dir_Name> to create the directories you need) and here we can create a text document using gedit or nano (type in gedit or nano). Our C code is the following:
- Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void initialize();
void player_guess();
/* the players guess */
int guess;
/* the number the player needs to guess*/
int game_number;
int main(int argc, char* argv)
{
initialize();
/* the game wont stop until the player finds the number */
do
{
player_guess();
}
while(guess != game_number);
/* congratulate the player */
printf("Oh goody, you guessed the number!\n");
printf("the number was indeed %d. \nCongratulations !!!\n", game_number);
/* end... */
return 0;
}
void initialize()
{
printf("Welcome to the Guessing game!\n");
/* initialize random seed: */
srand ( time(NULL) );
/* generate a game number */
game_number = rand() % 10 + 1;
printf("A random number was generated between 1 to 10 for you.\n");
}
void player_guess()
{
printf("Please place your bet: ");
scanf("%d", &guess);
/* the player missed the number, let's help her/him out a little bit */
if(guess > game_number)
{
printf("Nah, sorry you missed it! The number your looking for is actually SMALLER than %d.\n", guess);
}
else if(guess < game_number)
{
printf("Nah, sorry you missed it! The number your looking for is actually LARGER than %d.\n", guess);
}
}
Save the text as a C source code file. Let’s name it ppe_guessing_game.c .I think the sample code is well commented so I won’t be explaining what it does. I just assume you know C. (I think even beginner programmers should be able to understand the code.) Let’s compile this baby using gcc. The command looks as the following:
gcc ppe_guessing_game.c –o PPC32_guessing
That’s our PPU code. Nicely done so far! Now we are going to create another source code and name it spe_guessing_game.c. Our SPU code is a little bit different than the PPU code. Take a look at it first to see if you can spot the differences and then I will tell you why this code is different:
- Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
void initialize();
void player_guess();
/* variable to the number of the running SPE */
unsigned long long host;
/* the players guess */
int guess;
/* the number the player needs to guess*/
int game_number;
int main(unsigned long long spe, unsigned long long argp, unsigned long long envp)
{
host = spe;
initialize();
/* the game wont stop until the player finds the number */
do
{
player_guess();
}
while(guess != game_number);
/* congratulate the player */
printf("Oh goody, you guessed the number!\n");
printf("the number was indeed %d. \nCongratulations !!!\n", game_number);
/* end... */
return 0;
}
void initialize()
{
printf("Welcome to the Guessing game!\n");
printf("Our host for today is SPE number %d!\n", host);
/* initialize random seed: */
srand ( time(NULL) );
/* generate a game number */
game_number = rand() % 10 + 1;
printf("A random number was generated between 1 to 10 for you.\n");
}
void player_guess()
{
printf("Please place your bet: ");
scanf("%d", &guess);
/* the player missed the number, let's help her/him out a little bit */
if(guess > game_number)
{
printf("Nah, sorry you missed it! The number your looking for is actually SMALLER than %d.\n", guess);
}
else if(guess < game_number)
{
printf("Nah, sorry you missed it! The number your looking for is actually LARGER than %d.\n", guess);
}
}
As you can see the main() has changed a little bit, we have some interesting arguments here. Let’s see what they are:
unsigned long long spe: is the SPU’s ID which is a unique SPU task identifier assigned by the operating system. Which means the OS selects the SPE where the program get’s loaded into.
unsigned long long argp: is an effective address (EA) in the main storage to application parameters.
unsigned long long env: The environment pointer, which is an effective address (EA) in main storage to runtime environment information. I really don’t want to get into how the SPU code is getting loaded and all the other mysterious questions you might raise here. So please refer to this tutorial if you have questions about programming the SPU:
http://www.kernel.org/pub/linux/kernel/people/geoff/cell/ps3-linux-docs/ps3-linux-docs-08.06.09/CellProgrammingTutorial/BasicsOfSPEProgramming.html
Okay so the SPE version of the guessing game has some little extra. It stores the SPE number in a variable and displays the ID of the SPE when the game starts up. Other than that these two games are all the same. Compile the game with the following command in your Terminal:
spu-gcc spe_guessing_game.c –o SPU_guessing
We are done. Let’s check if the games are working fine. Type in:
./PPU_guessing
Test the game to see how it works then type in:
./SPU_guessing
Again, test the game then close it. If everything has gone well then we have our Binaries. Good work so far.
readelf and objdump
This section will talk about using readelf, objdump, spu-readelf and spu-objdump . If you haven’t used these tools before then you might want to take a look at the following tutorial on exploring ELF files with readelf and objdump:
http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html
I will only going to show you what kind of arguments we are going to use and tell a little bit about these tools.
readelf:
So let’s start with the readelf. In this tutorial we are going to use the readelf file to view ELF headers and to detect sections of the executable ELF binary we created when we compiled the simple game above this section. So for starters the readelf arguments in general look like this:
readelf –h <File_Name> - allows the user to view the header of the ELF binary.
readelf –S <File_Name> - read section headers
Navigate to the PPU_guessing binary using the Terminal and let’s type in readelf –h PPU_guessing to view the header of the PPU_guessing game:

Alright, so now we know that we are dealing with PPU code. We know that Data is Big Endian code and it’s an EXEC (Executable File) file. We know that there are 38 Sections (This could probably hold debug information as well). We also see where the entry point to the program is: 0x10000430.
Now let’s run readelf –S PPU_guessing .We get the following:

As you can see this has lots info we can use. Like what sections are executable/readable/writable codes (by looking at the Flags column). Also we have the sizes of each section. The ones that are zeroed out are probably not interesting anyways. We also have debug sections you can check out. There’s a major difference between debug ELF executables and non-debug ELF executables. There are less sections in the ELF file and also the executable code itself is shorter, since you don’t have to store every register on the stack to make the value available to read by the debugger (ppu/spu-gdb). The list your seeing now won’t be explained now. We will discuss it in the objdump section of this tutorial. We are actually going to use objdump to view what’s inside of these sections.
Now, try and use spu-readelf –h and spu-readelf –S on your SPU_guessing:


As you can see the spu-elf consists of fewer sections. This means fewer data and code to worry about. Ok, so we have a bunch of sections here but how do we check what’s inside in each section. That’s where the objdump comes into play.
objdump: objdump is the main tool we are going to use for reversing. You can look at it as a dump IDA disassembler but don’t be fooled. objdump can be still very powerful. First of all it allows us to view PPU and SPU code. Which IDA currently does not implement is the SPU disassemble and without the proper SPU processor plugin you can’t view/reverse SPU code using IDA. We are going to use objdump for two main things. The first one is viewing data and addresses. The second thing is viewing PPU and SPU code plus the address of each code segment.
If you look at the top readelf/spu-readelf –S listings. You should see that there are many sections to disassemble and view. Let’s take a look at the most important sections which we are going to touch in this tutorial.
PPU:
.init: Holds the process initialization code. Gets called before main().
.text: Executable section. This is where the program code goes. You might think that only main() and the other functions which are used by the application reside in this section. Well, the truth is that you get a lot more than that. More details in the “Reversing PowerPC 32 code” section of this tutorial.
.fini: Holds the process termination code. Gets executed when your program exits normally.
.rodata: A non writable section. Holds read only data which are strings, mostly. An ASCII code table could come in handy when dealing with .rodata.
.bss: Uninitialized or so called “null” data goes here. When the program executes this data gets initialized as 0.
.data: Initialized data.
.got: The Global Offset Table. Nearly everything gets redirected to this address. It’s very handy to know this address.
.plt: The Procedure Linkage Table. The base address for some libc function calls.
SPU:
note.spu_name: holds the name of the executable. For example: SPU_NAME.SPU_guessing
For more information on the ELF and it’s special sections check out: http://www.skyfree.org/linux/references/ELF_Format.pdf
Now, let’s check out some of the sections with objdump. Here’s some of the arguments you can give to objdump:
objdump –d <File_Name> ,disassembles the file completely
objdump –D <File_Name>, does the same as the previous flag
objdump –s <File_Name>,diplay content (hex and string)
objdump –S <File_Name>,mix disassembly with content
So let’s go ahead and display the .rodata section of the PPU_guessing using the command below:
objdump –s –j .rodata PPU_guessing

Here we can view the read-only data (like strings) which are located in our program. The first column shows the address while the other 4 next to this shows the data in hex (each row holds 16 bytes of data, remember that one character is 1 byte which can be represented with 2 hexadecimal digits).
Let’s check out what the following command gives us:
spu-objdump –s –j .rodata SPU_guessing

Now, you might think hex-editing is the same as hacking. Not at all dear Sir, you are wrong. Hacking is about reversing parts of the application which you are interested in or want to abuse (remove nag screens, remove trial counter, make the application fully functional if it’s a shareware). In order to do that, you’ll need to reverse engineer parts of the application and this is the part where assembly knowledge comes in handy. In the next section we are going to dive deep into reversing our PPU_guessing program.
Reversing PowerPC 32 code
Before we go into detail, there’s something I want to show you. If you look at the .text section using the command below:
objdump –D –j .text PPU_guessing
As I mentioned above the .text section has more on store for us than you would expect. Normally you are expecting to get main(), initialize(), player_guess() but you get things like:
_start: The first thing that gets executed by the Operating System. Also sets up the stack and It’s in fact a wrapper for __libc_start_main. Which is a library function inside libc. It’s interesting to note that we can actually create our own _start function enabling us to make the start code smaller.
_do_global_dtors_aux: The destructor function for this program.
call_do_global_dtors_aux
frame_dummy
call_fame_dummy
libc_csu_fini
libc_csu_init
_do_globabl_ctors_aux: The constructor function for this program.
call_do_global_ctors_aux
srand@plt: srand()
_gmon_start__@plt
__libc_start_main@plt: initializes the C library and calls main()
printf@plt: calls printf()
time@plt: calls time()
__isoc99_scanf@plt: calls scanf()
puts@plt: calls puts() although we know from the source code this wasn’t called specifically you’ll see how the compiler has swapped some of the printf()s to puts().
rand@plt: calls rand()
_glink
_glink_plt_resolver: resolves linking when the program executes.
It’s nice to know some of these (C runtime) functions as well, cause at least we know that they’re not interesting for us to reverse. The good thing is that we know which functions are being called from libc this can make our job a lot easier.
Check this out for more info on the above functions: http://www.mentalcases.net/texts/security/TextSectionStudy.txt
So now that we know what not to reverse there are only 3 functions left. Let’s view them:
- Code: Select all
1000059c <main>:
1000059c: 94 21 ff e0 stwu r1,-32(r1)
100005a0: 7c 08 02 a6 mflr r0
100005a4: 90 01 00 24 stw r0,36(r1)
100005a8: 93 e1 00 1c stw r31,28(r1)
100005ac: 7c 3f 0b 78 mr r31,r1
100005b0: 90 7f 00 0c stw r3,12(r31)
100005b4: 90 9f 00 08 stw r4,8(r31)
100005b8: 48 00 00 79 bl 10000630 <initialize>
100005bc: 48 00 01 0d bl 100006c8 <player_guess>
100005c0: 3c 00 10 01 lis r0,4097
100005c4: 7c 0b 03 78 mr r11,r0
100005c8: 81 2b 10 30 lwz r9,4144(r11)
100005cc: 3c 00 10 01 lis r0,4097
100005d0: 7c 0b 03 78 mr r11,r0
100005d4: 80 0b 10 2c lwz r0,4140(r11)
100005d8: 7f 89 00 00 cmpw cr7,r9,r0
100005dc: 40 9e ff e0 bne+ cr7,100005bc <main+0x20>
100005e0: 3c 00 10 00 lis r0,4096
100005e4: 30 60 0a 2c addic r3,r0,2604
100005e8: 48 00 03 79 bl 10000960 <puts@plt>
100005ec: 3c 00 10 00 lis r0,4096
100005f0: 31 20 0a 50 addic r9,r0,2640
100005f4: 3c 00 10 01 lis r0,4097
100005f8: 7c 0b 03 78 mr r11,r0
100005fc: 80 0b 10 2c lwz r0,4140(r11)
10000600: 7d 23 4b 78 mr r3,r9
10000604: 7c 04 03 78 mr r4,r0
10000608: 4c c6 31 82 crclr 4*cr1+eq
1000060c: 48 00 03 25 bl 10000930 <printf@plt>
10000610: 38 00 00 00 li r0,0
10000614: 7c 03 03 78 mr r3,r0
10000618: 39 7f 00 20 addi r11,r31,32
1000061c: 80 0b 00 04 lwz r0,4(r11)
10000620: 7c 08 03 a6 mtlr r0
10000624: 83 eb ff fc lwz r31,-4(r11)
10000628: 7d 61 5b 78 mr r1,r11
1000062c: 4e 80 00 20 blr
10000630 <initialize>:
10000630: 94 21 ff e0 stwu r1,-32(r1)
10000634: 7c 08 02 a6 mflr r0
10000638: 90 01 00 24 stw r0,36(r1)
1000063c: 93 e1 00 1c stw r31,28(r1)
10000640: 7c 3f 0b 78 mr r31,r1
10000644: 3c 00 10 00 lis r0,4096
10000648: 30 60 0a 80 addic r3,r0,2688
1000064c: 48 00 03 15 bl 10000960 <puts@plt>
10000650: 38 60 00 00 li r3,0
10000654: 48 00 02 ed bl 10000940 <time@plt>
10000658: 7c 60 1b 78 mr r0,r3
1000065c: 7c 03 03 78 mr r3,r0
10000660: 48 00 02 a1 bl 10000900 <srand@plt>
10000664: 48 00 03 0d bl 10000970 <rand@plt>
10000668: 7c 69 1b 78 mr r9,r3
1000066c: 3c 00 66 66 lis r0,26214
10000670: 60 00 66 67 ori r0,r0,26215
10000674: 7c 09 00 96 mulhw r0,r9,r0
10000678: 7c 0b 16 70 srawi r11,r0,2
1000067c: 7d 20 fe 70 srawi r0,r9,31
10000680: 7c 00 58 50 subf r0,r0,r11
10000684: 54 00 08 3c rlwinm r0,r0,1,0,30
10000688: 54 0b 10 3a rlwinm r11,r0,2,0,29
1000068c: 7c 00 5a 14 add r0,r0,r11
10000690: 7c 00 48 50 subf r0,r0,r9
10000694: 31 20 00 01 addic r9,r0,1
10000698: 3c 00 10 01 lis r0,4097
1000069c: 7c 0b 03 78 mr r11,r0
100006a0: 91 2b 10 2c stw r9,4140(r11)
100006a4: 3c 00 10 00 lis r0,4096
100006a8: 30 60 0a a0 addic r3,r0,2720
100006ac: 48 00 02 b5 bl 10000960 <puts@plt>
100006b0: 39 7f 00 20 addi r11,r31,32
100006b4: 80 0b 00 04 lwz r0,4(r11)
100006b8: 7c 08 03 a6 mtlr r0
100006bc: 83 eb ff fc lwz r31,-4(r11)
100006c0: 7d 61 5b 78 mr r1,r11
100006c4: 4e 80 00 20 blr
100006c8 <player_guess>:
100006c8: 94 21 ff e0 stwu r1,-32(r1)
100006cc: 7c 08 02 a6 mflr r0
100006d0: 90 01 00 24 stw r0,36(r1)
100006d4: 93 e1 00 1c stw r31,28(r1)
100006d8: 7c 3f 0b 78 mr r31,r1
100006dc: 3c 00 10 00 lis r0,4096
100006e0: 30 00 0a d8 addic r0,r0,2776
100006e4: 7c 03 03 78 mr r3,r0
100006e8: 4c c6 31 82 crclr 4*cr1+eq
100006ec: 48 00 02 45 bl 10000930 <printf@plt>
100006f0: 3c 00 10 00 lis r0,4096
100006f4: 30 00 0a f0 addic r0,r0,2800
100006f8: 7c 03 03 78 mr r3,r0
100006fc: 3c 00 10 01 lis r0,4097
10000700: 30 80 10 30 addic r4,r0,4144
10000704: 4c c6 31 82 crclr 4*cr1+eq
10000708: 48 00 02 49 bl 10000950 <__isoc99_scanf@plt>
1000070c: 3c 00 10 01 lis r0,4097
10000710: 7c 0b 03 78 mr r11,r0
10000714: 81 2b 10 30 lwz r9,4144(r11)
10000718: 3c 00 10 01 lis r0,4097
1000071c: 7c 0b 03 78 mr r11,r0
10000720: 80 0b 10 2c lwz r0,4140(r11)
10000724: 7f 89 00 00 cmpw cr7,r9,r0
10000728: 40 9d 00 2c ble- cr7,10000754 <player_guess+0x8c>
1000072c: 3c 00 10 00 lis r0,4096
10000730: 31 20 0a f4 addic r9,r0,2804
10000734: 3c 00 10 01 lis r0,4097
10000738: 7c 0b 03 78 mr r11,r0
1000073c: 80 0b 10 30 lwz r0,4144(r11)
10000740: 7d 23 4b 78 mr r3,r9
10000744: 7c 04 03 78 mr r4,r0
10000748: 4c c6 31 82 crclr 4*cr1+eq
1000074c: 48 00 01 e5 bl 10000930 <printf@plt>
10000750: 48 00 00 48 b 10000798 <player_guess+0xd0>
10000754: 3c 00 10 01 lis r0,4097
10000758: 7c 0b 03 78 mr r11,r0
1000075c: 81 2b 10 30 lwz r9,4144(r11)
10000760: 3c 00 10 01 lis r0,4097
10000764: 7c 0b 03 78 mr r11,r0
10000768: 80 0b 10 2c lwz r0,4140(r11)
1000076c: 7f 89 00 00 cmpw cr7,r9,r0
10000770: 40 9c 00 28 bge- cr7,10000798 <player_guess+0xd0>
10000774: 3c 00 10 00 lis r0,4096
10000778: 31 20 0b 48 addic r9,r0,2888
1000077c: 3c 00 10 01 lis r0,4097
10000780: 7c 0b 03 78 mr r11,r0
10000784: 80 0b 10 30 lwz r0,4144(r11)
10000788: 7d 23 4b 78 mr r3,r9
1000078c: 7c 04 03 78 mr r4,r0
10000790: 4c c6 31 82 crclr 4*cr1+eq
10000794: 48 00 01 9d bl 10000930 <printf@plt>
10000798: 39 7f 00 20 addi r11,r31,32
1000079c: 80 0b 00 04 lwz r0,4(r11)
100007a0: 7c 08 03 a6 mtlr r0
100007a4: 83 eb ff fc lwz r31,-4(r11)
100007a8: 7d 61 5b 78 mr r1,r11
100007ac: 4e 80 00 20 blr
Still a lot of code to look at but it’s doable
PPC ABI review: First of all we won’t be reviewing the whole ABI for that you might want to read the articles in the Prologue section of this tutorial.
r0 Volatile register used in function prologues (used to hold the LR)
r1 Stack frame pointer (SP)
r2 TOC pointer
r3 Volatile parameter and return value register (sometimes used as parameter 1)
r4-r10 Volatile registers used for function parameters 2 to 7
r11 Volatile register used in calls by pointer and as an
environment pointer for languages which require one
r12 Volatile register used for exception handling and glink code
r13 Reserved for use as system thread ID
r14-r31 Nonvolatile registers used for local variables
f0 Volatile scratch register
f1-f4 Volatile floating point parameter and return value registers
f5-f13 Volatile floating point parameter registers
f14-f31 Nonvolatile registers
LR Link register (volatile) you can’t use it as a general purpose register
use mflr (move from LR) and mtlr (move to LR) to access its contents
CTR Loop counter register (volatile)
XER Fixed point exception register (volatile)
FPSCR Floating point status and control register (volatile)
Condition Registers:
CR0-CR1 Volatile condition code register fields
CR2-CR4 Nonvolatile condition code register fields
CR5-CR7 Volatile condition code register fields
The PPU stack frame organization:

The information above comes from the http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.7.html
document.
Here’s a little bit of an explanation on what the Link Register is and what the Stack Pointer is:
Link Register(LR):
The LR is actually a special purpose register which holds the address to return to when a function call completes. In other words this register stores an address where the program counter will jump to when the execution of the function is complete. It’s always moved from the LR and stored on r0 when a new stack is being built.
Stack Pointer (stored on r1):
Stores an address, that points to the top of the current stack. It’s also important to note that not every stack has the same size. The size of the Stack depends on the local variables and the number of saved registers on the stack. But there is a minimum size which the Stack must have where it can store the LR and the SP. When your decrementing the stack you are “allocating” Stack space and when your incrementing it then you are actually “deleting” the Stack.
After this small review of the ABI let’s get back to our main() function and it’s PPU code:
Setting up the Stack: Prologue and Epilogue
In this section we are going to review how the stack is handled and how the ABI works. This will make our job easier to see what data is being stored on the stack. We are interested in Local Variables and looking at how the Stack is being setup, we can detect them.
Prologue Phase:
In the first few lines of our main function, we can see how the Stack is being setup. This is an important part which we need to investigate.
- Code: Select all
stwu r1,-32(r1)
Here we increment the Stack Pointer by -32, which means we move the pointer downwards by 32 Bytes and we store the stack pointer back in r1. This is why we call this instruction Store Word and Update. We also store the Stack Pointer (also known as the Back Chain Pointer, which is a Pointer to the previous Stack) on this address. This is how you allocate space on the stack. We always go from the higher address to the lower and we use a pointer to tell which Stack are we at. So now we store the “base” address of our new stack on r1 and now we can refer to r1 as the Stack Pointer. Remember that we said that the Back Chain is at SP+0 which is true. We are at that spot.
- Code: Select all
mflr r0
Here we Move the LR into r0. So we can save it:
- Code: Select all
stw r0,36(r1)
on SP+36 which is an interesting thing. Remember that we already decremented the SP to create our Stack. But now we are actually jumping back to our previous stack and place the LR on the “bottom” of the previous Stack.
- Code: Select all
mr r31,r1
Here we move the contents of r1 into r31 to create a frame pointer (a pointer to the top of the current stack).
- Code: Select all
stw r3,12(r31)
stw r4,8(r31)
These two instructions store argc and argv on the stack. Note that we are using the frame pointer as the base address.
Program Code:
- Code: Select all
bl 10000630 <initialize>
bl 100006c8 <player_guess>
Here we can see 2 Branch and Link instructions. Which means the instruction pointer (or program counter) jumps on to initialize() and saves the address of where it jumped from on to the LR and increments it by 4(Bytes) so when the blr (branch to the address in LR) is called from initialize() the program counter won’t jump back to the bl initialize() code rather on to the next instruction(bl player_guess). We won’t be following the program counter for now cause we haven’t really figured out what else does the main() function perform. So let’s move further:
So far we know that main() calls initialize() and player_guess() right after initialize() was called.
- Code: Select all
lis r0,4097
mr r11,r0
lwz r9,4144(r11)
lis r0,4097
mr r11,r0
lwz r0,4140(r11)
cmpw cr7,r9,r0
bne+ cr7,100005bc <main+0x20>
So what happens here? That’s the million dollar question. Actually this isn’t that hard to find out. As you can see the first instruction loads in 4097 into r0, but this time it shifts the value by 2 bytes. Let’s see how this looks like:
4097 (decimal) is equal to 0x1001 (hexa-decimal, just grab a calculator
0x10010000 in the next instruction we move this value into r11 from r0 and use this as a base to add 4144. Let’s see how we add the two addressed to create one:
- Code: Select all
0x10010000
+ 0x00001030
------------
0x10011030
We get this value in r9. What is this? Definitely not a value cause it’s too large to be some kind of value of a variable. But if you look at the address where this instruction is (100005c0) you should notice that the value we got also starts with 0x100… Hmmm are you thinking what I’m thinking? Yeah this is an address in our program. But the value is so large that it’s beyond the .text section. So where does this address point to? Let’s type in readelf –S PPU_guessing again to view the sizes and addresses of each section.
Just as I expected: our address is inside the .sbss. The .sbss is exactly 8 Bytes long. So there’s probably data inside. Let’s open up the .sbss using objdump -D -j .sbss PPU_guessing.
At the address 10011030 we have <guess> .Bingo! We can see that r9 stores the value of guess. I think we can move on to the next instruction. Again, we have these three instructions:
100005cc: 3c 00 10 01 lis r0,4097
100005d0: 7c 0b 03 78 mr r11,r0
100005d4: 80 0b 10 2c lwz r0,4140(r11)[/code]
I highlighted the most important part in these 3 instructions. Yes, you can view the address if you combine these two. So we get 0x1001102c inside r0. Which is an address in the .sbss to <game_number>. This means r0 stores the value of game_number. Great job so far. Now, we know that we are loading in 2 important values into r9 and r0. But here comes something even more important. These two values are being compared by the following instruction:
- Code: Select all
cmpw cr7,r9,r0
Basically what happens here is that. We compare these two “words”. If you look back at the previous instructions then you can see that these two values were loaded in as “words”, a word is exactly 4 Bytes long, remember that on 32 bit mode an address is 4 Bytes long, quite explains why you can only address 4 GBs of memory right?
- Code: Select all
bne+ cr7,100005bc <main+0x20>
Ok, so we are branching again. Our instruction is the Branch if not Equal (+ predict, that the branch is likely to be taken). Very interesting fact is that the machine knows that in most cases the player will be wrong
- Code: Select all
100005bc: 48 00 01 0d bl 100006c8 <player_guess>
Oh, yes! This means that if the two values are not equal then the Program Counter jumps back to player_guess. Which we can assume is the function which takes the guess value from the player. But we will see once we get there. We now know that main() has some very important role in this program. It loads in the values guess and game_number and compares them and branches according to the result. Let’s see what happens if the player was correct:
100005e0: 3c 00 10 00 lis r0,4096
100005e4: 30 60 0a 2c addic r3,r0,2604
100005e8: 48 00 03 79 bl 10000960 <puts@plt>
Again we can see Load Immediate Shifted and his buddy Add Immediate with Carry. But this time the situation is a bit different. The 0x10000a2c address gets loaded into r3, which makes me want to think that we are witnessing a function call. According to the ABI the r3 is the first argument that gets loaded when a function is being called. If you look at the third instruction, you can see that indeed we are branching and Linking into puts(). puts(), “WTF?” you might ask. Why is puts() being called and not printf(). I guess the answer is simple but you won’t like it. The compiler has decided to use puts() instead of printf() cause of one reason and that has something to do with the address we are loading into r3. Let’s check the address:
- Code: Select all
readelf –S PPU_guessing
Our address this time is inside .rodata according to the section addresses. If your still scratching your head how did I come to this conclusion then let’s check the section addresses together:
- Code: Select all
.fini 100009e0
.rodata 10000a18
.eh_frame_hdr 10000b9c
Ok, we have 0x10000a2c which is greater than 100009e0 (.fini ) and 10000a18 (.rodata) but not greater than 10000b9c(.eh_frame_hdr). Which makes me want to think that the address we are looking for is between .rodata and .eh_frame_hrd. So we open up .rodata (this is where read-only data is stored, makes sense, right?):
- Code: Select all
10000a28 00020001 4f682067 6f6f6479 2c20796f ....Oh goody, yo
10000a38 75206775 65737365 64207468 65206e75 u guessed the nu
10000a48 6d626572 21000000 74686520 6e756d62 mber!...the numb
Well, there’s no such address mentioned here. But we can guess that the address is somewhere here. So let’s look at 0x10000a28. We need to calculate where the data resides. Let’s subtract 0x10000a2c with 0x10000a28:
- Code: Select all
0x10000a2c
-0x10000a28
--------------
0x00000004
That means 4 Bytes. So our data starts from here:
00020001 4f682067 6f6f6479 2c20796f
Here we need to cut 4 bytes which means our data is the one in the green highlight and the red is the one we cut off (remember 1 Byte is two “digits” in Hex-Decimal). We can check if we are looking at the data we are searching for. We need an ASCII table to see that. Since we know that each character is two Hex digit. We can decode the characters from the hex-decimal code manually:
- Code: Select all
4f 68 20 67 6f 6f 64 79 2c 20 79 6f
-----------------------------------
O h g o o d y , y o
I think this is it we’ve found the data. Let’s see how it looks like:
10000a28 00020001 4f682067 6f6f6479 2c20796f ....Oh goody, yo
10000a38 75206775 65737365 64207468 65206e75 u guessed the nu
10000a48 6d626572 21000000 74686520 6e756d62 mber!...the numb
Our data is the one I highlighted with green. Okay nice so puts() writes out “Oh goody, you guessed the number!”.
Since there’s no data to display GCC thought that puts() is enough to do the job there’s no need for printf(). So how did I find out where’s the end of the string in .rodata? It’s simple if you’ve ever ran the program your reversing then it’s pretty easy to tell how strings are displayed. You can see them and write down the messages the program writes out. There’s another way to check this as you can see there’s a little bit of gap between each string. The gab consists of 0’s. Let’s move on to our next instruction series. This next one is also a function call. But this time we are calling printf(). This means we are going to display data as well as strings. Let’s see how this is preformed:
- Code: Select all
100005ec: 3c 00 10 00 lis r0,4096
100005f0: 31 20 0a 50 addic r9,r0,2640
100005f4: 3c 00 10 01 lis r0,4097
100005f8: 7c 0b 03 78 mr r11,r0
100005fc: 80 0b 10 2c lwz r0,4140(r11)
10000600: 7d 23 4b 78 mr r3,r9
10000604: 7c 04 03 78 mr r4,r0
10000608: 4c c6 31 82 crclr 4*cr1+eq
1000060c: 48 00 03 25 bl 10000930 <printf@plt>
Alright let’s check the addresses. 0x10000a50 points to “the number was indeed %d./nCongratulations!!!” where we can see data as well and this data is stored on 0x1001102c very familiar address. Isn’t this game_number? Indeed it is game_number. After loading the string and data into r9 and r0 we move the string from r9 into r3 and the data from r0 into r4. Then something interesting happens:
crclr 4*cr1+eq
crclr stands for Condition Register Clear. Basically what it does is simple. It xors the Condition Register with its own values which means it clears every bit to zero. But here we are only touching cr1 more specifically we are clearing the equal bits. Which is the bit field of cr1 that indicates if the result was equal or not. I have no idea why is GCC doing this :-/
Anyway, after that we Branch and Link into printf() with the two arguments to print out the string and the data together.
Epilogue Phase:
From now on we are moving into the Epilogue phase of the main() function. Let’s see how GCC handles that:
- Code: Select all
#zeroing out some of the registers
li r0,0
mr r3,r0
#Here we increment the Stack Pointer, thus getting rid of main()’s stack even its data
addi r11,r31,32
#Here we get back our previous LR (the address)
lwz r0,4(r11)
#and we move it into the actual LR
mtlr r0
#get the previous Frame Pointer (SP) from the main()’s stack, although we raised the stack #pointer already, the data wasn’t overridden.
lwz r31,-4(r11)
#and we store the SP in r1
mr r1,r11
#branch out of here, probably into the finalization and then into the program destructor
blr
Summary:
So let’s see what we know about main():
Stores argv and argc on the stack.
Calls initialize() and player_guess().
Loads in the game_number and guess and compares them if they’re equal if not then the Program Counter jumps back and calls player_guess() again.
Else it writes out the Contratulations message with the game_number and then the program ends.
This is vital info in reverse engineering this small program (not that you would need to reverse engineer something as simple as this xD).
So now that we know what main() does then let’s look at initialize() and player_guess(). I won’t break the code down like I previously did. I’m going to just comment it and when something special comes up, then I’ll make a more detailed explanation:
- Code: Select all
#initialize() starts here
<initialize>:
#Prologue
#Create stack, about 32 Bytes
stwu r1,-32(r1)
#move LR in to r0
mflr r0
#store LR to previous code into the previous stack
stw r0,36(r1)
#store old Frame Pointer on SP+28
stw r31,28(r1)
#move new Frame Pointer into r31
mr r31,r1
#Program Code:
#10000a80 points to the following string: “Welcome to the Guessing game!”
#this gets displayed by puts()
lis r0,4096
addic r3,r0,2688
bl 10000960 <puts@plt>
#Call time() with 0 as the first argument
li r3,0
bl 10000940 <time@plt>
#Load the result into r0, this is one worthless instruction here
#deleting it would make the program execute faster
mr r0,r3
#move result back into r3 again, worthless :-/ the result is moved into r3
#automatically and you don’t need to move around it between r0 and r3
mr r3,r0
#Call srand() with the result from time(), stored on r3
bl 10000900 <srand@plt>
#Call rand(), no arguments this time
bl 10000970 <rand@plt>
#move the result from r3 into r9
mr r9,r3
#this is what the MODULO operator generates
#no wonder why developers call it “expensive”, look at all these instructions:
lis r0,26214 #Load 0x6666 into r0
ori r0,r0,26215 #Or it with 0x6667 to generate 0x66666667
#Remember that r9 stores the random number
#mulhw stands for “Multiply High Word”,
#signed integers inside r9 and r0 form a 64 bit result
#the high word of this product is placed into r0, the result is a signed integer
mulhw r0,r9,r0
#I think I’ll leave the rest to you, it’s interesting to find out how the MODULO
#operator works in low level, but we are really not interested in it at the moment
#shift algebraic word by immediate
srawi r11,r0,2
srawi r0,r9,31
#subtract from, where we subtract from r11 by r0 and place the result into r0
#check out the 6xx_pem.pdf for more information
subf r0,r0,r11
rlwinm r0,r0,1,0,30
rlwinm r11,r0,2,0,29
add r0,r0,r11
subf r0,r0,r9
#The generated number resides in r9 we also add 1 to the random number to make
#it between 1 and 10, notice Add Immediate with Carry
addic r9,r0,1
Here’s a little help on recognizing MODULO in any future projects. I’ve managed to create a program which consists of the following code:
- Code: Select all
/*modulo.c*/
#include <stdio.h>
int main()
{
int operand_1 = 50;
int operand_2 = 100;
double result = 0;
result = operand_1 % operand_2;
return 0;
}
The example code is very simple. We are creating static data on the stack, 2 Integers and a double. The double stores the result. That means we will see some use of the float instructions and the floating point registers. Let’s look at the screenshot I made after disassembling the ELF with objdump:

Notice one important thing. We can see what the operands are and how the MODULO code uses them. This can be helpful in recognizing operands in our PPU_guessing example.
We can see the +1 immediate in the end and the result of rand() stored on r9. The only thing we can’t spot is the 10 immediate which we are “dividing” with
- Code: Select all
#Display information
#1001102c points to game_number
lis r0,4097
mr r11,r0
#But this time we are not loading the value but instead store r9 on the this address
#which means the randomly generated number gets stored inside game_number
stw r9,4140(r11)
#0x10000aa0 points to the following string:
#“A random number was generated between 1 to 10 for you.”
lis r0,4096
addic r3,r0,2720
#we write out the message using the above address using puts()
bl 10000960 <puts@plt>
#Increment the Frame Pointer to get out of this stack
addi r11,r31,32
#load back the old LR
lwz r0,4(r11)
#load r0 (old LR address) into the LR
mtlr r0
#load back the old Frame Pointer
lwz r31,-4(r11)
#move it into r1
mr r1,r11
#branch and link out of this stack and back into main()
blr
If you are asking “How the hell did you find out that those instructions were actually the MODULO operator?”. This is when experience starts to kick in. If you had checked out a C source code (compiled with GCC) that uses the MODULO, you would’ve known that those instructions are indeed “hiding” a MODULO operator. From now on always check the assembly source code your compiler is generating. Then try and understand it. It’s easier if you have the C source code. But without the source code you can only rely on your own knowledge.
Summary:
So here’s what we know about initialize():
Writes a welcome message.
Uses time(), srand(), rand() and the MODULO operator (% in C) to generate a random number between 1 to 10
Stores the random number as game_number (on game_number’s address)
Writes out a message that the number was generated between 1 to 10
Next up, we have the player_guess() function. What we already know about player_guess() is that it probably asks the player to input values. Let’s see how this is done:
- Code: Select all
#player_guess() starts here:
<player_guess>:
#Prologue:
#Set up stack, about 32 Bytes
stwu r1,-32(r1)
#move LR into r0
mflr r0
#store LR on the old stack (main())
stw r0,36(r1)
#store old Frame Pointer on SP+28
stw r31,28(r1)
#FP = SP
mr r31,r1
#0x10000Ad8 points to “Please place your bet: ”
lis r0,4096
addic r0,r0,2776
mr r3,r0
#clear cr1’s EQ bits
crclr 4*cr1+eq
#Call printf() to write the string we mentioned: “Please place your bet: ”
bl 10000930 <printf@plt>
#0x10000AF0 points to “%d”, integer data
lis r0,4096
addic r0,r0,2800
mr r3,r0
#0x10011030 points to a data in .sbss, this one is actually a data known as guess
lis r0,4097
addic r4,r0,4144
#clear EQ bits in cr1
crclr 4*cr1+eq
#so here the program reads the data from the user, using scanf()
bl 10000950 <__isoc99_scanf@plt>
#Loads in the data at address 0x10011030 (guess) into r9
lis r0,4097
mr r11,r0
lwz r9,4144(r11)
# r0 gets the data at address 0x1001102c (game_number)
lis r0,4097
mr r11,r0
lwz r0,4140(r11)
#compare words, r9 and r0 and store the result inside cr7, in other words
#compares game_number and guess
cmpw cr7,r9,r0
#Predicted Branch into address 0x10000754(check if greater than) if r9(guess) is less
#than r0(game_number)
#or equal to, predicted not to be taken, notice the ‘bne-‘
ble- cr7,10000754 <player_guess+0x8c>
#if r9(guess) is not less than or equal to r0(game_number),which could mean that it is
#greater than game_number, then we proceed in this direction:
lis r0,4096
#r9 stores 0x10000AF4 which points to this string:” Nah, sorry you missed it! The #number you are looking for is actually SMALLER than %d/n”
#Don’t be confused about the message, the program wants you to input a smaller
#number, cause the one you gave in is just too big.
addic r9,r0,2804
lis r0,4097
mr r11,r0
#r0 stores the address 0x10000030 which points to ‘guess’ data
lwz r0,4144(r11)
#move string(“Nah, sorry you missed it...”) into r3 (first argument) of printf()
mr r3,r9
#move data(guess) into r4 (second argument) of printf()
mr r4,r0
#clear EQ bits inside cr1, interesting the system seems to do this all the time
#when it calls printf(), probably uses this as a ‘No Operation’ ???
#we could actually check if printf() uses cr1’s EQ bits for something interesting
#just dig into printf()
crclr 4*cr1+eq
bl 10000930 <printf@plt>
#branch unconditionally into the address at 0x10000798 (which is the Epilogue)
#which means we destroy this stack and then recreate it once main() tells the machine
#to branch back to this function (player_guess())
b 10000798 <player_guess+0xd0>
10000754: lis r0,4097
mr r11,r0
#Load in 0x10011030 (guess) into r9
lwz r9,4144(r11)
lis r0,4097
mr r11,r0
#Load in 0x1001102c (game_number) into r0
lwz r0,4140(r11)
#Compare words r9 and r0 and store the result in cr7
cmpw cr7,r9,r0
#Predicted Branch if r9 (guess) is greater than or equal to r0
#(game_number), branch is predicted not to be taken, if taken then we
#branch to this address: 0x10000798 (Epilogue)
bge- cr7,10000798 <player_guess+0xd0>
#if r9(guess) is not greater than r0(game_number)
lis r0,4096
#Load address of string into r9:“Nah, sorry you missed it! The number your
#looking for
#is actually LARGER than %d”
addic r9,r0,2888
lis r0,4097
mr r11,r0
#Load address of data into r0: guess (data)
lwz r0,4144(r11)
#The string address goes into r3 (first argument)
mr r3,r9
#The data address goes into r4 (second argument)
mr r4,r0
#Again we clear the EQ bits inside cr1 cause we are calling printf()
crclr 4*cr1+eq
bl 10000930 <printf@plt>
0x10000798: #Epilogue:
#Increment the SP to delete the current Stack
addi r11,r31,32
#load in previous return address
lwz r0,4(r11)
#load previous return address into the LR
mtlr r0
#load previous SP into the Frame Pointer
lwz r31,-4(r11)
#pass SP to r1
mr r1,r11
#branch back to main()
blr
Summary:
So what do we know about player_guess?:
player_guess() asks the player to input an integer. It doesn’t check the size of the number! But it doesn’t matter cause, you can’t input anything larger than what an integer can hold. If you input a character or a string then the program goes into an infinite loop. So no character or string check as well. So the data is being read by scanf(%d).
After that the program compares game_number and guess. If guess is less than or equal to game_number then the program checks if guess is actually larger than or equal to game_number if the two numbers are equal then we branch and link out of this function. The code to check if game_number and guess are equal is in the main() function.
We can consider this as a big bottleneck to performance cause, the Stack has to be rebuilt each time the player inputs the wrong number. It would be better to not to use a function for this rather copy the code into main().
We’ve managed to find out how the program works exactly by looking at its PPU assembly code. The information we’ve gathered is enough to rebuild this program or to view its flaws. We can also rewrite the whole program in C if we really want to. Yeah I know that the MODULO section was pretty tricky but if you view the assembly code your compiler generates then you’ll be able to spot things like these easily. Keep viewing PPU code and you’ll be able to reverse anything you want.
Our next section will talk about reversing SPU code.


