Reverse Engineering Tools on Linux - Security & Cryptography

Users browsing this thread: 2 Guest(s)
kirby
Long time nixers
This is a completely rewritten version of my original post now that I have more experience. I'm still a novice, but I've done some real malware analysis and some exercises and generally feel like the post should reflect my updated feelings.

Introduction

Reverse engineering is seen as a primarily Windows-based activity. Basically all malware samples you reverse will be Windows-based, and some of the best tools such as OllyDbg are Windows exclusive. This post aims to look at what tools and resources are available for Linux and to evaluate them.

Before that, however, I'd like to quickly mention you can get Windows virtual machine images for all manner of hypervisors free from Microsoft here. They're meant for testing your web apps on IE, but just boot it up, install all your tools and save a snapshot - what are they going to do. With that out the way, let's look at some native tools.

IDA

[Image: 9408b42b-f339-e011-8d53-0200d897d049_1_f...format=jpg]

IDA (the Interactive Disassembler) from Hex-Rays has it's reputation as the best static analysis tool available, and for good reason, it's very good. It provides a very useful disassembly, graphing functions, comprehensive searching, imports and references to these imports, and much more. With this brings a ridiculous price tag which I'm sure puts it out of range of anyone here. Thankfully, Hex-Rays offer a free demo version. It is a tad limited in what it can disassemble, and you can't save. You can get around the latter issue with virtual machine snapshots if you're so inclined.

Documentation-wise, IDA's reputation means it has a strong user base and thus plenty of resources are available, including entire books. I found I didn't have a clue what to do, but on reading the dedicated chapter in Practical Malware Analysis, I picked it up no problem and now find it very intuitive. That said, Hex-Ray's own website seems a bit sparse, a lot of the pages seem out of date. I haven't ever had to go there for technical help though.

angr

angr is a Python symbolic execution engine framework. Symbolic execution is a very interesting field and not one that any of the other tools here provide to my knowledge. The Wikipedia page likely explains it better than I can, but in essence it involves traversing a program and storing values as expressions of of other values. This allows the user to perform constraint solving to obtain possible values for unknown variables.

As an example, say you're doing a CrackMe. Instead of reversing the entire algorithm, you could work out how the stack is setup and replicate this in angr. You could then point angr at a start address and tell it to reach a certain end address - the 'success' one. Once it gets there, you have the state of the program stored as a Python object and can tell angr to solve for what the input that lead to this state - the key - was. There are plenty of examples of exactly this.

This is a very powerful tool when used correctly. That's the catch though - learning angr is no simple task outside of the most basic of examples such as what I provided, and the angr documentation is very lacking at the moment. It is being worked on, however, and in my 5 week period I spent with it, the documentation was actively updated and improved. Definitely worth a look at.

gdb

Chances are you already have the GNU debugger installed, especially if you've ever written some C. It's quite a bare-bones debugger and contains everything you'd expect - breakpoints, memory dumps and register views etc, but the reversing experience is very clunky and annoying to navigate in my opinion - you simply need to keep your eyes on more things than gdb is willing to give in a nice view at once. It's age does mean that any information is pretty quick and easy to find, which is pretty good.

gdb-peda

gdb can be extended with scripts, and peda is a Python script that aims to add more on top of the gdb base. I think it's got popular enough to come as default on Kali. This adds a few commands which prove useful in reverse engineering and exploit development, and it provides extra information such as register views and a printout of the stack by default. It also has colours.

That said, I don't really like gdb-peda. I personally feel as though it suffers many of the same problems as gdb, while also making the output cluttered without it being that useful (the stack printout doesn't show the entire stack of a function, for example). That said I have a couple of coworkers who swear by it, so give it a try.

radare2

[Image: reverse-engineering-with-radare2-quick-i...80x600.jpg]

radare2 is a terminal-based tool that allows for both static and dynamic analysis (use the -d switch for the debugger! I've had to point this out to a couple of people). I personally really like it, and it's the best terminal option in my opinion.

When used statically, the 'analyse all' command (aa) can be used to give a text output not dissimilar to that of IDA's. From there on you can rename variables and functions to your hearts content. It even has ASCII graphs, though I personally found them a bit too awkward to use in the same manner as I would with a GUI.

The debugger provides pretty much all the options you could require, with a sensible syntax. Every function is documented within the program, and this help is easy to access as well. There's also the radare2 book in terms of documentation. Together these resources are very useful and have answered pretty much any question I have eventually, but this comes at the expense of it being basically the only documentation I can find. Googling questions rarely got me results.

edb

[Image: EDB-Evan-s-Debugger_1.png]

edb is a Qt4 (5?) app that very clearly takes a lot of inspiration from OllyDbg, right down to the keyboard shortcuts. Having used Olly all week I was going to write how edb didn't have as many features, but honestly after giving it a quick look the two seem incredibly similar. edb also comes with some decent plugins by default, such as a ROP tool. The creator himself says it is not a full release as the documentation is lacking, so keep that in mind. Otherwise this looks pretty good.

Additional Tools

Reverse engineering isn't just about reading assembly, and there are a few more tools available to Linux users that can be of use.

* 'strings' will dump all the strings in a program, which is useful for finding constants.
* 'strace' provides all the system calls a binary makes.
* 'xxd' can be used for hexdumps
* Any good scripting language such as Python or Perl can be made to good use for printing binary constants, going quick hex calculations in the terminal, etc.

Practice

Reverse engineering is a hard, long and very thought-intensive process a lot of the time, so practice is always good. The RPISEC Modern Binary Exploitation course materials are free online, and provide a Linux VM with gdb-peda and radare2 to try out the challenges on. You could also get the files from GitHub and run them locally if you prefer other tools.

Further Reading

* RE Wiki
* Reverse Engineering for Beginners
* /r/ReverseEngineering
* /r/malware

Thanks for reading! Feel free to PM any questions, and give any suggestions.
venam
Administrators
Wonderful and complete post.

I only tried two in the list you gave: GDB and OllyDBG.

GDB is a pain to learn, I keep forgetting the commands and I rage because I'm not able to express myself correctly with it.
However GDB has very powerful features as shown in the nullprogram blog

Sadly I love OllyDBG and I keep running it on Linux to debug windows binaries.
There are no equivalent on Unix.
xero
Long time nixers
so much good info here. thanx kirby! i have a look of new tools to look into now!
acg
Members
I've used edb and radare2, radare2 being pretty interesting.

Thanks for posting these tools, I'll check the others.
argonaut · musician · developer · writer · https://www.betoissues.com
io86
Members
Very nice post, kirby.

Gdb comes always handy, Olly is a must have on Windows, I guess. I like r2 more and more each day. As for edb, I have it installed, but I never really used it.

There is also the almighty IDA for Linux too. Qira looks nice, but I haven't tried it yet. For quickly visualizing a file I like Vix.

Some learning resources that might come handy with these tools: RE Wiki, Malware Analysis - CSCI 4976 and of course the legendary Lena.
kirby
Long time nixers
Yeah, it's early days for me. Hopefully I can update this post in the near future with some more complex features.

(07-04-2016, 12:36 AM)venam Wrote: Sadly I love OllyDBG and I keep running it on Linux to debug windows binaries.
There are no equivalent on Unix.

(08-04-2016, 10:06 AM)io86 Wrote: Olly is a must have on Windows, I guess.

Yeah, I've been using more and more of it and it's a very nice tool. Massive shame Linux doesn't have it.

(08-04-2016, 10:06 AM)io86 Wrote: There is also the almighty IDA for Linux too.

Am I correct in thinking you have to pay for the Linux version? That's what it seemed like at a first glance to me. I installed the demo version on Windows last night and it's a great tool, very important to have and produced the 'nicest' assembly output of anything I tried.

EDIT: Didn't see that page and thought they'd specifically ported the IDA interface to Linux exactly as it appears in Windows. I can see why they went for console but it's a little disappointing.

EDIT EDIT: Looks like I'm wrong again, as of 6.0 IDA Linux does have a proper interface.

(08-04-2016, 10:06 AM)io86 Wrote: Some learning resources that might come handy with these tools: RE Wiki, Malware Analysis - CSCI 4976 and of course the legendary Lena.

These look great, thanks a lot!
rain1
Members
Yeah I love the idea behind qira: each instruction run is like a git commit. It uses qemu to record all the information and then you can search through it.
The rr debugger works on the same sort of idea: http://rr-project.org/
The qira presentation he gave was great fun: https://www.youtube.com/watch?v=eGl6kpSajag

For a while I had been frustrated with gdb (crashing and/or the UI mangling itself) and starting thinking about a scriptable debugger - it turns out plan9 has one http://plan9.bell-labs.com/sys/doc/acidpaper.html
I still think this idea could be really profitable (especially compared to piping your commands into gdb..) but it's a lot of effort working with ptrace to actually implement one.
venam
Administrators
Just bumping this thread with this tool:

https://github.com/das-labor/panopticon
kirby
Long time nixers
(10-05-2016, 02:22 AM)venam Wrote: Just bumping this thread with this tool:

https://github.com/das-labor/panopticon

That looks pretty good. Never dealt with Rust before so I'll have to see how easy it is to build. Their webpage isn't hugely descriptive mind.

If anyone is looking for OllyDBG and anything else Windows offers, I've found you can just download one of these IE test VMs, set up an environment, and snapshot it. Seems to work pretty well so far for me? I'll update in the future if this backfires on me.
josuah
Long time nixers
It is a nice collection you have there! I'm adding one, but I do not know if it can works well: I never did reverse engineering.

http://rr-project.org/
kirby
Long time nixers
Hey guys I made a big update to the OP, completely rewritten. I thought I'd just save the original here for prosperity.

So for the past few weeks I've been diving into the field of reverse engineering, both as a hobbyist and in preparation for an upcoming internship I have at a security firm. I've been mainly reading the book Reverse Engineering for Beginners. I hit a snag where I'd wanted to test an example given, but the example was given in OllyDBG, a Windows application. I discussed with xero about the situation, and after concluding that Windows had all the good stuff (and with good reason), I decided to look into what was available on Linux and write a post in the style of venam.

My test is simple and entirely unscientific - to run this sample code and watch the memory to prove to myself why the program outputs what it does. The expected value is "a = 1, b = 2, c = 3", but this varies from system to system. This is not a particularly in-depth examination, but I felt it would give me a decent overview of regular usage and some way to compare between different applications.

Unless it wasn't already clear, I am a complete novice in this field. If I say anything incorrect, please feel free to comment on it. Forums are for discussion after all, and this has been as much a learning experience for me as I hope it is for you.

gdb

The first and most obvious is gdb. Installation is easy as it should be in every package manager ever, and the binary (at least on my system) comes in at about 6.2M. Accomplishing the task, while possible, was distracted by the lack of a window to view the memory at all times. Instead I had to print the memory at certain points in the program. This worked fine for my small example, but I feel like for larger projects it would get quite unmanageable having to juggle all the addresses. I'd probably need at least a pen and paper to jot down things. Nonetheless, gdb is useful for smaller projects, and can often be used as a backend for larger programs, so it's worthwhile at least tinkering with it a bit. I wanted to use gdb as a 'base' to compare against the other programs, so I made a painfully slow WebM to try and demonstrate.

radare2

Next on my list was radare. The site had a notice recommending radare2, so I used that. I had to compile this but the process was a simple './cofigure' followed by the usual 'make' / 'make install'. Final binary size came out as 96K. The website had screenshots of GUIs so I was very confused when I ran 'radare2 a.out' and was given a command line. I still don't really understand what's going on with any graphical radare interface, I think it's 'in development'. The application provides a shell like gdb and various 'visual' modes. I ended up flicking between both while trying to perform my analysis. The application was a tad confusing at first, but it is well documented both online and within the program itself, and this got me up to speed quite quickly. I ended up quite liking radare - it feels like a souped up gdb, and that's not a bad thing. I'm still not entirely sure how good a command-line interface would be for larger projects, but it's the probably the best you can get, and it certainly tries to cram in as many features as it can. I'd recommend it.

edb

The first graphical application, Evan's Debugger took me surprise. It was a massive pain to install - it's a Qt4 app, and not in the Debian repos. This means I had to install all the dependencies by hand, including a couple from unstable because the versions in stable were too old, though that's not really his fault I guess. What was his fault was the lack of any indication that I also had to install Qt5 libs. I'll be honest I don't really know what's going on with Qt, but cmake complained when I didn't have them installed and worked once I did. The final binary was 4.1M.

Opening the application and running my test binary gave me an obnoxious black window, which further hampered my first impression. It turns out however this was their console output and it was dwm that was resizing the window. I imagine the default size is sane. This is one of the many features it borrows from OllyDBG and it's clear this program is trying to emulate it as closely as possible. Despite my lackluster first impression, it was actually very pleasant to use. It gives me all the features I would expect, and I was able to accomplish my specific task even easier with explicit 'goto RBP' instructions in the stack viewer. I did have to toggle the word length, but this was an easy task as well. There are a couple of design choices I don't fully understand - I don't see what 'analysing' a region does, for example - and documentation is very sparse. He does mention this is why it is still in version 0.x. I would highly recommend this tool if you're fine installing the Qt libraries that come with it.

ddd

I didn't spend very long on ddd - I'd already used it briefly once and discarded it for being useless, and the same holds true now. It featured some nice windows, but they didn't display anything you couldn't watch with 'gdb -tui', and at the end of the day I just ended up entering gdb commands anyway. It looks incredibly dated and there's basically no reason to use it over gdb, which it's just a frontend for anyway.

OllyDBG

As an aside, I installed OllyDBG on my Windows partition to give it a look. At first I found it quite hard to get to grips with, but I'm not really sure if that's the fault of OllyDBG or Windows' EXE format. Once I got to grips with it it was the best application I used, and offered useful features such as text descriptions of what each memory address was. It's a shame it isn't on Linux, and it might be worth spinning up a Windows VM for if none of these other options do what you need. The experience also showed how much edb is 'inspired' by it however, so that's the best replacement in my opinion.

Conclusion

Beyond these specific apps, there are a lot of command-line tools available in *nix that can prove useful in reverse engineering - things like 'strings' or 'strace'. For the kind of reverse engineering I was looking for, I feel edb was the best, with a shout out to radare for being a very full featured command-line alternative. I hope this has been a useful guide to anyone interested, and feel free to comment on any of the apps I've mentioned or suggest more below.
venam
Administrators
Now, this is quite a good descriptive post from someone who knows what he needs.
Thanks a lot of the update!
jkl
Long time nixers
(08-04-2016, 10:06 AM)io86 Wrote: Olly is a must have on Windows, I guess.

Actually, x64dbg is the better OllyDbg already.
xero
Long time nixers
great update kirby. now i have a whole new suite of tools to checkout ;D
thlst
Members
There is also lldb, it's kind of a replacement for gdb.

This is a quote from the page:

Quote:In order to achieve our goals we decided to start with a fresh architecture that would support modern multi-threaded programs, handle debugging symbols in an efficient manner, use compiler based code knowledge and have plug-in support for functionality and extensions. Additionally we want the debugger capabilities to be available to other analysis tools, be they scripts or compiled programs, without requiring them to be GPL.
venam
Administrators
Kirby can you do a review of PINCE.
It's a frontend to gdb.
kirby
Long time nixers
This is a long time coming, sorry.

angr

angr is a Python symbolic execution engine framework. Symbolic execution is a very interesting field and not one that any of the other tools here provide to my knowledge. The Wikipedia page likely explains it better than I can, but in essence it involves traversing a program and storing values as expressions of of other values. This allows the user to perform constraint solving to obtain possible values for unknown variables.

As an example, say you're doing a CrackMe. Instead of reversing the entire algorithm, you could work out how the stack is setup and replicate this in angr. You could then point angr at a start address and tell it to reach a certain end address - the 'success' one. Once it gets there, you have the state of the program stored as a Python object and can tell angr to solve for what the input that lead to this state - the key - was. There are plenty of examples of exactly this.

This is a very powerful tool when used correctly. That's the catch though - learning angr is no simple task outside of the most basic of examples such as what I provided, and the angr documentation is very lacking at the moment. It is being worked on, however, and in my 5 week period I spent with it, the documentation was actively updated and improved. Definitely worth a look at.

-------

I will look at PINCE in a week or two, it tries to download a newer version of GDB and I'd rather not mess around with my Debian install just for that, and my internet is too bad for me to bother setting up a VM at the moment.
venam
Administrators
This is an excellent thread from 2016 about reverse engineering tooling on Linux.
Since then, many things have evolved. Linux is probably used much more in reverse engineering now.


The ghidra tool was released in 2019.
Many gdb extensions are getting popular and are very useful, such as pwngdb, gef, and peda.

Another big change today, is that docker pre-made box with a bunch of tools installed are getting popular, as they remove the hassle of setting up the environment yourself, or downloading a VM ISO. Let's mention some: pwnbox, reverse-me, and androidre.

Have you ever done reverse engineering on Linux, or even just debugged a C-based program using gdb, or played some CTFs or wargames? Have you tried one of the above software, what do you think of them?
z3bra
Grey Hair Nixers
I recently did attempt to do some reverse-engineering. Didn't finish the project though,

I own one of those Tomtom sport watch, which tracks different metrics, and export it all to "the cloud" using a proprietary tool.
At some point, that tool stopped working on Linux and I couldn't see my cool graphs showing all the pain I have to do sports.

I used mitmproxy to catch all the (encrypted!) traffic between the software and the cloud, and try to figure out what was happening under the hood. That was a great experience, and I managed to upload some tracks extracted from my watch to the cloud. Extraction was done using an opensource CLI tool, so all was great.

I gave up on it when I realized that the software has a way to generate some kind of "token" that must be registered onto the official website before synchronizing data. At this point I managed to fix the software so it runs again on Linux, so I just use that because it's easier and gets the job done in a more easy way.

Was pretty fun though !
opFez
Members
(19-08-2020, 03:42 AM)venam Wrote: The ghidra tool was released in 2019.
Many gdb extensions are getting popular and are very useful, such as pwngdb, gef, and peda.

Wow, gotta check some of these out. Thanks for the links!
freem
Members
About the tools already mentioned, I used (long ago) ollydbg and IDA. Several others too: w32dsm (or something like that, a debugger, older than olly IIRC), softice/winice (ring 0 debuggers for windows up to XP I think. There was rr0d, Rasta Ring 0 Debugger that was doing the same for windows, linux and freebsd, but never tried it)... those are only for debugging.
Then there was a shitload of resource analyzers, tools to guess compiler or packer used on a specific binary... that I don't remember well.
And of course, hexadecimal editors. My favorite back then was, without a doubt, WinHex. I even used it to recover deleted files :)

I'm no longer in reverse engineering, since I now usually have the source code. Except when I wan't to have fun, but I rarely play with those things now.
Still, I use some tools that fill the holes on linux, because I write code and it's a need to have some insight.
Quality is usually lower, but hey... For winhex, there is wxhexeditor.
For debuggers, I usually use gdb, with the cgdb frontend. I tried the *cough* GUI mode of GDB: it sucks. Really. Maybe someday I'll write my own frontend, that's something I have in mind since long, but then I think I'd try to learn to use LLDB before, since I'm pretty sold to most of LLVM suff: clang is so much better than gcc, libc++ just kills libstdc++, so maybe LLDB is better than GDB too.

Radare2 in on my TODO list of stuff to try since long, too.
venam
Administrators
(21-08-2020, 05:24 AM)freem Wrote: For debuggers, I usually use gdb, with the cgdb frontend. I tried the *cough* GUI mode of GDB: it sucks.
There have been many attempts to make gui for GDB. This one comes to mind, and it isn't bad, though I tried it long enough.

As for hex editors, I keep finding myself struggling on Unix systems. I haven't found any decent one that handles all the features I want. When pasting hex they all crumble if there's a space or newline in the wrong place, when comparing sections of file they go crazy, none of them support good diff between hex values. It's lacking.
ckester
Members
I currently use the nemiver frontend to gdb. It's OK, but nothing to shout about.

I used cgdb and ddd in the past but (like me) they're both showing their age. Clunky.

Been meaning to try lldb and the rest of the clang/llvm suite but haven't gotten around to it yet.
jvarg
Members
(21-08-2020, 05:56 AM)venam Wrote:
(21-08-2020, 05:24 AM)freem Wrote: For debuggers, I usually use gdb, with the cgdb frontend. I tried the *cough* GUI mode of GDB: it sucks.
There have been many attempts to make gui for GDB. This one comes to mind, and it isn't bad, though I tried it long enough.

As for hex editors, I keep finding myself struggling on Unix systems. I haven't found any decent one that handles all the features I want. When pasting hex they all crumble if there's a space or newline in the wrong place, when comparing sections of file they go crazy, none of them support good diff between hex values. It's lacking.

for GDB in my opinion the best GUI at least for me was insight [1] it's even possible to use the usual suspects PEDA [3], PWNDBG[4] within it but it's quite outdated and hard to build :/.
an interesting debugger is 'rr' which supports back in time debugging [5]
My favorite hex editor is the 010 Editor which supports templates which is really helpful IMO [2]


[1] https://sourceware.org/insight
[2] https://www.sweetscape.com/010editor/
[3] https://github.com/longld/peda
[4] https://github.com/pwndbg/pwndbg
[5] https://rr-project.org/