The Art of Obfuscation



🚧This is still under construction. Please share any feedback you might have! (and yes, I used marquee)🚧

Throwback to kindergarten obfuscation

PoV: You’re 10 years old. Wearing a uniform too tight for you, trousers above your waist but not self-conscious enough to care, writing an exam with your Flora pencil. You don’t need the extra 5 marks from the Apsara pencil - you’re a first-bencher, you can’t get 105/100. But you might get a star sticker.
Mummy said don’t copy and don’t show anyone. Usually you’d let your friend copy from you, but you remember she didn’t give you the foreign biscuit “oreo” last week. What do you do when faced with this trauma?

You decide to be a “good” kid and make life difficult for your friend.

  • Write with a bad handwriting (there goes the 5 marks)
  • Answer questions in a jumbled order
  • Write a wrong answer, cross it out and write the right answer later

This is obfuscation: intentionally making data unintelligible and difficult to understand.

This is usually used in source code. And nerd goodie friends.

Big boy obfuscation

Now you’re all grown up and working in a tech company, but…some things never change. There are no exam sheets now, but you do have the design docs you need to write and your IDE. 😈 Here are some things you can do to trip people up:

1️⃣ Change file and folder names in Google Docs.
Eg: change "payslips_folder" name to "documentation_folder" (nobody reads the docs anyway), "Important meeting summaries" to "Recycle bin" (increases chances of it being read though). Or if you have one Workdrive folder, create 10 more dummy folders and rename them all to numbers. Anybody will give up after checking a few folders.

2️⃣ Running programs on unusual ports or URLs.
Eg: bvdagnscdbasc.netlify.app instead of todo-app-new.netlify.app, localhost:11263 instead of localhost:8000 etc.

3️⃣ In code, renaming variables to misleading or vague values.
Eg: username to u, userInput to str, login_id to num, accounts_extension_due to accsexdue.

4️⃣ Splitting values in code or using weird short forms so that it’s harder to search.
Eg: You can modify text such that it’s easy to read for people but won’t show up when they do a Ctrl+F search. str = 'default_password' could be str = 'def pwd' or str = 'de' + 'faultp' + 'ass'.concat('word') which makes it harder to search for but still works. Kind of like what we do to our passwords to feel secure - pick a common word, add an a few @$&s and a number. Reads like the original word but just so much securitty, amirite?

In all these examples, anybody with enough resources and time on their hands will still be able to figure it out.
People can open every Google Drive folder and check for files, they can try every URL combination, they can read the whole Google doc instead of searching for certain words.
We’re just making it harder for people trying to figure it out, hopefully discouraging people from putting in that effort.

This is called Security through obscurity; note that obfuscation compliments security by increasing the barrier for someone trying to understand and break into your software, but is not a replacement for security or encryption.
Encryption and other security measures are the lock on your door; prevents breaches. Obfuscation is adding a maze to get to your door hoping most people will skip your house and move on to easier targets.

Try some basic obfuscation to put in your Writer doc!

(unless you're in my team. don't do this guys, i like my ctrl+f. sfic)

Source code obfuscation

Most of the above examples are pretty simple; but obfuscation for computers happen on a whole other level.
Computers do not need any context and will just process whatever you give them. So when it comes to source code, it’s possible to transform it to what looks like gibberish to us but perfectly normal for computers.

a short example of Javascript code converted into gibberish using this tool - but it works, so it makes no difference to the computer!
Obfuscated code example

At a high level: we write code in plain english, which after some processes (like compiling) are turned into the 0s and 1s that the computer understands, called a binary format.
When apps are distributed in these binary or native formats it which takes a lot of effort to convert into something human readable to understand how the code actually works. The huge chunk of the discipline reverse engineering is dedicated to this.

Popular tools are Ghidra and IDA.

An example of obfuscated C++ code from this resource:
Obfuscated code example2


See also: Security through obscurity
Try some more complicated obfuscation online - Online JS obfuscator