A while ago I saw a bit of a joke post about “the absolute state of PHP” on my Facebook timeline. The post involved using the concept of variable variables to write virtually unreadable code.
How does this work?
In short, variable variables can be unpacked unlimited amount of times. This means you can create a number of self-referencing variables to represent everything with enough self references. By daisy chaining these references with the concatenation operator you can create entire scripts.
I decided to put this to the test.
The Solution
Here’s an example of what you can do with this obfuscator:
<?php
extract(array_combine(range('a', 'y'), range('b', 'z')));
$z = 'A';
extract(array_combine(range('A', 'Y'), range('B', 'Z')));
$zero = '0';
$one = '1';
$two = '2';
$three = '3';
$four = '4';
$five = '5';
$six = '6';
$seven = '7';
$eight = '8';
$nine = '9';
$plus = '+';
$equals = '=';
$dolluh = 'a'; //...we couldn't use "dollar" because we have no reference to 'a' yet!
$space = ' ';
$semicolon = ';';
$underscore = '_';
$singlequote = '\'';
(($$a . $$$$$$$$$$$$$$$$$a . $$$$a . ${$$$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$a . $$$$$$$a} . $$$$$$$$$$$$$$$$$$$a . $$$$a . ${$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$$a . $$$$a . $$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$a . $$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$a . $$$$a} . $$$$$a . $$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$a . $$$$$$$$$$$$$$$$$$$a . $$$$$$$$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$$$a)("",($a . ${$$$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$a . $$$$$$$a} . $$$$$$$$$$$$$$$$$$a . $$$$a . ${$$$$$$$$$$$$$$$$$$a . $$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$a} . ${$$$$$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$a} . ${$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$$a . $$$$a . $$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$a . $$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$a . $$$$a} . $$$a . $$$$a . $$a . $$$$$$$$$$$$$$a . $$$a . $$$$a)($$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$a . $a . $$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . ${$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$a . $$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$a} . $$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . ${$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$$$a} . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . ${$$$$$$$$$$$$$$a . $$$$$$$$$$$$$a . $$$$a} . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . ${$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$a . $$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$a} . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$a . ${$$$$a . $$$$$$$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$a . ${$$$a . $$$$$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$$$a . $$$$$$$a} . $$$$$$$$$$$a . $$$$$$$$$$$$$$$$$$a})))();
Executable link can be found here; this example output will write PHP IS A FLAWLESS LANGUAGE BTW
to stdout.
How does the obfuscation work?
By abusing variable variables, as well as using ${$x}
to reference the variable named inside of $x
, we can build entire scripts. If you unwrap the script, it actually looks like this:
//EOF is base64 encoded content; decoded for ease of reading
(create_function("", base64_decode(<<EOF
echo "PHP IS A FLAWLESS LANGUAGE BTW";
EOF;)))();
This creates an immediately executing function that will output PHP IS A FLAWLESS LANGUAGE BTW
.
Why not use eval?
In theory, using eval()
is the right pick. It’s what most other PHP obfuscators use behind the hood. But, there’s one major problem: eval is a language construct, not a function. This sounds like garbage, but it means that you cannot use eval with variable variables. PHP, for all of the weird flexibility it allows, does not allow you to reference language constructs via variable variables.
Since the goal was to obfuscate everything, that doesn’t work for us.
create_function
was suggested to me by my friend & colleague Pete, after I discovered that you couldn’t use eval in this way. The only downside is that create_function
has been deprecated since PHP 7.2.
Why use base64_encode?
When you think about what this obfuscator is doing - character substitution - you can begin to understand why we used a known encoding. Base64, by its very name, has a limited character set of 64 characters, all of which can be represented in plaintext. Similarly, the vast majority of characters in the base64 character set are represented in incrementing characters ([a-z]
for instance) so we can condense our variable generation a lot.
What is the use of this?
This obfuscator can’t be statically analysed (eg: by an IDE) - chuck the example script in PHPStorm and try and tell me what the script is outputting! This has the benefit that your code, although it can be unwrapped reasonably easily (eg: wrap it in var_dump
), static analysers will have a lot of trouble doing so.
This obviously has a lot of malware implications, but there are legitimate use cases for code obfuscators. I didn’t want to post something like this given the implications, but there is a very clear learning experience from all of this.
In Conclusion
There are a lot of ways to write code obfuscators for PHP, but most of them are easily statically analysed (& IDEs can help you make sense of them in rapid time). This form of obfuscation - although having an initial “code map”, makes code utterly unreadable and almost impossible to read without specialist tools or knowledge of it. There is no real clear way of detecting if a script has been obfuscated this way - except for maybe running something similar to grep '\${3,20}[a-zA-z]' file.php
(as a primitive example) to see the mass amounts of variable variables (which shouldn’t apply to any cases of legitimate code).