Introduction to Dharma - Part 2 - Making Dharma More User-Friendly using WebAssembly as a Case-Study

In the first part of our Dharma blogpost, we utilized Dharma to write grammar files to fuzz Adobe Acrobat JavaScript API's. Learning how to generate JavaScript code using Dharma opened a whole new area of research for us. In theory, we can target anything that uses JavaScript. According to the 2020 Stack Overflow Developer Survey, JavaScript sits comfortably in the #1 rank spot of being the most commonly used language in the world:

In this blogpost, we'll focus more on fuzzing WebAssembly API's in Chrome. To start with WebAssembly, we went and read the documentation provided by MDN.

We'll start by walking through the basics and getting familiarized with the idea of WebAssembly and how it works with browsers. WebAssembly helps to resolve many issues by using pre-compiled code that gets executed directly, running at near native speed.

After we had the basic idea of WebAssembly and its uses, we started building some simple applications (Hello World!, Calculator, ..), by doing that, we started to get more comfortable with WebAssembly's APIs, syntax and semantics.

Now we can start thinking about fuzzing WebAssembly.

If we break a WebAssembly Application down, we'll notice that its made of three components:

  1. Pure JavaScript Code.

  2. WebAssembly APIs.

  3. WebAssembly Module.

Since we're trying to fuzz everything under the sun, we'll start with the first two components and then tackle the third one later.


JavaScript & WebAssembly API

This part contains a lot of JavaScript code. We need to pay attention to the syntactical part of the language or we'll end up getting logical and syntax errors that are just a headache to deal with. The best way to minimize errors, and easily generate syntactically (and hopefully logically) correct JavaScript code is using a grammar-based text generation tool, such as Domato or Dharma.

To start, we went to MDN and pulled all the WebAssembly APIs. Then we built a Dharma logic for each API. While doing so, we faced a lot of issues that could slow down or ruin our fuzzer. That said, we'll go over these issues later on in this blog.

To instantiate a WebAssembly module, we have to use WebAssembly.instantiate function, which takes a module (pre-compiled WebAssembly module) and optionally a buffer, here's how it looks as a JavaScript code:

The process is simple, we will'll have to test-try the code, understand how it works and then build Dharma logics for it. The same process applies to all the APIs. As a result, the function above can be translated to the following in Dharma:

The output should be similar to the following:

What we're trying to achieve is covering all possible arguments for that given function.

On a side note: The complexity and length of the Dharma file dramatically increased ever since we started working on this project. Thus, we decided to give code snippets rather than the whole code for brevity.

Coding Style

We had to follow a certain coding style during our journey in writing Dharma files for WebAssembly for different reasons.

First, in order to differentiate our logic from Dharma logic - Dharma provides a common.dg file which you can find in the following path: dharma/grammars/common.dg . This file contains helpful logic, such as digit which will give you a number between 0-9, and short_int which will give you a number between 0-65535. This file is useful but generic and sometimes we need something more specific to our logic. That said, we ended up creating our own logic:

We also decided to go with different naming conventions, so we can utilize the auto-complete feature of our text editor. Dharma uses snake_case for naming, we decided to go with Camel Case naming instead.

Also, for our coding style, we decided to use some sort of prefix and postfix to annotate the logic. Let's take variables for example, we start any variable with var followed by class or function name:

This is will make it easy to use later and would make it easier to understand in general.

We applied the same concept for parameters as well. We start with the function's name followed by Param as a naming convention:

Since we're mentioning parameters, let's go over an example of an idea we mentioned earlier. If a function has one or more optional parameters, we create a section for it to cover all the possibilities:

Therefor our coding style, we used comments to divide the file into sections so we can group and reach a certain function easily:

That said, you can easily find certain functions or parameters under its related section. This is a fairly good solution to make the file more manageable. At a certain point you have to make a file for each section, and group shared logic on an abstract shared file so you eliminate the duplication - maybe we'll talk about this on another blog (maybe not xD).

Testing and validation

After we finish the first version of our Dharma logic file we ran it, and noticed a lot of JavaScript logical errors. Small mistakes that we make normally do, like forgetting a bracket or a comma etc.. To solve these error we created a builder section were we build our logic there:

We had to go through each line one by one to eliminate all the possible logical errors. We also created a wrapper function that wraps the code with try-catch blocks:

By doing so, we made it much easier to isolate and test the possible output.

While we were working on the Dharma logic file we faced another issue. When you want your JavaScript to import something from the .wasm(eg. a table or a memory buffer) you have to provide it from the .wasm module. For that, we ended up making many modules that provide whatever we import from generated JS logic, and export whatever we import from .wasm modules. In brief, to do that we built a lot of .wasm modules, each one exports or imports what JavaScript needs to test an API. An example of this logic:

For that to work, you need the following .wasm file:

So if JavaScript is looking for the main function you should have a main function inside your .wasm module. Also, as we mentioned, there are many things to check like import/export table, import/export buffer, functions, and global variables. We'll have to combine many of them together, but some of them we couldn't like tables. You can only have one on your program either exported or imported. That said, we had to separate them into different modules and avoid some of them to reduce complexity.

After finishing our first version, we went to the chromium bug tracker which appears to be a great place to expand our logic to find more smart, complex tips and tricks. We used some of the snippets there as it is, and some of them with little modification. Also it's worth mentioning that, when you search you should apply the filter that is related to your area of interest. In our case we looked into all bugs that have Type of 'Bug-Security' and the component is Blink>JavaScript>WebAssembly, you can use this line on the search bar.

While we were reading these issues on the bug tracker, we found this bug that could be produced by our Dharma logic (if we were a bit faster xD)

WebAssembly Module

Now that we're done fuzzing the first two components, we can move on to the last component of WebAssembly, which is the module.

Everything that we did earlier was related to fuzzing the APIs and JavaScript's grammar, but we found two interesting functions used to compile and ensure the validity of that module, compile and validate functions. Both of these two function receive a .wasm module. The first function compiles WebAssembly binary code into a WebAssembly module, the second function returns whether the bytes from a .wasm module are valid (true) or not (false).

For both compile and validate, we made a .wasm corpus (by building or collecting), then we used Radamsa to mutate the binary of these files before we imported them from our two functions.

We improved the mutation by skipping the first part of the .wasm module which contains the header of the file (magic number and version), and start to mutate the actual wat instructions.

Stay tuned for the final part of our Dharma blog series, where we implement more advanced grammar files. Happy Hunting!!

Previous
Previous

Exploring Acrobat’s DDE attack surface

Next
Next

Introduction to Dharma - Part 1