Details: Tech Support by: Emerald City IT; Support Field: Computer Repair and Tech Support; Support Category: Virus, Spyware, & Malware Removal

Reader Martin asks us for some help extracting embedded content from a submitted malicious document.

These are the streams inside the document, listed with oledump.py:

20230329-205636 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

The streams to extract are those where the stream name includes Package, CONTENTS, ... .

This can be done with oledump as follows: oledump.py -s 6 -d sample.vir > extracted.vir

-s 6 selects stream 6, and -d produces a binary dump which is written to a file via file redirection (>).

This has to be repeated for every stream that could be interesting.

But I also have another method, that involves less repeated commands.

First, we let oledump.py analyze the file, and produce JSON output. This JSON output contains all the streams (id, name and content) and can be consumed by other tools I make, like file-magic.py, a tool to identify files based on their content.

Like this:

20230329-210251 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

file-magic.py identified the content of each stream: data, Word, PDF, ...

We can now let file-magic.py produce JSON output, that can then be filtered by another tool: myjson-filter.py:

20230329-210653 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

By default, myjson-filter.py produces JSON output (filtered), but with option -l (--list), we obtain a list of the items and can easily observe what the effect of our filtering is (for the moment, we have not yet filtered).

With option -t, we will filter by type (determined by file-magic.py). Option -t takes a regular expression that will be used to select types. Let's go with regular expression data:

20230329-210946 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

At first, what is identified as just data, doesn't interest us. So we will reverse the selection (v), to select everything that isn't data, like this:

20230329-211146 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

I justed added a new option to my myjson-filter.py tool, to easily write all selected items to disk as individual files: option -W (--write).

Option -W requires a value: vir, hash, hashvir or idvir. Value vir instructs my tool to create files with a filename that is the (cleaned) item name and with extension .vir. Like this:

20230329-211813 Extracting Multiple Streams From OLE Files, (Wed, Mar 29th) | Computer Repair, Networking, and IT Support in Seattle, WA

So now we have written all streams to disk, that were identified as something else than just plain data.

If you don't find what you are looking for in these files, just use -t data to write all data files to disk, and see if you can find what you are looking for in these files.

For another example of my tools that support JSON, take a look at my blog post "Combining zipdump, file-magic And myjson-filter".

Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com