Deciphering .grf files
August 10, 2012
In DirectShow we work with graphs of filters. We build them in tools like GraphEdit
or GraphEditPlus while
experimenting and then we build them in our own code. Some parts of graph can be
built automatically by DirectShow's intelligent connect procedure which selects
filters according to their ability to handle given mediatypes and their priorities.
To see details of graph built by our code we can save the graph to a file and then
open it in an editor. Loading a graph from a file is done by calling
IPersistStream::Load() method which performs all the loading logic and either succeeds
or fails, there's not much control over its actions. If the graph was created on a
different machine or the same machine but in different circumstances and
it mentions some filters, devices or even files not available at the moment of
loading, then loading fails and the graph file is pretty much worthless.
Not anymore! Here is a small utility which can read a .grf file and translate it
to plain text containing most of useful information. Now you can easily see
graph details (filters, connections, mediatypes, including all basic info for video
and audio streams) even if you don't have all the mentioned filters and files.
grfdump.zip (134 KB)
It's a command line tool, you run it like
grfdump.exe file.grf > plain_text.txt
and get something like:
Filters:
1 File Source (Async.) 0000 E436EBB5-524F-11CE-9F53-0020AF0BA770 SOURCE: C:\video\Video22.MP4 filter data: 0 bytes.
2 Elecard MP4 Demultiplexer 9A79C4D0-84CC-46F3-824C-BC5793D5596C filter data: 99 bytes.
3 Elecard AVC Video Decoder 5C122C6D-8FCC-46F9-AAB7-DCFB0841E04D filter data: 532 bytes.
4 Elecard AAC Audio Decoder 109DF9EC-AEA3-47A3-97EA-DAAF57EC97F0 filter data: 180 bytes.
5 VDFilter 6CD44B99-8406-4E8B-A522-911FCFBEA2F2 filter data: 87 bytes.
6 Video Renderer 70E102B0-5556-11CE-97C0-00AA0055595A filter data: 0 bytes.
7 Default DirectSound Device 79376820-07D0-11CF-A24D-0020AFD79767 filter data: 40 bytes.
Connections:
File Source (Async.) 0000.Output --> Elecard MP4 Demultiplexer.Input
fixed size: 1, temporal: 0, sample size: 1
major type: e436eb83-524f-11ce-9f53-0020af0ba770 MEDIATYPE_Stream
sub type: 49952F4C-3EDC-4A9B-8906-1DE02A3D4BC2
format type: 00000000-0000-0000-0000-000000000000
format size: 0
Elecard MP4 Demultiplexer.H.264 Video (Annex B) --> Elecard AVC Video Decoder.In
fixed size: 0, temporal: 1, sample size: 58
major type: 73646976-0000-0010-8000-00AA00389B71 'vids' == MEDIATYPE_Video
sub type: 8D2D71CB-243F-45E3-B2D8-5FD7967EC09B
format type: E06D80E3-DB46-11CF-B4D1-00805F6CBBEA
format size: 194
rcSource: {left: 0, top: 0, right: 704, bottom: 572}
rcTarget: {left: 0, top: 0, right: 704, bottom: 572}
bitrate: 0 AvgTimePerFrame: 1199600
biSize: 0
biWidth: 704
biHeight: 572
biPlanes: 1
biBitCount: 24
biCompression: 875967080
biSizeImage: 0
biXPelsPerMeter: 0
biYPelsPerMeter: 0
biClrUsed: 0
biClrImportant: 0
Elecard MP4 Demultiplexer.AAC Audio --> Elecard AAC Audio Decoder.Input Pin
fixed size: 1, temporal: 0, sample size: 1
major type: 73647561-0000-0010-8000-00AA00389B71 'auds' == MEDIATYPE_Audio
sub type: 000000FF-0000-0010-8000-00AA00389B71
format type: 05589f81-c356-11ce-bf01-00aa0055595a FORMAT_WaveFormatEx
format size: 23
wFormatTag: 255
nChannels: 2
nSamplesPerSec: 48000
nAvgBytesPerSec: 24000
nBlockAlign: 4
wBitsPerSample: 16
cbSize: 5
...
A few thoughts arised while making this tool.
DirectShow Graph file format
The format is kind of
described in MSDN.
It's a COM storage file containing a stream for which a grammar is given. The guy
who wrote this grammar for MSDN clearly didn't understand anything about grammars.
For example, here's a line from that "grammar":
<filter list> ::= [<filter> <b>] <filter list>
This line actually says that a filter list always consists of a filter list.
And it may contain some filters too, but not necessarily. ;)
What he probably meant is
<filter list> ::= | <filter> [<b> <filter list>]
(filter list is either empty or consists of a filter optionally followed by
a filter list)
Not only the form is screwed, content of that grammar is not correct either!
It becomes obvious if you compare connection description in the grammar and in the
example below (on that MSDN page). Some fields are missing, some are not in
proper place. If you ever decide to implement parsing this kind of files follow
the example, not the grammar. But even this will not give you a correct implementation
because after trying to open some real files you'll find that even simplest description
of what constitutes a blank character is wrong: there can also be a zero char.
The data is a mix of unicode text and binary data, that binary data can have odd size,
so the unicode strings are not aligned at 2-byte offsets, so you can't just treat
the data as a big unicode string and use simple string processing functions.
Luckily I made this tool in D programming language which allowed to create a bunch
of simple parser combinators and primitive parsers, so the parsing code is pretty
concise and looks similar to the grammar. For example, this line from the grammar
<filter> ::= <n><b>"<name>"<b><class id><b>[<file>]<length><b1><filter data>
became this code in D:
int n, datalen;
wstring name, cls, fname;
auto r = st.b().num(n).b().p_name(name).b().p_clsid(cls).b();
r = r.opt!((x) { return x.p_file(fname); }).num(datalen).b1();
Structures, mixins and reflection
Without reflection outputting contents of structures is a tedious task. In D we can
enjoy compile-time reflection and write a simple generic function:
void printRecord(R)(R r)
{
foreach(fld; __traits(allMembers, R))
writeln(fld, ": ", __traits(getMember, r, fld));
}
When applied to different structures it's automatically unrolled to different
functions writing contents of that structures (including field names).
In DirectShow there are 2 very similar structures which are used very often:
VIDEOINFOHEADER and VIDEOINFOHEADER2. The latter contains all the fields of the first
however not in the same order, so one cannot just cast one to the other and use the same
code for both. In C++ this usually leads to code duplication. In D we can use mixins:
write common code once and include it to both structures.
mixin template print_vih() {
void print()
{
write("rcSource: "); rcSource.print();
write("rcTarget: "); rcTarget.print();
writeln("bitrate: ", dwBitRate, " AvgTimePerFrame: ", AvgTimePerFrame);
printRecord(bmiHeader);
}
}
struct VIDEOINFOHEADER {
RECT rcSource; // The bit we really want to use
RECT rcTarget; // Where the video should go
DWORD dwBitRate; // Approximate bit data rate
DWORD dwBitErrorRate; // Bit error rate for this stream
ulong AvgTimePerFrame; // Average time per frame (100ns units)
BITMAPINFOHEADER bmiHeader;
mixin print_vih;
} ;
struct VIDEOINFOHEADER2 {
RECT rcSource;
RECT rcTarget;
DWORD dwBitRate;
DWORD dwBitErrorRate;
ulong AvgTimePerFrame;
DWORD dwInterlaceFlags;
DWORD dwCopyProtectFlags;
DWORD dwPictAspectRatioX;
DWORD dwPictAspectRatioY;
union {
DWORD dwControlFlags;
DWORD dwReserved1;
};
DWORD dwReserved2;
BITMAPINFOHEADER bmiHeader;
mixin print_vih;
}
Source code is included in the archive linked above. This is a very small and simple
tool so I don't want to restrict its use with any license, you can do with it whatever
you wish.
tags: directshow
|