Python parser for C, C++

Created Mittwoch 15 März 2017

For many years I've been using pygccxml, which is a very nice Python wrapper around GCC-XML. It's a very full featured package that forms the basis of some well used code-generation tools out there such as py++ which is from the same author.

Parsing C++ is fiddly, and few parsers have been written that aren't part of a compiler. You can find a good summary of the issues here.

I would suggest using the PLY lex/yacc tool. There's a prebuilt C parser, and the parser itself is quite fast. Once you have the file parsed, it shouldn't be too hard to find all of the functions.

pycparser is a parser for the C language, written in pure Python. It is a module designed to be easily integrated into applications that need to parse C source code.

One of the most popular uses of pycparser is in the cffi library, which uses it to parse the declarations of C functions and types in order to auto-generate FFIs.

In order to be compilable, C code must be preprocessed by the C preprocessor - cpp. cpp handles preprocessing directives like #include and #define, removes comments, and does other minor tasks that prepare the C code for compilation.

For all but the most trivial snippets of C code, pycparser, like a C compiler, must receive preprocessed C code in order to function correctly. If you import the top-level parse_file function from the pycparser package, it will interact with cpp for you, as long as it's in your PATH, or you provide a path to it.

Note also that you can use gcc -E or clang -E instead of cpp.