Parsing EDIFACT directories
I am looking for the best method of parsing the actual EDIFACT directory files from the UNECE website.
I have managed to write one for the 12A directory using C#, but for older directories such as 96A/B (HTML) and 99A/B (TEXT) it is proving too difficult and time consuming to write a universal parser without having to code version specific rules by checking the file extension to determine which parser needs to be used.
My question is there any parsing library (.NET only) in existence where I can specify how certain files should be parsed/transformed to a different format?
To re-clarify I am not looking to process actual EDIFACT data files, but the source directories themselves.
I found this project which has all the directories in XML format (see the data directory) https://code.google.com/p/izi-sandbox/source/browse/trunk/php/php_edi/
I use it for a dumb interpreter based on my parser https://github.com/sabas/edifact
Check out smooks. The have some code somewhere that parses all of these. I don't remember the exact location of the code however.
This won't be pretty (and may not be free), but look at the Dictionary Viewer from Liaison. You can export the dictionary to HTML, parse the HTML into something you like and go from there.