1748169 Members
3700 Online
108758 Solutions
New Discussion юеВ

Re: xml file conversion

 
Ken Feese
Occasional Advisor

xml file conversion

In OpenVMS, is there any way to convert XML to a flat file? Also, how do convert a flat text file to xml? can it be done in COBOL?
13 REPLIES 13
Antoniov.
Honored Contributor

Re: xml file conversion

Ken,
what do you mean when you say convert?
A XML file is a text file with XML directives that VMS doesn't recognize. Perhaps you need copy a XML file to text file in vms and SET FILE /ATTR may help you.
Please post more info for help

@Antoniov
Antonio Maria Vigliotti
Ken Feese
Occasional Advisor

Re: xml file conversion

The file is in text:
jonesfrankjames1234sunnyvalest anywhere usa999991234apt7

and i need to convert it to something like
... etc.

Hope this helps
Hein van den Heuvel
Honored Contributor

Re: xml file conversion

It helps, but not enough. It would help more if you could also attach a txt file with two or thre input records and sample output. No cut and paste for the input, but an ftp or some such transfer to avoid transformations.
The Forum software nukes spaces, so we can not see whether the data is space or tab seperated or column based.

Anyway... this is NOT a VMS problem, bu t a general purpose computing problem. The problem would be the same whether the input live on a Windoze box, a Unix system or VMS.

I suspect there are tools out there that perform this convert driven by a few rules.
If you have to write somethign from scratch then i would encourage you to investigate a PERL or AWK solution, but DCL can help also.

The biggest question to you is, how can you tell where the 'firstname' stops and the 'middlename' starts? Fixed columns? Tabs/Spaces?

Here is a DCL starting point assuming fixed length fields.

$OPEN INP raw-file.dat
$! Use create (optional /FDL) to get 'normal' file, not DCL special.
$CREATE xml-file.txt
$OPEN/APPEND OUT xml-file.txt
$loop:
$ READ/END=done INP record
$ xml = ""
$WRITE/SYMBOL xml out ! Optional /SYMBOL for long lines
$GOTO loop
$done:
$EXIT

Using PERL and assuming white-space seperated input the solution could be:

perl -p xml-convert.pl < raw-file.dat > xml-file.txt

where xml-convertl.pl looks something like

($last,$first,$mid,$addres)=split;
$_ = ""

The -p will make perl create a loop around the code, reading $_ from input to start, and printing the final $_ at the end of the loop.

Hope this helps,
Hein.
Craig A Berry
Honored Contributor

Re: xml file conversion

Given your example, I take it that by a flat file you mean a sequential file with fixed-width, externally defined fields in each record. These external definitions may usefully be called metadata, "data that's about other data." With XML the metadata are inline with the content, so one of your conversion tasks will be to map the metadata from whatever format it's in (CDD, language structure statements, etc.) to the appropriate XML tags. The definition of "appropriate" varies widely; you may need a Document Type Definition (DTD), or not, depending on the goal of the conversion and the needs of the recipient. You may also need to perform some data conversions, such as if you have binary floating point data that you need to convert to text.

Here is a Perl module that writes XML-encoded output:

http://search.cpan.org/~josephw/XML-Writer-0.500/

I believe libxml2, one of the more popular XML parsers, includes a writer module, see:

http://www.xmlsoft.org/

libxml2 is written in C. No doubt you can call it from COBOL if you have sufficient understanding of the OpenVMS Calling Standard.

As far as parsing XML content, in addition to libxml2, there is expat:

http://sourceforge.net/projects/expat

and HP supplies parsers in Java and C++ that came out of the Apache project.
Hein van den Heuvel
Honored Contributor

Re: xml file conversion

Good ones Craig!
Also thanks for reminding me of the Original query.
I focussed on the intermediate clarification, but I overlooked half of the original question "can it be done in COBOL? "

Absolutely! Cobol could do it all, allthough is it not known for begin very flexible in string handling. If need be you can always use Cobol to call RTL routines like STR$APPEND to create a long dynamic string.
But moreover, you probably have the record definition readily available in Cobol. So at the very least you can use Cobol to transform the raw data into a more readily parseable datastream for example by using a character not normally found in the data as seperator (~, tab, |, whatever), or a comma seperated list of quoted fields.

If you consider perl for the job, and are not too familiar with it for now, then be sure to read up on pack/unpack (tricky at first, but great to deal with fixed length columns) and printf/sprintf.

Hein.
Ken Feese
Occasional Advisor

Re: xml file conversion

You guys are close but not on target....
Yes the file is created based upon an extract spec that the client wants and the fields, as any cobol programmer knows are delimited by the FD but the client/vendor doesn't care about the FD, they just want it in XML format which befudles me as to why they can't extrapulate the data in the format that they want and massage it at while. Guess this is a management decision and outside the realm of the developer.
Craig A Berry
Honored Contributor

Re: xml file conversion

Ken,

How are we not on target? Based on your additional comments, it seems to me you have been given plenty to go on to write your own converter. XML is an excellent data interchange format, and I think your client/vendor is reasonable to ask for it.
Antoniov.
Honored Contributor

Re: xml file conversion

Ken,
as posted by Hein your problem isn't VSM but how convert a text file into xml file.
You need a specific application.
About Cobol, on VMS you can manage LINE SEQUENTIAL file to read and write text files; this is VMS specific and it isn't ANSI standard.
If you known how you have write into xml files you can write into it using string statement too.

@Antoniov
Antonio Maria Vigliotti
Willem Grooters
Honored Contributor

Re: xml file conversion

Ken,
I agree with others this is not VMS related. The very same problem will arise in Unix or windows (or any OS).

First text to XML.
Main problem you'll face is the understanding of how the file is to be interpreted.

jonesfrankjames1234sunnyvalest anywhere usa999991234apt7

... etc.

Given the string "jonesfrankjames1234sunnyvalest", there is NO unique way to determine first, middle and last name - unless is is guaranteed that each of these is exactly 5 characters in size. I doubt this is true.
Once this can be determined, it won't be a problem at all to add the structure of the data - that's what XML is all about. The programming language is no issue - of course it can be done in COBOL, but the CXM package that comes on the e-business CD (requires C compiler) makes it even easier.

The other way around is more straightforward, but again you need to know about the structure of the data.

Willem
Willem Grooters
OpenVMS Developer & System Manager