Small and Simple XML-Parser in REXX
Автор: Johannes Stelzer
Дата: 08.10.2008
Источник: Joshi's Blog
I've written a very small and simple XML-parser in Rexx.
The parser is based on the COBOL-way of parsing.
The parser queues XML-events (e.g. "START-OF-ELEMENT"), on the external queue. The calling procedure reads those events and processes them.
For example: xmltest.rexx
/* REXX */
parse arg infile
call 'parseXML' infile
do while queued() > 0
parse pull xmlevent
parse pull xmltext
say xmlevent : xmltext
end
exit 0
The parser doesn't check the XML on being well-formed or against some xsd or dtd.
And it doesn't know namepaces...
But the code can be easily adapted to run under zOS without any dependency.
That's why i wrote it.
The Code:
/* Rexx xml parser
Copyright (c) 2008 Johannes Stelzer
This software is provided 'as-is', without any express or implied warranty.
In no event will the authors be held liable for any damages arising from
the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it freely,
subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software in
a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
parse arg xmlfile
call throwEvent 'START-OF-DOCUMENT', xmlfile
do while lines(xmlfile) > 0
xmlline = linein(xmlfile)
call parseline xmlline
end
call throwEvent 'END-OF-DOCUMENT', xmlfile
exit 0
throwEvent: procedure
parse arg xmlevent, xmltext
queue xmlevent
queue xmltext
return
parseLine: procedure
parse arg xmlline
do while pos('>',xmlline) > 0
parse var xmlline vorher '<' token '>' +1 xmlline
token = '<' || token || '>'
if vorher <> '' then
call throwEvent 'CONTENT-CHARACTERS', parseEntities(vorher)
select
when index(token,'<?') > 0 then do
parse var token '<?' name '?>'
call throwEvent 'PROCESSING-INSTRUCTION', strip(name)
end
when index(token,'<!--') > 0 then do
parse var token '<!--' innen '-->'
call throwEvent 'COMMENT', space(innen)
end
when index(token,'</') > 0 then do
parse var token '</' innen '>'
call throwEvent 'END-OF-ELEMENT', space(innen)
end
when index(token,'/>') > 0 then do
parse var token '<' name attributes '/>'
call throwEvent 'START-OF-ELEMENT', strip(name)
call parseAttributes attributes
call throwEvent 'END-OF-ELEMENT', strip(name)
end
otherwise
parse var token '<' name attributes '>'
call throwEvent 'START-OF-ELEMENT', strip(name)
call parseAttributes attributes
end
end
if xmlline <> '' then
call throwEvent 'CONTENT-CHARACTERS', parseEntities(xmlline)
return
parseAttributes: procedure
parse arg attributes
attributes = space(attributes)
do while attributes <> ''
parse var attributes name '=' '"' val '"' attributes
call throwEvent 'ATTRIBUTE-NAME', strip(name)
call throwEvent 'ATTRIBUTE-VALUE', parseEntities(val)
end
return
parseEntities: procedure
parse arg text
out = ''
do while text <> ''
parse var text vor '&'+1 entity ';'+1 text
if entity <> '' then do
select
/* XML-entities */
when entity = 'lt' then entity = '<'
when entity = 'gt' then entity = '>'
when entity = 'amp' then entity = '&'
when entity = 'apos' then entity = "'"
when entity = 'quot' then entity = '"'
otherwise
entity = '&' || entity || ';'
end
end
out = out || vor || entity
end
return out