Smart CODE | |
Your on-line guide to the generated code |
#include <SGML.h> |
SGML_t * scRegisterSGMLMimeType( mimetype, dtd) char * mimetype; char * dtd; |
SGML_t * scRegisterHTML( mimetype) char * mimetype; |
int scRegisterSGMLErrorHandler( handler) void (*handler)(); |
int scAddTagCallback( sgm, tagname, type, callback, data) SGML_t * sgm; char * tagname; int type; void (*callback)(); void * data; |
int scAddAttrCallback( sgm, tagname, attrname, callback, data) SGML_t * sgm; char * tagname; char * attrname; void (*callback)(); void * data; |
int scProcessSGML( sgm, istream) SGML_t * sgm; InputStream istream; |
environment variable DTDDIR |
library -lsgml in the lib directory of your distribution |
The SGML library provides a standard, upgradeable mechanism for filtering the input data. We provide it so that you don't need to spend time parsing HTML. It also isn't an AD-HOC parser. It is the reference parser provided by the SGML User Group, and it uses a standard HTML32 DTD. As the HTML standard moves on, you can just upgrade the DTD.
What might you use it for?
The rest of this page contains reference documentation for this API as well as a worked example
SGML_t *
scRegisterSGMLMimeType( mimetype, dtd)
char * mimetype;
char * dtd;
This is used to associate a Mime type with an SGML DTD. The most common will be:
SGML_t * sgm = scRegisterSGMLMimeType( "text/html", "HTML32.soc");
An alternative that you can use is:
SGML_t *
scRegisterHTML( mimetype)
char * mimetype;
which associates text/html with the HTML32 DTD
void
errorhandler( s)
char * s;
using:
int
scRegisterSGMLErrorHandler( handler)
void_f handler;
ON_ENTRY |
ON_EXIT |
ON_ATTR |
to say when you want your routine to be called
int
mycallback( tag, attribute, type, call_data, client_data)
char * tag;
char * attribute;
int type;
void * call_data;
void * client_data;
int
scAddTagCallback( sgm, tagname, type, callback, data)
SGML_t * sgm; /* the parser handle scRegisterSGMLMimeType */
char * tagname; /* eg "LI" "MENU" "A" */
int type; /* ON_ENTRY (for <MENU>) ON_EXIT (for </MENU>) and ON_ATTR */
void (*callback)(); /* your routine */
void * data; /* data you want passed into your routine */
int
scAddAttrCallback( sgm, tagname, attrname, callback, data)
SGML_t * sgm; /* the parser handle scRegisterSGMLMimeType */
char * tagname; /* eg "A" or "MENU" */
char * attrname; /* eg "SRC" or "href" */
void (*callback)(); /* your routine */
void * data; /* data you want passed into your routine */
int
scProcessSGML( sgm, istream)
SGML_t * sgm; /* the parser handle scRegisterSGMLMimeType */
InputStream istream; /* the input stream from the server */
to parse the document
int
processMyData ( sc_data_t * data )
{
group0_t * group = (group0_t*)data->group;
char * type = data->content_type; /* mime type */
InputStream i = (InputStream) data->data;
int len = data->content_length;
return 0;
}
Here is the same, filled out so that it parses the input stream if
it is HTML:
int
processMyData ( sc_data_t * data )
{
group0_t * group = (group0_t*)data->group;
char * type = data->content_type; /* mime type */
InputStream i = (InputStream) data->data;
int len = data->content_length;
SGML_t * sgm;
if ( strcmp( type, "text/html") != 0)
return -1;
sgm = scRegisterHTML( type); /* the parser object */
(void) scAddTagCallback( sgm, "A", ON_ENTRY, getanchor, "a-call");
(void) scAddAttrCallback( sgm, "A", "HREF", getlinkinfo, "href");
(void) scProcessSGML( sgm, i);
return 0;
}
this will call your getanchor and getlinkinfo routines as the links are
seen in the parsed input:
int
getanchor( tag, attr, type, call_data, client_data)
char * tag;
char * attr;
int type;
void * call_data;
void * client_data;
{
printf("anchor-start(%s)\n", client_data);
}
int
getlinkinfo( tag, attr, type, call_data, client_data)
char * tag;
char * attr;
int type;
void * call_data;
void * client_data;
{
printf( "%s=%s\n", client_data, call_data);
}
This is the tree data structure that is returned:
typedef struct snode_s {
char * s_tag;
sattribute_t * s_attributes;
int numchildren;
union {
struct snode_s * children;
sdata_t * data;
} body;
struct snode_s * s_stackprev;
struct snode_s * s_next;
stag_t * s_ref;
} DOCtree_t;
Here is an extract of the license for this software, from the SGML User Group. The full text is included with the sources.
Standard Generalized Markup Language Users' Group (SGMLUG) SGML Parser Materials 1. License SGMLUG hereby grants to any user: (1) an irrevocable royalty-free, worldwide, non-exclusive license to use, execute, reproduce, display, perform and distribute copies of, and to prepare derivative works based upon these materials; and (2) the right to authorize others to do any of the foregoing. [...] (d) SGMLUG has no knowledge of any conditions that would impair its right to license the SGML Parser Materials. Notwithstanding the foregoing, SGMLUG does not make any warranties or representations that the SGML Parser Materials are free of claims by third parties of patent, copyright infringement or the like, nor does SGMLUG assume any liability in respect of any such infringement of rights of third parties due to USER's operation under this license.