Package dap :: Package responses :: Package html :: Module nanosax :: Class nsparser
[hide private]
[frames] | no frames]

Class nsparser

source code

object --+
         |
        nsparser

very simple parser for XML

emits events like SAX, except the API is a lot ('even' ;) simpler

Instance Methods [hide private]
  __init__(self, handler)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
  parse(self, xml)
parse the xml using self.handler
  _handle_pis(self, xml)
handle processing instructions
  _parse_into_chunks(self, xml)
  _parse_start(self, lineno, data)
  _test(self, assertion, lineno, message)
raises an exception with message as text when assertion is false

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__


Class Variables [hide private]
  TYPE_TEXT = 1
  TYPE_START = 2
  TYPE_END = 3
  TYPE_COMMENT = 4
  TYPE_CDATA = 5
  _reg_name = <_sre.SRE_Pattern object at 0x1728da0>
  _reg_start = <_sre.SRE_Pattern object at 0x16264f0>
  _reg_attr = <_sre.SRE_Pattern object at 0x15453e0>
  _reg_xml_decl = <_sre.SRE_Pattern object at 0x159bbd8>
  _reg_encoding = <_sre.SRE_Pattern object at 0x1705368>
  _reg_pi = <_sre.SRE_Pattern object at 0x1728e00>
  _reg_dtd_1 = <_sre.SRE_Pattern object at 0x152cb80>
  _reg_dtd_2 = <_sre.SRE_Pattern object at 0x163b110>

Properties [hide private]

Inherited from object: __class__


Method Details [hide private]

__init__(self, handler)
(Constructor)

source code 
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
Overrides: object.__init__
(inherited documentation)

parse(self, xml)

source code 

parse the xml using self.handler

xml is supposed to be either a unicode or ascii string, or a string with the character set as defined in the xml declaration

_handle_pis(self, xml)

source code 

handle processing instructions

takes care of handling (if appropriate) the XML declaration, and of discarding any processing instructions and document type declarations etc. the lib can't deal with

returns unicode, if the input string is not already unicode the charset mentioned in the XML declaration will be used for conversion (if any)

_parse_into_chunks(self, xml)

source code 
None

_parse_start(self, lineno, data)

source code 
None

_test(self, assertion, lineno, message)

source code 
raises an exception with message as text when assertion is false

Class Variable Details [hide private]

TYPE_TEXT

None
Value:
1                                                                     
      

TYPE_START

None
Value:
2                                                                     
      

TYPE_END

None
Value:
3                                                                     
      

TYPE_COMMENT

None
Value:
4                                                                     
      

TYPE_CDATA

None
Value:
5                                                                     
      

_reg_name

None
Value:
^[\w:-]+$                                                              
      

_reg_start

None
Value:
^([\w:-]+)(\s+(([\w:-]+)=(("(?=([^"]*)")[^"]*")|('(?=([^']*)')[^']*'))
))*$                                                                   
      

_reg_attr

None
Value:
([\w:-]+)((="(?=([^">]*)"))|(='(?=([^'>]*)')))                         
      

_reg_xml_decl

None
Value:
<\?xml.*?>                                                             
      

_reg_encoding

None
Value:
encoding="([^"]+)"                                                     
      

_reg_pi

None
Value:
<\?.*?>                                                                
      

_reg_dtd_1

None
Value:
<!DOCTYPE\s+[\w:-]+\s+\[.*?\]>                                         
      

_reg_dtd_2

None
Value:
<!DOCTYPE\s+[\w:-]+\s+SYSTEM\s+.*?>