C HTML Parser, HTML Scraper, Convert HTML to well-formed XML

来源:百度文库 编辑:神马文学网 时间:2024/05/03 05:02:40
Chilkat C++ HTML-to-XML / HTML Parser Library
* Read the README.html contained in each download.
C++ Linking Help
Download Chilkat C++ Libraries for VC++ 9.0 / Win32
Download Chilkat C++ Libraries for VC++ 8.0 / Win32
Download Chilkat C++ Libraries for VC++ 8.0 / x64
Download Chilkat C++ Libraries for VC++ 7.0
Download Chilkat C++ Libraries for VC++ 6.0
Documentation· Purchase·License·C++ Examples
HTML to XML Conversion C++ Library.
The Chilkat HTML-to-XML library is designed for the purpose oftransforming HTML into well-formed XML for parsing. If effect, it isdesigned to be an HTML parser / scraper. Once HTML is converted toXHTML (i.e. well-formed XML), the plethora of existing XML parsingcomponents and libraries can be leveraged for HTML parsing and scraping.
File-to-file HTML to XML conversion.
Memory-to-memory HTML to XML conversion.
Convert character encoding during conversion process.
Flexibility in controlling how HTML entities are handled.
Automatically convert HTML entities to corresponding 8-bit characters.
Optionally drop all text formatting tags from the output.
Drop/undrop specific tags from the output.
HTML / XML Examples
HTML Example #1
HTML Example #2
HTML Example #3