HTMLReader

screenshot of HTMLReader

A WHATWG-compliant HTML parser in Objective-C.

Overview

HTMLReader is a robust HTML parser crafted for Objective-C, designed to adhere to the WHATWG HTML specification and seamlessly integrate CSS selectors. This tool offers functionality similar to that of modern web browsers, enabling developers to efficiently parse and manipulate HTML across multiple Apple platforms, including iOS, macOS, tvOS, and watchOS. With its straightforward installation process and powerful capabilities, HTMLReader is ideal for anyone needing to scrape HTML content effectively.

The parser stands out due to its independence from less flexible libraries, ensuring support for modern markup while avoiding issues often seen with legacy tools. Whether you are looking to extract specific elements or restructure HTML content, HTMLReader provides the necessary features to accomplish these tasks easily.

Features

  • Cross-Platform Support: HTMLReader is compatible with iOS, macOS, tvOS, and watchOS, allowing for versatile application development across Apple's ecosystem.

  • Modern Browser Parsing: It accurately parses HTML in a way that mimics modern browsers, addressing the limitations of older libraries.

  • CSS Selector Integration: You can utilize CSS selectors to navigate and manipulate HTML content easily, enhancing your ability to work with documents programmatically.

  • Simple Installation Options: HTMLReader can be integrated into your project via various methods, including CocoaPods, Carthage, and Swift Package Manager, providing flexibility in setup.

  • Lightweight Dependency: The library relies solely on Foundation, minimizing overhead and simplifying builds.

  • Tested with html5lib: HTMLReader integrates rigorous tokenization and tree construction tests, ensuring a high standard of functionality.

  • Rich Documentation: The tool comes with comprehensive examples and documentation to help users quickly understand and implement its features.