Meeseeks

screenshot of Meeseeks

An Elixir library for parsing and extracting data from HTML and XML with CSS or XPath selectors.

Overview

Meeseeks is an intuitive Elixir library designed for parsing and extracting data from HTML and XML using CSS or XPath selectors. By offering a user-friendly API and powerful capabilities, it facilitates efficient data manipulation for developers looking to work with web data. Meeseeks stands out due to its combination of a robust HTML5 parser and a permissive XML parser, making it suitable for various use cases.

With support for custom selectors and a range of built-in extraction helpers, Meeseeks simplifies the process of querying and managing data from complex documents. Whether you’re working on a simple project or a larger application, this library is tailored to meet the needs of modern developers in parsing tasks.

Features

  • Friendly API: Designed for ease-of-use, the API enables quick access to HTML and XML parsing functionalities.

  • Browser-grade HTML5 parser: Handles modern HTML standards effectively, ensuring accurate parsing of web content.

  • Permissive XML parser: Offers flexibility in processing XML documents, accommodating various formats and structures.

  • CSS and XPath selectors: Leverage the power of both CSS and XPath for sophisticated data selection from documents.

  • Supports custom selectors: Customize the way you select nodes, tailoring queries to meet specific requirements.

  • Helpers for data extraction: Built-in functions like attr, html, and text make retrieving information straightforward and efficient.

  • Compatibility with Elixir and Erlang: Tested with a minimum of Elixir 1.16.0 and Erlang/OTP 26.0, ensuring stability and performance.

  • No Rust installation required: Uses pre-compiled NIFs via rustler_precompiled, eliminating the need for local Rust installations during setup.