FunpySpiderSearchEngine

screenshot of FunpySpiderSearchEngine
django

Word2vec 千人千面 个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

Overview

The combination of Word2vec, Scrapy, ElasticSearch, and Django presents a powerful suite for personalizing search experiences. This innovative framework enables users to crawl data, store it efficiently, and provide dynamic search functionalities that enhance the end-user experience. In a world where relevant content is crucial for engagement, this approach addresses the need for intelligent and personalized searches across various platforms.

By leveraging advanced technologies such as Redis for real-time displays and word embeddings from Word2vec, this setup ensures that users not only find what they're looking for but are also presented with contextually relevant suggestions. The user-centric design of this system fosters deeper interactions and a more intuitive search process.

Features

  • Data Crawling with Scrapy: Seamlessly crawl data from various sources, such as Zhihu, to populate your database for extensive search capabilities.
  • ElasticSearch for Storage: Efficiently store and manage crawled data with ElasticSearch, providing a robust backend ready to support scalable search functionalities.
  • Full-text Search: Implement powerful full-text search capabilities that return not only results but also highlight relevant keywords for enhanced visibility.
  • Real-time Data Display with Redis: Utilize Redis to show the real-time count of crawled items across three different sites, improving transparency and user engagement.
  • Word2vec Integration: Enhance search relevancy through the use of Word2vec, allowing semantic understanding of queries and improving ranking based on historical user behavior.
  • Dynamic Scoring Mechanism: Leverage custom scoring algorithms that double scores for titles with relevant keywords, ensuring that pertinent results are prioritized.
  • User-friendly Setup: Clear instructions for installation and configuration make it easy for developers to set up ElasticSearch and Redis, getting the project up and running with minimal hassle.
django
Django

Django is a high-level Python web framework that encourages rapid development and clean, pragmatic design. It follows the model-view-controller (MVC) architectural pattern, providing an extensive set of built-in tools and conventions to streamline the creation of robust and scalable web applications.