Overview
The project at hand is a dual-function Python application that effectively illustrates the combination of web scraping and web development using two powerful frameworks: Scrapy and Flask. It allows users to scrape information from the Internet Movie Database (IMDb) and then presents that data through a user-friendly web interface. By utilizing MongoDB for data persistence, this project demonstrates an efficient way to handle and display movie-related information, making it an interesting exploration for both developers and movie enthusiasts alike.
The first part of the project focuses on collecting movie data, such as names, ratings, and genres, through a Scrapy spider. Once the data is gathered, it is stored in a MongoDB database, forming the backbone of the second part where Flask serves a web application for users to access the collected information.
Features
- Web Scraping with Scrapy: The project uses Scrapy to effortlessly fetch detailed information about movies from IMDb, parsing lists such as the top 250 movies.
- MongoDB Integration: Utilizing MongoDB allows for efficient storage and retrieval of movie data, making it easy to manage large datasets.
- Flask Web Application: A lightweight web app built with Flask presents the scraped movie information, enabling easy access for users.
- User Search Functionality: Users can search for movies by name, rating, genre, or year, offering a flexible way to find specific movie details.
- Predefined Queries: A sidebar in the web application allows users to quickly access predefined queries for common movie categories.
- Local Development Setup: Easy setup using Vagrant makes it simple for users to launch the environment and run the application with minimal hassle.
- Responsive Design: The web application is designed to render the gathered data in a user-friendly and visually appealing manner, enhancing the overall user experience.