Data Analyst: Pujan Vakharia
Client/Sponsor: Metabis
Purpose
Avocado is one of the growing health foods, (rather health fruits). With rising awareness about health-conscious consumption of food and beverages, avocados have become a preferred ingredient of many consumers over the last decade. To understand the future of avocados, the study is being undertaken to see if the rise of avocado demand correlates significantly to that of health-conscious consumption. Furthermore, the interplay of supply and demand is to be assessed to understand what adjusted first to rising avocado popularity, supply or demand. Assumption is that supply came after demand, we would like to confirm the same through pricing trends see how future trends of supply affect the demand and price. The insights generated should help predict the future consumption of avocado for years to come (5-10 year period).
Note: Data used for this project is publicly available here.
Scope / Major Project Activities
Activity | Description |
Acquire historical avocado pricing, sales and produce data | The essential parts of understanding supply and demand, avocado prices, sales and avocado production, preferably for the period of past decade to be collected. |
Processing: Cleaning and consolidation | Preparing the dataset for analysis |
Analysis | Test hypothesis, perform exploratory data analysis to answer various open-ended questions to understand the world of avocados inside and out |
Test Correlation with Health Awareness | Find data of health awareness and compare the avocado demand with the same to see if there is correlation. |
Identify supply and demand trend relation and pricing | Compare sales and production details to see how it relates. |
Create prediction models | Test different models such as multiple linear regression, random forest etc. |
Final report and predictions | Final report discussing impact of health-conscious consumption trend, supply and demand etc plus the predictions for next (5-10 year) period. |
This Project Does Not Include
- Does not involve any other item beyond avocado.
- Does not involve data beyond that collected from HAB, meaning, growth produce historical data relies on its availability on HAB website or other public dataset platforms, may not be whole decade.
- Does not involve predictions internationally, only for USA and that too, as a whole country. Furthermore, even though the granular details like that of specific regions may be utilized in predicting future demand of avocados for the whole country, granular details of avocado sales predictions for specific regions is not provided.
Deliverables
Deliverable | Description/ Details |
Finalized dataset | The cleaned and processed dataset based on which the analysis is conducted. |
Initial Insights | Insights generated from primary analysis including exploratory data analysis and other general insights. |
Predictions | Prediction of avocado demand for the country of USA (5-10 year time period). |
Schedule Overview / Major Milestones
Milestone | Expected Completion Date | Description/Details |
All datasets procured | 4-Feb-2024 | Information to be collected from Hass Avocado Board |
Datasets are cleaned and processed | 18-Feb-2024 | Datasets to be cleaned, aggregated and prepared for further analysis with tests to ensure consistency etc. |
Exploratory Data analysis report generated | 3-Mar-2024 | Initial insights report and exploratory data analysis conducted and generated, includes research to identify key questions and further research to answer the same qualitatively |
Prediction model finalized with 80%+ accuracy | 17-Mar-2024 | Various prediction models tried and tested against test data and best one to be finalized. (Kaggle tasks to be completed hopefully beforehand) |
Predictions and corresponding report generated | 31-Mar-2024 | Final report to be finalized and generated. And be forwarded for publishing, promotion and continuous tracking and periodic improvement and promotion and SEO procedures |
*Estimated date for completion
Estimated date for completion is 31-Mar-2024.
This scope of work layout is derived from Scope-of-Work-Template provided by Google under Google Data Analytics courses.