Bethuel Godstime
let's help you sell your business
01/07/2023
Data Analytics Challenge at Harvoxx Tech. Hub with The Sage, Sir Stewart Ezekiel Jnr.
HI there...!
It is Day 8 of our . Thankfully, We are edging closer. You can drop a question in the comment box if you have, or whatever thoughts about this analysis that is on your mind; whether corrections, suggestions, additions, or even another analysis you like us to look at.
Today, we will seek to define some CORRELATIONS. I should have checked this earlier, but it is better late than never.
A correlation can be defined as an association existing between two entities; in this case, the discounted price and the actual price. However, correlation should not be misjudged for causation. Story for another time.
A correlation could be positive linear(coefficient values of +1 for a strong positive linear correlation), negative linear(coefficient value of -1 for a strong negative liner correlation), curvilinear, or no correlation at all( where the correlation value is 0).
The attached graphs and tables help to confirm the association existing between these two from our dataset.
Thus means, a change in one would likely affect the other in the same direction.
Tomorrow, we would use Power BI to further analyze and finally visualize our dataset. Stay tuned. For now,
Good night, And happy new month again.! Welcome to JULY...!!
01/07/2023
Data Analytics Challenge at Harvoxx Tech. Hub with The Sage, Sir Stewart Ezekiel Jnr.
It is the Day 7
TOPIC: _PIVOT TABLES _
A PIVOT TABLE is a data analysis tool that can automatically summarize, sort, count aggregate and average your dataset in a spreadsheet, while displaying the summarized values in a new table. Pivots acts as a sort of query against a source dataset, with the data source likely to exist in the same source location in the spreadsheet.
It Has become one of Excel’s most powerful data analysis tools with extensive usage by financial analysts around the world.
Due to this summarization and aggregation features, users can easily pick out major data points to help answer statistics or business questions from the spreadsheet. This makes them an important tool for institutions that need to analyze large amounts of data.
Pivot tables also come in handy in designing dashboards as they allow you to summarize, analyze, explore, and present summary data in a dynamic way. Pivot Charts complement Pivot Tables by adding visualizations to the summary data in a Pivot Table. This way, you can easily see comparisons, patterns, and trends. This makes them an important tool for creating dashboards in Microsoft Excel and Microsoft Power Bi.
With an achievable project objectives in mind, I have generated a pivot table that would be used to design interactive dashboards for stakeholders to clearly see the inferences deducted from our analysis.
Thank you, and happy new month of JULY.
30/06/2023
Data Analytics Challenge at Harvoxx Tech. Hub with The Sage, Sir Stewart Ezekiel Jnr
Day 6/10,..Prt.2
REGRESSION ANALYSIS
Welcome back! Earlier, we discussed the descriptive statistics of our dataset. Now, let's proceed to regression analysis.
^^Regression analysis can be defined as a statistical method used to estimate the relationship between a dependent variable and one or more independent variables.
^^This analysis can be used to assess the strength of the relationship between variables and to model the future relationship between them.
The method is expressed in a graph and tests the relationship between a dependent variable against independent variables. The independent variable changes with the dependent variable, and the regression analysis attempts to answer which factors matter most to that change.
Our analysis revealed that actual price contributes 92.52% of the variation observed in discounted prices. Then, the remaining variation is attributed to factors not covered in this analysis.
The standard error is so high, at 1898.86, from 1465 observations. Thus, on average, a lot of inaccuracies are expected when predicting discounted prices from actual prices.
Thank you
30/06/2023
Data Analytics Challenge at Harvoxx Tech. Hub with The Sage, Sir Stewart Ezekiel Jnr
Day 6/10
After over a week break, we're back and refreshed for the second half of our challenge. 💪 💪
Going forward, our analysis goes deeper each day into explaining our dataset.
Coming up, we shall perform DESCRIPTIVE ANALYSIS and REGRESSION ANALYSIS on our dataset, which is a major part of EXPLORATORY DATA ANALYSIS, EDA.
Descriptive statistics describe, show, and summarize the basic features of a dataset found in a given study, presented in a summary that describes the data sample and its measurements. It helps analysts understand the dataset better.
THE FOLLOWING ARE COMPONENTS OF DESCRIPTIVE STATISTICS/ANALYSIS
The MEAN. It is one of the most popular ways of determining averages. The mean is simply the AVERAGE of our dataset.
The MODE. Is the most occurring number or input in the dataset.
It is like the most given or agreed upon outcome from the population that gave the dataset.
The MEDIAN. This is simply the middle number in a dataset after the dataset has been arranged in an ascending or descending order.
Range= tells us how far apart the two end values of our dataset are from each other. It is simply the difference between the highest value and the lowest value in the dataset.
Standard Deviation=this describes how far our data points are from the mean. The farther or more dispersed, the larger the standard deviation and the more the data points are distinct from one another.
The variance= we can get the variance by squaring the standard deviation. It reflects the degree of spread of the dataset.
These descriptive statistics can be useful for:
1) Highlighting potential relationships between variables and
2) Providing basic information about variables in a dataset.
Descriptive analysis offers us a summary of our dataset, basically.
3) Descriptive statistics also help in spotting outliers or other odd observations that could need more research.
If you loved your numbers, exploratory data analysis would most likely tickle your fancy.
Soon, we will get to see the REGRESSION ANALYSIS result of our dataset.
Will write to you soon. Bye for now!
Going to prepare pt.2...
/10
13/06/2023
Data Analytics Challenge at Harvoxx Tech. Hub with The Sage, Sir Stewart Ezekiel Jnr.
Day 5
Hurray!! We're five(5) days into our challenge.
Today, let's talk more about...
DATA CLEANING!!!
Usually we see the reports and presentations or the dashboard a data analyst must have presented and because visuals are easy to relate with, we might want to say, 'oh I can do that, with this or that design tools."
Sure anybody with a background in graphics design can re-create those visuals, but to we data analysts, we know that a successful project is more than beautiful dashboards.
Data Analysts engage in a whole lot before arriving at those compelling visuals. Checks are done to verify for accuracy, completeness, usefulness, correctness of the data. By identifying and eliminating mistakes and inconsistencies in your data, the data cleaning process aids in the achievement of project objectives.
Data cleaning is simply the process of altering data to ensure it is accurate and correct.
A data set is checked manually as well as against various databases to:remove duplicate copies, remove or amend incorrect details, such as physical addresses or out of date format, blanks, inappropriate delimiters, and so much more.
A clean and accurate data set will ensure communications are only sent to people that will have a genuine interest or benefit.
Data cleaning is one of those instances we cannot wish away. IT MUST BE DONE.
Data cleaning is used to double-check the integrity of your dataset before the start of analysis because at the point of collecting data a lot factors contribute to error, e.g entry error, as well as during data transfer or combining multiple data from different sources.
SIGNIFICANCE OF DATA CLEANING
💡 Saves time and increases productivity.
Imagine if after their discount or clearance sale, a certain shop still goes on to stock those goods they succeeded in clearing off, it means, they will continue perpetually in discount giving.
Because, you would probably admit that if they had products that customers needed, they would sell out instead of giving out discounts.
💡 Boost results and revenue.
Clean data makes for better results. This better results will then translate to better insight and business recommendations and greater ROI on marketing and communications campaigns.
Even after considering customer acquisition costs, businesses want to make profit and remain so or even scale quickly.
💡 Protect reputation.
People don’t want to receive information that has no relevance to them. Or even insights that will hurt their business instead of increasing it. Thus slso applies to the reputation of the team in-charge of Data analytics.
By maintaining accurate data you ensure that your communications only reach people that would benefit from them. Not only does this increase the likelihood of generating business, it also helps maintain your brand integrity and reputation.
Ensuring that my data is clean and accurate is a priority, because the argument against data cleaning will never be valid. That is not the kind of Data Analyst we were trained to be.
Tomorow, we will dive into EXPLORATORY DATA ANALYSIS, another interesting concept on Data Analytics.
11/06/2023
DAY 4/10 Data Analytics Challenge with, Sir Stewart Ezekiel Jnr, the Scribe of Harvoxx Tech. Hub.
Salutation Brothers!
Even when he's not an indigene, the new cats, it seems he knew more about our routes to and from the Harvoxx Tech.Hub than me. That is just Nanna Onoriode Felix the explorer being Nanna.
Understanding the dataset we have before us is key, only then will you be able to see what is not literally on the surface, and be able to make connections and inferences.
A tough task though, but I was able to separate the quantitative from the qualitatives. Now we have a table titled Table1.
I checked for blanks, and dealt with those instances of incomplete data like one instance where a string was recorded as a rating for product, when clearly, the column was meant for a range of numbers.
No rush, we have come to one of the critical aspects of Data analytics, which is data cleaning, which takes over half of the time allocated for projects.
We will take a pause here for today and continue later today.
Thank YOU, and God bless you. 🙌
Pls: Provide feedbacks, questions and your thoughts in the comment box.
11/06/2023
ROLES OF DATA ANALYTICS IN THE SALES INDUSTRY
Hello friends,
It is the third day of the with the Scribe of Harvoxx, Sir Stewart Ezekiel Jnr and my friends who also accepted the challenge. 😉
For today, I will be showing us key roles I could be playing as a Data Analyst in the sales industry.
Do well to check my posts from my profile
On Day 1, we were able to establish that the role of a data analyst is to interpret data and turn it into information which has potentials of improving a business.
These analyses are done using certain tools; the MS Excel, SQL, and Power BI being some of those tools.
Because it is Day 3, let us see these three(3) roles Data Analytics plays in the sales industry, especially at the electronics shop that generated the amazon dataset we are working on.
1) Sales Prediction:
Collecting data on sales, businesses will understand how much sales are made, how and when their sales happen, and other complementary information.
With these sales data, we will be able to predict sales, as well as customers' buying behavior. Knowing what products sold, when they are sold, and quantity sold is important for a realistic sales forecast.
2) Acquisition and Retention of Customers:
At the core of every successful business is so much concern about their customers.
Data-driven insights help business owners to understand all of their customers' needs. This will enable the business owner to evolve its products to match the needs of customers in this fast-pacing world.
With the dataset provided, i can categorize their customers, whether by purchasing frequency, or other metrics, after which, we will devise means to retain all customers, either by increasing campaigns to onboard more or creating means to motivate the available or both to increase sales.
Customer relations is always an important department in every business out there.
3) Identification of Best Selling Products and The Opposites:
I will check the trends existing among the products' sales from the dataset and tell you which products we could continue to have in our catalog or list after considering many important factors.
In a situation where a product's sales or profit margin is inconsistent, sales strategies can be adjusted accordingly.
That would be all for now, thank you, and see you tomorrow again. Bye 👋
10/06/2023
DAY 2/10 Data Analytics Challenge at Harvoxx Tech.Hub with The Sage, Sir Stewart Ezekiel Jnr
Yesterday, we were able to outline the objectives of our analysis.
For today, we shall peruse our dataset to see our KEY VARIABLES. 🔑
On the Excel platform where the data is stored, we could see from yesterday that they were recorded in rows and columns( of course, that is what Excel is known for), where each column records a distinct variable, and the rows, distinct observations.
For the records, our dataset consists of 1465 observations, also known as rows, and 16 variables, from columns A-P.
These variables include:
🔑PRODUCT ID. This is an alpha-numeric combination the store uses to identify its stocks.
🔑PRODUCT NAME. As it implies, is the name of the product being observed.
🗝️CATEGORY. This is effective for sorting the product the store has into categories.
🔑DISCOUNTED PRICE. We are usually happy whenever we're offered a reduction in price, is it not?
🔑ACTUAL PRICE. This should be the real amount amazon store has initially set to sell any particular product.
🔑DISCOUNT PERCENTAGE. This shows us how much discount we are getting from each product.
🗝️RATING. This means the store was collecting user satisfaction level before they bought any of their products or after they acquired it.
🗝️RATING COUNT. This confirms that the rating was counted. It tells how many customers gave the particular project those ratings.
🗝️ABOUT PRODUCT. Here, the store gave a description of their product.
🔑USER ID. This should be distinct for each customer.
🗝️USER NAME. This states a name corresponding to the User ID.
🗝️REVIEW ID. If reviews are taken for the product against each username, then it will be recorded in this field.
🗝️REVIEW TITLE. Is a field summarizing the review each product has.
🗝️IMG_LINK. A Url to the image product.
🗝️PRODUCT LINK. This is the link to the particular product in the store.
Product ID, product name, discounted price, actual price, discount percentage, and user ID are so important to our analysis and must be carefully handled as they will affect our analysis the most.
CAVEAT: Some data collected may not be necessary for analysis, but that does not mean they are useless, no. They help put the other data points in perspective.
We have moved another step forward. Trust me to bring you more tomorrow. Thank YOU, MVPs
Provide feedbacks, questions and your thoughts in the comments session.
Gracias!!!
DAY *1/10* Data Analytics Challenge at Harvoxx Tech.Hub with The Sage, Sir Stewart Ezekiel Jnr.
The dataset I would be working on for the challenge is the *amazon* sales record.
Now, I would start to explore this dataset as much as possible to get insights, trends, and patterns, as well as offer actionable advice at the end to the shareholders at this shop. The dataset contains data of over 1400 entries, and it nothing but excitement that I feel right now! 💪
I will be using Excel and Power BI for this project.
First, let us understand the "PROJECT'S OBJECTIVES".
The objectives of this project are just like the aims or goals or the purposes why the analysis on *amazon* sales data was started in the first place.
Below are a few of them:
1) Which users ranked as top 10 customers by purchase frequency, volume, and sales generated.
2) Considering the products with the largest and smallest discount sums, how profitable are they afterward, and what relationships exist within their sales trends?
3) How does discount percentage relate to product categorization.
4) What is the correlation between discounted price and rating, discounted price, and actual price.
Let's walk this path together, guys!!
In the end, we will be able to reach a decision for the business owners who collected that data for their store, based on what our data is going to tell us...
https://m.facebook.com/story.php?story_fbid=2455619787949945&id=100005058422926&mibextid=Nif5oz
1) Get a whole lot of knowledge before kick-starting.
2) Start anyways and learn on the go.
What's your pick and why?
Bethuel Godstime Send a message to learn more
28/07/2020
Do you know what the little extra effort could amount to?
Please observe this
👇👇👇
Click here to claim your Sponsored Listing.