If You Listen Closely, Data Speaks Volumes 📊📢
Let’s also get investment smart with 'XIRR' and uncover 'Hindsight Bias'.
Hello, data-driven minds, welcome to the 13th edition of DataPulse Weekly!
Each newsletter promises a journey through the fascinating intersections of data, stories, and human experiences. Whether you're an analyst or simply curious about how data shapes our world, you're in the right place.
🎉 Update: We are now 300-strong! Thank you for your belief and support 🙏
If we've made even one person fall in love with data or helped someone feel more confident with it, we'll consider it as a huge win.
We're excited to create even more valuable content in the future. Feel free to DM us if you have suggestions or just want to say hi.
Thank you again for being with us on this journey 🚀
Now, let’s dive straight into today’s Data Menu -
Today’s Data Menu
📊 Case Study: Intent Score Framework
💹 Metric: XIRR
🧠 Human Bias: Hindsight Bias
📊 Data Case Study: Intent Score Framework
If you listen closely, data speaks volumes—sometimes even revealing your customers' true intentions.
In today’s case study, our objective is to build a systematic framework to create a user intent score, allowing us to effectively target those with a high likelihood of making a purchase.
Consider an e-commerce company aiming to develop an intent score for each user.
There are generally four types of data available for an e-commerce user:
Demographic Data: Basic information about customers.
Examples: Age, gender, income, location, device.
Transactional Data: Details of a customer’s purchases.
Examples: Purchase history, order frequency, transaction amounts, and order dates.
Behavioral Data: Tracks customer interactions with your business.
Examples: Website visits, click-through rates, email opens.
Psychographic Data: Insights into customers’ attitudes and lifestyles.
Examples: Interests, values, lifestyle choices.
Step-by-Step Process to Build an Intent Score:
Create features: Aggregate all potential influencing features for each user over the last N days at daily, weekly, or monthly intervals.
Here, we have aggregated user activities for the last 30 days on a daily basis and created a flag to indicate whether the user made a purchase or not.
Our sample data might look like this:
We are showing just a few features for simplicity. In reality, there are many features you can create for each observation.
Notice how User ID 2 is repeated on multiple dates. The idea is to aggregate activities over the last 30 days for each user on each date, even if they repeat.
Normalize Features: Scale all features from 0 to 1. For Example - If a user has visited the website 20 times in the last 30 days, and the maximum number of visits by any user is 50, the normalized value would be (20 / 50) = 0.4.
Our normalized sample data will look like this:
Identify Key Features: Use correlation or utilize machine learning models to identify the most important features impacting purchases.
A correlation matrix can help identify the key features.
Since we have only a few features, all of which have a correlation ≥ 0.5 with purchase, we will retain them all in the next step.
Assign Initial Weights: Start by assigning initial weights (From 0 to 1) to features from the previous step.
Simply put, weights represent the strength of the relationship between a feature and the user’s intention to make a purchase.
Higher feature weight → Higher predictive power of the feature!
Our initial weights might look like this:
Simulate Weights: Iterate through combinations, adjusting weights incrementally to explore all possible options.
In the second row, we decreased the weight for the Recency feature and increased the weight for the Frequency feature. Ensure that the sum of the weights always equals 1.
It is important to try all possible combinations to ensure no potential weight combination is left unexplored. However, some optimization will be needed to remove redundant weight combinations if the data size is huge.
Calculate Intent Score: Calculate the intent score by multiplying the feature weights by their respective feature values and summing them up.
For each weight iteration:
Intent Score = SUM(Normalized value of feature X * Weight of feature X), where X represents each feature. Our data might look something like this:
Conversion Rate Against Intent Score Bucket: For each iteration:
Create Intent Score Buckets by grouping the intent score in 0.1 bucket size.
Sum the total users and purchased users in each bucket
Calculate the conversion rate for each bucket by dividing the purchased users by the total number of users in each intent score bucket
Visualize the Conversion Rate vs Intent Score Bucket for all the iterations:
Identify Optimal Weights: Find the iteration with the highest slope, indicating the best differentiation between purchased and not-purchased users.
In our case, Iteration X has the best slope.
Iteration X offers a more balanced approach to the intent score by combining multiple features at once. Therefore, we should use the weights from Iteration X to build our user intent score.
Conclusion
Building an intent score using customer data involves identifying key features, simulating weights through iterations, and finding the iteration and corresponding weights having the highest slope.
The above step-by-step framework provides a good foundation to build a user intent score.
While many sophisticated machine learning models can yield better results by analyzing features more effectively, the beauty of the above framework lies in its simplicity and broad applicability for developing scores based on user attributes.
In the next section, we will explore XIRR, a financial metric to make you smart about your investment decisions.
💹 Metric of the Week: XIRR
XIRR (Extended Internal Rate of Return) is a financial metric used to calculate the annualized return of a series of cash flows occurring at irregular intervals.
Unlike the traditional IRR, which assumes equal time periods between cash flows, XIRR accounts for the actual dates of each cash flow.
This makes it especially useful for investments like mutual funds, where you might be making systematic investment plan (SIP) contributions at different times.
For instance, if you invest $500 monthly in a mutual fund over three years and the fund's value fluctuates, XIRR helps calculate the true annualized return by considering each contribution date and the final value.
XIRR provides a clear, accurate measure of how well your investment is performing. By using XIRR, you can better compare the performance of different funds, make informed decisions, and optimize their investment strategies to maximize returns.
Since we are discussing investing, let's explore a well-known cognitive pitfall related to stock markets in the next section.
🧠 Human Bias: Hindsight Bias
Did you know there are more than 180 ways your brain can trick you? These tricks, called cognitive biases, can negatively impact the way humans process information, think critically and perceive reality. They can even change how we see the world. In this section, we'll talk about one of these biases and show you how it pops up in everyday life.
A 1.5-month-long Indian general election concluded on June 4th with dramatic twists and turns. The Indian stock market reacted notably:
On June 3rd, exit polls predicted a decisive BJP win (the previous majority party in parliament), leading to a more than 3% rise in Sensex, the index of the 30 biggest Indian stocks.
On June 4th, the actual results showed the BJP did not win a majority, causing the Sensex to plummet by more than 5%.
On June 7th, the NDA coalition government led by the BJP was formed, and the market recovered by 6% from its low on June 4th.
If you are an investment enthusiast, you might have heard these statements on the evening of each of these days:
June 3rd rise: “BJP is undoubtedly going to win with a large majority, so the market is going to surge before results.”
June 4th decline: “BJP was never going to win a majority. The market will crash further.”
June 7th recovery: “The NDA coalition led by BJP was definitely going to form the government. We should have invested more on June 4th during the market fall.”
These statements reflect a pattern of thinking, “I knew what the market would do,” or in other words, “I already knew it” or “It had to happen.”
This phenomenon is called hindsight bias—the tendency to see events as predictable after they have happened. Investors and analysts might convince themselves they had foreseen these market movements all along, even though the reality was far more uncertain.
Examples of Hindsight Bias:
Regretting not investing in a real estate property that suddenly saw a 200% spike in the last 1.5 years due to a new government infrastructure proposal.
Choosing a computer science major in university when the IT sector was booming but regretting it due to poor job market conditions.
Hindsight bias can distort our perception of events, making us believe we could have predicted outcomes that were actually unpredictable. This can lead to overconfidence in future predictions and poor decision-making.
That wraps up our newsletter for today! We've broken down complex data concepts and will continue to do so in future editions. If you found this valuable, please consider subscribing and sharing it with just one person who might benefit—it motivates us to create more content like this.
Quite informative article as always. Really enjoyed reading it.