Algorithms can handle vast datasets, including thousands of attributes, without succumbing to In this case, Im using standard scaler. So, this is the correlation between income, total expenses, age, total accepted campaign, and recency. <> 0000004481 00000 n Lets see if theres any duplicated data, looking by the customer ID column. <>/Type/Annot/Subtype/Link/Rect[417.827 76.365 472.875 81.524]/Border[0 0 0]>> As we can see, customers that doesnt have any teens at home contributes the highest expense. Firstborn Personality Scale: This test was desgined to produce the maximum possible difference between scores of first-born (oldest) and later-born children. Required Training. 0000003461 00000 n greatest applications of machine learning is to classify individuals based on their personality endobj constructs in digital data. fusion of a personality-based approach has primarily increased the Recommender Systems We can observe many machine learning applications in day-to-day lives, but one of the greatest applications of machine learning is to classify individuals based on their personality traits. high-dimensional and large amount of data has paved the way for increasing marketing campaigns' attractiveness. Vocabulary IQ Test: Vocabulary test giving an IQ score like result. LPEJPF+AdvP4C4E59 l|$ C19AdvP4C4E59 9Tsnnssnnsyy|-y 0000001732 00000 n Each person on this planet is unique and carries a unique personality. The database is made by crowd-sourcing ratings of the characters, and the goal is to match people to characters they will agree are similar to them using techniques from recommendation engines. Based on the kid home graph, most customers dont have any kids. handy to research and observe human behavior. 0000002596 00000 n The evidence indicates that people can mean several different things when they describe themselves as an introvert or extravert, so the trait of introversion-extraversion should actually be broken down into a couple different, though related, traits. 422 0 obj A special focus is given to the strengths, weaknesses and validity of the various systems. 418 0 obj i0,Td0bHcylW@g6;+10c|/m*g'Wy ,0rf^&s8.RsJhewJPYtT4L2sl#"D>qjlu m=Wc*Q"R$W.r9coD3mHc` db o`vPoe(P*! n34j|lyFlRX^)99x3C(f _-dgQ *[a )l O@> U There are 2 features that needs to be encoded. 0000033288 00000 n All Rights Reserved, https://www.enjoyalgorithms.com/blog/personality-prediction-using-ml, https://www.logility.com/es/blog/improve-forecast-quality-and-reliability-with-value-add-forecasting-part-1/, https://www.kaggle.com/piercedee/customer-analysis/. Finally, down to clustering. <> Data Science and Artificial Intelligence are revolutionizing the world through technical transformations. 420 0 obj 2240 ID are unique, so no duplicated data.

endobj This dataset I got from kaggle, so I think anyone is allowed to use it. Now lets start by importing the necessary Python libraries and the dataset: Before moving further, lets have a look at the data by using the automatic EDA technique: Now I will create some new features in the dataset to define the customer personalities as a part of data preparation: Now I will remove the outliers and the missing values in the dataset: To take a look at the clustering of clients in the dataset, Ill define the segments of the clients. Based on the marital status graph, most customers are already married. We can observe many machine learning applications in day-to-day lives, but one of It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviours and concerns of different types of customers. The ancient practice of astrology connects the way a person is to their date of birth. First, lets handle the missing values. This will increase the success of your marketing campaigns. objective, and automated personality assessments. Now, time for encoding. Predict job performance and cultural fit by assessing personality traits and values, Get in Touch
Protestant Work Ethic Scale: There is sociological theory that Northern European countries developed faster in the industrial revolution than southern ones because of the additudes towards work promoted by Protestantism (versus Catholicism). endobj For example, instead of spending money to market a new product to every customer in the companys database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment. Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. 0000001120 00000 n I also want to sum up the total expenses and total accepted campaign for each customers. 2021. https://www.cobuildlab.com/blog/ai, [5] Predictions Dynamic Pricing uses Machine Learning CobuildLab. Seems like theres some null values. Machine learning models are a boon to researchers and are used to learn Customers with an average income of around $69,500. Love podcasts or audiobooks? The highest correlation is income and total expenses, followed by total expenses and total accepted campaign. Time for scaling. Based on the total expenses graph, wine has the highest sell amount. co-linearity issues. Machine Learning models have been actively using such a wide range of data 416 0 obj These tests range from very serious and widely used scientific instruments popular psychology to self produced quizzes. Of course, first step is to import the libraries. 421 0 obj
<>/ExtGState<>/Properties<>>> Assess personality and values with one single questionnaire, Full range of reports to suit the customer needs, Competency reports that show likely fit with a number of the competencies required for various roles (Managers, Sales, Teachers and Customer service rep.), Extensive cross-cultural adaptations across 12 countries, Related services available on request (job profiling, specific norm, bespoke report, etc. Customers with an average total spend of approximately $1,252. [1] Personality Prediction using Machine Learning https://www.enjoyalgorithms.com/blog/personality-prediction-using-ml, [2] Improve Forecast Quality and Reliability wirh Value-add Forescasting https://www.logility.com/es/blog/improve-forecast-quality-and-reliability-with-value-add-forecasting-part-1/, [3] Customer analysis https://www.kaggle.com/piercedee/customer-analysis/, [4] Optimization Business Profitability CobuildLab. Statistical "Which Character" Personality Quiz: This tool will compare your answers to a database of 2,000 fictional characters. 0000045730 00000 n 0000053809 00000 n Hence the The purpose of customer segmentation is to divide customers into many different ways. Based on the education graph, most customers comes from the graduated education background. method. Now lets start with the task of customer personality analysis with Python. OSPP Enneagram of Personality Scales: Customer Personality Analysis is a detailed analysis of a companys ideal customers. 0000039384 00000 n Here I will be using the Apriori algorithm for the task of customer personality analysis with Python. <<8EB3E2B22E1326A751CE5A2DB17372B1>]/Size 447/Prev 432922>> endobj customer Feel free to ask your valuable questions in the comments section below. 446 0 obj endobj In this case Ill try using K-means and agglomerative clustering, Lets see how many clustering is the best by seeing it with the elbow method, Ok, 2 is the best clustering based on elbow method, Now, lets look at the peoples characteristics in each cluster, Lets see what the segmentation looks like, Lets see the optimum number of clustering based on the silhouette score. It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviors and concerns of different types of customers. 0000003283 00000 n endobj Customers can be grouped by their demographic, behavior, lifestyle, psychographic, value, etc. xref Personality traits are closely associated with an individuals behavior and preferences.
the startxref Why Zillow Failed and How You Can Avoid the Same Fate, Applying fbprophet for supply chain management, The Semantics of Formula One: The Surprise Factor Index, Structured vs. Unstructured Data: A Complete Guide, FILTERING MULTIPLE COLUMNS BASED ON VALUES IN PANDAS DATAFRAME, Guide to Churn Prediction: Part 5 Graphical analysis, Customers Profiling Using K-Means Clustering, Arvato Project: Customer Segmentation using Supervised and Unsupervised learning, data = pd.read_csv('https://raw.githubusercontent.com/andhikaw789/Customer-Personality-Analysis/main/marketing_campaign.csv', sep='\t'), data['Age'] = (2021 - data['Year_Birth']), data['Dt_Customer'] = pd.to_datetime(data['Dt_Customer'], format='%d-%m-%Y'), date_now = datetime.strptime('Jan 1 2021', '%b %d %Y'), data['Years_customer'] = (pd.Timestamp('now').year) - (pd.to_datetime(data['Dt_Customer']).dt.year), data['Total_Expenses'] = data['MntWines'] + data['MntFruits'] + data['MntMeatProducts'] + data['MntFishProducts'] + data['MntSweetProducts'] + data['MntGoldProds'], data['Total_Acc_Cmp'] = data['AcceptedCmp1'] + data['AcceptedCmp2'] + data['AcceptedCmp3'] + data['AcceptedCmp4'] + data['AcceptedCmp5'] + data['Response'], plt.figure(figsize=(11,14), facecolor='lightyellow'), data['Age'].value_counts().sort_index(ascending=False).plot(kind='barh'), plt.figure(figsize=(10,10), facecolor='lightyellow'), ax = sns.histplot(data=data, x='Income', binwidth=10000, kde=True), plt.figure(figsize=(8, 9), facecolor='lightyellow'), ax = sns.countplot(data=data, x='Education', saturation=1, alpha=0.9, palette='rocket', order=data['Education'].value_counts().index), ax.annotate(f'\n{p.get_height()}', (p.get_x()+0.4, p.get_height()), ha='center', va='top', color='white', size=11), plt.figure(figsize=(9, 8), facecolor='lightyellow'), ax = sns.countplot(data=data, x='Marital_Status', saturation=1, alpha=0.9, palette='rocket', order=data['Marital_Status'].value_counts().index), number = '{}'.format(p.get_height().astype('int64')). Exposure Based Face Memory Test: Measure of face memory and face blindness. ), Reliable, backed by years of research and thousands of data. Required Training, Get in Touch Moreover, recently, It has been promoted as a spiritual and self-help tool by many authors and there exist several different popular tests of Enneagram type. 0000001934 00000 n Education and marital status. It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviours and concerns of different types of customers. 0000038862 00000 n 0000003002 00000 n For example, instead of spending money to market a new product to every customer in the companys database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment. In this article, Im going to introduce you to a data science project on customer personality analysis with Python. The availability of a high-dimensional and large amount of data has paved the way for increasing marketing campaigns' effectiveness by targeting specific people. And a clash between personal and job or organisational values often lead to disillusionment in the role or with the organisation, decrease in job satisfaction, disengagement and turn-over. There also is a peer report verison, which is even more advanced. 413 0 obj It categorizes people into one of four temperaments, each of which is associated with specific neuro-chemicals. Here I will use this algorithm to identify the biggest customer of wines: So according to the output and overall analysis conducted on this data science project on customer personality analysis with Python, we can conclude that the biggest customers of wines are: I hope you liked this article on Customer Personality Analysis with Python. plt.annotate('{}'.format(x_value), (x_value, y_value), xytext=(-50, 0), textcoords='offset points', va='center', ha='left', color = 'white', fontsize=14, fontweight='semibold'), ax = data[['NumDealsPurchases', 'NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases']].sum().sort_values(ascending=True).plot(kind='barh'), plt.title('Purchases', pad=15, fontsize=18, fontweight='semibold'), ax = data[['Education','Total_Expenses']].groupby('Education').sum().sort_values(by='Total_Expenses', ascending=False).plot(kind='bar', figsize=(10,8), legend=None, color='blue'), plt.title('Total Expenses by Education', pad=10, fontsize=15, fontweight='semibold'), xytext=(0,9), textcoords='offset points', color='black', fontsize=13), ax = data[['Marital_Status','Total_Expenses']].groupby('Marital_Status').sum().sort_values(by='Total_Expenses', ascending=False).plot(kind='bar', color='blue', figsize=(10,9), legend=None), plt.title('Total Expenses by Marital Status', pad=10, fontsize=15, fontweight='semibold'), ax = data[['Kidhome','Total_Expenses']].groupby('Kidhome').sum().sort_values(by='Total_Expenses', ascending=False).plot(kind='bar', color='blue', figsize=(9,9), legend=None), plt.title('Total Expenses by Kid Home', pad=10, fontsize=15, fontweight='semibold'), xytext=(0,9), textcoords='offset points', color='black', fontsize=15), ax = data[['Teenhome','Total_Expenses']].groupby('Teenhome').sum().sort_values(by='Total_Expenses', ascending=False).plot(kind='bar', color='blue', figsize=(9,9), legend=None), plt.title('Total Expenses by Teen Home', pad=10, fontsize=15, fontweight='semibold'), ax = data[['Education','Total_Acc_Cmp']].groupby('Education').sum().sort_values(by='Total_Acc_Cmp', ascending=False).plot(kind='bar', figsize=(10,8), legend=None, color='blue'), plt.title('Total Acc Campaign by Education', pad=10, fontsize=15, fontweight='semibold'), ax = data[['Marital_Status','Total_Acc_Cmp']].groupby('Marital_Status').sum().sort_values(by='Total_Acc_Cmp', ascending=False).plot(kind='bar', figsize=(13,9), legend=None, color='blue'), plt.title('Total Acc Campaign by Marital Status', pad=10, fontsize=15, fontweight='semibold'), ax = data[['Kidhome','Total_Acc_Cmp']].groupby('Kidhome').sum().sort_values(by='Total_Acc_Cmp', ascending=False).plot(kind='bar', figsize=(9,8), legend=None, color='blue'), plt.title('Total Acc Campaign by Kid Home', pad=10, fontsize=15, fontweight='semibold'), ax = data[['Teenhome','Total_Acc_Cmp']].groupby('Teenhome').sum().sort_values(by='Total_Acc_Cmp', ascending=False).plot(kind='bar', figsize=(9,8), legend=None, color='blue'), plt.title('Total Acc Campaign by Teen Home', pad=10, fontsize=15, fontweight='semibold'), sns.heatmap(data[['Income', 'Total_Expenses','Age', 'Total_Acc_Cmp', 'Recency']].corr(), annot=True), data['Income'].fillna(data['Income'].mean(), inplace=True), from sklearn.preprocessing import LabelEncoder, data_prep['Marital_Status'] = lenc.transform(data_prep['Marital_Status']), from sklearn.preprocessing import OrdinalEncoder, edu = ['Basic', 'Graduation', 'Master', '2n Cycle', 'PhD'], data_prep['Education'] = ore.transform(data_prep[['Education']]), data_prep = data_prep.drop(['ID', 'Year_Birth', 'Dt_Customer', 'AcceptedCmp1', 'AcceptedCmp2', 'AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5','Response', 'Complain', 'Z_CostContact', 'Z_Revenue'], axis=1), from sklearn.preprocessing import StandardScaler, std_scaler = np.array(data_proc[['Income','Kidhome', 'Teenhome', 'Recency', 'MntWines', 'MntFruits', 'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts', 'MntGoldProds', 'NumDealsPurchases', 'NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases', 'NumWebVisitsMonth','Age', 'Years_customer', 'Total_Expenses', 'Total_Acc_Cmp']]).reshape(-1,19), data_proc[['Income', 'Kidhome', 'Teenhome', 'Recency', 'MntWines', 'MntFruits', 'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts', 'MntGoldProds', 'NumDealsPurchases', 'NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases', 'NumWebVisitsMonth','Age', 'Years_customer', 'Total_Expenses', 'Total_Acc_Cmp']] = scaler.transform(std_scaler), kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42), kmeans = KMeans(n_clusters=2, random_state=42), data_segment.groupby(['Segments']).mean().