
For at least two decades, retail sales have been expanding into the online sphere. This lengthy period has allowed for the development of methods and ways to attract customers and maintain their loyalty to online stores.
E-commerce stores differ from typical retail stores in that customers are not anonymous. Retail stores try to identify customers through „loyalty cards,” enabling them to analyze customer behavior, preferences, or visit frequency. However, online stores have far greater capabilities to analyze customer behavior. A person logged into a sales platform is also monitored by systems supervising sales on that platform. Well-configured e-commerce systems track which areas of the site the customer visited, how long they viewed products, which products they displayed, and which products they compared. This way, an extensive historical knowledge base about each customer is built, allowing for a competitive advantage.
Recommendation Systems
Recommendation systems aim to increase the purchases made by an individual customer or maintain that customer’s loyalty over the long term. Typically, these two goals are pursued simultaneously. In the long run, it is considered reasonable to focus the recommendation system more on strengthening customer loyalty.
There are two main directions for creating recommendation systems. First, the similarity of products is analyzed, and based on this similarity, the next product is suggested to the customer. Secondly, customers are observed, common features are identified, and segments, groups, and clusters are created. As a result, specific products are suggested during purchase, and incentives are sent to customers in the form of SMS, emails, and notifications.
Both directions of analysis can be used simultaneously in recommendation systems. The recommendation methodology based on product similarities is useful when a customer takes the first step by selecting a specific product. Our system will then suggest images of similar products. If the customer does not take this step, a recommendation system based solely on product similarity will not work. To encourage the customer to take the first step, it is worth attracting them to the store, to specific product groups or services that can be provided through the website.
This article will cover how to create a simple recommendation system based on product similarity. The system draws knowledge about products from their descriptions and photos, based on which it will generate offers for customers. I will describe how to build such a recommendation system in Python.
Cosine Rule
The cosine rule in recommendation systems is mainly used in collaborative filtering, where product properties are compared. This method calculates the similarity between items based on user ratings or descriptions provided on the e-commerce platform. If we have two items, each represented by a vector of ratings from different users, the cosine of the angle between these vectors in multidimensional space is a measure of their similarity. The closer the cosine is to 1, the more similar the items are.
To illustrate this concept, let’s use an example. We have three cakes:
- Cake A has a diameter of 8 cm and a height of 8 cm.
Cake B has a diameter of 32 cm and a height of 10 cm.
Cake C has a diameter of 26 cm and a height of 15 cm.
Building a Simple Recommendation System for an E-Commerce Store
Place the cakes’ heights and diameters in a coordinate system. On the above chart, I have marked where each cake is located. The x-axis represents the height of the cake, and the y-axis represents its diameter. I drew lines from the points representing the cakes to the origin. The sharper the angle between these lines, the more similar the products are. The chart shows that Cake A differs significantly from Cake C, while the difference between Cake B and A is theoretically small, as the angle between the lines is relatively sharp.
Finding Similarities Through Product Descriptions
To create an effective recommendation system, customer behavior needs to be analyzed, and similarities among the sold products should be identified. If a customer is new, it is impossible to assign them to a specific segment. Such clusters are built based on customer behavior. Characteristics like gender, age, or residence location play a secondary role in cluster building. If a customer is new, a recommendation system based on cluster membership cannot be created. The system must instead rely on product similarity. This similarity can be determined based on photos or descriptions. Image-based product similarity can be determined by artificial intelligence through image-recognizing neural networks. A second method that can be used in parallel is identifying similarity based on product descriptions. These descriptions can be created by customers or store staff. Customers often leave comments on products when they are emotionally stirred, ranging from euphoria to anger. We could seek product similarity based on customer emotional engagement, though such actions would likely have limited effectiveness. For example, customers who viewed an emotionally charged cake might receive recommendations for other highly exciting cakes. The only reasonable solution for an effective recommendation system seems to be seeking product similarity based on descriptions made by store staff.
Example of a Recommendation System for a Confectionery Store
Imagine an e-commerce site selling a large selection of confectionery products. One category sold on this site is cakes. The store offers 3,500 different types of cakes. Usually, when a customer views a cake, the site suggests five other similar cakes below. The effectiveness of these suggestions directly impacts sales efficiency. The customer has access to many similar online cake stores. If the suggested cakes are irrelevant, the customer will simply leave the site and buy a cake elsewhere.
To find product similarities, we will use the cosine angle method described above. We will use a database of cakes that was directly used for product searches on the website. Below is a list of database fields. The fields are in English because the tool for creating recommendation systems does not support Polish.
Fields in the Cake Database
Gender (female, male): indicates gender; the customer might consider this when searching for a cake.
Type (birthday, communion, professional, anniversary): the type of cake is crucial to customers. Communion, wedding, birthday, or professional event cakes each have different forms.
Size (tiny, small, medium, large, huge): the size of the cake.
Color (black, white, yellow, green, brown, blue, lemon): the color of the cake.
Writings (string, no string): whether there is a message on the cake; for some customers, the message on the cake is significant.
Height (high, low): whether the cake is tall or short.
Figurine (flowers, animals, cars, none, people): cakes may feature figures like animals, flowers, cars, or a couple.
Medical (high-calorie, low-calorie, lactose intolerance, gluten intolerance, histamine intolerance): some customers need the cake to meet medical guidelines, such as low calories or accommodations for lactose or histamine intolerance.
Sport (none, ball, basketball, cycling, swimming): children’s cakes often have sports themes, which I included in my cake database.
This database was created in Python. Each person can create such a training database themselves. Below is the code I used to create this database.
The Index field represents the cake’s identification number. Cake names are not included in the database because this information is irrelevant to the recommendation system.
Now, I need to create a description field. Typically, stores have a description field containing product features, dimensions, and intended use. Creating such a field for 3,500 cakes would be very time-consuming. Alternatively, such a description field may not exist or may not be suitable for the recommendation system being created. To create an effective description field for the recommendation system, all cake features should be combined into a single string of words.
Next, tokenization must occur for the system to find similarities. Tokens are words that connect with other tokens. The system searches for words of the product indicated by the customer in the description field and looks for the most similar products in the description field in the 3,500-cake database. The system does this using the code below.
I love it when people come together and share ideas. Great website,
keep it up!