
Without Data, There’s No Recommendation System
To recommend something to someone, you must first know them. To know someone, you must have detailed information about them—often more than they know about themselves. When a customer completes multiple transactions with us, we get to know their behavioral patterns, their frequency of visiting stores, how long they hesitate before purchasing, how often they act on impulse, and when a discount offer might work best. All of these data points form the foundation of a recommendation system.
To build such a system, we must first learn how to collect data about our customers. This article is dedicated to exactly that.
Why Customer Data Is Crucial for Recommendation Systems
Personalized recommendations function much like a skilled salesperson in a store—able to predict, based on collected information, what a customer might like. Recommendation algorithms (including FM and DeepFM) learn from data: the more accurate and comprehensive the behavioral and preference data you collect, the better the system can tailor offers to individual customers.
Traditional methods of gathering opinions (such as surveys) fall under explicit feedback—directly asking customers what they like. Unfortunately, such declarations are often imprecise or infrequent. This is why modern systems prefer to rely on implicit feedback—hidden data from real user behavior. Such data is richer and often reflects preferences more honestly, though it requires interpretation. In short: instead of asking—observe.
Below, we outline exactly which data points are worth collecting and how to do it discreetly, without overwhelming customers with questionnaires.
W-MOSZCZYNSKI ppic 9-25
Imagine you run an online bakery specializing in cakes, pies, and other pastries. You want to implement a modern recommendation system powered by algorithms such as Factorization Machines (FM) or DeepFM, so you can offer products aligned with each customer’s taste. For these tools to work effectively, they require rich data—covering preferences, purchase habits, and behavioral patterns.
This section provides a detailed analysis and a practical guide on how to discreetly collect such information without relying on low-reliability surveys.
What Customer Information Should You Collect?
To build an effective bakery recommendation system, focus on gathering behavioral data from your online store:
-
Purchase History (Transactions) – Record what the customer bought, when they bought it, how often they purchase, and how much they spend. This history reveals patterns—for example, whether someone buys cakes monthly (perhaps for events) or mainly before holidays. Such data supports cross-selling and helps predict future needs.
-
Browsed Products and On-Site Activity – Track which products and categories a customer views, how long they spend on them, and which pages they visit. This can reveal interests even without a purchase. For instance, frequent viewing of meringue cakes without buying signals an opportunity for targeted recommendations.
-
Search Queries – If your store has a search bar, log the exact terms entered. These are direct indicators of intent (“gluten-free,” “sugar-free,” etc.). This data informs both recommendations and inventory planning.
-
Cart Additions and Abandonments – Even abandoned carts indicate interest. For example, if a customer adds a chocolate cake but doesn’t purchase, you can send a reminder or offer a discount later.
-
Basic Demographics and Contact Data – Collected during checkout (name, address, phone, email). While they don’t directly reveal taste preferences, they can help with location-based offers and communication. Always ensure GDPR compliance.
-
Inferred Preferences – Derived from behavior (purchase history, browsing patterns, cart additions). For example, a customer who repeatedly orders birthday cakes with “Happy Birthday” inscriptions is likely buying for birthdays; another filtering for vegan products probably follows a vegan diet.
These are primarily first-party data—collected directly via your store—making them the most valuable for your recommendation model.
Methods of Collecting Customer Data
Collecting information should be seamless, integrated into store operations, and not require customers to fill in lengthy forms. Proven methods include:
-
Website Analytics – Tools like Google Analytics 4 track visits, page views, clicks, and time on site. Combined with cookies, this allows you to identify returning users and their interests.
-
E-commerce Event Tracking – Most platforms (PrestaShop, WooCommerce, Shopify) can track key actions such as product views, cart additions, checkout initiations, and purchases. These events reveal where customers hesitate and help train algorithms to identify product relationships (“Customers who viewed X often also bought Y”).
-
User Behavior Profiling – Encourage account creation by offering benefits (order history, faster checkout, loyalty discounts). Logged-in behavior can be linked to a persistent profile, allowing for personalized recommendations and targeted offers.
-
Heatmaps and Session Recordings – Tools like Hotjar or Crazy Egg show where users click, scroll, and pause, offering UX insights that can indirectly enhance recommendations.
-
Traffic Source and Campaign Analysis – Knowing whether a customer came from a Facebook ad or a Google search for “sugar-free cake” allows tagging them for relevant offers.
-
Loyalty Programs – Points, discounts, or perks for frequent customers encourage sign-ups, providing more structured behavioral data tied to a customer ID.
-
Reviews and Social Media Insights – Even unstructured comments can reveal purchase intent or preferences (“Beautiful cake for my son’s first birthday” implies repeat needs).
-
Aggregated Trends – Seasonal and contextual trends (e.g., higher cheesecake sales during holidays) can feed contextual features into the recommendation system.
Using the Collected Data in a Recommendation System
Once diverse customer and interaction data is collected, it can be used to:
-
Build User and Product Feature Sets – Factorization Machines require user features (ID, segment, preferences, average spend, location) and product features (ID, category, flavor, price range). The richer the feature set, the better the matching accuracy.
-
Enable On-Site Personalization – Dynamic sections like “Recommended for You,” “Customers Also Bought,” “Recently Viewed,” or “Bestsellers in Your Area” enhance engagement and sales.
-
Inform Marketing Decisions – Segment customers for targeted outreach (e.g., special offers for lapsed buyers, early birthday cake promotions for repeat birthday purchasers).
-
Continuously Improve Models – Retrain periodically as preferences evolve, add new features when gaps are found, and validate performance through click-through rates (CTR) and conversions.
Summary
For an artisan running an online bakery, customer data becomes as essential an ingredient as quality flour or a trusted recipe. Collecting it doesn’t need to be difficult or invasive—most of it is already flowing through your store in the form of digital traces. Your task is to gather, structure, and use it effectively.
Combine your craft expertise with structured behavioral data, and you’ll spot patterns that allow you to anticipate customer needs—sometimes before they’re even aware of them.
As for the next step—while many small businesses rely on third-party analytics tools, building your own structured database from system logs provides full control and independence. These logs contain raw sequences of customer actions—purchases, hesitations, and decisions—which, once filtered and structured, become the foundation for your own recommendation engine. That will be the focus of my next article.
Wojciech Moszczyński
Graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń. Specialist in econometrics, finance, data science, and management accounting. Focused on optimizing production and logistics processes. Active researcher in AI development and applications. Long-time promoter of machine learning and data science in business environments.