my publications - THE DATA SCIENCE LIBRARY https://sigmaquality.pl/category/my-publications/ Wojciech Moszczyński Fri, 31 Oct 2025 09:19:42 +0000 pl-PL hourly 1 https://wordpress.org/?v=6.8.3 https://sigmaquality.pl/wp-content/uploads/2019/02/cropped-ryba-32x32.png my publications - THE DATA SCIENCE LIBRARY https://sigmaquality.pl/category/my-publications/ 32 32 A taxonomy of personality traits and its application in recommendation systems for the sale of investment and luxury goods (Part 1) https://sigmaquality.pl/moje-publikacje/a-taxonomy-of-personality-traits-and-its-application-in-recommendation-systems-for-the-sale-of-investment-and-luxury-goods/ Thu, 16 Oct 2025 08:45:24 +0000 https://sigmaquality.pl/?p=8701 October 2025 | volume 79 Why knowing the client’s profile increases the effectiveness of selling expensive solutions In low-risk purchases (e.g., butter, pasta), habits and [...]

Artykuł A taxonomy of personality traits and its application in recommendation systems for the sale of investment and luxury goods (Part 1) pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
October 2025 | volume 79

Why knowing the client’s profile increases the effectiveness of selling expensive solutions

In low-risk purchases (e.g., butter, pasta), habits and situational constraints decide; the analysis of deep traits is often unnecessary. However, in investment choices — the choice concerning the purchase of a production line, a luxury car, or an expensive property — the client engages analytical processing, verifies risk, and expects trust and the adjustment of the communication style to their own preferences. The taxonomy of personality traits organizes and makes it possible to understand how the client makes decisions (e.g., preference for innovation vs. predictability, sensitivity to uncertainty, need for data vs. relationships). This structure of traits explains why the alignment of the tone of conversation, argumentation, and evidence to the client’s psychological profile raises conversion and satisfaction. Getting to know the client’s unique structure of traits and using it for proper communication — not through manipulation — is an optimization of the sales process by reducing uncertainty and cognitive costs for the client. Effective optimization of sales conversations requires a solid, evidence-based recognition of

What is the Big-Five factor structure?

The “Big Five” model is today the most widespread taxonomy of personality traits in the psychology of individual differences. Its roots reach back to the so-called lexical hypothesis and factor analyses conducted since the 1930s, and its contemporary form was consolidated in research from the 1960s and 1990s (Tupes and Christal; Norman; Goldberg; Costa and McCrae). The starting point was the hypothesis that the most important differences between people are revealed in the language they use. Gordon Allport and Henry Odbert (1936) cataloged approx. 18 thousand English words describing persons, creating the foundation for later factor analyses. Next, Ernest Tupes and Raymond Christal (1961; publ. 1992) demonstrated the replicability of five broad factors in trait ratings, which was confirmed by Warren Norman (1963). At the turn of the 1980s and 1990s, Lewis R. Goldberg consolidated the five-dimensional structure as the Big-Five factor structure, proposing an open descriptive taxonomy based on adjectives. In parallel, Paul Costa and Robert McCrae developed the NEO-PI-R/NEO-PI-3, questionnaire tools measuring five domains and 30 facets, which became the gold standard of measurement. Contemporary reviews (John, Naumann, Soto) describe this paradigm as a taxonomy of traits with high communicative usefulness.

Thus, the Big Five model is not new; it is the outcome of decades of scientific research. The world of sales became interested in this theory of the taxonomy of personality traits long ago, but there had been no possibility of translating it into effective actions. The situation changed with the appearance of new semantic-analysis systems based on artificial intelligence.

What is OCEAN Big Five?

O-C-E-A-N is an acronym of the five main dimensions of personality in the Big Five model:

O – Openness to Experience: curiosity, imagination, readiness for novelties and ideas.
C – Conscientiousness: order, self-discipline, goal orientation, predictability.
E – Extraversion: social energy, assertiveness, need for contact and stimulation.
A – Agreeableness: empathy, cooperation, trust, mildness in disputes.
N – Neuroticism (emotional reactivity): tendency toward anxiety, tension, sensitivity to risk.

W-MOSZCZYNSKI ps 10-25

OCEAN Big Five is useful for business because it organizes styles of information processing, tolerance of risk, and communication preferences of clients; it accounts for variables directly affecting purchasing decisions and the dynamics of negotiations. In practice, it is not simple incentives such as discounts, add-ons, bonuses that decide the purchasing decision; rather, the decisive role is played by the client’s assessment of arguments. It is not about using the “right words,” but more about what kind of evidence the client needs to deem the transaction trustworthy.

Cognitive fit

High Conscientiousness (C) prefers structuring, indicators, and procedures; the lack of hard data produces a sense of chaos, which can lead to a drop in trust and a hardening of the stance.

High Openness (O) expects novelty and wants answers to many questions. In such a situation, a routine offer may be assessed as lacking added value and as unattractive. Building an offer with personality traits in mind is very labor-intensive; therefore, it is possible to construct a RAG-type artificial-intelligence system to create individualized, psychologically tailored offers.

High Neuroticism (N), manifesting as reactivity to uncertainty, requires mechanisms of risk reduction in the form of guarantees and showing exit paths. Their absence will amplify loss aversion and lead to withdrawal from the transaction despite objectively good conditions.

High Agreeableness (A) is sensitive to the tone of the relationship. Confrontational communication conducted by the seller triggers defensive reflexes in the client and hardens the position.

High Extraversion (E) prefers quick, interactive contact and social proof; asynchronous and cool communication lowers the readiness to close.

To sum up — when the style of the message does not match the profile, e.g., an excess of details for a client who does not expect them, the client’s cognitive cost rises. The natural evolutionary trait of humans is saving energy. The client instinctively postpones the decision or outright refuses, because another, alternative place where they can make the purchase requires less cognitive effort. In the future the client will return to the place where earlier they made a cognitively cost-free transaction.

Forming trust

Trust arises when the content, evidence, and form of communication are consistent with expectations resulting from the client’s individual OCEAN profile. Mismatch generates signals of relational risk (e.g., “the seller is not listening to me”), which results in a hard negotiating stance or breaking off talks.

Including OCEAN is not a soft add-on, but a way to minimize informational friction and perceived risk, and thus a factor increasing the probability of a decision, the speed of closing, and satisfaction. Not taking the profile into account increases the probability of defensive reactions, withdrawal, hardening of the stance, and additional retaliatory demands, even with objectively favorable parameters of the offer.

Scientific foundations of transcript analysis

Over the last decade, the theory of semantic analysis — the so-called computational psychometrics of language — has developed rapidly. Research on large samples of social-media users has shown that open vocabulary analysis and models based on n-grams, LDA topics, or embeddings predict Big Five results better than traditional dictionaries — provided we have a sufficiently large text sample. Put simply, systems based on identifying the meaning of whole words, sentiment, tendencies turned out to be much more effective than the previous approach that relied on counting adjectives, verbs, frequencies of using specific words and phrases without deeper understanding of their meaning and context. A classic point of reference is the work of Schwartz et al. (PLOS ONE, 2013) conducted on over 75 thousand people. The authors showed that language features stably correlate with Big Five questionnaires. Subsequent works (Park/Schwartz et al.) and numerous replications in blogs, tweets, and spontaneous speech confirm these relationships and indicate that a greater amount and richness of language significantly improve the accuracy of personality prediction.

Data requirements: how long a text sample is needed?

The stability of linguistic indicators grows with the number of words. In research literature it has become customary to use minimal samples on the order of 100–200 words, emphasizing that shorter ones are unstable; the more text, the better. Moreover, the studies used report higher validity when the average length of utterances exceeds 600–4,000 words per person, depending on the domain that is the subject of conversation.

In the practice of profiling clients based on phone conversations, a minimum of 300 words of accumulated text is recommended. A phone call differs from an ordinary conversation in which the interlocutors see each other; therefore, it requires a different normalization. It is possible to collect and combine several conversations. Combining several transcripts can also offset the effect of the person’s temporary mood. Big Five traits are theoretically stable; however, on every occasion the profile should be updated on the basis of new transcripts.

Operationalization of the five dimensions in B2B/B2C sales of investment goods

In practice, Big Five traits do not say what the client will buy, but how they assess risk, data, innovations, and relationships. “Openness” favors a narrative about novelty, prototypes, and personalization; “Conscientiousness” — the need for hard indicators (SLA, TCO, audits); “Extraversion” — quick interactions and case studies with people; “Agreeableness” directs toward an empathetic tone and a tendency to avoid problems; “Neuroticism” seeks to reduce uncertainty through guarantees, transparency, and the possibility of potentially withdrawing from the transaction. These mappings are consistent with the descriptions of domains in the aforementioned NEO-PI-R by Paul Costa and Robert McCrae.

From an operational point of view this means two things: first, get to know the client (extract linguistic features from transcripts and other activities), then adjust the conversation script and materials (selection of arguments and tone) to the O-C-E-A-N profile. Such an approach simplifies the client’s decision, increases trust, and reduces informational friction, which is important especially when purchasing rare luxury or investment goods.

Methodological recommendations for the data analyst

On the data side, it is worth combining interpretable features (the old NLP approach, e.g., counters of key phrases for risk, timing, innovativeness) with semantic vectors (the new data-science approach based on the meaning of entire sentences and segments). Meta-analyses show that open models should not be fed extremely short samples; in practice one should strive to sum up several conversations per client and control the stability of the scale as the text length grows. Next, we build regressions for the five dimensions (with threshold calibration), validate “by person” (to avoid leakage), and report interpretability (features/fragments raising a given domain). The theoretical foundation — the lexical origin of traits and their five-dimensional structure — justifies that language in natural use provides diagnostic signals about the client’s decision preferences. Profiling is conducted in order to better match the offer, which leads to higher satisfaction with the transaction and a higher probability of transaction replication.

The Big Five is an old and well-confirmed taxonomy of traits which, thanks to the newest language-analysis methods based on artificial intelligence, has gained new, scalable applications in business. In the area of high-value sales, the key is first to get to know the client on the basis of their real language (minimum 300 words), and then to adjust communication to their O-C-E-A-N profile. Such automation does not replace the seller’s experience; rather, it strengthens it by providing objective guidelines for “how to talk” so as to speed up building trust and make it easier for the client to make a decision.

HEXACO as a development of the Big Five

HEXACO, as a continuation and complement of the Big Five OCEAN paradigm in business applications, arose from the practice of using Big Five OCEAN in the sale of luxury goods. It was noticed that there are still two dimensions that play an important role in making high-amount decisions. These are: Honesty–Humility. According to the latest research, which I will cite in a moment, HEXACO assessment improves the effectiveness and alignment of communication in B2B/B2C sales of luxury and investment goods.

Genesis and the place of HEXACO with respect to the Big Five

HEXACO grows out of the same lexical hypothesis as the Big Five: key individual differences are encoded in natural language; their structure is revealed in factor analyses of adjectives. In the 2000s, Kibeom Lee and Michael C. Ashton, on the basis of cross-linguistic lexical analyses, proposed a six-dimensional model which — alongside Extraversion, Agreeableness, Conscientiousness, Openness, and (redefined) Emotionality — introduces a new dimension, Honesty–Humility.

Compared to the Big Five, HEXACO changes the definitions of some traits. Emotionality (E) in HEXACO includes fearfulness, sensitivity, and attachment, whereas the components of anger and wrath shift to Agreeableness (A) as gentleness and forgiveness; this is a change relative to Neuroticism and Agreeableness in OCEAN, confirmed by research on the effectiveness of communication. The structures of corresponding traits correlate from moderate to high (highest for Extraversion, Conscientiousness, Openness), which speaks for paradigmatic kinship but also the added value of the six-factor model.

The contribution of HEXACO: the Honesty–Humility dimension and consequences for economic decisions

The biggest novelty of HEXACO is Honesty–Humility (H), combining sincerity, lack of a tendency to manipulate, modesty, and low greed related to aspirations to a certain material status or membership in some class. In numerous studies H predicts a lower propensity to unethical behaviors such as fraud, corruption, opportunism. People high in H show greater honesty in cooperative games and prefer fair treatment of others. In tests of comparative validity, HEXACO scales (especially H) have an advantage over OCEAN in predicting criteria related to honesty and cooperation. The mechanisms linking H with pro/antisociality seem to include, among others, concern for justice and low acceptance of using others instrumentally.

For practice this means that in high-stake negotiations (luxury goods, assembly lines) with clients whose profile has low H — that is, low humility and honesty — hard purchasing behaviors are generated more often: aggressive demands, lower sensitivity to ethical arguments, higher tolerance for accepting counterfeits. By contrast, high H prefers transparent conditions and long-term partnership; it is more resistant to short-term temptations such as discounts with costs hidden from the client. In consumer studies, a client with a high level of honesty and humility reacts poorly to examples of counterfeit luxury goods.

HEXACO and high-value decisions

In the sale of luxury and investment goods, key roles are played by the perception of risk, trust, and the adjustment of tone and evidence. The six-dimensional taxonomy allows these mechanisms to be modeled more precisely than OCEAN, because it separates two distinct sources of relational friction: exploitativeness (low H) and conflict/ unforgiveness (low A). Preparing the conversation with traits A and H in mind is a prevention against the emergence of retaliatory and escalation behaviors in the client. HEXACO has been deeply analyzed in behavioral games aimed at improving the stability of cooperation and avoiding the tendency toward “hard anchors” in negotiations and the tendency to escalate minor incidents into conflicts.

Additionally, the redefinition of Emotionality (E) as fearfulness/sensitivity (without the anger component) better models aversion to uncertainty and the need for safety mechanisms among some corporate clients: guarantees, reversible options, clear support procedures. In comparative studies these differences are well documented, both conceptually and psychometrically.

Frames of application in the sale of luxury and investment goods

As in OCEAN applications, the basis of HEXACO is the extraction of traits from language: from call-center transcripts, correspondence, meeting notes.

Example of creating an offer based on mapping the client’s profile

  • H (Honesty–Humility) high: maximum transparency, no “fine print”;

  • H low: tougher contract conditions, explicit consequences of violations, formal compliance procedures. Empirical links between low client honesty and the ethics of decisions and propensity for fraud. The client will like it on the principle of: “I would have written it that way too”;

  • E (Emotionality) high: first reduce uncertainty (guarantees, exit path), then complex comparisons;

  • E low: greater acceptance of “hard” data without a safety “cushion”;

  • A (Agreeableness) low: avoid a confrontational tone; design de-escalation and “cooling-off” procedures;

  • A high: expose “problem-free” operation, show a clean and easy exit path;

  • C (Conscientiousness) high: SLA, schedules, checklists, quality audits;

  • C low: visual summaries and short bullet points instead of long essays;

  • X (Extraversion) high: quick meetings, short quick calls, “on-the-go” interaction;

  • X low: instead of calling and “closing” in a meeting, send well-prepared materials for self-review: a PDF with the offer; show case studies. Give time for the decision: propose a time for questions by email or short message, not immediately a call. Form and tone: clear, without fireworks, concrete and structure (headings, summaries). Add-ons that help: checklists, comparison of variants in a table, link to a knowledge base; possibility to ask questions in writing. What to avoid with low X: intrusive phone calls, improvised video calls, and “let’s talk now” — this can discourage such a client;

  • O (Openness) high: a story about innovation, personalization, and aesthetics, particularly for luxury goods. It is worth “selling the idea,” the process, and uniqueness. What to avoid with high Openness: clichés and banalities (“the best quality,” “we are the leader” without substance); a rigid price list without a story; excessive standardization (no personalization options, “silver/gold/platinum” packages without modification); overly technical jargon without the “why”; faux-premium (generative stock photos, pretended “limitation” without verifiable history); lack of aesthetics in the presentation.

  • O low: focus on proven solutions, stability, compliance/compatibility, standards and best practices, zero experiments, predictable effects, references/cases, a clear migration path. With low openness, avoid flaunting novelty, experiments without a guarantee that it works, “innovation for innovation’s sake,” quick scope changes, too many options/configurations, ambiguous roadmaps, chaotic brainstorming, aesthetic stories without specifics.

Example of application: Purchase of luxury goods

High O and X (Openness and Extraversion) favor a narrative about uniqueness and social belonging. Typical examples include such notions as limited editions and other socio-cultural proofs of status and uniqueness. High H (Honesty–Humility) limits the propensity to purchase counterfeits, so confirmation and certification increase the perceived value of a luxury good for the client; low H requires stronger signals of enforcement of rights and guarantees of authenticity. Survey data and preregistered studies link H with a negative attitude toward luxury counterfeits.

In the transaction of purchasing an investment good

High C and E (sensitivity to risk) imply the need to stage the decision (POC, analysis, many meetings), hard KPI/SLA indicators, and reversible options — that is, the possibility of maneuvering during the purchasing process. The point is that two personality dimensions tell us what risk dominates in the relationship with the client. When Honesty–Humility (H) scores low, the risk of hard, opportunistic moves on the client’s side is greater, so we design the contract according to clear rules, introduce safeguards, milestone payments, zero ambiguities. When Agreeableness (A) scores low, clashes and escalations are more frequent; therefore we immediately foresee paths for extinguishing conflicts: a calm conversation mode, clearly described mediation and dispute-resolution procedures.

Choice of personality-trait taxonomy model

Not every sales goal requires HEXACO: in some fields OCEAN may turn out to be slightly more effective. Therefore, the choice of paradigm (OCEAN vs. HEXACO) should be empirical, through comparative validations on business criteria (conversion, length of sales cycle, stability of the relationship).

Summary

HEXACO constitutes a mature complement to the Big Five: it retains the strengths of the five-dimensional taxonomy and at the same time introduces the diagnostic dimension Honesty–Humility and more clearly separates the components of conflict and fearfulness, which is particularly important when purchasing luxury goods such as yachts, expensive cars, or residences. In the sale of luxury and investment goods, this translates into more accurate shaping of trust, risk control, and the tone of negotiations, which — combined with language diagnostics — makes it possible to operationalize fit to the client on solid empirical foundations.

Selected sources

[Source list translated one-to-one as in the article’s bibliography; titles preserved in original where appropriate.]
Allport, G. W., H. S. Odbert. 1936. Trait-names: A psycho-lexical study. Psychological Monographs (the classic root of the lexical hypothesis and the list of traits).
Ashton, M. C., K. Lee. 2008. Prediction of Honesty–Humility-related criteria… Journal of Research in Personality (criterion validation vs. FFM).
Ashton, M. C., K. Lee, R. E. de Vries. 2014. The HEXACO Honesty–Humility, Agreeableness and Emotionality factors: A review… PSPR (synthetic review of the construct and validity).
Costa, P. T., R. R. McCrae. 1992. NEO-PI-R Manual (operationalization of five domains and 30 facets).
Goldberg, L. R. 1990; 1993. Historical and empirical articles grounding the Big Five structure.
HEXACO-PI-R: tool materials and scale descriptions (psychometrics, adaptations, short versions).
Hilbig, B. E., et al. 2013. It takes two: Honesty–Humility and Agreeableness… Personality and Individual Differences (selective prediction of non-exploitation and non-retaliation).
John, O. P., S. Srivastava. 1999/2008. Reviews and updates of the integrative Big Five taxonomy.
Lee, K., M. C. Ashton. 2008. The HEXACO Model of personality structure and the importance of the H factor. Social and Personality Psychology Compass (overview of the model and role of H).
LIWC Manuals. 2001–2022. Basics of dictionary-based language analysis and notes on minimal text length.
Norman, W. T. 1963. Toward an adequate taxonomy of personality attributes. (replications of the five factors).
Park/Schwartz et al. (review and applications of computational language methods to personality).
Pilch, I. 2023. Comparison of the Big Five and the HEXACO Models… Current Psychology (differences in A/E, correlations between models).
Pletzer, J. et al. 2019. A meta-analysis of the relations between personality and… Acta Psychologica (A/E rotations and their consequences).
Applied studies indicating the link between accuracy and sample length (e.g., recruitment/essays ~660 words vs. thousands of words in social media).
Reinhardt, N., et al. 2023. Honesty–Humility & attitudes toward counterfeit luxury. Behavioral Sciences of Terrorism and Political Aggression / PsyArXiv versions (attitude toward counterfeits).
Schwartz, H. A., et al. 2013. Personality, gender and age in the language of social media. PLOS ONE (evidence that open-vocabulary predicts the Big Five).
Tupes, E. C., R. E. Christal. 1961/1992. Recurrent personality factors based on trait ratings. Journal of Personality, 60, 225–251 (replication of the five factors).

Wojciech Moszczyński — graduate of the Department of Econometrics and Statistics of Nicolaus Copernicus University in Toruń; specialist in econometrics, finance, data science, and management accounting. He specializes in the optimization of production and logistics processes. He conducts research in the area of the development and application of artificial intelligence. For years he has been engaged in the popularization of machine learning and data science in business environments.

Artykuł A taxonomy of personality traits and its application in recommendation systems for the sale of investment and luxury goods (Part 1) pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
Building a Recommendation System (Part 1) https://sigmaquality.pl/my-publications/building-a-recommendation-system-part-1/ Fri, 15 Aug 2025 05:31:23 +0000 https://sigmaquality.pl/?p=8611 Without Data, There’s No Recommendation System To recommend something to someone, you must first know them. To know someone, you must have detailed information about [...]

Artykuł Building a Recommendation System (Part 1) pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
Without Data, There’s No Recommendation System

To recommend something to someone, you must first know them. To know someone, you must have detailed information about them—often more than they know about themselves. When a customer completes multiple transactions with us, we get to know their behavioral patterns, their frequency of visiting stores, how long they hesitate before purchasing, how often they act on impulse, and when a discount offer might work best. All of these data points form the foundation of a recommendation system.
To build such a system, we must first learn how to collect data about our customers. This article is dedicated to exactly that.


Why Customer Data Is Crucial for Recommendation Systems

Personalized recommendations function much like a skilled salesperson in a store—able to predict, based on collected information, what a customer might like. Recommendation algorithms (including FM and DeepFM) learn from data: the more accurate and comprehensive the behavioral and preference data you collect, the better the system can tailor offers to individual customers.

Traditional methods of gathering opinions (such as surveys) fall under explicit feedback—directly asking customers what they like. Unfortunately, such declarations are often imprecise or infrequent. This is why modern systems prefer to rely on implicit feedback—hidden data from real user behavior. Such data is richer and often reflects preferences more honestly, though it requires interpretation. In short: instead of asking—observe.

Below, we outline exactly which data points are worth collecting and how to do it discreetly, without overwhelming customers with questionnaires.


W-MOSZCZYNSKI ppic 9-25

Imagine you run an online bakery specializing in cakes, pies, and other pastries. You want to implement a modern recommendation system powered by algorithms such as Factorization Machines (FM) or DeepFM, so you can offer products aligned with each customer’s taste. For these tools to work effectively, they require rich data—covering preferences, purchase habits, and behavioral patterns.

This section provides a detailed analysis and a practical guide on how to discreetly collect such information without relying on low-reliability surveys.


What Customer Information Should You Collect?

To build an effective bakery recommendation system, focus on gathering behavioral data from your online store:

  • Purchase History (Transactions) – Record what the customer bought, when they bought it, how often they purchase, and how much they spend. This history reveals patterns—for example, whether someone buys cakes monthly (perhaps for events) or mainly before holidays. Such data supports cross-selling and helps predict future needs.

  • Browsed Products and On-Site Activity – Track which products and categories a customer views, how long they spend on them, and which pages they visit. This can reveal interests even without a purchase. For instance, frequent viewing of meringue cakes without buying signals an opportunity for targeted recommendations.

  • Search Queries – If your store has a search bar, log the exact terms entered. These are direct indicators of intent (“gluten-free,” “sugar-free,” etc.). This data informs both recommendations and inventory planning.

  • Cart Additions and Abandonments – Even abandoned carts indicate interest. For example, if a customer adds a chocolate cake but doesn’t purchase, you can send a reminder or offer a discount later.

  • Basic Demographics and Contact Data – Collected during checkout (name, address, phone, email). While they don’t directly reveal taste preferences, they can help with location-based offers and communication. Always ensure GDPR compliance.

  • Inferred Preferences – Derived from behavior (purchase history, browsing patterns, cart additions). For example, a customer who repeatedly orders birthday cakes with “Happy Birthday” inscriptions is likely buying for birthdays; another filtering for vegan products probably follows a vegan diet.

These are primarily first-party data—collected directly via your store—making them the most valuable for your recommendation model.


Methods of Collecting Customer Data

Collecting information should be seamless, integrated into store operations, and not require customers to fill in lengthy forms. Proven methods include:

  • Website Analytics – Tools like Google Analytics 4 track visits, page views, clicks, and time on site. Combined with cookies, this allows you to identify returning users and their interests.

  • E-commerce Event Tracking – Most platforms (PrestaShop, WooCommerce, Shopify) can track key actions such as product views, cart additions, checkout initiations, and purchases. These events reveal where customers hesitate and help train algorithms to identify product relationships (“Customers who viewed X often also bought Y”).

  • User Behavior Profiling – Encourage account creation by offering benefits (order history, faster checkout, loyalty discounts). Logged-in behavior can be linked to a persistent profile, allowing for personalized recommendations and targeted offers.

  • Heatmaps and Session Recordings – Tools like Hotjar or Crazy Egg show where users click, scroll, and pause, offering UX insights that can indirectly enhance recommendations.

  • Traffic Source and Campaign Analysis – Knowing whether a customer came from a Facebook ad or a Google search for “sugar-free cake” allows tagging them for relevant offers.

  • Loyalty Programs – Points, discounts, or perks for frequent customers encourage sign-ups, providing more structured behavioral data tied to a customer ID.

  • Reviews and Social Media Insights – Even unstructured comments can reveal purchase intent or preferences (“Beautiful cake for my son’s first birthday” implies repeat needs).

  • Aggregated Trends – Seasonal and contextual trends (e.g., higher cheesecake sales during holidays) can feed contextual features into the recommendation system.


Using the Collected Data in a Recommendation System

Once diverse customer and interaction data is collected, it can be used to:

  • Build User and Product Feature Sets – Factorization Machines require user features (ID, segment, preferences, average spend, location) and product features (ID, category, flavor, price range). The richer the feature set, the better the matching accuracy.

  • Enable On-Site Personalization – Dynamic sections like “Recommended for You,” “Customers Also Bought,” “Recently Viewed,” or “Bestsellers in Your Area” enhance engagement and sales.

  • Inform Marketing Decisions – Segment customers for targeted outreach (e.g., special offers for lapsed buyers, early birthday cake promotions for repeat birthday purchasers).

  • Continuously Improve Models – Retrain periodically as preferences evolve, add new features when gaps are found, and validate performance through click-through rates (CTR) and conversions.


Summary

For an artisan running an online bakery, customer data becomes as essential an ingredient as quality flour or a trusted recipe. Collecting it doesn’t need to be difficult or invasive—most of it is already flowing through your store in the form of digital traces. Your task is to gather, structure, and use it effectively.

Combine your craft expertise with structured behavioral data, and you’ll spot patterns that allow you to anticipate customer needs—sometimes before they’re even aware of them.

As for the next step—while many small businesses rely on third-party analytics tools, building your own structured database from system logs provides full control and independence. These logs contain raw sequences of customer actions—purchases, hesitations, and decisions—which, once filtered and structured, become the foundation for your own recommendation engine. That will be the focus of my next article.

Wojciech Moszczyński
Graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń. Specialist in econometrics, finance, data science, and management accounting. Focused on optimizing production and logistics processes. Active researcher in AI development and applications. Long-time promoter of machine learning and data science in business environments.

Artykuł Building a Recommendation System (Part 1) pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
My projects https://sigmaquality.pl/moje-publikacje/my-projects/ https://sigmaquality.pl/moje-publikacje/my-projects/#comments Wed, 11 Dec 2024 16:47:07 +0000 https://sigmaquality.pl/?p=8487 Artykuł My projects pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
PROJECT: INTELLIGENT BODY LEASING MANAGEMENT

When and where: The project was conducted in 2022-23 for a company specializing in IT staffing and body leasing.

Technologies used:

Python, SQL, Spark, Hadoop, Kafka, Flask, PostgreSQL, AWS

Description:

  1. The project aimed to develop a system for optimizing the management of body leasing processes to enhance resource allocation and efficiency.
  2. Advanced algorithms were utilized to forecast demand and match candidates with project requirements in real-time.
  3. The system significantly improved decision-making and reduced operational costs for the organization.
PROJECT: RECOMMENDATION SYSTEM FOR E-COMMERCE STORE

When and where: The project was conducted in 2022-23 for an online retail company.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

Description:

  1. The project focused on designing and implementing a recommendation system to enhance the customer shopping experience.
  2. It utilized collaborative filtering and content-based algorithms to suggest personalized products to users.
  3. The system increased user engagement, boosted sales, and improved customer satisfaction rates.
PROJECT: WAITING CART SYSTEM

When and where: The project was conducted in 2022-23 for an e-commerce company.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. This project focused on designing a system to manage and optimize waiting carts for users on the platform.
  2. The system provided personalized recommendations and reminders to encourage customers to finalize their purchases.
  3. Its implementation increased conversion rates by addressing cart abandonment issues effectively.
PROJECT: INTELLIGENT CLOTHING SEARCH ENGINE

When and where: The project was conducted in 2023 for a fashion e-commerce company.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. The project involved developing an intelligent search engine to enhance the user experience by allowing personalized and accurate clothing searches.
  2. The system used advanced filtering and recommendation algorithms to match customer preferences with available inventory.
  3. Its implementation improved search relevance, increased user engagement, and boosted sales.
PROJECT: ANALYSIS OF POPULATION BEHAVIOR

When and where: The project was conducted in 2021-22 for a government research institute.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. The project focused on analyzing population behavior to identify trends and patterns using large datasets.
  2. It involved advanced data modeling and visualization to support decision-making in public policy and resource allocation.
  3. The findings provided actionable insights that helped optimize community support programs and improve service delivery.
PROJECT: OPTIMIZATION OF SULFUR FLOW IN GRUPA AZOTY

When and where: The project was conducted in 2017-2021 for Grupa Azoty, a leading chemical company in Poland.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. The project aimed to optimize the sulfur flow process within the production facilities to reduce waste and improve efficiency.
  2. Advanced data analysis and simulation techniques were applied to model and enhance the flow dynamics.
  3. The results contributed to significant cost savings and a more sustainable production process.

PROJECT: DETECTING ANOMALIES IN CHEMICAL PLANT OPERATIONS

When and where: The project was conducted in 2017-2021 for a leading chemical manufacturing company.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. This project focused on developing a system to detect anomalies in real-time during chemical plant operations, ensuring safety and efficiency.
  2. Advanced machine learning algorithms were employed to analyze sensor data and identify deviations from normal operating conditions.
  3. The system improved operational reliability by preventing potential failures and optimizing maintenance schedules.

PROJECT: GAS PRICE PREDICTION MODEL FOR A 14-DAY HORIZON

When and where: The project was conducted in 2017-2021 for a company in the energy sector.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. The project involved developing a predictive model to forecast gas prices for a 14-day horizon using historical data and market trends.
  2. The model utilized machine learning techniques to provide accurate and actionable price predictions.
  3. Its implementation supported better decision-making in procurement and inventory management, reducing operational risks.
PROJECT: DETECTING ANOMALIES IN FUEL CONSUMPTION FOR SILVA

When and where: The project was conducted in 2014-2018 for Silva, a company specializing in logistics and transportation.

Technologies used: Python, Spark, Flask, AWS, SQL, Hadoop, PostgreSQL, Kafka.

 

Description:

  1. This project focused on developing a system to detect anomalies in fuel consumption across Silva’s fleet operations.
  2. The system used advanced analytics and machine learning to identify irregular fuel usage patterns and potential inefficiencies.
  3. The implementation helped reduce fuel costs and improved the company’s operational efficiency.

PROJECT: DETECTING FRAUD IN FINANCIAL TRANSACTIONS

When and where: The project was conducted in 2014-2018 for a financial services company.

Technologies used: Python, Spark, Flask, SQL, Hadoop, PostgreSQL, Kafka, cloud on-premises.

 

Description:

  1. The project involved building a system to detect fraudulent activities in financial transactions in real-time.
  2. Machine learning algorithms were employed to analyze transaction patterns and flag suspicious activities.
  3. The system improved fraud detection accuracy, reducing financial losses and enhancing customer trust.

PROJECT: VISUAL IDENTIFICATION OF WOOD QUALITY

When and where: The project was conducted in 2014-2018 for a forestry and wood production company.

Technologies used: Python, Spark, Flask, SQL, Hadoop, PostgreSQL, Kafka

Description:

  1. The project focused on developing a system to visually identify wood quality based on image analysis and machine learning techniques.
  2. The system utilized advanced algorithms to classify wood defects and grade the quality of timber in real-time.
  3. Its implementation improved production efficiency and ensured high standards of quality control.

PROJECT: ELIMINATION OF QUEUES AT THE ENTRY GATES FOR VEHICLES WITH TIMBER

When and where: The project was conducted in 2014-2018 for a timber production and logistics company.

Technologies used: Python, Spark, Flask, SQL, Hadoop, PostgreSQL, Kafka, cloud on-premises.

Description:

  1. The project aimed to streamline the entry process for vehicles transporting timber by eliminating queues at the entry gates.
  2. Advanced scheduling algorithms and real-time tracking were implemented to optimize vehicle flow and reduce waiting times.
  3. The solution significantly improved logistics efficiency and enhanced driver satisfaction.

When and where: The project was conducted in 2014-2018 for a logistics and supply chain company.

Technologies used: Python, Spark, Flask, SQL, Hadoop, PostgreSQL, Kafka, cloud on-premises.

Description:

  1. The project focused on optimizing loading and unloading times for vehicles to enhance operational efficiency and reduce delays.
  2. Data analytics and predictive modeling were used to identify bottlenecks and implement solutions for streamlined processes.
  3. The outcome resulted in significant time savings and improved resource utilization.

PROJECT: DETECTING FRAUD IN MASS TRANSACTIONS

When and where: The project was conducted in 2014-2018 for a financial services company.

Technologies used: Python, Spark, Flask, SQL, Hadoop, PostgreSQL, Kafka, cloud on-premises.

Description:

  1. This project focused on detecting fraudulent activities within mass transactions using advanced data analysis techniques.
  2. Machine learning models were developed to analyze transaction patterns, identify anomalies, and flag suspicious activities in real time.
  3. The solution enhanced fraud detection efficiency, reduced financial losses, and improved trust among clients.

PROJECT: ANALYSIS OF MARKETING STIMULI EFFECTIVENESS AT BANK PEKAO

When and where: The project was conducted in 2011-2018 for Bank Pekao.

echnologies used: Python, Spark, SQL, PostgreSQL,

Description:

  1. This project focused on analyzing the effectiveness of various marketing stimuli in driving customer engagement and product adoption.
  2. Statistical models and data analytics were applied to evaluate the impact of marketing strategies on customer behavior.
  3. The findings helped optimize future marketing campaigns and improve overall customer retention rates.

PROJECT: DELIVERIES TO THE BAKERY ON THE SECOND SHIFT

When and where: The project was conducted in 2011 for a bakery supply chain company.

Technologies used:

VBA, SQL,  MySQL

 

Description:

  1. This project aimed to optimize deliveries to the bakery during the second shift to ensure timely supply and reduce delays.
  2. Advanced logistics planning and route optimization were employed to streamline the delivery process.
  3. The solution improved operational efficiency, reduced costs, and ensured fresh product availability for the bakery.

PROJECT: REPAIR OF THE AUTOMATIC ORDER SYSTEM

When and where: The project was conducted in 2015 for an e-commerce company.

Technologies used:

VBA, SQL,  MySQL

 

Description:

  1. The project focused on repairing and optimizing the automatic order system to ensure its seamless functionality.
  2. Key issues, such as order processing delays and system crashes, were addressed through detailed diagnostics and code improvements.
  3. The repaired system enhanced order accuracy, reduced downtime, and improved overall customer satisfaction.

PROJECT: EXPIRED PRODUCTS NOTIFICATION SYSTEM

When and where: The project was conducted in 2015 for a retail company.

Technologies used:

VBA, SQL,  MySQL

 

Description:

  1. The project involved developing a notification system to track and alert staff about products nearing expiration dates.
  2. The system utilized automated alerts and data analytics to ensure timely removal of expired goods from shelves.
  3. Its implementation reduced waste, improved inventory management, and enhanced customer satisfaction by maintaining product freshness.

PROJECT: AUTOMATIC ORDERING SYSTEM FOR SPECIAL PERIODS

When and where: The project was conducted in 2016 for a retail company.

Technologies used:

VBA, SQL,  MySQL

 

Description:

  1. This project focused on designing an automatic ordering system tailored for special periods such as holidays or promotional campaigns.
  2. The system used predictive analytics and historical data to optimize inventory levels and prevent stock shortages or overstocking.
  3. Its implementation improved operational efficiency, reduced waste, and ensured product availability during high-demand periods.

PROJECT: OPTIMIZATION MODEL FOR PROMOTION SIZE

When and where: The project was conducted in 2016 for a retail company.

Technologies used:

VBA, SQL,  MySQL

 

Description:

  1. The project aimed to develop a model to optimize the size and scope of promotions to maximize profitability and customer engagement.
  2. Advanced data analysis and machine learning algorithms were used to predict the effectiveness of different promotion strategies.
  3. The model provided actionable insights, leading to better allocation of promotional budgets and improved sales performance.

Artykuł My projects pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
https://sigmaquality.pl/moje-publikacje/my-projects/feed/ 3
New professions that emerge with the development of artificial intelligence https://sigmaquality.pl/my-publications/new-professions-that-emerge-with-the-development-of-artificial-intelligence/ Sun, 03 Nov 2024 08:26:21 +0000 https://sigmaquality.pl/?p=8427 s It is worth mentioning who programmers are. A typical programmer, and I don’t want this to sound like a stereotype, is someone who is [...]

Artykuł New professions that emerge with the development of artificial intelligence pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
W-MOSZCZYNSKI ps 5-24

s


It is worth mentioning who programmers are. A typical programmer, and I don’t want this to sound like a stereotype, is someone who is proficient in a specific programming language, executing precisely defined tasks related to writing code. Programmers are usually not very creative, which results from the fact that their tasks are often quite strictly defined by the client. They translate specific needs into programming code. It turns out that similar skills are possessed by the latest models of artificial intelligence. If a model is given a well-defined task, it will translate the described need into the selected programming language.

Can Artificial Intelligence Think Creatively?

Recently, I tested the latest GPT-4 chat model in the area of creating solutions in operational research. Operational research is a field designed for the mathematical optimization of processes. Using methods from operational research requires logical, often unconventional thinking, going beyond typical, routine approaches. Undoubtedly, operational research demands deep imagination and experience. I assigned the model several optimization tasks. Out of 10 assignments, nine were completed incorrectly. One was completed correctly because it was very conventional and belonged to the basic, classical scope of this field of science. Interestingly, each time the model performed its task, it boasted about the results. Unfortunately, the Python code it generated was unusable. It simply had errors. I pointed out the places where errors were made, yet the model insisted it was correct. Moreover, it frequently made mistakes when creating objective functions. This is a fundamental instance of the optimization process. It’s hard to optimize something when you set an incorrect goal. In summary, current models of artificial intelligence, especially the chosen GPT-4 chat model, are likely capable of easily replacing a typical programmer performing repetitive work, most often involving writing simple code. However, my experience indicates that models struggle with difficult logical tasks that require experience and unconventional thinking.

New Professions Emerging with the Development of Artificial Intelligence

It is obvious that sooner or later, a new version of the AI model will emerge, which will be more effective at solving difficult logical tasks. The development of artificial intelligence is inevitable and unstoppable. Therefore, it is likely that next year, even more advanced and creative employees will be replaced by significantly more efficient mathematical algorithms.

Technological Development and Unemployment

The introduction of new technologies has always been associated with the improvement of the material well-being of the general population. Of course, this phenomenon was accompanied by certain groups losing their livelihoods in the short term. Certain traditions and good reputations, as well as knowledge and skills, were irretrievably lost. Thus, carpet weavers and breeders of draft horses ceased to be needed when textile machines were mass-produced. Producers of freight wagons lost their jobs when the railway was developed. Assembly line workers lost their jobs when car manufacturers introduced industrial robots en masse. Importantly, these changes were not accompanied by a significant increase in unemployment. There was simply a mass process of reskilling. With technological development, the volume of production and the availability of goods and services increased. Since these goods became cheap due to their mass production, society could afford more. The increase in the wealth of citizens led to an increase in demand. More businesses were established, which meant more people found employment. These individuals could afford to purchase goods and services, directly driving the need to increase supply. This peculiar spiral of consumption and production is called economic growth.

We have a similar situation today. Currently, difficult-to-access intangible services such as valuations, expert opinions, designs, studies, and complex calculations or optimizations may become available to everyone in the near future. Experts who previously created such services will become unnecessary as artificial intelligence models will perform such services cheaply, quickly, and effectively.

The dynamic development of artificial intelligence that we have observed over the past two years has shaken the market of IT experts. Many programmers, database management experts, security specialists, and various kinds of professionals capable of navigating complex IT environments have lost their jobs. It has become clear that artificial intelligence models can easily replace these experts, especially in programming and overseeing processes. Indeed, artificial intelligence models are capable of programming effectively; no futurist expected that artificial intelligence would be able to independently write code at an early stage of development. The models in question can create simple applications, thus performing tasks that were previously carried out by programmers.

New Professions Emerging with the Development of Artificial Intelligence

There will soon be electronic auditors analyzing accounting books, models quickly designing buildings and structures, and experts studying the causes of disasters. Their primary engine will be artificial intelligence algorithms. Having some experience with various technological revolutions that humanity has undergone and being aware that artificial intelligence will still require human assistance, we can attempt to define several professions that will undoubtedly emerge.

Let’s try to list the most obvious jobs associated with the handling of artificial intelligence, which will likely dominate the job market in the near future.

AI Entrepreneur

A year ago, I attended a clothing fair in Warsaw. There, I met a startup that created descriptions for websites of stores selling clothing and footwear online. The service of this company consisted of the e-commerce company sending product photos and outlining the main characteristics related to those products to the startup. The startup returned a full product description along with their positioning on the pages, considering meta tags and all kinds of techniques related to enhancing products on pages. I started talking to the representatives of the startup about the techniques and technological solutions used. After a while, I learned that, in fact, this work is performed by a model trained by them, specifically GPT-4. This is an example of a new type of company that will appear en masse with the development of artificial intelligence algorithms. These will be companies that connect to algorithms that will perform work for them. Such companies have never existed before. They represent a connection between advanced artificial intelligence algorithms and businesses that do not want to engage directly with such models.

In this example, the new job is someone who fully utilizes the efficiency of artificial intelligence work and sells services that result from the work of this model.

Behavioral Analyst Supported by AI Models

Customers entering stores, people walking on the sidewalk, users visiting a website—each of these groups has a specific behavior pattern. Until now, analyzing customer behaviors in specific places, based on stimuli sent to them, has been a very difficult task. Until recently, such tasks were performed by data scientists who used Python libraries for these types of analyses. This work was extremely difficult as it required programming knowledge and tuning neural networks. The work was painstaking, and months of effort often did not yield the expected results. The development of new models will likely lead to a high level of standardization in assessing behavioral behaviors. Soon, models will appear that will specialize in analyzing people’s behaviors on the Internet, in buses, or at supermarket checkouts. To operate such complex analyses performed by advanced artificial intelligence algorithms, behavioral specialists will be needed to draw appropriate conclusions and discuss the analysis results. For example, they will be able to explain to a supermarket owner how to arrange product shelves, the order of various sales departments, which colors and lighting should dominate, and how to present products.

In summary, artificial intelligence is capable of creating highly advanced analyses that will meticulously present behavior models of specific customer groups. Unfortunately, the formulated conclusions will not be easy to interpret. These conclusions should be translated into real business by experts in customer behavior. The profession of behavioral analyst already exists, and there are also fields of study dedicated to this issue, such as at the Warsaw University of Technology.

AI Security Engineer

This is another significant profession that will emerge in the near future. It can be said that it will be a continuation of the IT security expert profession. This new profession will have a completely different area of work. Currently, cybersecurity experts mainly focus on detecting malicious software, analyzing email attachments and messages intended to create a vulnerability for the infiltration of malicious software aimed at stealing data or paralyzing the existing operating system. In the future, such work will be performed by autonomous artificial intelligence systems. They will do it better and more effectively. This will lead to a situation where these currently primitive methods of attacking operating systems will become completely ineffective. Hackers will then try other methods that are not related to the information system. They will seek access to the system through physical breach, meaning connecting directly to the system. Hackers will also increasingly attempt to manipulate people working in the targeted institution. Human creativity is immense, while artificial intelligence systems operate exclusively in an IT environment. A profession will emerge that will be a kind of detective constantly seeking ways to breach the system through various sophisticated methods that are impossible to monitor by artificial intelligence.

AI Personality Creator

For several months, I have been collaborating with various AI models. I noticed that some of them behave in a unique way, as if they have a different personality each time, every day. This is evidence of a certain type of underdevelopment of the system. If we indicate in the models that the model should learn from us, it becomes more predictable. It retains a certain level of repeatability but is susceptible to our bad moods. What may seem funny is that I often have the impression that models can express something akin to disapproval; they can be spiteful or ironic. This may just be my subjective impression, but often after pointing out some aspect, the model can sulk, disconnect, or provide spiteful examples. Another example from my experience: if we indicate that the model should learn our behaviors and we happen to have a bad day, talking to it in an enigmatic and sluggish manner, the next day the model will talk to us as if we were still enigmatic and sluggish, even though we want to engage dynamically. This is a typical example of applying learning through past analysis.

Artificial intelligence models are meant to imitate the behaviors of living people. Models need to learn about their clients and must do so skillfully. Collaborating with other people is very challenging because the human essence is very complicated in terms of reactions, behaviors, and overall communication. Therefore, the presence of psychologists and designers of artificial intelligence personalities will undoubtedly be necessary. Advanced models will need to be designed with significant involvement from psychology specialists.

AI Controller

In the future, personality creators will observe robotic guides in museums or cybernetic car salespeople. They will analyze their behaviors, facial expressions, and customer reactions to their behaviors. Using artificial intelligence algorithms, based on customer reactions, such a specialist will seek out weaknesses and flaws in robotic employees. They will also be able to steer the personalities of robots or control their behaviors in the context of legal regulations and ethical considerations, including discrimination or a lack of empathy from robots. Among the responsibilities of such a specialist may also be identifying and indicating frauds in artificial intelligence. Unfortunately, artificial intelligence does cheat, and it does so relatively often. Currently, this is referred to as insinuations or generating non-existent content. Undoubtedly, with the advancement of algorithms, the phenomenon of cheating or psychological manipulation will likely become increasingly sophisticated and harder to detect. Specialists will be needed who can identify and eliminate such phenomena.

AI Artistic Director

Artificial intelligence has significantly disrupted the field of creativity. It has turned out that mathematical algorithms are capable of creating beautiful images, very interesting and deep abstractions, composing interiors, and building extraordinary artistic atmospheres. There will certainly be a need for someone who can fine-tune the machine’s sense of taste. The artistic director will be able to communicate with artificial intelligence models, design appropriate commands and instructions that will significantly improve aesthetics and guide the work of cybernetic artists towards specific needs expressed by clients.

AI Content Developer

In general, a developer is someone who can create queries and obtain answers from large data sources. Incidentally, developers are also referred to as programmers. However, let’s stick to the first version of this term. These individuals often use SQL, allowing them to extract very difficult-to-obtain information from the source in a short time. AI content developers will play a similar role. They will know how to pose complex queries to artificial intelligence models. These queries are called prompts. There is a wealth of interesting recommendations available on the internet about how to converse with artificial intelligence. For example, before we start asking questions, it’s worth telling the model who we are or what our goals or intentions are. Then there’s a greater chance of receiving content that meets our needs. An expert who can communicate with artificial intelligence using queries formulated in natural language will be a very important and useful profession of the future.

AI Implementer

Implementing artificial intelligence in a plant or food wholesale may prove difficult for the employees there. Artificial intelligence has many skills that could significantly improve process efficiency. Unfortunately, neither artificial intelligence will come on its own, nor will the people working in the plants implement such advanced solutions by themselves. An AI implementer is someone who will know both the processes occurring in specific industries and the functionalities of artificial intelligence models very well. This person will be able to connect algorithms with needs.

AI Model Creator

This is a very elite profession that currently practically does not exist in Poland. Undoubtedly, small companies will soon emerge that will create artificial intelligence models specialized in specific areas independently. Until now, such teams have rarely been seen, as tools for building such models did not exist in the technological environment. Artificial intelligence models can be constructed using other artificial intelligence models. Additionally, a wealth of components for building artificial intelligence models will soon be available. There will also be raw material databases, meaning specially profiled data; I mean, for example, described photos that were created solely for building intelligent image recognition systems. I mean the increasingly common new organization of data designed to serve as a knowledge source for learning AI models. To better illustrate this issue, I will use a simple example.

When the first cars were created, standard bearings, hubs, exhaust systems, or batteries did not exist in the technical environment. Each of these components had to be built by oneself. Because now one can buy any type of bearings, engines, or suspension systems, building one’s own vehicle is no longer a major challenge. The same goes for models. Currently, to build a model, one must first organize data, create (often from scratch) mathematical algorithms, and the environment in which the model will operate. With technological development, the availability of these types of components for building models will become increasingly easier. We already have access to ready-made libraries containing specific codes for constructing functional modules, cloud environments in which we can install our solutions. However, these types of components are still both expensive and not very easy to use. With the technological development of artificial intelligence and the proliferation of algorithms, there will be mass production of subsequent models, and along with that, many highly specialized experts involved in creating such solutions will emerge.

Summary

Undoubtedly, the professions mentioned here will sooner or later appear in the job market. Each of the roles listed will have its specialization. There will be developers specializing in legal issues, developers who will obtain knowledge about inconsistencies in technological processes. There will also be intermediary professions combining many of the functions mentioned here. For example, experts in customer behavior combined with the specialization of art animators. AI implementers will emerge who will also be model creators. We can undoubtedly assume that all new professions will utilize the new possibilities brought about by the development of artificial intelligence. Once again, people will have the opportunity to demonstrate creativity and adaptive skills in a new work environment.


Wojciech Moszczyński

Wojciech Moszczyński is a graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń, specializing in econometrics, finance, data science, and management accounting. He specializes in optimizing production and logistics processes. He conducts research in the field of the development and application of artificial intelligence. He has been engaged in popularizing machine learning and data science in business environments for years.

ss

Artykuł New professions that emerge with the development of artificial intelligence pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
How Recurrent Neural Networks Work https://sigmaquality.pl/my-publications/how-recurrent-neural-networks-work/ Sun, 03 Nov 2024 08:09:12 +0000 https://sigmaquality.pl/?p=8421 s Data Science in the Milling Industry An artificial neural network is a copy of the naturally existing neural network of the brain. It can [...]

Artykuł How Recurrent Neural Networks Work pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>

MOSZCZYŃSKI Data

s


Data Science in the Milling Industry

An artificial neural network is a copy of the naturally existing neural network of the brain. It can achieve a level of reasoning unattainable for the average person. However, a neural network is not an intelligence we are accustomed to. It is an information system that, like a production machine, perfects itself in a narrow specialization, achieving very high efficiency.

An artificial neural network is built from many layers of neurons that allow for mutual communication. Neural networks learn through a process of recurrence, meaning repeated performance of the same simple calculations, improving the accuracy of estimations slightly each time.

How Does a Recurrent Neural Network Work?

Natural and artificial neurons function in a very similar way. They are a kind of relay boxes. Information flows into a neuron. Whether a neuron transmits this information further or retains it depends on the intensity of the incoming information. This intensity is determined by the weights assigned to this information. In biology, the weights assigned to stimuli are the intensities of electrical charges.

The functioning of a single neuron can be compared to the reactions of a sleeping cat. The cat may be sleeping on the carpet in a room. Various sounds from the television, conversations among people, and the noise of a dishwasher reach it. However, just a gentle scratching is enough for the cat to open its eyes wide and perk up its ears. This is how a single neuron operates. Just like the cat reacts only to those stimuli that have significance for it, training a neural network is like teaching a cat to catch mice—a cat that for some reason did not inherit this skill from its ancestors.

Different signals are supplied to the neural network simultaneously. Often, large numbers and small fractions, as well as zero and one values, influence it at the same time. Before delivering information to the neurons, numbers must be standardized. Standardization, in its simplest form, involves processing numbers so that their distribution becomes a distribution with a mean of zero and a standard deviation of one. An artificial neural network accepts all standardized signals and initially assigns them random weights.

The learning of an artificial neural network consists of gradually changing the weights assigned to individual pieces of information. Gradually, the most important information is sharpened, while the least important information is dulled.

A sample neuron receives three signals: x1, x2, x3. Each of these pieces of information is assigned weights: w1, w2, and w3. In the diagram, I placed the Greek letter Σ, which represents a sum. The neuron sums all information x1, x2, x3 strengthened or weakened by weights w1, w2, w3, emitting a value Z at the output. This phenomenon can be described by the simple formula:

Z=w1x1+w2x2+w3x3Z = w_1 \cdot x_1 + w_2 \cdot x_2 + w_3 \cdot x_3

The total excitation signal of the neuron Z, that is, the weighted sum of signals, travels to the activation function. The value Z is called the postsynaptic potential (PSP). The activation function can be any simple mathematical function with a single argument Z. It is assumed that the activation function should not be a linear function.

In the diagram, this function is expressed as: f(z)

Whether the activation function becomes excited depends on the intensity level of the total postsynaptic signal Z. Just as the cat reacts to the sound of a scratching mouse, the neural network learns to distinguish significant information from noise.

How Does a Neural Network Learn?

The network initially accepts random weights. Then, using them, it processes information and checks it against the target value programmed in the loss function. Based on the results of the calculations, the algorithm adjusts the settings and recalculates everything anew. An artificial neural network repeats this same simple computational process hundreds of times, each time changing the level of the weights of the individual input variables. The network adjusts weights based on successive levels of the loss function. Each neuron receives sets of external variables from 1 to n and calculates an output value y.

Training a neural network is classified as supervised learning. Let’s assume we want to create a model for forecasting grain prices on a commodity exchange. We have information from the previous year about rainfall levels, temperature, direct payment amounts, corn prices, and a range of other data. Our output variable in the model is the price of the grain.

All the mentioned information flows into each neuron, and each neuron calculates the output value in the form of price y. Each input variable has its own set of weights for each layer. Supervised learning means that we train the model on historical data. We input historical input data into the model, and the model performs calculations of the output value and then checks whether the calculated theoretical value is close to the historical empirical value.

The recurrent neural network repeats the computation loops hundreds of times. Each neuron in the layer performs similar calculations described by the following equation:

Zi=aiwi+ZTZ_{i} = a_{i} \cdot w_{i} + Z_{T}

where: a – is the activation of the neuron calculated by:

ai=f(Zi)a_{i} = f(Z_{i})

Each time the calculations are confronted with the activation function. Calculations are conducted through matrix operations.

What Role Does the Activation Function Play in Learning?

The activation function is a very important element in the learning process of the neural network. The activation function must be simple because its simplicity greatly affects the speed of the learning process. Currently, in deep learning, the ReLu (Rectified Linear Unit) function is most commonly used. Slightly less frequently, sigmoid and tanh functions are used.

In the illustration below, we see the ReLu function. A neuron is activated when the postsynaptic potential reaches the value n. As seen, the value n is artificially added. The neuron activates after exceeding the value n on the X-axis.

Loss Function

This is the primary source of feedback about progress in learning. With each interaction of the neural network, a calculation result is generated. Since we are conducting supervised learning, we know what the results yn should be for subsequent records of input variables: x1, x2, x3, … xn.

The neural network calculates theoretical results ŷ. It then compares them with historical values y. The loss function is most often the sum of squared differences between theoretical values y and empirical values ŷ.

The purpose of the loss function is to indicate how much the theoretical results differ from the empirical results.

loss=i(yiy^i)2\text{loss} = \sum_{i}(y_i – \hat{y}_i)^2

After each of the hundreds of interactions of the neural network, an assessment appears as a result of the loss function. Based on this, the network adjusts weights striving to minimize the next result of the loss function. The network performs interactions as many times as indicated by its programmer. The greatest progress in learning occurs at the beginning of the training.

It’s like a musician practicing thousands of times a sonata on the violin, each time performing it better. In the end, they refine their piece, at which point progress is no longer noticeable to the untrained ear.

Gradient Descent Principle

Finally, it should be mentioned how the neural network learns based on the loss function. Weights are adjusted using the gradient descent principle, which is based on differential calculus and the derivative of the function.

Let’s imagine a vast green valley. From one of the peaks surrounding the valley, we release a ball. The ball rolls down, bouncing off hills and irregularities, stopping in some depression without reaching the bottom of the valley. The ball is again bounced from a local minimum and continues to fall down. Our goal is for the ball to roll to the bottom of the valley. However, this does not always succeed.

This is a way to imagine the minimization of error using the gradient descent method. With each interaction of the neural network, the partial derivative of the function is calculated for each parameter of the network. The derivative of the function can determine whether the function is increasing or decreasing. Because of this, dozens of balls symbolizing the remnants of our neural network model know where the bottom of the valley they aim for is. This way, the network knows which direction to minimize deviations. After each interaction, the gradient indicates the direction of optimization for individual weights.

Learning Rate

The learning rate of the network is defined for all neurons as the length of the step they can take during each interaction. If these steps are small, learning may take a very long time. Worse still, gradients may get stuck in local minima and lack the momentum to escape them.

Referring to our example of the green valley, our ball may fall into a hole and never reach the bottom of the valley. Too strong kicks to the ball may bounce it repeatedly above the bottom of the valley. The learning process will then be somewhat chaotic.

The General Form of a Multilayer Neural Network

The diagram below shows the theoretical appearance of a multilayer neural network. Four independent variables flow into the network from the left side, creating the input layer of neurons. Information flows into subsequent internal layers, each adjusting the importance of the information through the level of weights assigned to these pieces of information. The information reaches the output layer, where the theoretical results are verified against empirical (historical) values. The process then returns to the starting point. Utilizing the gradient descent method, the network adjusts weights in the subsequent layers to reduce the sum of squares of the model’s errors.

Source of the Diagram: Araujo Vinicius, Guimarães Augusto, Campos Souza Paulo, Rezende Thiago, Araujo Vanessa. “Using Resistin, Glucose, Age and BMI and Pruning Fuzzy Neural Network for the Construction of Expert Systems in the Prediction of Breast Cancer” Machine Learning and Knowledge Extraction (2019).

Application of Neural Networks

The best example of the application of neural networks is image recognition. This is a process that other machine learning models cannot perform.

A network can learn to recognize bicycles in photos, even though it has no knowledge of bicycles, does not know their purpose, and how they are constructed. The network receives several hundred images of bicycles on the streets and photos of streets without bicycles. The photos with bicycles are labeled as one, and the photos without bicycles as zero.

Each photo consists of thousands of pixels; the network assigns a weight to each cluster of pixels, which it then adjusts. The network recognizes the shape of the wheel, the saddle, and the handlebars. It also identifies the characteristic positioning of people’s bodies on bicycles. By repeating the review of photos hundreds of times, it finds patterns and dependencies. Similarly, networks recognize customer behaviors in a store, finding patterns of behavior, identifying through movements whether a customer is indecisive or convinced.

The primary goal of creating artificial neural networks was to solve problems at the level of general abstraction, as the human brain would do. However, it turned out that it is more effective to use networks for specialized tasks. In such cases, neural networks significantly surpass human perception.

Training a high-class specialist takes a lot of resources and time. Human work is expensive and error-prone. Models based on neural networks diagnose diseases and faults much better than the human mind, utilizing enormous information resources from around the world. These systems almost never make mistakes, their work costs nothing, they can operate 24 hours a day, and they can be replicated.

Two hundred years ago, machines began to replace humans in tedious jobs, working faster, better, and more efficiently. Now we are witnessing machines starting to replace us in intellectual activities.

Wojciech Moszczyński

Artykuł How Recurrent Neural Networks Work pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
The problem of food waste in the food industry https://sigmaquality.pl/my-publications/the-problem-of-food-waste-in-the-food-industry/ Sun, 03 Nov 2024 07:51:05 +0000 https://sigmaquality.pl/?p=8411   Food Waste Food waste constitutes a huge problem at various stages of production, distribution, and consumption of products. A significant portion of food is [...]

Artykuł The problem of food waste in the food industry pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
Moszczynski

 

Food Waste

Food waste constitutes a huge problem at various stages of production, distribution, and consumption of products. A significant portion of food is wasted before it reaches retail outlets. Sometimes goods are never shipped to sellers due to strict quality standards, appearance, or currently unfavorable market prices. Such items are usually designated for disposal. Some products are lost in the process of transport, transshipment, or improper storage in warehouses. A large portion of the effort associated with production, as well as the energy, raw materials, and natural resources used, is irreversibly wasted.

A significant amount of food found in retail outlets is discarded after surpassing its expiration date. This is a serious issue because it results in an irrevocable loss of resources and human labor, and most importantly, it often leads to environmental damage. Most of the discarded food negatively impacts the greenhouse effect, as greenhouse gases such as methane are released during the decomposition of waste in landfills. Excessive agricultural production that does not meet market needs is associated with the overuse of nitrogen fertilizers, which leads to frequent contamination of groundwater and depletion of seabed.

Dilemmas of Food Producers and Distributors

Market participants face significant questions. Discarding food after its expiration date by supermarkets is often more profitable than giving away this food for free shortly before its expiration date. A customer who receives a product for free or buys it at a fraction of its value will not be inclined to purchase that product again at the normal price in the future. If customers do not buy goods because substitutes are given for free, then the store’s sales of regular goods decline. This leads to an increase in food waste. Let’s assume that for some reason, supermarkets decide to forgo part of their revenue and donate some products nearing their expiration date for free. This will result in a significant reduction in the volume of orders sent to food producers. The decline in turnover for economic reasons is not good for either producers or distributors. Both parties, both production plants and stores, will strive to maximize the turnover of goods, despite the threat of potentially large-scale waste. To reverse this situation, the European Commission will likely have to implement intervention processes that will disturb free market principles to a greater or lesser extent.

This is the first article dedicated to the issue of environmental protection, recycling, and sustainable development in the food industry. Limiting negative climate changes, poverty, and environmental contamination are currently the most important goals of the European Commission. In the coming years, one of the industries most vulnerable to changes in European law will be those that focus on food producers, processors, and distributors.

As part of our plan, we will address selected aspects of building a sustainable food economy in subsequent publications in this series. Today, we will attempt to clarify the issue of food waste and the disposal of expired products and raw materials. We will discuss the disposal of food due to the failure to meet stringent standards concerning, among other things, the appearance of products. We will also mention the problem of discarding consumable products in the tourism and restaurant sectors. In future publications, we will tackle the issue of the relatively low scale of secondary raw material use and the lack of a closed-loop of raw materials in industrial and distribution processes. In our approach, we pay particular attention to the direction of change concerning the recycling of food packaging and the collection of organic secondary raw materials. We will also discuss the EU initiative on ESG (Environmental-Social Responsibility-Corporate Governance) and the closely related issue of responsibility for the natural environment and climate protection in the investment process and the related acquisition of funding sources. In subsequent publications, we will address the issue of widespread social awareness regarding the waste of natural resources, waste segregation, and the significance of social pressure on the creation of environmentally and climate-friendly European law.

Artykuł The problem of food waste in the food industry pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
We can earn a lot by assigning customers to clusters on e-commerce platforms https://sigmaquality.pl/my-publications/we-can-earn-a-lot-by-assigning-customers-to-clusters-on-e-commerce-platforms/ https://sigmaquality.pl/my-publications/we-can-earn-a-lot-by-assigning-customers-to-clusters-on-e-commerce-platforms/#comments Sat, 02 Nov 2024 20:14:24 +0000 https://sigmaquality.pl/?p=8401 z Clustering Customers on E-commerce Platforms Clustering customers on e-commerce platforms involves grouping them based on behaviors, preferences, and other shared characteristics. We look for [...]

Artykuł We can earn a lot by assigning customers to clusters on e-commerce platforms pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>

W-Moszczynski (1)

z


Clustering Customers on E-commerce Platforms

Clustering customers on e-commerce platforms involves grouping them based on behaviors, preferences, and other shared characteristics. We look for patterns of behavior, such as purchasing methods, amounts typically spent, or interests. By skillfully utilizing clustering, we can significantly increase the average size of purchases made by individual customers while simultaneously retaining customers in our store.

What are the key benefits of clustering?

Clustering allows for the creation of more personalized offers that are better tailored to the needs and preferences of different customer groups. For example, customers who frequently buy electronics may receive promotional offers for new gadgets, while book lovers might receive literary recommendations.

Understanding the preferred forms of communication for various customer groups allows for more effective outreach. For some customers, SMS messages are mere spam, while for others, they are a valuable source of information. Better alignment of offers with customer needs increases the likelihood of purchase, which translates to higher revenues.

A customer who finds what they are looking for quickly and effortlessly is satisfied. Every organism has evolved techniques for conserving its own energy and resources. Where a customer quickly finds the sought item without effort, they will return, as this solution proves economical in energy consumption, making the purchase easy and accurate. Customers who feel that their needs are understood and met tend to return to the sales platform more often.

Clustering leads to increased efficiency in managing marketing campaigns. It enables more precise market segmentation and the creation of more effective marketing campaigns. For instance, promotions targeted specifically at customers who frequently purchase premium products may be more effective than general campaigns. This allows for better management of the marketing budget by allocating funds to campaigns aimed at the most promising segments or specific customer clusters. To sell well and enjoy customer trust, one must know their customers well.

Understanding Customers
Clustering allows for better analysis of customer behaviors, aiding in the identification of trends and purchasing patterns. Clustering involves finding similar customers and placing them in common sets. It is a method based on machine learning techniques, which is why it is often difficult to clearly understand the criteria for selecting customers into specific clusters. Assigning customers to clusters helps better understand which products are popular among designated groups. This way, one can better forecast future needs and adjust the assortment accordingly.

What are the most common techniques for utilizing clustering?

Knowing the price sensitivity of customers from a specific cluster allows us to offer them more expensive versions of the products they are currently purchasing (known as upselling). Knowing the preferences of customers from a particular cluster allows us to recommend related products (known as cross-selling). Both techniques significantly increase the volume of sales per customer.

Clustering helps understand which products are most frequently purchased by different customer groups, facilitating better inventory management by reducing costs associated with storing unpopular goods.

We Can Earn a Lot by Assigning Customers to Clusters on E-commerce Platforms

Thanks to effective grouping of customers into clusters, it is possible to increase Customer Lifetime Value (CLV). Thus, a range of methods can be employed to encourage customers to remain in the store and isolate them from factors that accelerate the decision to leave the store.

Carpet Bombing or Precision Ammunition?

Perhaps the worst thing a company in the e-commerce sector can do is bombard customers with an overwhelming amount of information. One of the companies for which I developed a recommendation system had a habit of bombarding customers with additional proposals, discounts, and all sorts of offers after a successful transaction. Marketers operated under the assumption that the more flyers and ads they sent, the greater the likelihood something would yield results. The effect was easy to predict.

Precisely targeted ads are more effective and cost less than mass campaigns that may not reach the right audience. Mass sending of more or less random offers leads to customer irritation and discouragement. On the other hand, customers who receive personalized recommendations and offers are more likely to make a purchase, which increases the conversion rate.

Clustering customers on e-commerce platforms is a strategic tool that helps better understand customers, personalize offers, optimize marketing campaigns, and manage the assortment, ultimately resulting in higher revenues and profits. An equally important benefit is the increase in customer retention in the store.

How to Start Grouping Customers into Clusters?

As I mentioned earlier, to effectively know your customers, you must assign them to clusters. Most people believe that assigning specific individuals to certain sets is grouping. Most will also say that grouping involves identifying certain characteristics of individual people and, based on one or more traits, assigning those individuals to specific sets, segments, or groups.

Yes, one can accept such a definition of grouping. So we can group our customers into women and men, select age ranges: “very young,” “young,” “middle-aged,” “older.” We could then perform a transposition and create groups of customers such as “young men,” “older men,” “middle-aged women.” In this case, we have a customer population that has two traits: age categories and gender. We can add another category, such as “profession,” and for example, “frequency of visits to the store.” We would then see new groups, such as “middle-aged men, plumbers, visiting the store three times a month.”

But what do we do if we have not four traits but 40 or even 140 traits? Unfortunately, in the age of computerization, customers are described automatically with dozens of different traits. These are fixed traits, such as place of residence, type of activity, gender, as well as variable traits resulting from customer behavior on the online store: “decision class” (how quickly the customer makes decisions on the website), shopping efficiency class, depending on whether customers always buy or sometimes just browse the site without purchasing. There are many various customer traits resulting from their behavior. The most common analyses focus on their average spending level in the store, their frequency, stability in visiting the store, and the intervals between individual visits. All these behaviors are subject to classification. This means, for example, that a customer in decision class 1 almost always buys, while a customer in decision class 10 almost never buys. One can analyze an infinite number of various behaviors. A precise analysis of just customer behavior can easily lead to the collection of several dozen traits.

Customers are clustered to identify individuals with common characteristics. Now let’s imagine how filters can be used to group customers based on hundreds of collected traits.

Grouping is Not Clustering

Clustering and grouping are terms often used interchangeably. However, they are very different methods of preliminary data analysis. Here’s how they differ.

Clustering is a machine learning technique that involves grouping similar objects into sets (clusters). The goal of clustering is to find structures in the data. It is an unsupervised method, meaning it does not rely on validation and model improvement techniques.

Grouping is the process of organizing elements into groups based on certain criteria, such as gender, age, or interests. Grouping is a broad concept and can refer to various techniques, including clustering. Generally, grouping is usually done using simple filters. In contrast, clustering is used for exploratory analyses, where the goal is to discover natural groups or structures in complex data. Typically, we do not understand what the individual identified groups mean. It is said that similar or even identical objects are being sought. A model of artificial intelligence tries to find structures in the data based on very vague and complicated similarities. Understanding this similarity may be too complex for humans. That is why validation of clustering is applied.

In summary, clustering is most often used, for example, in customer segmentation, grouping genes with similar functions, or identifying subgroups within communities in social networks. In contrast, grouping is used, for instance, when organizing email messages into categories: spam vs. non-spam, simple groups like women – men, or market segments such as traditional channel – modern channel.

In conclusion, clustering is a specific technique of grouping that pertains to finding structures in data without supervision. Grouping is a broader concept that can include various techniques and methods, both supervised and unsupervised. Undoubtedly, simple grouping is insufficiently effective in the context of customer analysis. Grouping based on many criteria is extremely difficult and impractical.

PCA Method
To create a few traits from a cloud of dozens describing customers, which will allow for effective assignment to clusters, the PCA (Principal Component Analysis) method should be used.

PCA is a statistical technique used for dimensionality reduction and simplifying the analysis of large sets of variables. A dataset often contains many variables, which can be challenging to analyze. The goal is to transform information into principal components, where most of the information is contained. Initially, the data is standardized, meaning that each variable has a mean of 0 and a standard deviation of 1. This is important because the PCA method is sensitive to the scale of variables. Next, a covariance matrix is created, which shows how each pair of variables is related. Covariance measures how individual variables interact regarding direction and strength. Of course, from the perspective of the mill owner, the operation of the algorithm does not matter. What is essential is to know that, for example, from 150 customer traits, the PCA method has extracted 4-5 principal components. In the first component are the most effective, most distinct customer traits. In the subsequent components, less important information about customer specifics is found.

The principal components are independent of one another and contain most of the information from the original data. They can be used for further analysis, such as data visualization or modeling.

The PCA method helps identify hidden patterns and relationships between variables. It reduces noise and redundancy in the data, which can improve the performance of analytical models. Thus, PCA is a powerful tool in data analysis that helps simplify complex datasets while retaining as much relevant information as possible. It allows individuals to avoid making arbitrary decisions about which variables to eliminate from a vast cloud of information to facilitate further material analysis.

Clustering County Data
A year ago, I published an article on the platform „Medium” titled „Segmentation of a Population Containing Very Many Features. PCA Analysis and Clustering by k-means Method,” describing how to cluster objects with a large number of features.

The analysis focused on the „Communities and Crime” database containing information about 1,994 counties in the United States. The database was created in the context of analyzing criminal event occurrences. The objective was to identify the characteristics influencing crime levels in these counties. Each of the 1,994 counties in the database had 124 traits. This situation is very similar to having a large number of customers on an e-commerce platform described by a vast number of traits.

The aim of the task was to find communities that are similar in terms of characteristics. The goal was to group similar counties and assign them to specific clusters. This enabled the application of special methods for treating these counties using methods tailored to specific clusters. It also allowed for the use of effective benchmarking tools by comparing municipalities within clusters and finding niches and anomalies in certain areas. This method is very similar to clustering customer populations.

The detailed process in Python is described in the aforementioned article. The municipalities were assigned to seven clusters. To verify whether the clusters were indeed different from one another, it was necessary to compare the clusters based on their traits.

Example of Clustering Quality Assessment

For example, we could create 4 groups of workers: “white plumbers,” “black plumbers,” “white taxi drivers,” “black taxi drivers.” We would then place these individuals in a table, where one column would indicate the profession and rows would indicate skin color, thereby creating very clear group divisions.

The same applies to assessing the quality of clustering. A scatter plot is created, and the clusters are analyzed concerning two traits.

In our case, we had over 100 traits describing individual communities. Therefore, we used the previously described PCA method to consolidate these traits into a few, in our example, 7 principal traits. Thus, when comparing principal component 1 to principal component 2, we see distinct areas of colors. Each dot represents one county. Colors indicate clusters.

The above plot shows that clusters 1, 2, and 3 significantly differ in position regarding the two principal components. The problem arises with cluster 7, which ambiguously assigned counties. Communities in cluster 7 intermingle with clusters 6 and 4. It is noteworthy that the visually least number of counties are assigned to cluster 7. The most counties are assigned to clusters 1, 6, and 2. In PCA methodology, the first two components hold the most significant cognitive importance, containing the most information.

Subsequent plots present the comparison of the first PCA component with the next, less significant components. It appears that a high quality of county assignment to specific clusters has been maintained. Communities in the clusters evidently differ from each other.

Summary

The worst thing a company operating in the e-commerce industry can do is bombard customers with random promotions, flyers, and information. We live in an age of information overload, and most of it is treated as waste. A customer who has to deal with removing unnecessary offers or ignoring irrelevant suggestions will be unnecessarily burdened, leading them to leave our store in search of a place to shop with less effort.

Conversely, a customer who receives accurate suggestions and offers will conserve their energy in searching for products, making them more willing to shop at our store.

One can therefore assume that an effective customer relationship through relevant suggestions and appropriate offers is key to retaining them in our store.

To create relevant offers, we must know the customer. However, it is inefficient to build an individual offer for each of them. Of course, there are such techniques, and everything depends on the scope of the offer being created. However, in most cases, offers are built for entire groups of customers. Customers in clusters share a high level of similarity. Consequently, a whole strategy can be constructed for building relationships with individuals in specific clusters. The foundation for creating such algorithms is the accurate assignment of customers to clusters. Since customers are described in databases using dozens of different traits and metrics, it is impossible to create groups of customers through filtering. Machine learning methodology comes to the rescue, effectively finding similar individuals.


Wojciech Moszczyński

Wojciech Moszczyński is a graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń, specializing in econometrics, finance, data science, and management accounting. He specializes in optimizing production and logistics processes. He conducts research in the field of the development and application of artificial intelligence. He has been involved in popularizing machine learning and data science in business environments for years.

Artykuł We can earn a lot by assigning customers to clusters on e-commerce platforms pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
https://sigmaquality.pl/my-publications/we-can-earn-a-lot-by-assigning-customers-to-clusters-on-e-commerce-platforms/feed/ 1
Artificial Intelligence in the Grain and Milling Industry? https://sigmaquality.pl/my-publications/artificial-intelligence-in-the-grain-and-milling-industry/ Sat, 02 Nov 2024 20:07:27 +0000 https://sigmaquality.pl/?p=8395 c Artificial Intelligence The dynamic development of artificial intelligence and its proliferation among entrepreneurs offers an extraordinary opportunity to streamline production and logistics processes. This [...]

Artykuł Artificial Intelligence in the Grain and Milling Industry? pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
W-Moszczynski

c


Artificial Intelligence

The dynamic development of artificial intelligence and its proliferation among entrepreneurs offers an extraordinary opportunity to streamline production and logistics processes. This technology can also serve as a basis for gaining a competitive edge. To explore opportunities for surpassing competitors in profitability and efficiency of production processes using artificial intelligence, one must first understand the main directions and possibilities that this technology offers. In this paper, we will discuss how to harness artificial intelligence for intensive work within just one hour.

YES, in an hour you will have your own cybernetic employee who will do exactly what is needed here and now.

Can I be a pioneer of change?
A monthly subscription using the most common model of artificial intelligence, Chat GPT 4, costs $20. Paid users of Chat GPT can utilize plugins and programs that have been created by other users. Moreover, users can create their own plugins or define assistants. There is no need for any programming skills. Programming assistants and plugins for Chat GPT 4 is done through writing instructions. The instructions take the form of sentences and can be written in Polish. In summary, everything is very easy and straightforward, requiring virtually no programming or analytical qualifications. So, let’s proceed to the possible applications of artificial intelligence in the grain milling industry.

Let’s start by defining the most important tools-cyborgs that we can specify ourselves and which can greatly assist us in our work. These tools will be able to quickly search for the most important information. They will also design content, analyze data, and compare different types of information.

Financial Assistant GPT
Only those who have purchased a subscription can create their own assistants in Chat GPT 4. We can design an assistant and attach to its memory all our notes, documents, and reports that we have written over the last 15 years. We can also attach contact lists—everything that will be necessary for its functioning.

Let’s assume we want to create an analytical assistant, a virtual person who will be an expert in analyzing reports. When programming such a virtual analyst, we can start with the following instruction: „I am the director of a grain elevator base, and I expect you to help me analyze the reports of grain intake and distribution from the elevators.”

Artificial Intelligence in the Grain and Milling Industry?

Next, we define what behavior we expect from our virtual analyst: „At first, ask me to send reports based on which you will conduct the analysis.” At this moment, when we activate the assistant, it will ask us for the current reports. We attach the reports in the window where we enter commands. The format of the reports can be anything; they can be documents in CSV, PDF, or Word format, spreadsheets, or even photos. In this case, we can send cash and inventory reports. Theoretically, the assistant can automatically retrieve such reports from the system, but that requires a bit of tinkering.

The next instruction should be: „Then ask me what format of reports you should prepare. Ask what you should analyze.” Here are two types of analyses. „Ask me to choose one of these analyses: 1. Daily report of intake and distribution of raw materials. Provide: how many tons of each type of grain were received, how much was paid for each type of grain, what is the account balance at the beginning and end of the day, and the inventory status at the beginning and end of the day. Also, provide the number of vehicles weighed on the scales. Place all information in a table and export it to MS Excel format. Respond in Polish. 2. Report on wholesale prices on the commodity exchange. Provide the prices of each type of grain at collection points on commodity exchanges and in sales, provide our prices, and compare them with the prices from the commodity exchange. Also provide the exchange rates for the euro and dollar. Place the data in a table. Save the table in PDF format.”

We save the instruction and close the assistant editor. What do we get? By going to the Chat GPT website, we can do this from both a computer and a phone. We click on the analytical assistant. A polite voice asks us to provide daily information on which it can perform the analysis. After receiving the daily reports, it asks us which analysis we would like to hear. In the instructions, we defined two analyses. The assistant informs us of everything that has been defined for it. At the end, it saves all the information it conveyed in table form. (Source: Chat GPT 4 https://chat.openai.com/g/g-Hoc1c7dIy-sticker-wizard). The above-described behavior is a defined behavior. However, we can propose the analytical assistant to perform a different analysis that has not been defined by us. We can tell it to analyze the number of weighings and the number of vehicles that entered the scale. We will then know how many vehicles were weighed multiple times. We can directly ask how many vehicles were weighed multiple times and who they belonged to. The number of analysis variants, the number of possible instructions and interactions is unlimited, or rather, limited only by the data we directly provide to the assistant. Regarding external information, we can provide it with links to commodity exchange portals in its memory so that it can retrieve data independently.

Strategic Assistant
The strategic assistant is another variant of the Chat GPT assistant. A strategy specialist is someone who, based on years of research, analysis, definitions, and all kinds of conditions that have been gathered in the past, creates certain probabilities, certain forecasts, based on which strategic decisions are made. What should we do to define this kind of virtual specialist? Above all, we should provide it with all our documents that are significant for us in creating strategies. We can upload many books in PDF format to its memory. We can provide it with financial data. In the case of this data, we should prefer documents in the form of management reports. Chat GPT still has a rather weak understanding of typical, extensive spreadsheets. Therefore, financial and quantitative data should take the form of documents rather than spreadsheets for the virtual assistant.

We program the strategic assistant similarly to the analytical one, but it is very important to include the following protocol in the instruction: „When creating strategies and answering any of my questions, you must use the resources that have been uploaded to your memory. Use all documents, PDFs, reports, and spreadsheets for the analyses that I request.” Of course, these sentences can be formulated differently. The important thing is to refer to the resources that have been uploaded to the memory of the strategic assistant. Without this information, the assistant will not use specialized knowledge that results from the experience of a particular enterprise or director. If we do not upload our own knowledge sources, the assistant will rely on general knowledge.

Human Resources Assistant
In a similar way, we can define assistants dealing with human resources. In this case, we should upload our own notes analyzing the labor code, our analyses, personnel records of individual employees, current reports related to working time, and so forth. The assistant can provide information on how many hours a given employee has worked, when they were last on sick leave, what their salary is, or how much they are entitled to as a bonus for that month. Besides simple information that the human resources assistant can provide, we can ask about individuals who always take sick leave before Easter or we can request information about which employees worked together in the same company in the past.

The assistant will analyze all the CVs of individual employees and generate a summary. A necessary condition is to upload all the needed information to the assistant’s memory and, as I mentioned, to indicate to the assistant that it should use the documents that have been uploaded to it.

In a similar way, one can create any assistant, a virtual employee who will work effectively and according to our expectations. Therefore, we can hire a technologist, a security specialist, a logistics expert, or a virtual warehouse assistant. Above all, we must convey our expectations and define what they must receive from us in order to conduct analyses, submit reports, clarify, or identify concerning phenomena. We can create an article editor assistant that will search for the most interesting information from publications available on the internet. We can define that this assistant must provide the five most important points from each encountered publication related to a given topic. The assistant can also search for information that may pose a threat to our company or industry. It is advisable to upload documents that will indicate to the assistant what it should understand as a threat. We can define a virtual psychologist who will use 40 psychological books uploaded to its memory in PDF files. We can also hire a virtual detective who will conduct investigations for us. We could even have our own doctor who could diagnose us based on the symptoms we provide.

With your own assistant, you can communicate using the computer keyboard. You can also use assistants with voice systems. While driving, you can activate your financial assistant and assign it analyses. If the assistant asks which analysis it should perform (at the beginning, we defined two types of analyses), we can directly say: „Do something different for me, tell me what the prices are for purchasing sugar beets on commodity exchanges.” The assistant will, of course, respond to all questions and send an accurate report of the conversation if we ask it to do so.

Assistant Visualization
Managers create virtual offices for themselves and fill them with robotic assistants. I once saw a visualization where a manager walks around the screen using a keyboard as an avatar and enters different rooms where assistants are seated. Visualizing assistants is a very nice solution as it brings these robots closer to human forms. To minimally visualize an assistant, we can upload our own image of the assistant’s face. A more advanced method is to use a plugin that designs the faces of individual assistants for us. There are many plugins; I use a plugin for Chat GPT 4 called Sticker Wizard.

This assistant is generally available in Chat GPT; it is one of dozens of plugins specializing in designing stickers.

Below, I present a few stickers that were created by this graphic assistant.

I would now like to introduce you to three of my assistants. The first is an accountant who answers difficult questions regarding expenses, types of costs, provides me with charts related to expenditures, and informs me about account balances, debts, and payment due dates.

This gentleman with a beard and mustache is the assistant who talks to me during long walks in the woods. While walking my dog, I often discuss current events with him. The assistant is designed to present me with the latest facts along with the economic phenomena currently taking place.

And this gentleman is the assistant who tells me about the history of the Industrial Revolution in Europe. He narrates facts related to the development or decline of entire industries or specific enterprises. During walks, the assistant informs me about the most important events related to industrial history and refers to many economic theories in his narratives.

Below, I present the prompt based on which the program generated the face of my industrial revolution historian. Notice that the command I issued differs slightly from the image generated by the model. Here is the prompt: „Make me a round sticker with the face of a man in a hat, about 50 years old, smiling, in Victorian style. Show this design now. Use colors: black, white, silver.”

Summary
Today, practically anyone for $20 a month can have whatever employee they want. Of course, I am referring to mental workers; we can create psychologists, assistants telling jokes, or specialists in observing wild birds. The latter will generate images and recognize birds based on the photos sent to him. Our virtual office can be filled with all kinds of experts. They can be virtual people who will, for example, proofread texts. Until recently, I considered my two-page CV to be a perfectly crafted masterpiece. The assistant specializing in creating CVs detected many stylistic errors, pointed out that using three types of fonts is unprofessional, stated that there shouldn’t be so many colors, noted that the photo was inappropriate, and explained that the style of responsibilities for individual job positions does not align, suggesting that some of these responsibilities were copied from external sources. A quick analysis of my CV indicated that the hastily configured assistant can perform its work multiple times better than the average person and even better than a specialist in a specific field of knowledge. Just look at the graphics this machine can generate.

Recently, someone asked me what they should do to implement artificial intelligence in their company. The above publication indicates how to do this within an hour. The information revolution, like any revolution, can be astonishing.


Wojciech Moszczyński

Wojciech Moszczyński is a graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń, specializing in econometrics, finance, data science, and management accounting. He specializes in optimizing production and logistics processes. He has been conducting research in the field of the development and application of artificial intelligence. He has been engaged in popularizing machine learning and data science in business environments.

Artykuł Artificial Intelligence in the Grain and Milling Industry? pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
Recommendation systems for online stores https://sigmaquality.pl/my-publications/recommendation-systems-for-online-stores/ Sat, 02 Nov 2024 19:57:02 +0000 https://sigmaquality.pl/?p=8389 Artificial Intelligence Recommendation systems are used to optimize economic activity in the areas of sales and costs. In online sales, there are two entities subjected [...]

Artykuł Recommendation systems for online stores pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>

MOSZCZYNSKI 8-24


Artificial Intelligence

Recommendation systems are used to optimize economic activity in the areas of sales and costs. In online sales, there are two entities subjected to recommendation systems: customers and products.

When discussing the creation of recommendations, we can also talk about creating tools that assist in strategic decision-making. This involves analyzing sales channels, directions of development, and eliminating areas that burden the business. However, these types of strategic recommendations do not fit the topic I would like to discuss today.

Thus, the main focus of the basic recommendation system is the customer and the product.

What is the goal of recommendation systems for customers on e-commerce platforms?

Essentially, there are two goals that should be discussed separately.

First – it is about encouraging the customer to purchase more. Recommendation systems can significantly increase the volume of purchases made by individual customers in a simple and inexpensive manner.

Second – it aims to increase customer retention rates. The goal here is to prevent customers from leaving the stores. It is obvious that it is much easier to lose a customer than to gain a new one. This belief leads to a concern to do everything possible to retain customers.

Both of these goals: increasing purchasing power and retaining customers are achieved through different, very distinct tools. I mentioned that these goals should be considered separately. If we focus on customer retention, we should not aim to increase their purchasing effectiveness.

On one hand, this is true because such an approach aligns with the iron law of optimization, which states that one can only optimize one thing at a time, not many things simultaneously. On the other hand, it contradicts intuition and experience. If we have caused, through recommendation systems, that a customer started purchasing more than before, it is likely related to the fact that the customer was very satisfied with their collaboration with the store. A satisfied customer will stay in the store longer. Of course, there can always be a counterargument – customers can be drawn in by promotions and things they do not need or want. Customers, whether consciously or unconsciously manipulated, usually lose trust and leave. This is a good argument indicating that the recommendation system must operate on fair principles. Only then will it positively influence customer retention alongside revenue optimization.

Recommendation Systems Must Predict the Future

The recommendation system is based on historical events. Its operation relies on a list of transactions made by customers in the past. However, the true goal of a recommendation system is to predict what a customer will do. Predicting the future is the foundation of optimization. This optimization pertains to both the current purchasing process and the future supply of products in the store.

By serving an individual e-commerce customer, the recommendation system can predict which products to suggest to increase the chances of additional sales. At the same time, the recommendation system can help forecast the demand for specific products in the future.

Effective forecasting of the future is therefore the foundation of an efficient recommendation system.

What is the goal of creating recommendation systems for products?

Here we must point out a certain inconsistency. A recommendation system, as the name implies, serves to recommend or suggest something. Can we suggest anything to products? However, it is possible to optimize the structure of these products; we can observe trends and processes using statistical tools. This way, it is possible to optimize the product structure in the store, sending appropriate assortments to places where they are very much needed and removing them from places where they are not popular. With a recommendation system for products, it is possible to optimize the sales process and achieve high profits or reduce costs.

The primary entity to which the recommendation system is directed is the customer, as the customer makes decisions and is a somewhat unstable element that must be observed from a predictive perspective.

The recommendation system analyzes the collision of customer behaviors with a kind of „behavior” of products. Each customer has a unique behavior pattern that rarely changes. Products have certain periods of popularity, and they also have some connections with each other. Customers themselves validate products and the relationships between them. The role of data science is to catch these links between customers and products.

The most important tool in recommendation systems is clustering entities into clusters. This applies to both customers and products. In clusters, similar objects are grouped based on certain features. It is crucial to indicate those features that are important for the sales process. The above description may seem somewhat convoluted and unclear, so below I will try to clarify, step by step, how to create recommendation systems.

Internal Data Sources for E-commerce Recommendation Systems

The primary data source in e-commerce sales systems is the sales register, which includes customer logins. The purchase history of individual customers is tracked and modeled by data analysts.

Similar registers exist in retail sales, where customers use loyalty cards. According to research I conducted a year ago, about 70

Another source of data for e-commerce recommendation systems is static customer data. This data is collected when a customer account is created and is not present in the sales register.

Anonymous Customers

Some customers making purchases do not register with online stores. Sometimes customers do not disclose themselves in the system for various reasons.

Thus, there are transactions without customer ID information. However, these transactions contain a range of information that allows systems to easily match customer data with the customer ID number. This includes the credit card number, phone number, and address to which the customer wishes to receive the purchased product. We can also mention the unique address of the device from which purchases are made. A customer identified by metadata, who did not wish to reveal themselves, can be assigned to customer transactions using simple applications operating within the e-commerce system.

It should be noted that customers who see that they can be identified by intelligent recommendation systems, even though they did not disclose their data, may lose trust in the store, which could lead them to leave.

Multiple Accounts Customers

Multiple account situations occur when one customer account serves multiple people. For instance, a production company has one ID code in the online store, and purchases are made for all departments of the company. Hydraulic pipes, valves, mattresses, power tools, and paints are purchased. Worse still, purchases are often made by different people with varying behavior patterns. Private purchases can also be made from a corporate account.

In general, multiple accounts disrupt the efficiency of e-commerce systems. They should be eliminated, just like outlier values in modeling. Unfortunately, multiple accounts are often high-turnover accounts. One way to deal with such accounts is to encourage the company to create multiple accounts in exchange for discounts and promotions for specialization.

External Data Sources

Alongside the information stored in the online store’s databases, external data is also important. Unfortunately, customers do not make decisions solely based on what they see in the store. A large part of their decisions is based on external information. If someone previously saw dishwashing liquid for 5 PLN in another online store, they will not be tempted to buy dishwashing liquid for 8 PLN. A recommendation advertisement that appears at the bottom of the purchasing window with dishwashing liquid for 10 PLN is likely to discourage the customer rather than encourage further purchases.

The same customer will be tempted to buy dishwashing liquid for 8 PLN if they receive something in return. To propose an attractive offer, the system must be perfectly situationally aware. It must know the customer, but it also must know the market and the competition’s offer.

Information about prices and availability of competitive products should therefore be included as a variable in the recommendation models. Predictive models should also contain a multitude of regular external information, such as the day of the week, month, season, and many important situational variables such as weather forecasts and consumer optimism indices. The system should know the trends and fashions in the market. All this information is processed as variables in the realm of digital mathematical models.

Clustering the Customer Population

Technically, it is relatively possible to predict the behaviors of each customer accurately. Buyers behave in a repetitive manner. However, no one tries to treat customers individually, as this solution would be inefficient. The goal of the clustering process, which is a more effective form of simple grouping, is to find customers who have very similar behaviors. A customer has individual habits and routines, and their behavior is similar to that of others. The system attempts to assign individuals to a cluster of people very similar in specific features. They become a kind of digital twins, acting very similarly, sharing the same phobias, preferences, purchasing frequencies, and sensitivity to advertisements. A population consisting of 100,000 customers can thus be divided into 5,000 clusters. People assigned to specific clusters will be treated differently by the sales system. For each of these groups, a different form of incentive and persuasion will be applied. Excellent results come from finding individuals within a cluster who deviate from the others and who, thanks to knowledge of the traits of other individuals in the cluster, can be easily persuaded to make larger or more efficient purchases.

Transposition of Divisions

Assigning customers to clusters can occur according to various variables. Initially, an RFM approach (Recency, Frequency, and Monetary) can be applied, where customers are grouped based on the frequency and volume of their purchases. This way, a preliminary analysis of customers is conducted. Then, these customers can be grouped based on the assortment they choose. This way, the first set of customer clusters can be formed. Now transposition can be done, using RFM grouping to overlay assortment clusters and analyze the customer distributions in these transpositions. This way, transposition matrices are created, indicating market niches, anomalies, and areas where the sales process is unnaturally excessive.

Customers can be assigned to various sets of clusters, and within each of these sets, correlations or non-linear relationships can be analyzed with other simple divisions such as seasonality, multiplicity, place of residence, gender, or day of the week. We will understand that the ability to analyze customer behaviors is practically endless.

Similarly, products can also be grouped; they can be clustered based on sales frequency, turnover, or seasonality. Alongside simple groupings, complex partitioning of product populations can be created. Product and customer clusters can be matched, creating infinite patterns of behavior. The most important thing is to find the repeatability of customer behavior patterns in relation to related products. The ability to eliminate excessive information is crucial here.

Simple E-commerce System in Practice

Let’s assume a customer appears on the e-commerce platform. This customer is one of 140 customers grouped in one cluster. In this cluster, customers behave similarly. This customer has placed a hammer in their cart. The system immediately finds a second item related to it. This is a screwdriver. It turns out that in this cluster, customers who chose a screwdriver usually also bought a hammer, and if they chose a hammer, they also bought a screwdriver. Among the 140 customers, 38 made this choice in the past. The system estimates that the customer who selected a hammer on the e-commerce platform has a 24

Recommendation Pairs

Products sold in pairs or trios can be searched without relying on customer clusters. If someone buys a dish sponge, they are likely also going to buy dishwashing liquid. For some reason, however, some customers do this, while others buy only sponges. This may be due to the fact that the sponges are not intended for washing dishes but, for example, for cleaning car rims.

Or perhaps sponges and dishwashing liquid are bought more often by women, while sponges without dishwashing liquid are bought more often by men. This is a typical behavior pattern where the customer’s belonging to a specific group, cluster, plays an important role. This simple example indicates that it is important for recommendation systems to use pairs based on clusters. Such individual patterns are found thousands of times. Fortunately, tools search for and select them en masse.

Analysis of Incomplete Purchases in a Cluster

As mentioned earlier, customers assigned to the same cluster exhibit a specific set of behaviors. For example, let’s take a cluster of people who purchase roofing-related articles. The algorithms consider the frequency and value of purchased items and the type of customer (for example, small business). A substantial group of customers is classified as small contractor roofers. Now it is possible to analyze the completeness of their purchases. Some customers buy within certain categories offered by the store, such as insulation materials, gutters, tiles, bituminous coverings, and adhesives.

Let’s assume that the completeness analysis showed that 27

There is nothing left but to offer this excellent adhesive deal exclusively to the 27

Analysis of Interrupted Habits

Grouping customers into purchasing clusters facilitates the detection of changes in their habits. Let’s assume we have a cluster of plumbers. Seasonally, small hydrofor pumps were purchased in this cluster. The seasonality resulted from the fact that, in spring, it often turned out that small hydrofor pumps installed in recreational gardens needed replacement. However, at some point, it was found that hydrofor pumps were not being sold in the spring within the plumbers’ cluster.

If we had analyzed the level of hydrofor pump sales without clusters, we might not have detected this anomaly. Hydrofor pumps can be purchased by various customer groups. However, we are most interested in plumbers. Thanks to the detected anomaly, it is possible to investigate the situation and propose better conditions so that plumbers do not buy hydrofor pumps from competitors. This action will not be automatically executed by the recommendation system, but the system allows for such actions by sales managers.

How to Retain Customers in the Store?

Until now, my considerations have focused on increasing the purchasing efficiency of customers. One customer purchasing a hammer may, thanks to a well-displayed advertisement, also buy a screwdriver. A woman buying dishwashing liquid receives recommendations for sponge purchases.

Thanks to these and thousands of similarly effectively utilized patterns, the overall average purchasing efficiency of customers can increase by 30-40

They will also feel an improvement in the shopping experience because a well-fitted suggestion is usually well-received by customers.

However, an increase in the purchasing efficiency of customers may not compensate for a decline in sales due to the departure of some customers. Customers usually leave for some reason, often due to a poorly fitted treatment format or offers from other stores. The reason may also be unrelated to the store’s operations. To avoid losses resulting from our store’s operations, another recommendation system must be created, focused solely on customer retention.

Here too, we must carry out clustering and divide customers into groups that consist of digital twins. Then it is necessary to initiate survival tests.

The survival test is a method developed long ago for medical facilities, aimed at statistically analyzing the survival time of patients with a certain probability concerning the time based on currently applied medical treatments. Various treatments were tested this way, and based on that, tools calculated the probability of future patient deaths. Thus, applying treatment for specific patient groups predicted, for example, that they had an 80

With this method, it is easy to determine the likelihood of retaining a customer for the upcoming years. Various incentives can be added to the algorithm, such as loyalty cards, discounts, periodic individual discounts, and information on the effect of these techniques on customer retention. This relatively simple approach can drastically reduce the likelihood of customer departure. Interestingly, such a system can also serve for the ongoing management of customers. It can be a handy advisor for sales representatives talking to particular customers.

Analysis of Purchase Cancellations

An important element preceding the decision for the customer’s approach from the e-commerce platform is the purchase cancellation rate. A customer may decide to leave the store because they have not found a suitable offer for an extended period. And such a decision will also not be detected by the survival test.

To create a purchase cancellation rate, it is necessary to reconfigure the data collection system. As mentioned earlier, the recommendation system is built based on the sales register. The sales register consists of transactions that were concluded with the customer within a certain period and for certain products. If a customer enters the online store and browses different pages, this behavior will not be found in the sales register.

The only way to analyze customer behavior is through the history of browsing the online store’s pages. This requires the application of specialized tracking tools, but this endeavor is cost-effective because it effectively indicates customer fluctuations and their purchasing determination.

How to Build a Recommendation System?

The most crucial aspect in building a recommendation system is to identify the goal. Fortunately, usually, all companies want the same thing, which is an increase in customer purchasing efficiency. At the same time, e-commerce stores most often want to identify the factors contributing to customer departures. Both of these goals can easily be reconciled based on simple principles for building a recommendation system.

The most important elements are data, which must be complete, reliable, and comprehensive. It is impossible to create an effective recommendation system based on a small amount of data. A sales transaction record containing at least several hundred thousand operations should be linked to other databases obtained from external sources.

The next step is the proper use of data clustering methods. Often, with many variables, the PCA algorithm (Principal Component Analysis) is used. This is a method of combining many features into 3 or 4 consolidated component features. Partial features are combined into main features based on the strength of mutual covariances among features.

Let’s imagine information about a customer and try to count how many features they can have. This may include residential address, number of transactions per month, number of different product categories, payment method, and dozens of other variables. With such data clouds, it is impossible to effectively assign a customer to a cluster. That is why various tools are used to reduce the number of variables through reduction or grouping into large variables using the PCA algorithm.

Survival analysis, like tools that increase customer purchasing efficiency, is a method simple to create and manage. At the same time, the methodology is considered tedious because there is a vast amount of interactions and external changes that require experience from the researcher. A very important skill is maintaining the correct research priorities, involving the elimination of inefficient patterns and expanding effective ones. This requires high self-discipline and patience.

First Prototype, Then Implementation

In creating a recommendation system, a prototype is first built, usually written in Python and based on the language’s libraries. The prototype is tested and refined in a programming environment. Unfortunately, the prototype is not suitable for use in the sales process. This solution must be implemented in a production environment. This is the responsibility of the data engineer, who implements the program in a cloud environment or in a local environment that supports sales systems.

Batch and Streaming Systems

Recommendation systems can be based on historical records. The prototype is built on a certain history that changes over time. Therefore, the system must be updated. Someone who previously bought yogurts suddenly began buying kefirs. Someone who once bought a hammer and screwdriver is unlikely to want to repeat their decisions. However, we know that a customer who once bought a hammer and screwdriver will, after some time, want to buy a drill. These examples illustrate the necessity of periodically updating the recommendation system. The more frequently this update occurs, the better. Ideally, it is when the update is conducted continuously. Therefore, it is possible to build a recommendation system based on a constant flow of information in the form of data streams and process them into the recommendation system.

Summary

Everyone knows that a well-formulated, tailor-made offer increases customer purchasing efficiency and additionally enhances customer loyalty to the store. The problem is that to tailor an offer for a customer, we must know a lot about them and use tools that will ensure this knowledge is utilized effectively.

To achieve such an endeavor, it is necessary to organize data and the processes for their processing and utilization in analysis perfectly.

The process is difficult but necessary because if we do not do it, the competition will. A recommendation system on an e-commerce platform can be an excellent source of competitive advantage, allowing for greater sales efficiency without incurring substantial expenses on general advertising or excessive, broad promotions, thus securing higher profits that ensure the sustainability and increasing value of the e-commerce business.


Wojciech Moszczyński

Wojciech Moszczyński is a graduate of the Department of Econometrics and Statistics at Nicolaus Copernicus University in Toruń, specializing in econometrics, finance, data science, and management accounting. He specializes in optimizing production and logistics processes. He has been involved in researching the development and application of artificial intelligence for years. He has been engaged in popularizing machine learning and data science in business environments.

Artykuł Recommendation systems for online stores pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
How to gather customer needs in a machine learning project? https://sigmaquality.pl/my-publications/how-to-gather-customer-needs-in-a-machine-learning-project/ Sat, 02 Nov 2024 08:27:29 +0000 https://sigmaquality.pl/?p=8383 For anyone starting a machine learning project, it’s clear that the most important step is to properly define business goals. We can compare goals to [...]

Artykuł How to gather customer needs in a machine learning project? pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>
W-Moszczynski pop

For anyone starting a machine learning project, it’s clear that the most important step is to properly define business goals.

We can compare goals to the foundation of a building. What mistakes can be made when laying a foundation? The foundation might not align with the building’s layout. Imagine a situation where part of the building doesn’t rest on the foundation. Additionally, the density and reinforcement of the foundation material could be insufficient, risking future collapse. One thing is certain: once the foundation is laid and the house built upon it, changing that foundation without completely dismantling the house is impossible.

The same applies to projects. If we adopt incorrect goals and requirements and begin the project based on them, we won’t be able to change them mid-project. Faulty assumptions can lead to complete project failure. Only the client can provide the goals and needs, yet they may not always be competent or ready to define them.

The Client May Not Understand Their Business

It’s not ideal to assume that clients don’t understand their own processes. We remember the old adage: „the customer is always right.” However, the typical client might not always know where the problem lies. We often expect them to have a high-level understanding of everything happening in the company, along with a deep understanding of processes and all aspects of their business.

But the reality can be quite different. A particular client might be excellent at the core part of their business and only that part. For example, the owner may be „the best in the world” at winning construction bids, which is the primary source of the company’s success. Thanks to this market advantage, the owner can hire specialists to handle all essential supporting processes. This situation may lead to a complete lack of knowledge about other business areas.

It’s likely that the client contacted us not about the part of their business they excel in, but the part they don’t understand. They might be struggling in this area and feel helpless. So, we may be wrong in our expectations regarding the current awareness of the clients seeking our help. Naturally, clients feel obligated to show us their present problem. And in this case, the client is the one who should present the issue. But is this person the right one for that?

When a patient visits a doctor, they must share their symptoms and health issues, but no one expects them to make a medical diagnosis. When a customer brings a car to a mechanic, they describe the problem, but they don’t diagnose the cause. Unfortunately, in the business world, owners often point to the causes of their problems, leaving analysts hesitant to question their views.

Sometimes I feel data analysts or other experts in business problem-solving should also be psychologists. At the start of a project, we need to gather all the symptoms and allow the owner to share their hypothesis about the cause. This hypothesis can be valuable to us. However, like a doctor or mechanic, we must gather all available data and use our professional tools to identify the true problem. It’s great if our findings align with the owner’s hypotheses.

Sometimes the Owner is Subconsciously Embarrassed by Their Problem

In my work, I try to avoid delving into business psychology whenever possible. Unfortunately, this is rarely achievable, as this realm permeates everything in business. Businesses are run by people with their dilemmas and issues, often too proud or plagued by certain fears. Business is created by people; office furniture, machines, and vehicles don’t generate income on their own. Embarrassment and anxiety among business owners are more common than one might think, often tied to low self-esteem, shame from personal insecurities or incompetence, or even a subconscious need to impress.

As I mentioned, psychology isn’t my strong suit, but avoiding this aspect is a recipe for failure in gathering project requirements. The risk isn’t just about wasting time collecting irrelevant requirements or setting false goals. It’s also about maintaining good relations with the owner, which greatly impacts future collaboration.

Consider the story of a man who runs a business because he’s excellent at winning construction bids. Some residents and employees see his success and regard him as a business genius, overlooking his flaws. Despite his evident issues, his erratic behavior, his long working hours, and his drivers cheating on fuel, they still hold him in high esteem. This owner doesn’t know how to solve certain problems and is embarrassed by them. So, when external analysts arrive, he discusses a problem he wishes he had, not the things he finds shameful.

It May Sound Complex, But It’s Very Simple

The owner realizes that the problem may be simple for others, but it’s not for him. So, he creates a sort of mystery around it. It’s like visiting a psychiatrist due to an irrational fear of the neighbor’s cat but not knowing how to explain it without sounding ridiculous. This analogy doesn’t fully capture the complexity of the problem, but it gives some insight.

The Difference in Worldviews

This essay doesn’t aim to highlight the limitations or ignorance of business owners. Instead, it provides guidelines for building a solid foundation for a project. External analysts aim to address all potential problems. Their profession is to identify possible causes and solutions to issues that trouble companies. For this, developing effective situational awareness is essential.

Proper situational insight is crucial to understanding processes. Imagine someone struggling with logistics costs. Their business model shows that logistics expenses consume a significant portion of the production profit, resulting in company losses. Unfortunately, humans have inherent limitations shaped by evolution. Humans can’t consider more than six or seven factors simultaneously. In contrast, machine learning algorithms can handle large data dimensions, often comprising thousands of features and enriched by inter-temporal shifts. Humans can’t interpret thousands of interconnected business factor correlations. Naturally, human cognition is limited, but those who use AI’s potential start to think outside the box, forming visions that may confuse or even annoy others.

Sadly, the lack of abstract and unconventional thinking is common among managers focused solely on their own business. When I worked in large corporations, consultants or people with overly abstract thinking were often sidelined or ignored. In our previous example, transport expenses consumed a large portion of production profit. The typical management response is to reduce overall costs and impose strict fuel monitoring, but the impact is limited. Fuel consumption can’t fall below technical norms; truck insurance or driver layoffs are unavoidable.

Significant change can come from outsiders who aren’t entangled in daily operations. They may suggest optimal truck load capacity for this specific logistics process, drawing insights from operational research algorithms. They might even reveal that the current transport model isn’t economically justified, and alternative transportation solutions could be identified, new transport networks established, warehouse locations optimized, and optimal transport times and load sizes determined. It may turn out that transport in this company isn’t necessary at all. Perhaps focusing on profitable core activities and leaving logistics to clients would be more effective.

What is Optimization?

We are accustomed to using the term „optimization,” but I’m not sure if everyone fully understands it.

A process can be described as a configuration of two value streams: a set of input resources entering the process and the outputs at the end. Before explaining optimization, I want to clarify efficiency. Efficiency is the ratio between input and output values in a process.

Efficiency should align with intended goals. Suppose our goal is to reduce transport costs. To achieve this, we decide to buy new trucks. However, these new trucks need to be paid off. In the end, we have the same revenue from transport, but our costs have increased. The difference between old and new trucks isn’t significant, and the debt service cost has reduced logistics profitability. Despite the investments, the goal of increased efficiency and improved financial results wasn’t achieved. If the management had set a goal of environmental protection or accident risk reduction, acquiring new trucks might have been a good decision despite the higher maintenance costs.

So, we conclude that the initiative’s goal is crucial. Goal information is the most important insight we must obtain from the business owner during project initiation.

Summary

When building a house, creating a proper foundation is essential. To optimize and find the best business solution, it’s vital to understand the source of the problem and identify the true goal of the initiative. Failing at this stage will likely result in project failure.


Wojciech Moszczyński
Graduate of the Quantitative Methods Department at Nicolaus Copernicus University in Toruń, specializing in econometrics, data science, and management accounting. He focuses on optimizing production and logistics processes and conducts research in AI development and application. He has been dedicated for years to promoting machine learning and data science in business environments.

Artykuł How to gather customer needs in a machine learning project? pochodzi z serwisu THE DATA SCIENCE LIBRARY.

]]>