Azure Cognitive Services, or eight minutes to introduce true artificial intelligence to your business

Artificial Intelligence

The truth is that no robots will replace us or our employees in the near future. Just as for many decades, common video conferences have not replaced regular meetings. For at least fifteen years, I have seen video conference equipment in large corporations. However, it was not used until the pandemic, which forced a change in thinking and horizons. Will robots and other thinking machines become widespread this time as a result of some demographic or military catastrophe? I would rather not expect that. Certainly, the entire modern West, as well as China and Russia, face a huge demographic problem. Fewer children are born than the number of elderly who die. Undoubtedly, sooner or later, some forms of robotic workers will appear. There is a lot of talk about it, but it is difficult to break established habits. So, we will live as before, having very modern technology at hand that we could use, but we will not do it, because of time, money—why change something that works? Why introduce something that might cause trouble?


Azure Cognitive Services:

Eight Minutes to Introduce True Artificial Intelligence into Your Business

At the beginning of the year, we once again start looking to the future with hope. Once again, we say that this year will be the breakthrough year for introducing artificial intelligence into our bakeries and confectioneries. For the third year in a row, the new year is announced as a year of technological breakthrough. In business circles, conversations about artificial intelligence have looked more or less the same for three years. In the first sentence, the owner says that artificial intelligence is very important and should be introduced. The rest of the group agrees, sometimes someone adds that maybe soon robots will replace us. Lately, someone often interjects: “Yes, ChatGPT found me a great cake recipe.” Beyond that, no one knows anything, and it does not look like that will change.


Common Sense and Limitations

A few years ago—here I mean a horizon of three to five years—the implementation of artificial intelligence solutions was quite an expensive and troublesome undertaking. First, a concept had to be created, which usually resulted from an attempt to solve some serious systemic problem. Let’s assume we had a problem with employees who kept going out for a cigarette. Twenty years ago, when I worked in an office, employees went out into the corridor to smoke. So, entry gates with cards were installed. Everyone had to check in. The problem was that the toilets were also in the corridor. If we really wanted to effectively monitor the behavior of all employees, we would have to create some kind of face or body reader. This reader would analyze people’s behaviors, in a simple version answering the question: are they going to the toilet or for a cigarette?

We reach an absurdity similar to Chinese surveillance. Let me remind you: in Chinese cities, hundreds of thousands of cameras have been installed that scan citizens’ faces and observe whether they cross at green lights, do not litter, and do not write on walls. It seems like an absurd solution, but on the other hand—is it really madness? Or maybe it is a form of control that contributes to the reduction of accidents, suicides, assaults, and ordinary vandalism at the cost of losing part of our freedom?

Could this be useful in our businesses? In order not to go too far, I will stop here. Once again, I will try to return to my thought about how technological projects were once carried out. So, to implement such a project with faces scanned by intrusive cameras, one would first have to hire several data analysts to collect faces—let’s say, the faces of 200 employees. Each employee would have 5–6 photos taken. Each of these photos would have to be precisely described and classified. Six photos for 200 employees means 1,200 photos to describe—that is, about two weeks of slave labor.

I refer to my article in which I indicated that artificial intelligence is built by an army of unqualified, poorly paid slaves (“Global Artificial Intelligence is Built by an Army of Slaves,” Przegląd Piekarski i Cukierniczy, 2024, no. 7). Then, a team of analysts would have to launch the PyTorch library, which would build a neural network. Another three weeks of work by highly paid experts. Next, data engineers would take the neural model from the analysts and put it into the cloud so that the model could operate in a production environment. There would also be work for monitoring experts who would have to integrate everything with the monitoring system so that it would work autonomously.

An undertaking for 200 people lasting six months. Counting the market rate for each expert from the artificial intelligence team and the margin for the implementation company, the cost is about one million zlotys. A million zlotys spent to supervise two hundred employees—five thousand per person—absurd. So much money just to check who wastes twenty minutes a day smoking cigarettes? Complete madness and a waste of money.


W-MOSZCZYNSKI ppic 2-25

Technological Changes Few People Know About

So what has changed and where did this absurd idea of a face-scanning project come from? I observe the Chinese economy as the most digitally advanced economy in the world. I observe trends. One of the leading ones is digital surveillance systems. Everywhere you can find license plate readers, facial scanners, and body scanners. On the other hand, in Europe and the United States, a trend has clearly begun to dominate that aims to verify people’s authenticity. It is about terrorist threats, but also trolling and hate on the Internet.

In the past, verifying people took place through various types of documents. Fingerprints, however, are quite troublesome—you cannot intentionally leave them everywhere. In summary, the face is the key to analysis and supervision. What is measurable is manageable. It is not about enslaving people, but primarily about analyzing behavior to improve safety and the quality of processes. It is possible to observe customers and employees, detect signs of fatigue in drivers, and detect behaviors of people planning to commit acts of terrorism or theft.


Companies Directly Transmitting Robots’ Thoughts from the Cloud

As I mentioned, creating a facial recognition system on your own is extremely time-consuming and expensive. Such an undertaking lasts many months and does not guarantee satisfactory results. This awareness also existed among the research teams of the largest IT corporations. So they created universal tools that can be used quickly and cheaply, for a small fee or for free.

In the article “Don’t Be Afraid, It’s Only Our Terminator at the Reception Desk: What Changes Will Artificial Intelligence Bring to the Milling Industry?” (Przegląd Zbożowo-Młynarski, 2024, no. 2), I mentioned that robotic torsos will eventually appear, performing small and monotonous office tasks such as receiving raw material deliveries, issuing transport documents, or selling bread in stores.

I cited this example because I indicated that each of these robotic torsos would be equipped with specialist knowledge of its activities. This software would be provided by specialized companies directly transmitting robots’ thoughts from the cloud, however strange that may sound.

Similarly, certain IT solutions already exist. We have ready-made models that read car license plates, scan faces, recognize silhouettes, and distinguish bicycles from electric scooters. All this is within reach and free if the project is carried out on a small scale.


Eight Minutes and You Have It

Step one: create a Microsoft account (https://account.microsoft.com/). Step two: based on that account, go to the Azure site and create a free Azure account (https://azure.microsoft.com/). Don’t stress if the system asks for a bank card number. They won’t cheat you—it’s just a standard procedure in case you suddenly want to process huge amounts of data using advanced neural networks. The corporation would charge you around twenty dollars. The company simply wants some security when giving you huge, all-powerful tools, that’s all.

Step three: go to the “face scanning” resource. Find it in the cloud search: “Cognitive Services Face” or enter the address: https://portal.azure.com/#create/Microsoft.CognitiveServicesFace

Fill out the form and confirm. We enter the tool.

We can now find it in our cloud resources. I named mine “FaceScanner.”

Then you need to download the access key K1 and the so-called “EndPoint” location.

We take a test photo to check how it works.

Now we go to our computer, open the bash command console, and enter the following command:

curl -X POST "${AZURE_FACE_API_ENDPOINT}/face/v1.0/detect" \
-H "Content-Type: application/octet-stream" \
-H "Ocp-Apim-Subscription-Key: $AZURE_FACE_API_KEY" \
--data-binary "@/home/hdoop/Music/image1.png"

where:

  • AZURE_FACE_API_ENDPOINT is the copied value of the Endpoint

  • AZURE_FACE_API_KEY is the copied value of K1

  • /home/hdoop/Music/image1.png is the location of the image

We should receive a similar result:

[{"faceRectangle":{"top":194,"left":0,"width":479,"height":518}}]

Now we will try to assign an ID to the face, that is, a digital name.
To the previous command, we add: returnFaceId=true.

curl -X POST "${AZURE_FACE_API_ENDPOINT}/face/v1.0/detect?returnFaceId=true" \
-H "Content-Type: application/octet-stream" \
-H "Ocp-Apim-Subscription-Key: $AZURE_FACE_API_KEY" \
--data-binary "@/home/hdoop/Music/image1.png"

If the system does not want to assign a face ID (error 403), you need to ask Microsoft to unlock the service.
In other words, if you want the faceId function to work, you must submit a request for approval by Microsoft:

  1. Go to: https://aka.ms/facerecognition

  2. Fill out the form explaining how you intend to use identification/verification.

  3. After acceptance, the functionality will be unlocked.


How Much Does It Cost?

Since we are in the cloud and have already provided our bank card details, it is worth knowing whether we will be charged. You should choose the free plan for this service.
In the case of the free version, 30,000 face transactions per month are free.
After that, costs oscillate around $1 for 1,000 identified faces.
The exact price list can be found on the cloud’s website.

OK, we have it. We have launched the neural face scanner for our employees.

In this example, my face (after receiving Microsoft’s approval) was assigned to a specific IP and received its identity.
You can do the same with your employees—each of them has a photo in the HR archive.
It is enough to upload these photos to the directory accessible to the scanner, in the same way as I did.

After about eight minutes, we have a system that has assigned IPs to employees’ faces.
Now, using the code above, we can show the computer other photos—the system will recognize each employee even when we would have trouble doing so ourselves.

Systems that take photos of everyone walking in front of the camera are already available on the market—even our phones can do it.
In just one or two days, we have a system that monitors employees.
We can call it the introduction of artificial intelligence into a bakery.


How I Love Those Regulations…

Of course, in this barrel of honey, there is also a spoonful of tar.
Some thinkers at Microsoft concluded that facial scanning might violate citizens’ personal rights and infringe upon the right to a private image.

This is reminiscent of the so-called Red Flag Act, a regulation introduced in the 19th century in England.
Anyone who took a car onto the road was required to have an employee running in front of the vehicle with a flag, warning other road users about the approaching danger from the automobile.
This regulation effectively blocked the development of the automotive industry in Great Britain.

To use identification via the Azure face scanner, you must fill out a form explaining and describing how you intend to use this tool.
In this way, civil liberty and GDPR specialists will decide whether you should be granted the appropriate right.

The face scanner I presented today had, for me, another wonderful functionality.
It recognized moods, indicated gender and age.

It is hard for me to imagine a more useful tool in a bakery store.
Let’s imagine customers entering our store.
No one pays attention anymore to omnipresent cameras.
Each customer would be photographed and recorded with the time and date of entry.
Dates and times could be correlated with the temperature outside, the day of the week, and the season of the year.
They could also be compared with lighting conditions inside the store, with the employees serving customers, and with the temperature inside.

Thanks to this, one could build a model of customer satisfaction—find correlations between temperature, interior design, and satisfaction.
We could create a model showing which people, of what age and gender, buy the most, what they buy, and how long they shop.
Customers’ faces, their moods, gender, and age are a goldmine of information when combined with external data.

It would be relatively easy to build an optimal store—with optimal assortment, lighting, temperature, and location.
Saleswomen could receive bonuses for improving customers’ moods.
The assortment could be adjusted for each day of the week.
Unfortunately, activists operating in the GDPR field and lawyers blocked this extremely valuable function, making it currently impossible to use such information.


Conclusion

The conclusion should be conciliatory, so I will try to make it so.
Creating a facial detection system has become possible and, one could say, relatively simple.
To launch such a system, we must cross another horizon of our prejudices and fears.

I remember how twelve years ago, one of the owners of a large bakery installed cameras in the production and warehouse area.
This met with enormous opposition from the staff.
Today, no one would pay attention to it.

Introducing a monitoring system would help reduce various pathologies such as theft in supermarkets (the system easily recognizes the movements of a person intending to steal something) or workplace accidents (the system can detect the behavior of intoxicated, sick, or tired people).

However, in the back of our minds, we still have the need to preserve individual freedom and the bad example of total surveillance introduced on Chinese streets.

I will not decide which is more important or in which direction we should develop.
My goal was to indicate possibilities that already exist and that are relatively easy to implement even by a beginner IT specialist.
I leave this information at your disposal.