But what is it and why it is so important? In the era of the “Internet of Things” in which we live today, everything produces data: for example, let’s take a chair and put in a chip that detects data and transmits it on the net: that chair will start sending millions of data. We also think about the black boxes installed inside our cars that immediately send assistance when an accident happens or that it gives a discount on the insurance policy if we have driven correctly. In the case of the chair all the data sent are probably rubbish and will be of no use, but in the second case the sent data can save our lives or save us a few hundred euros on the renewal of the insurance policy.
An enormous amount of data is collected every day. Everything we do, from buying something online to updating our profile on social media, produces data that can be collected and analyzed. The tsunami of data is overwhelming us.
Now what could one think and if all these data that are collected and analyzed are a great opportunity or a huge digital garbage? Frankly, I think this is a great opportunity to be able to explain very complex things and processes. However, a specific figure is needed to explain these processes, which is that of the Data Scientist. The task of the scientist of the datum is to make the data tell a story that is not trivial, but rather let us discover something that we do not know and that we were completely ignoring.
Think of a first practical case of big data studied in the nineties: an American supermarket chain used statistical tools to analyze what its customers bought; the company had noticed that diapers and beer were in the same cart on Friday evening. Analyzing this data in more depth, they discovered that on Friday evening wives sent their husbands to the supermarket to buy diapers, and that, since they were there, they also bought some packets of beer. The supermarket took advantage of this and moved the beer to the shelves next to the diapers and as a result had a significant increase in both diaper sales and beer sales.
From this we understand what can be found in the data. Not analyzing them would be a waste indeed.
In reality, interpreting and analyzing data is part of a discipline that has a history of more than two centuries and is called statistics. Until the end of the nineties analysts had to ask themselves the question of which and how many data to collect and historicize since this operation had a very high cost. Now the situation has changed profoundly: a lot of data is already available for analysis, for example we think of banking transactions or online purchases. In this context, the figure of the data scientist thus becomes decisive for acquiring competitive advantages from the data.
Unfortunately, today this thing in Italy has not yet become much understood: according to the latest research, only 3 out of 10 large companies have a data scientist on their staff. I hope that the trend will change quickly because otherwise we would lose a great opportunity for growth.