This post is part of a series that covers Artificial Intelligence Introduction to Machine Learning, Fourth Edition By Ethem Alpaydin Introduction To Machine Learning, aiming to introduce and exemplify the possibilities and options available, in addition to addressing the context and usability.
Introduction to Artificial Intelligence and Data Analytics
We are always generating data. Everything we do, the way we behave, our opinions, our preferences, and even things we own, everything is a possible “type” of data, about something or that defines something.
In the image above we see the same situation twice: people going from one place to another and doing something in the Ethem Alpaydin Introduction To Machine Learning meantime.
Let’s imagine that in the second image people are accessing a newspaper app, which is completely possible.
They have their preferences and opinions, in the first image the relevant data is highlighted with a pen, and in the second image with their own finger, on the cell phone.
In the second image the man, sometimes without noticing, Ethem Alpaydin scrolls the cell phone screen directly to the type of news he wants to read while the woman in the first image turns the pages until she reaches the page she likes the most.
In this case, they are reading the same type of data, news, at different times/years and with different content. So, considering that in the two images people are doing exactly the same thing, what would be the difference besides the time and content displayed?
The difference is that in the second image, in addition to generating data, we are collecting data and in the first image we lost this data. At that moment, this woman was the only person who knew about her own preferences and behavior.
Generating Ethem Alpaydin Introduction To Machine Learning data as usual and we do it every second without notice.
Collecting data, however, is something relatively new that expands exponentially with the creation of new apps, websites, virtual assistants, and even wearable technology, which in a cycle allows the generation of more types of data, which we would not generate without these new features.
One of the main differences between structured and unstructured data is how easily it can be subjected to analysis. Structured data is overall easy to search and process. Ethem Alpaydin Unstructured data is a lot more difficult to search and analyze. Once collected, this data has to be processed to understand its applicability.
The lines between structured and unstructured data are often blurred because today many of the datasets have structured and unstructured fields, for example, if you take unstructured data like a photograph, it can still have structured data components like date, size of the image, and resolution
Although unstructured data represents almost 80% of all global data, with the largest volume of data, the need for innovation, understanding, and agility, Ethem Alpaydin Introduction To Machine Learning many companies have chosen to store structured and semi-structured data whenever possible, to obtain new information and insights and therefore competitive advantage.
Artificial Intelligence Algorithms are one of the main allies in this process, aiming to simulate human intelligence in machines, the models can process the data, being able to Ethem Alpaydin identify patterns and behavior to provide more information and understanding of the dataset.
While it sounds like a new concept, it’s not. The birth of Artificial Intelligence took place in the mid-1950s, with several scientists from different fields (mathematics, psychology, engineering, economics and political science) starting a discussion about the possibility of creating an “Artificial Brain”.
In 1950 Alan Turing published a landmark paper in which he speculated about the possibility of creating machines that think. Ethem Alpaydin Introduction To Machine Learning He noted that “thinking” is difficult to define and devised his famous Turing Test.
At that time many early AI programs used the same type of algorithm to achieve some goal, like proving a theorem, and the demands were what we consider Process Driven because it was designed to perform a sequence of actions with a particular purpose, simulating human behavior in performing tasks.
Since then, the concepts and algorithms have continued to improve and with that the need to create statistical models with no need to be explicitly programmed, Data-Driven models, with Machine Learning algorithms like Linear Regression, Logistic Regression, Decision Tree, Naive Bayes, SVM, K-Means, Gradient Boosting algorithms, etc.
Years later the use of Deep Learning algorithms, a subset of Machine Learning, also became popular; it was the result of the idea for algorithms to imitate the functioning of the human brain, structuring the algorithms in layers to create an artificial neural network. It has been around since the beginning of AI itself, but it was hard to put into practice as it requires more computing power.
One model is not necessarily better or worse than the other and they are also not mutually exclusive, they can and are often used together, the important thing is to take into account the use case, the available infrastructure, the time for training, processing capacity and the desired input and output.
Elastic, for example, Ethem Alpaydin Introduction To Machine Learning uses Machine Learning models to provide an integrated solution to support the analysis and understanding of data stored in Elasticsearch, for this reason, we are going to delve deeper into Machine Learning models and available Elastic solutions.