Big Data is a relatively new term. The action of gathering and processing data has been done for many years now. Just the collecting of data is something we have been doing since the stone age with cave paintings. When we digitalized our history and put everything online, we reached new levels. Big data is the next step in connecting our world.
Who doesn’t like to have exactly what they want before they even start looking? You are talking to a friend about wanting to buy a new laptop and the next hour you see an advertisement about the new laptop deals on your social media. This is happening right now thanks to Big Data.
What is Big Data?
Big data is what we call a great amount of data. It’s information that businesses get every day about the people they interacted with. All this data by itself is information that doesn’t make any sense. We give them power. With analytics tools, we search for patterns and design algorithms to see things that otherwise we would have never seen.
But not all data is “Big Data”. Data has to have certain characteristics, also known as the 7 Vs:
The massive amounts of data is the most prominent characteristic of Big Data. All the digital processes around the world generate it. According to DOMO by 2020, it’s estimated that 1.7 megabytes of data will be created every second for every person on earth.
The information generated every second of everyday changes very fast. If I give you a number it will be null by the time you finish reading this article. This data is created, stored and processed in real-time.
The information has a lot of sources, and it is so different and complex that it is hard to keep track of. The data can be structured or unstructured. Structured data is information easy to read like a database. Unstructured data is information like emails, images on our wall, click sequence in a webpage, even if we watch YouTube videos till the end.
It’s different from variety because it can have the same values but the resulted data is different. For example, you can go on YouTube every day and do the same amount of clicks. But if you see this data more closely, the content of the videos are different.
It is important to check the data uncertainty. The quantity of the data is not the only thing that matters but the quality of that data. The companies have to know the accuracy of the data they are collecting.
If you only have this data it doesn’t have any value. It gains value when you transform it into useful information. This can translate to knowledge and later take important actions and decisions. For example, if you have an algorithm to analyze people’s likes on social media you can guess the ethnic background of users of a certain area. That is useful knowledge to have when you are designing political propaganda.
After the data is processed, they need to be arranged in a way that shows us new information. Like knowing when a flu outbreak is going to occur by analyzing the google searches in a certain area.
Nowadays, all companies receive amounts of data from different sources. In this digital era, everything is connected and interacting in real-time. Data could be the new currency in the future. It is up to them to exploit this new era of information technology.
How does it work?
Technology growth is exponential. Almost everyone in the world has a connection to the internet one way or another. All our devices like phones, smart TVs, tablets, and laptops are sending information every second. Big data engineers analyze massive data-sets to see things we could have never seen otherwise. Companies use machine learning and artificial intelligence to process and produce new information.
We, as humans, learn with experiences and repetition. In essence, we replicate the human learning process so we could train machines to do certain tasks. This training is Machine Learning, which is a subset of Artificial Intelligence. Engineers design algorithms and statistic models to allow computers to do a task and learn from the results. Computers learn to rely on patterns and inferences.
We have a great amount of data that a human brain will never be able to process. We train a computer to analyze in seconds all that data and see patterns that generated the information we need. Before, companies needed physical servers and location to store and process this data. Now, engineers have created a data warehouse cloud. There, companies can store their data and receive insights in any way they can imagine.
There are many ways to analyze Big Data. It can be a descriptive analysis, with graph reports and other visualizations. It can also be a predictive analysis. By pairing Machine Learning with Artificial Intelligence, companies can predict events before they happen.
It can be a bit frightening to think about all the information companies can know about you. Businesses like Facebook use Big Data to create a picture of who you are. They look at what you like, your hobbies, average hours of sleep, how much time you spend on your phone and much more. You may think why would they care what I’m doing? I’m not a famous person, right? Well, they do care, and they analyze so much data about you that they get to know you even better than you know yourself.
For example, YouTube uses your search history, watched videos, and liked videos to know what new videos they can suggest. But they go much further than that. They use information like if you watch a video till the end to determine how impulsive you are.
The real cost of free apps…
The problem with big data is that most people don’t realize how much information they are giving up. We think these apps or services are free and we use it to socialize with our friends or buy things. But the real cost is the data we are giving away. The governments around the world are trying to implant new laws to have more customer protection. However, a lot of these analytic tools can bypass these laws. They use just one group of data like your purchase history to guess your personality pretty accurately.
Like any new technology or invention, Big Data has an upside and a downside. This technology can do a great amount of good in industries like health and weather forecast. They can use data to predict the next outbreak of some disease in Africa or how probable is for certain areas to suffer a hurricane in the next year. But it can do a lot of bad, like when companies sell the data of their costumers to others not knowing how it’s being used. The possibility of discrimination is one of the biggest worries. Imagine that you apply for a job and they refuse to interview you based on a profile of you they gained through Big Data. All this is possible right now.
Which industries are using it?
With digitalization, more companies everyday are using big data. Your phone sends your location every second. Your computer sends how many documents you save that day, or how many emails you send. The applications are endless.
A digital study made by the IDC showed that in 2010 the big data size was standing at 1.2 zettabytes (1.2 trillion gigabytes). By 2020 the size will reach 40 zettabytes (40 trillion gigabytes) said the same study. It would take 3 million years to download all the information currently on the internet. And only if we assume a download speed of 44 megabits per second.
Some of the industries that are using it are:
- Banks: they use advanced analytics to predict and recognize fraud, and to improve the experience for their clients. With Big Data, they could know when you are looking for that new car to offer you credit before you even contact them.
- Education: by using Big Data, schools can identify students that are at risk and improve their system accordingly. Designing an education system more suited to the students.
- Government: by using Big Data analysis, they can have better systems to manage utilities, dissolve traffic jams and prevent criminal activity.
- Health: this is one of the industries where real-time responses are necessary. With Big Data, they can even predict if a patient is going to have an infection before it showed any symptoms.
- Manufacture: big companies are using this technique to reduce costs. By using Big Data, they know the best way to do a process.
- Retail: by having a great amount of data about their clients, they can know how to better sell their products. They design specific publicity for certain groups that they know will be more receptive than others.
How are companies using big data?
The client is always right. This is one of the most important things to know to make a successful company. Which better way to please your clients than knowing everything you can about them? Almost every company is using big data to improve the customer experience, making it more personalized.
It’s a famous company for its customer service. They use Big Data to create a profile of you with the information they gather from your use of the webpage or mobile app. In addition, Amazon uses information like your location, your purchase and search history to give you suggestions they know you will like. The company uses a collaborative filtering engine that can predict the client’s tastes using already gathered data. Let’s say you bought a smart TV recently, Amazon will suggest for you some new home theater system. According to an article on Investopedia, this method generates 35% of the company’s sales per year.
In the game industry it’s important to gain clients and keep them if you want to make a profit. This company uses Big Data to know their clients and what they want. “We measure how we acquire them, which games they play, how much they pay us, how they engage us on social channels, and if and when they churn.” Said Paul Bugryniec, Head of Business Intelligence at Miniclip.
This company was created in 1997 and has used data analytics since the beginning. What better example than the new black mirror movie called Bandersnatch. An interactive movie that will roll out according to the choices of the viewers. This kind of audiovisual material will allow building data profiles to understand what plots work well with specific groups of viewers. According to Forbes the service has 137 million subscribers and it’s on pace to do $15 billion in sales through fiscal 2018.
It’s a dating site and it’s more accurate now than ever. They do not only use personality tests to match people, but information like sleep habits to find your perfect match. The matchmaking service now goes beyond the traditional compatibility. Prateek Jain, eHarmony’s head of technology, said in an interview that they use a process of generating behavioral data using Machine Learning (ML) models to offer more personalized recommendations to its users.
Snow Leopard Trust
It’s a non-profit that aims to protect snow leopards in their natural habitat. They are using Big Bata analytics and AI to track the hotspot of these animals to protect them from poachers. They also ally with the Nature Conservation Foundation to help track and catch tiger poachers. In the last 4 years, India’s tiger population increased 33 percent to almost 3 thousand animals, said CNN. All this was possible with Big Data and Artificial Intelligence.
What can we do?
The human factor of decision-making must be always prioritized over Big Data. Sometimes the predictions or patterns may be wrong. Companies have to be smart in decision making to take the correct actions. We have to keep control of the technology and not let Big Data take control of our actions. We as users, can at least be more conscious and pay attention of the data we are putting out there when using the internet.
Mechanical Engineer Student. Experienced freelancer in various niches since 2016. She considered herself a tech geek and can write both in English and Spanish. Apart from writing she dedicates her time to powerpoint design, app development and 3D modelling.