In the age
of digitalization, more and more companies are facing huge amounts of data,
which are very quickly generated by many different sources. We call them big
data. What can we do with this big data and what role do they play in the field
of data science? How can big data impact businesses, the economy and society?
Let’s try to explain some basic concepts and connect these data with each other
into a meaningful, perhaps even understandable, whole.
data can be created by people or generated by machines or devices, such as
sensors gathering climate information, satellite imagery, digital pictures and
videos, purchase transaction records, GPS signals, and other information. The
main advantage of big data analytics is that it can reveal patterns and
connections between different sources and datasets, allowing for useful
insights and better decisions.
On its website, the European Commission cites the areas of healthcare, manufacturing, food security, intelligent transport systems, energy efficiency and urban planning as examples of the use of big data. These areas ultimately allow for increased productivity and better services, which are a source of economic growth.
The future is
in the data
value at the different stages of the data value chain will be at the heart of
the future knowledge economy. Improved analytics and processing of data,
especially of big data, will enable the transformation of Europe’s service
industries by generating a wide range of innovative information products and
to the European Commission, analytics and processing of big data will increase
the productivity of all sectors of the economy through improved business
intelligence and enable more efficient solutions to many of the challenges that
face our societies. The Commission expects further improved research and speed
up innovation, cost reductions through more personalised services and increased
efficiency in the public sector.
Due to the
exponential growth of the volume, variety and velocity of data, databases are
becoming increasingly difficult to capture, manage and process with
conventional means. Getting value from the vast amounts of data that users
generate daily has become crucial for companies such as Google and Facebook.
so, such companies benefit from real-time market data, as they make decisions
easier for other companies, which in turn can lead to higher revenues and lower
costs. Analytics of large volume of data can provide detailed business
information on customer behaviour or consumer profiling.
Big Data is essentially a special application of data science, which involves many specific domains and skills. The general definition is that data science encompasses all the ways in which information and knowledge is extracted from data.
mentioned, data is everywhere and is found in huge and exponentially increasing
quantities. Data science as a whole reflects the ways in which data is
discovered, conditioned, extracted, compiled, processed, analysed, interpreted,
modelled, visualized, reported on, and presented regardless of the size of the
data being processed. Big data is therefore a special application of data
science is a very complex and intertwined field as it incorporates mathematics,
statistics, computer science and programming, statistical modelling, database
technologies, signal processing, data modelling, artificial intelligence and
learning, natural language processing, visualization, predictive analytics and
so on. It is applicable to all the areas we have mentioned in big data and many
How is the
The life cycle
of useful and collected data in various ways usually includes its capture,
pre-processing, storage, retrieval, post-processing, analysis, visualization,
and so on. Once captured, data is usually referred to as being structured,
semi-structured, or unstructured. These distinctions are important because they
are directly related to the type of database and storage technologies required,
the software and methods used to query and process data, and the complexity of
dealing with the data.
data refers to data that is defined by a structure or schema in a database or
spreadsheet. Unstructured data is data that is not defined by any schema,
model, or structure, and is not organized in a specific way. In other words,
these is just stored raw data. It follows naturally, that Semi-structured data
is a combination of the two.
In order for data to be used in a meaningful way, it must first be captured, pre-processed and stored, experts say. Following this process, the data can be mined, processed, described, analysed, and used to build models that are both descriptive and predictive. Descriptive statistics is a term used to describe the use of statistics to a data set in order to describe and summarize the information that the data contains. It basically includes describing data as well as other forms of analysis and visualization.
statistics and data modelling on the other hand are very powerful tools that
can be used to gain a deep understanding of the data and predict meaning and
results for conditions beyond of those that data has been collected. Using
certain techniques, models can be created, and decisions can be made
dynamically based on the data involved.
What have we
never before collected as much varying data as we do today, nor have we needed
to handle it as quickly. The variety and amount of data that we collect through
different mechanisms is growing exponentially. This growth requires new
strategies and techniques to capture, store, process, analyse and visualize
science is therefore an umbrella term that encompasses all of the techniques
and tools used during the life cycle stages of useful data. On the other hand,
big data typically refers to extremely large data sets that require specialized
and often innovative technologies and techniques in order to use data
these fields are going to get bigger and become much more important with time.
The demand for qualified practitioners in both fields is growing rapidly and
they are becoming some of the hottest and most lucrative fields to work in.
Armed with at least a basic explanation of key concepts involved with data
science and big data, you may now be better able to understand some of the
other technologies we already have or are going to introduce.
Author: Rok Žontar
Keywords: Big Data, Data science, technology,
digitization, European Commission.
article is part of joint project of the Wilfried Martens Centre for European
Studies and the Anton Korošec Institute (INAK) Following the path of
digitalization in Slovenia and Europe. This project receives funding from the
information and views set out in this article are those of the author and do
not necessarily reflect the official opinion of the European Union
institutions/Wilfried Martens Centre for European Studies/ Anton Korošec
Institute. Organizations mentioned above assume no responsibility for facts or
opinions expressed in this article or any subsequent use of the information