Data analytics is the science of analyzing raw data in order to draw conclusions about the information they contain to uncover patterns and extract valuable insights from it. Its aim is to apply those insights to beneficially inform larger organizational decisions and subsequently achieve real-world goals. For this reason, it is also known as Business Intelligence.
Technically, data analysis is a process of cleaning, prepping, transforming, modeling, and processing data with the goal of discovering meaningful information, making informed conclusions, and supporting decision-making. Due to this technical nature, many data analytics techniques and processes have been automated using computer algorithms to prepare raw data for human consumption. Depending on the workflow stage and data analysis requirements, there are four main kinds of analytics that provide varying depths of analysis—Descriptive, Diagnostic, Predictive, and Prescriptive.
Data sets in analytics scenarios are very large and complex, so four types of data analytics have been defined in order to help uncover different patterns and stories within the data. The first two types of data analytics, descriptive analytics and diagnostic analytics, focus on past analysis and are the simplest to perform. The next two types of data analysis, predictive analytics and prescriptive analytics attempt to understand the future from the data. These forward-looking analyses tend to be more complex while potentially offering the greatest value of insight.
Descriptive Analytics asks: what happened? Answers to these types of descriptive questions are found through routine data analytics and basic reports which are easily automated. These analyses can form baselines for business goals.
Diagnostic Analytics asks: why did it happen? Diagnostic techniques dig deeper into the content of the data, for specific periods of time, in order to answer 'why' type questions.
Predictive Analytics asks: what will happen? Building on the descriptions and diagnosis in previous data analysis, Predictive Analytics involves methods such as regression analysis, forecasting, multivariate statistics, pattern matching, predictive modeling, and forecasting to understand the future.
Prescriptive Analytics asks: what we can do? If there is a high degree of achievement in the previous levels of analysis, sophisticated techniques such as graph analysis, simulation, complex event processing, neural networks, recommendation engines, heuristics, and machine learning can be applied to those data to make better future decisions.
Data analytics applications vary by business and industry, with the most common application of data analytics as web analytics, which draws conclusions about website user behavior based on website traffic, and financial data analytics used to create necessary reports performed on financial data sets. Additionally, technology trends continue to push analytic capabilities into new areas like edge computing, where automated remote analysis dramatically reduces latency, and overcomes the data glut challenges of today's edge technologies.
Although there are many roles involving business data analytics (data suppliers, data consumers, and data preparers), the role of developing and engineering the data pipeline between suppliers and consumers belongs to data preparers. In this category, the specific roles that contribute to ensure raw data becomes usable insights are:
Data Scientist — Data scientists deal with large data sets, in the Big Data realm, and dream up the models needed to solve real-world problems. This role devises new data sources, as well as theories around using new forms of data.
Data Analyst — Data analysts are the go-to data analytics person in a company, handling practical and necessary data science. Data analysts may work on business insights using various data products, or develop their own with data tools.
Data Engineer — Data engineers build the pipelines that refine raw data into new usable, valuable, and monetizable data.
Data Steward — Another developing role due to heightened requirements around data governance is the data steward which is responsible for developing company data governance policies, and ensuring compliance.
Data Curator — A developing role around enhancing final assets, data curators begin with the needs of consumers to optimize accordingly their DataOps content to business the needed context.
High-speed processing has allowed sophisticated data analytics software the ability to analyze data in real-time as well as peer back to analyze past performance. But they are not the same processes and should be understood to serve different purposes depending on their applications.
Take for instance network monitoring analytics where historic and real-time data is used in different ways. While network traffic passes over a network, routers and switches can monitor data packets, identifying unwanted packets by comparing those signatures with a database of known threats. Likewise, enabled by automation, real-time intelligent network monitoring can reroute traffic, reconfigure settings, and even complete minor tasks that 'self-heal' the network. Analyzing data in real-time requires speed and compute power sufficient enough to ingest large volumes of data at high velocities, often sacrificing deep data analysis for speed and automation.
However, if in the case of network intrusion and criminal activity, sometimes prompting an in-depth network forensics investigation, historic network data records can prove to be the only source of truth for analysts. Yet maintaining a source of truth has its challenges. A routine network data practice purges traffic logs older than a few weeks in order to mitigate the costs of data storage. Although a summary of network traffic may be kept, details will be lost which could make deep investigations impossible.
Data analytics vendors abound providing businesses with ample solutions to solve their data analysis needs. There are stand-alone data tools, but analytics platforms offer businesses solutions with full capabilities to absorb, organize, discover, and analyze their data.
Some platforms require IT expertise to set up the analytics environment, connect data sources, and prepare data for usage; while others are user-friendly, designed with the non-expert in mind. These user-friendly platforms are known as self-service, and allow data consumers to prepare, model, and transform data as they need to make business decisions.
Data analytics and big data are terms that often collocate and can be confused to mean the same thing. Data analytics is about finding patterns within data, typically structured data, within significantly smaller sets than Big Data sets. Statistical analysis is a primary tool for data analytics. And the purpose is usually business problem-oriented.
Big Data analytics, however, is characterized by a high variety of structured, semi-structured, and unstructured data, drawn from various sources like social media, mobile, smart devices, text, voice, IoT sensors, and web, and further by the high velocity and high volume at which its data pipelines ingest.
Though there is no official big data size, big data operations can be measured in the terabytes and petabytes for organizations like eBay and Walmart, and in the zettabytes for Google or Amazon. Once collected, data can reside in an unstructured form in data lakes available for processing by data preparers. After processing, the filtered and structured data is maintained in data warehouses to be used by data consumers.
How can we help?
Please give us your comments, questions or feedback.
Let us help you find the best solutions
Send us a note and we’ll get back to you
Thank you for connecting with us.
We'll reach out to you shortly.
You’re in the Right Place!
Hitachi Consulting and Hitachi Vantara have integrated into a new company under the Hitachi Vantara brand. We help you connect what’s now to what’s next.