Info about BI | DA | DS
General Info:
Below are technologies and techniques used in data stream, will analyse any form of data to extract meaningful insights.
BI - Business Intelligence
DA- Data Analytics
DS - Data Science
Evolution of BI:
Data analysis has been evolved through different stages from Business Intelligence to Data science.
Model we develop should determine the level of intelligence that we can analyse the data.
In order to excel in analytics one must have below skills.
- Knowledge on statistics tool like R, python, SAS (must have)
- Analytics Techniques using statistics tools (must have atleast few below list)
- Domain Exposure (nice to have)
- Data Visualization tools like Tableau, Qlikview etc., (nice to have)
Analytics Techniques:
- Advance Statistics
- Data Mining
- Predictive Modeling
- Time Series Forecasting
- Machine Learning
- Optimization Techniques
Domain Exposure:
- Market and Retail Analytics
- Web and Social Media Analytics
- Finance and Risk Analytics
- Suppy Chain and Logistics Analytics
Each business process or subject area in a business shall be the area of analytics.
Eg: Financial Analytics, HR Analytics, Sales Analytics, Marketing Analytics etc.,
Analytics in broader term will cover below concepts:
1.Descriptive analytics - past - what happened, when, how, who, how many?
2.Diagnostic or Discovery analytics - why is it happened? past
3.Predictive analytics - future - what will happen?
4.Prescriptive analytics - what action should we do? How we can make it happen?
Predicting and prescribing the solution
Big data :
Big data is a problem statement with four V's (Volume, Velocity, Variety, Veracity)
Hadoop is one of the solution to process big data with the distributed systems HDFS + Mapreduce (or Hive/Pig)
Other big data solutions in real-time:-
NoSql (cassandra/MongoDB) hosted in any distributed systems
Spark SQL hosted in any distributed systems like AWS EMR
Business Intelligence: (SQL query based analysis)
Extract the business insights from enterprise data to develop reports/dashboards for decision making.
It will help to do descriptive analytics.
Data source : Enterprise data from multiple data sources
To do :
- Data relationship model(preferably star schema) to be developed by applying SQL functions, joins, calculations to render data for reports / dashboards and adhoc queries for decision making
- BI reports to be developed with features like tables, graphs/charts, drill down, prompts etc, should cater for business requirements
Eg: Bank Performance, Sales Order, Marketing etc
Data Analytics: (Statistics based analysis)
Extract actionable insights from business data to do forecasting and predictions.
Analysis would be more specific and concentrated towards problem statement.
Predictive and Prescriptive analysis.
Learning process: (use historical data)
Extract the features from raw data
Develop algorithms based on machine learning techniques
Use analytical tools/programming by applying statistical methods based on algorithms
Analytical program will produce optimized parameter or value, called model (output of algorithm) - Train model
Prediction process: (use live data)
Apply model on test data
Features or Independent variables: Age, sex, complaints, experience, etc derived from historical data
Target or Dependent variables: known responses, related to problem statement
Learning between features and target, your algorithm tries to
Learnt what? - Parameters or Values are called model, it is happening in a iterative way
if the error is more, it will check with target and update the parameter till it is optimized, the value is model.
Data source : Business data from single or multiple data sources
To do : Data relationship model to be developed by applying statistical and mathematical functions on raw data to find answers for predefined questions
Eg: Credit card fraudulent system
Business analytics- would be more of functional terms like People Analytics, Spending Analytics
It will communicate the benefits like cost impact, value impact to the customers.
Data science: (Data analytics + Data mining + AI/ML )
Data science is a umbrella term which covers BI and Data analytics, it is multi-disciplinary fields involved. Also it is called as advanced data analytics.
Analysis would be more generic towards problem statement.
Data source : Business data from single, multiple data sources, advanced devices thru (ML), IOT
To do : Model to be developed by applying statistical functions to create questions.
Programming would be applied on modeled data
Eg: ????
Artifical Intelligence:
AI branches and divisions:
Machine Learning:
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
Machine learning focuses on the development of computer programs that can access data and use it learn for themselves.
Machine learning (ML) is a category of algorithm that allows software applications to become more accurate in predicting outcomes without being explicitly programmed.
It can do with reasonable data to apply techniques and build model
ML algorithms as below
Supervised:
- Linear - Single, Multiple
- Logistic
Unsupervised:
- Clustering - K-means, Hierarchical
- Dimension reduction
Decision trees
Neural networks
Deep Learning:
Deep Learning is more intensive way of ML. It requires vast amount of data to apply techniques and build model
IOT:
Comments
Post a Comment