Data Analysis Tools

As mentioned in my previous post , in this post I will be listing out the tools, blogs and forums, online courses that I have gathered over the past one year, which I felt necessary in my journey, which will be helpful to my fellow data science aspirants.
 Skillset Required:
  •  Knowledge in Statistics – Exploratory analysis, doing initial analysis of the data & understanding the data to decide what techniques needs to be applied, which I feel is a must know subject. 
  • Mathematics – basics of calculus, algebra etc. for mathematical formulation of the problem statement.
  •  Understanding Machine learning algorithms for predictive modeling, recommendation engines, classification models, cluster analysis, social network analysis.
  • Data mining skills like data cleaning/Data Munging skills, apply Machine learning techniques on the data. 
  •  Visualization skills to display the results, to understand the results during building data modeling.
Tools required: 
Programming Languages: Proficiency in any two of the below mentioned languages would be advisable:
  •  R 
  • Python 
  • Java – comes in handy when we work on Hadoop 
  • C,C++
Tools required: Since I’m using Open source tools, I will be confined to them:
  • R-Studio
  • NLTK toolkit 
  • Rapid Miner
  • Weka
Important Point: 
Most of the machine learning algorithms has been already implemented as packages in the above languages/tools . We need to just download and make use of them.

Big data Tools: 
  • Hadoop setup from Cloudera/Hortonworks
  • Mongodb- NoSQL DB
  • HBASE, PIG, HIVE.
Visualization tools: 
Though I have not explored much in this area, but till day I’m happy with R packages for visualizations.
  • Data exploration in R/Python 
Few Books I have referred: 
  • The Elements of Statistical Learning - 2nd Edition 
  • Simon Sheather, A Modern Approach to Regression
  •  Data Mining 3rd Edition by Ian H. Witten, Eibe Frank, Mark A. Hall
Online Courses: 
Though a lot of courses are available online, I have stick to very few sites as below,
For Data Analysis, Stats, Maths:
For Big Data: 
Blogs and forums: 
Online forums is one place where I used to get a lot of information, in Linked Groups I could get answer to all my trivial questions. You post any query and you will get elaborate answer from research scholars to industry experts, I really love this place. I will list down few Linkedin groups I follow,
Blogs I follow:
Will add more when I come across the new tools. Guys hope this will serve you as a starting point for the Journey. All the best, Happy New Year. Please do add any new tools and technologies to the above list.

Comments

  1. Thanks for sharing the information! You can read more about online courses below:- Learn and expand marketable skills through video lectures, quizzes, and forum discussions. Scholars may choose to receive a certificate for a small fee. Join Subhe for a video editing course online and transform your career with degrees, certificates, achieve Specializations in desired digital skills. These popular graphic design classes all have top ratings and can be completed in 8 hours or less. Take the photoshop tutorials in hindi and our modular learning expertise gives you the ability to study online anytime.

    ReplyDelete

Post a Comment

Popular posts from this blog

Cloud Computing in simple

How to Write an Effective Design Document

Bookmark