Monday, April 17, 2017

Analyst and Data Scientist skill set in Digital Marketing. Applying tools and technologies to be successful. Part 2.



The business need in the analysis of data will only grow. In 2011, International Data Corporation estimated the weight of the world's data in 1.8 zettabytes (1.8 trillion GB), two years later, in 2013, they were already 4.4 zettabytes. Last year, only aggregate Internet traffic exceeded 1.1 zettabytes. Daily more than 2.5 quintillions of data bytes are generated, and for a year their volume is at least doubled.
A competent digital analytic is required today for every company working on the Internet, including marketing field. The demand for data scientists is also growing. According to IDC calculations, US companies already by 2018 will need 181 thousand specialists in the analysis of big data. It is a rapidly growing professional sphere, where tools and technologies are changing rapidly too. There are several main entry points in the data science for digital analysts: Excel, SQL, SAS, R, and Python.
Excel
Pretty simple and familiar to everyone. The set of solved analytical problems is limited only by the knowledge, skills and imagination.
SQL
SQL is universal and therefore popular with analysts. It is used both inside SAS, and in R, and in Python. Allows to create databases, select data from tables according to specified conditions and easily group them.
"Godfather" analysts: used in business since 1976. It is easier to learn than R and Python. It has detailed documentation and customer support, so many companies that have been on SAS for a long time rarely go over to something else. But younger organizations often use R or Python.
The SAS certificate is an advantage for job seekers. But training is expensive, and the SAS system is costly for companies. To be in demand in the labor market, digital analytics is clearly worth learning R or Python.
R
This language has been used in development since 2007. It was used primarily in scientific and applied research, but quickly gained momentum in business. The popularity of R continues to grow, defying the almost 40-year-old monopoly of SAS. Analysts appreciate this language for its simplicity and functionality, including for free tools. For R there are about 12 thousand statistical, graphic and analytical packages.
Thanks to the open Python code, it's a great free alternative to R. Its advantages are in code readability and availability to the production environment. Many systems for producing large amounts of data use Python. There are also disadvantages: for example, there are more statistical packages in R. But for Python there are many libraries and modules that help the analyst to turn into Data Scientist.
Also there are many different sets of libraries and statistical programs which in different combinations give freedom in choosing options for working with data and creating forecasts:
In the basis of analytics for big data, you need SQL and Excel. To move further within the profession, it is worth choosing one of the languages depending on your needs - R, Python or SAS. And ideally - use all three. For example:
 • SAS - to process big data sets;
 • Python - create a model of machine learning;
 • R or SAS Visual Analytics- provide a graphical representation of the data.
The most important tools that an analyst needs are an understanding of what and why to calculate, and the ability to clearly express thoughts. Therefore, the skill to ask questions and correctly formulate the results will save hours of data processing. The main thing to remember: as in any case, 80% of the result is achieved due to the correct formulation of the problem and 20% of known heuristics. 



2 comments: