DOT CLUB-IBS HYDERABAD

DOT CLUB-IBS HYDERABAD
A resourceful destination for academicians, corporate professionals, researchers & tech enthusiasts

Wednesday, November 25, 2020

Hurdles in Data science industry


“Data is the new oil” – Clive Humby


In the current world scenario, the word ‘data’ has got a huge valuation. Not only is it the plural form of datum, but also it is one of the most valuable assets that someone can have. Starting from the current date and time at a place to the stock market numbers, everything is data. Due to this huge availability of data and as there are huge number of sources from where we get the data, we sometimes term it as Big Data. But the data that is available and surrounding us, is very distorted and haphazard. These data need to be analyzed and oriented in a manner that is easily understandable to the common public. Thus, converting the raw data to information.

               


The information that we get from the various collected observations need to be further analyzed and converted into a form that can be used to predict the future trend or perform certain operations or help in solving real world problems. Data Science is that segment of Machine Learning, in which the data fed into the machine is analyzed and operated and performed upon various calculations to figure out the trend in the growth or fall of a specific dataset. Data Science mainly uses skills of statistics, mathematics and computer science to visualize the data in a way that identifies the trend in the rise and fall of the values and helps in predicting the outcome. It helps organizations to identify regions in which the products sell the most, to investigate the diseases that impact the most in which parts of the area, to identify the products that bring the most profit to a firm etc.


So far, we see that data science happens to be a very beneficial field in the current generation, and implementing it would help the firms to grow in a better and faster way. But as any other technology, this boon also has some hurdles that need to be dealt with before implementing in the various sectors. One of the primary obstacles in this field is the lack of people and infrastructure needed to maintain this technology. The implementation of data science requires in depth knowledge about computers and the languages and software needed for the purpose. Furthermore, it also requires people to have a knowledge regarding how the businesses and services in the various sectors function. Training the people for this information is in itself a big task.

                                        
 

Another important factor to consider is that, the quantity of data is increasing every day. It was found in November 2017, that Google processes about 25PB of data every day, and it has increased a lot since then. So, in order to process such huge data, we also need machines which can perform the task. And producing such powerful computers also incur huge costs. Even if the machines are made, the time taken in order to process the information will be considerably high. Thus, the use of distributed frameworks comes into picture. In such a framework, there are multiple computers connected over a network, which perform specific tasks.


It is important for the analysis that we have data, but it is even more important to have the correct data at our disposal. The collection and filtering of data consumes a good amount of time and thus the analysis and results are delayed. In the process, we might often end up loosing important hidden data in the set.


And one more important obstacle in the road to implementation of data science is the problem identification. The identification of the correct problem that can be resolved with the analysis of the correct data is a very big hurdle that the data scientist need to overcome. Sometimes the problem identification can take up to days and as such much effort goes on to the process. Thus, this major drawback is still causing a major hindrance in the path of data science.


Data Science, though being a beneficial field for the present as well the future of computation and analysis, is not able to contribute to the betterment of mankind. For the process, the road to the implementation of data science need to be free of the hurdles. People should be educated using proper resources so that the programming and analysis skills are learnt thoroughly. The infrastructure needed for the quick performing of the data analysis tasks should be available to the organizations, so that the skills learnt by the employees in programming and analytics or analysis can be put to use in an efficient way. By the implementation and usage of data science, the analysis of data can see a new dawn, having much improved speed and accuracy.