top of page

Data Wrangling in the Marketing Field

The next step of Data Science is Data Wrangling. Data Wrangling is basically the process of cleaning and filtering data. When data is collected, there is just too much to control. Data Wrangling is the vital step in which the data is organized and prepared for business’ to use. For example, say a sports equipment company was trying to create a targeted advertisement for a certain user. To figure out exactly what sport that user likes best, they would have to collect the user’s search history data. In this case, let’s say the user really likes baseball. When that sports equipment company looks through the search history data, they will find an overdose of things. Everything will obviously not be related to baseball, or even sports in general. There could be makeup tutorials, cooking recipes, searches for clothing, etc. This is just too much data, and most of which does not even apply to that sports equipment company. Therefore, after collecting the data, they would have to organize it. They could filter the data so that only sports related things show. Then from there, since the data is slightly more manageable, they could determine what sport is searched the most, and therefore what the user likes best. Data Analysis is closely linked in this step because determining what sport is searched most within all the sports data requires some analysis of the prepared data. Even then, the data needs to further organized. There are so many different positions in baseball: pitcher, batter, outfielder, umpire, etc., and every position utilizes different equipment. After all the data on baseball is filtered, the company would have to dig deeper to figure out what position this person might play. Just for the example’s sake, this user plays umpire. Finally, after so much organizing and cleaning, the company finds that this user plays umpire. They can then move to placing the targeted advertisement for umpire equipment on sites the user uses. This was just a small example, but as can be seen this process is very long and tedious. In real life, it is even more difficult, considering Google receives almost 63,000 searches per second across the globe. Our world has too much data to handle, and that is why Data Wrangling is so important, because it organizes the data.


Recent Posts

See All

Logistic Regression

Logistic Regression is a machine learning model used for classification. When a prediction of a dependent variable consists of 2 values...

What is Operational Research?

Operational research is a field of study in which scientists analyze patterns to make predictions for the future. This enables decision...

Comments


bottom of page