Big Data is all the buzz today. There is no dearth of articles about Big Data, Data in the Cloud, Data Mining, Analytics, etc. But what exactly do these terms mean?
This series of articles will explain Big Data and how it applies to Broadband Service Providers. It will explain what they can do with it, why they should do it, and show how Big Data techniques may be implemented by Broadband Service Providers to benefit their business.
The term Big Data refers in traditional IT to “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.”[Wikipedia]
A more useful definition of Big Data for our purposes would be thus: a sufficiently large collection of data that can be analyzed to provide information that may not be obvious without such analysis. To be useful, data should have these three traits – it should be timely, relevant and accurate.
Analyzing this data to detect recurring patterns (referred to as “Data Mining”) often requires specialized tools to store and work on the data. Data mining is sufficiently advanced that a lot of open source and vendor provided tools exist today. The reader may have come across Apache Hadoop, which is probably the most popular framework allowing running of compute-heavy applications on clusters of commodity hardware.
Data mining and analytics have given birth to a new breed of scientists – they are part computer scientist and part statisticians, and the reader may hear them referred to as “Data Scientists”. By this author’s assessment, a Data Scientist is about to become a very hot profession – just like a Computer Scientist was a hot profession a decade ago. The vast amounts of data that are being created today are creating a need for the data to be analyzed and acted upon – it should surprise no one if the next decade is the decade of the Data Scientist.
Types of Big Data for a Broadband Service Provider
Infrastructure data: Networks operated by Broadband Providers generate lots of data on a daily basis – some of it saved and a lot of it tossed away today. This includes infrastructure related data such as aggregated bandwidth usage information that could be used to answer questions like – what times of day (and which days of the week) do I see heavy usage on my smallest data pipes?
Such network data can be analyzed to detect patterns, which may help to understand wastage of bandwidth in your network, for instance. It may also be used to detect issues, and predict issues that will likely happen – before they have even happened! This is appropriately called “predictive analytics”, and can be quite useful if performed right.
Customer data: There is also customer-specific data, such as CPNI (Customer Proprietary Network Information, per-customer data usage information, etc.), that is available to the service provider. Some data may be useful to the provider if aggregated and anonymized, such as detecting patterns of customer purchase behavior across platforms, and these could be useful in driving marketing or setting pricing for future features to be offered to customers.
Such data is useful for internal consumption by the operator, but other data may have intrinsic value that other external agencies would be willing to pay for. Privacy concerns are paramount, of course, and therefore any data shared with anybody needs to be aggregated to protect the individual consumers.
Customer specific data may include telephone calling records, or high-speed data usage information, or even detailed information on television viewing behavior! The “clickstream” information about what channels (or VOD) were watched and for how long, could be quite useful in determining behavior when purchasing other services, for instance. If the provider intends to do their own advertising insertion, this data is a goldmine that can help target the advertising shown locally.
Social data: Independent Broadband Providers have ventured into the social world as well. And if a customer (or a potential customer) joins an operator by following your posts or accepting an “app” in a social networking site such as Facebook, the operator gets access to data about the customer that could be useful. It’s a great way of staying in touch with the service providers’ customers, and this is well understood.
What is not well understood is that social networks offer deep insight into audience demographics and behavior patterns beyond what the MSO can get from data that is inside their networks and on their equipment.
For instance, the number of likes on a Facebook page, the demographics of the followers, or the habits of those who accept an operator’s app, can offer huge insights that can be used to drive sales and customer satisfaction.
Social data is very useful data – Big Data – that must be treated with the care it deserves. Customers don’t share information with the express intent of having it used for marketing or sales purposes. It might be time to create a new maxim – Do unto others’ social data as you would have others do unto yours!
This post has been an introduction to Big Data for the Video provider. In the next article of this series, we will dig a little deeper into some of the uses of Big Data, and will take some specific examples on how it may be used. Stay tuned for parts 2 and 3 of this series where the author will explain how operators can act on and monetize Big Data!
Contact Kshitij at firstname.lastname@example.org