Since Big Data is such a “big” and diverse topic, there are plenty of assumptions, misunderstandings, and confusion surrounding the concept. Through discussions with colleagues, I’ve noticed a few recurring themes, uncertainties and generalizations. I’ve also heard plenty of assumptions and claims…some of which are true and some lack validity. After hearing from industry experts, I’ve compiled three Big Data Myths, which we will bust, to uncover some of the Big Data facts.
Myth One: Big Data is really “BIG.”
When it is stated that big data is really big, that infers that big data translates into a lot of information, more than most companies collect. In that case, this myth is not true at all. At the latest Hadoop conference, big data was defined as any data that cannot fit into Excel, which in reality, is the amount of data most companies collect and store, especially since the rise in social media and the volumes of data collected from those sources.
“Social media releases floods of data that hold massive promises for business – both in terms of branding and opening up new channels to market, in which large populations of consumers are speaking. They’re [consumers] communicating who they are and what they do and do not like. Such a relentless flow of data provides extraordinary levels of feedback and an unrivaled chance for companies to listen to the voices of their customers and target audience(s), garner intelligence, and participate in a collaborative dialogue to boost competitive advantage and new opportunity,” explains Upasna Gautam of MagicLogix.com. [Stay tuned for another post about the link between social media and big data.] Most businesses are using social media outlets for this type of insight and the amount of data collected and needed to be analyzed is considered “big.”
So it isn’t just enterprises who are struggling to manage big data. The general rule – if you have multiple spreadsheets with data, you have big data on your hands. As Margaret Dawson, VP of Marketing for Symform points out, “SMBs also struggle to keep up with skyrocketing data volumes. In fact, a recent data and backup trends survey of SMBs found that respondents average one terabyte to over 500 TBs (1 TB = 1000 GB) of data, with most forecasting data growth of 10-40% over the next year.” Wow.
Myth Two: Big Data makes BETTER analytics.
Is bigger really better? Sometimes, but in the case of big data, it seems that in certain circumstances, bigger is simply bigger and quantity does not always equal quality; the gap occurs in the analysis and translation of the data. It’s [almost] comparable to giving someone a hammer, nails, wood, and all the tools they need to build a house, but without the blueprint, they don’t know what to do and where to start. This means that with our data, we have to be sure we are collecting the valuable data to help solve the most prominent problems – the problems that relate back to our KPIs and bottomlines – and following the right blueprint to get what we need.
Erin Bartolo, Data Science Program Manager at the School of Information Studies at Syracuse University, agrees and provides a strategy to ensure bigger data turns into something meaningful. “Entertain your inner skeptic by questioning everything from what data are meaningful to how you project your own biases on findings. Without objective, analytical skills, analytics merely backs up our own biases with data,” she advises. She explains that the “whys” and “hows” need to be infused into the data analysis to really find value. She says, “…increasing one’s awareness of data and appreciation of its objectivity reveals insights whether the data is stored in an Excel spreadsheet or in a massive data warehouse.”
So in this case, size doesn’t really matter, unless you need the size of the data to answer the questions that relate back to your ultimate goals.
Myth Three: You need a team of Hadoop engineers and Analytics platforms to be on premise to work with Big Data.
While it is quite a challenge to merge data collected from various sources and analyze the information (and Hadoop professionals could be an advantage), there are other solutions and platforms that can help transform the unstructured data into structured data and merge with business intelligence (BI) tools.
Werner Hopf, CEO, Dolphin believes, “There are compelling [software] solutions to help companies meet those goals, achieve significant savings and performance improvements, and lay the foundation for leveraging SAP HANA – the vehicle for truly maximizing the potential benefits of big data – in the future.” These options can be cost effective and easily navigated by an intelligent, but not expert users.
The other idea of housing analytics platforms on premise is busted by Keith Metcalfe, Vice President of Sales and Marketing at WCI Consulting as he adds, “Integrating and cleansing data to a targeted place for reporting is a core concept behind any enterprise approach to analytics/business intelligence, and there is no technical reason why that target cannot reside in a hosted/SaaS environment. Cloud platforms and analytics tools are great applications for hosted (e.g. Amazon Web Services) or SaaS analytics platforms (e.g. SAP BusinessObjects BI OnDemand). Having said this, core to the topic of SaaS and hosted environments is that an organization sees value in replacing IT infrastructure, as this is where the financial return justifies the cost of investing in such environments.”
Two last pieces of big data management advice from Hopf, “From a data management perspective, making the most of the big data opportunity requires the adoption of two key strategies: 1) augmenting data archiving capabilities with nearline storage; and 2) re-architecting the business warehouse (BW) data model for lean, flexible, organized “views” of information that serve up agile reporting without increasing administrative
overhead.” Do this, add the human aspect, and solve big business problems using your big data.
If you have questions about these myths, feel free to reach out to a Top Big Data Twitter Influencers.Comments