V’s of Big Data

Popat Avani
8 min readNov 18, 2020

What is data?

Data is storing characters, string ,symbols etc in Digital format.

What is big data?

Big Data is also data but with a huge size.Big Data is a large volume of audio, video, animation, text and document collections.

YB>ZB>EB>PB>TB>GB>MB>KB>B

History of V’s Big Data

The word “big data refers to data that is so massive, fast or complex that conventional methods make it difficult or impossible to process. For analytics, the act of accessing and storing vast volumes of information has been around a long time. But when market analyst Doug Laney expressed the now-mainstream idea of big data as the three V’s, the concept of big data gained traction in the early 2000s.

Who is Generating Big Data?

· It is estimated that by 2020, 10.1 billion mobile devices will be in use. This is more than the entire population of the planet, and laptops and desktops are not included here. Every day, we perform more than 1 billion Google searches. Every day, about 301 billion emails are sent. There are more than 231 million tweets posted daily. Facebook stores, accesses and analyses more than 30 petabytes (that’s 1019 bytes) of user-generated data. On YouTube alone every minute, around 250 hours of video are uploaded.

Application of big data

Securities and Banking : By analyzing data banks can keep track of frauds credit card detections.

Media and Entertainment:Amazon prime which is driven to provide a great customer experience by offering video,music, and kindle books in a one-stop shop.

Education :big data is used quite significantly in higher education.

Healthcare Providers: It can be used in detection of cancer and so many more diseases.

Characteristics of Big Data.

1)Velocity

Velocity is probably the best known characteristic of big data. Velocity essentially refers to the speed at which data is being created in real-time .Lets check how much data is created every 60 seconds : 98,000 tweets are uploaded, 6,95,000 facebook’s statues are uploaded,12,20,000 google searches, 1,275 TB data created. From all this data we can imagine how much data can be ‘BIG’.

2)Volume

Volume refers to how much amount of data is being created .All this big data is created from the last 5–6 years only because we humans have connected everything with technology.Now that data is generated by machines, networks and human interaction on systems like social media the volume of data to be analyzed is massive.

For example: everyday we are creating 2.5 quilltion of data every single day.

3)Variety

Variety stands for all the different types of data like data can be in a form of images, texts,etc.The principle of big data says that:when you can, keep everything. So using all the Variety of data we can make more precise decision.

For example: Before 10–15 years data was only in the form spreadsheet and databases but now a days data can be in a form of PDFS,Video,Photo,Personal Data,SMS and more.

4)Variability

Variability can also refer to the inconsistent speed at which big data is loaded into your database.Big data is also variable because of the multitude of data dimensions resulting from multiple disparate data types and sources.

This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively.

5)Veracity

This is one of the unfortunate characteristics of big data and a bit more meaning .Veracity refers more to the provenance or reliability of the data source, its context, and how meaningful it is to the analysis based on it.Veracity helps to filter through what is important and what is not, and in the end, it generates a deeper understanding of data and how to contextualize it in order to take action.

6)Visualization

Visualization plays an important role among all the v’s .It helps in reading the process data that will be in the form of graphs and bars.We can take all the decisions and make rules with the help of these charts and graphs.

7)Value

Last, and the most important of all, is value. The other characteristics of big data are meaningless if you don’t derive business value from the data.

So all you have to do is analyze the data, understanding your customers better, targeting them accordingly, optimizing processes, and improving machine or business performance.You have to comprehend the potential, alongside the additionally testing attributes, before setting out on a major information technique.

After doing research on the characteristics of big data,it was found out that there are not just 7V’s but there are more than 50V’s.So some of them are mentioned here.

1.Vagueness: The meaning of found data is often very unclear, regardless of how much data is available.

2. Validity: Rigor in analysis (e.g., Target Shuffling) is essential for valid predictions.

3. Valor: In the face of big data, we must gamely tackle the big problems.

4. Value: Data science continues to provide ever-increasing value for users as more data becomes available and new techniques are developed.

5. Vane: Data science can aid decision making by pointing in the correct direction.

6. Vanilla: Even the simplest models, constructed with rigor, can provide value.

7. Vantage: Big data allows us a privileged view of complex systems.

8. Variability: Data science often models variable data sources. Models deployed into production can encounter especially wild data.

9. Variety: In data science, we work with many data formats (flat files, relational databases, graph networks) and varying levels of data completeness.

10. Varifocal: Big data and data science together allow us to see both the forest and the trees.

11. Varmint: As big data gets bigger, so can software bugs!

12. Varnish: How end-users interact with our work matters, and polish counts.

13. Vastness: With the advent of the Internet of Things (IoT), the “bigness” of big data is accelerating.

14. Vaticination: Predictive analytics provides the ability to forecast. (Of course, these forecasts can be more or less accurate depending on rigor and the complexity of the problem. The future is pesky and never conforms to our March Madness brackets.)

15. Vault: With many data science applications based on large and often sensitive data sets, data security is increasingly important.

16. Veer: With the rise of agile data science, we should be able to navigate the customer’s needs and change directions quickly when called upon.

17. Veil: Data science provides the capability to peer behind the curtain and examine the effects of latent variables in the data.

18. Velocity: Not only is the volume of data ever increasing, but the rate of data generation (from the Internet of Things, social media, etc.) is increasing as well.

19. Venue: Data science work takes place in different locations and under different arrangements, Locally, on customer workstations, and in the cloud.

20. Veracity: Reproducibility is essential for accurate analysis.

21. Verdict: As an increasing number of people are affected by models’ decisions, Veracity and Validity become ever more important.

22. Versed: Data scientists often need to know a little about a great many things: mathematics, statistics, programming, databases, etc.

23 . Version Control: You’re using it, right?

24. Vet: Data science allows us to vet our assumptions, augmenting intuition with evidence.

25. Vexed: Some of the excitement around data science is based on its potential to shed light on large, complicated problems.

26 Viability: It is difficult to build robust models, and it’s harder still to build systems that will be viable in production.

27 Vibrant: A thriving data science community is vital, and it provides insights, ideas, and support in all of our endeavors.

28. Victual: Big data — the food that fuels data science.

29. Viral: How does data spread among other users and applications?

30. Virtuosity: If data scientists need to know a little about many things, we should also grow to know a lot about one thing.

31.Viscosity: Related to Velocity; how difficult is the data to work with?

32. Visibility: Data science provides visibility into complex big data problems.

33. Visualization: Often the only way customers interact with models.

34. Vivify: Data science has the potential to animate all manner of decision making and business processes, from marketing to fraud detection.

35. Vocabulary: Data science provides a vocabulary for addressing a variety of problems. Different modeling approaches tackle different problem domains, and different validation techniques harden these approaches in different applications.

36. Vogue: “Machine Learning” becomes “Artificial Intelligence”, which becomes…?

37. Voice: Data science provides the ability to speak with knowledge (though not all knowledge, of course) on a diverse range of topics.

38. Volatility: Especially in production systems, one has to prepare for data volatility. Data that should “never” be missing suddenly disappears, numbers suddenly contain characters!

39. Volume: More people use data-collecting devices as more devices become internet-enabled. The volume of data is increasing at a staggering rate.

40. Voodoo: Data science and big data aren’t voodoo, but how can we convince potential customers of data science’s value to deliver results with real-world impact?

41. Voyage: May we always keep learning as we tackle the problems that data science provides.

42. Vulpine: Nate Silver would like you to be a fox, please.

Well a research says by the end of 2021 there will be more than 100V’s in Data science.

FUTURE SCOPE of Big Data

1. Tools for visual data exploration are going to expand 2.5 times faster than the rest of the Business Intelligence (BI) industry. Investing in this end-user self-service enabler will become a necessity for all businesses by 2018.

2. Spending on cloud-based Big Data and Analytics (BDA) solutions would rise three times faster than spending on on-site solutions over the next five years. It will become a re-quirement for hybrid on/off premise deployments.

3. The lack of qualified personnel will continue. There will be 181,000 deep analytics jobs in the U.S. alone in 2018 and five times as many positions that require relevant data processing and interpretation skills.

4. The unified architecture of the data platform will become the basis of the BDA strategy by 2017. Unification can take place by knowledge management, review, and unification.

5. Growth in applications integrating traditional and advanced technologies

Predictive analytics will accelerate in 2015, like machine learning. Such apps are going to expand 65% faster than apps without predictive features.

6. 70 percent of large companies are now purchasing external information, and by 2019, 100 percent will do so. More or- gains will simultaneously begin to monetize their data by selling or offering value-added content to them.

7. In 2015, the adoption of technology to continuously analyse event streams would accelerate as it is applied to Internet of Things (IoT) analytics, which is projected to rise by 30 percent at a five-year compound annual growth rate (CAGR).

8. In response to the need for greater continuity in decision making and decision-making process information retention, decision management systems will grow at a CAGR of 60 percent through 2019.

9. In 2015, rich media (video, audio, image) analytics will at least triple and emerge as the main driver for investment in BDA technology.

10. Half of all customers will engage on a daily basis with services focused on cognitive computing by 2018.

Taking an average of all the figures proposed by leading large,It can be concluded that data market analysts and research firms.That approximately 15% of all IT organisations are going to moving to cloud-based service platforms, this service is expected to grow by 35 percent between 2015 and 2021.

--

--