Big Data, Crystal Balls and Looking Glasses: Reviewing 2016, predicting 2017

Big Data, Crystal Balls and Looking Glasses: Reviewing 2016, predicting 2017
大数据，水晶球和镜子：回顾2016，预测2017

End-of-year reviews are boring -- and everyone does them. Predictions are boring -- and they are hard. Of course, this is different -- because big data.
年底回顾很无聊—可每个人都要做回顾。预测未来很无聊--并且它们很难预测。当然，这是不同的--因为大数据。

How do big data people go about making end-of-year reviews and predictions? Using data is the obvious answer, but there's a few issues with that approach: there is no synthesis in data alone -- you have to find the story behind data, pick an angle and seek meaning. In addition, that approach does not account for subtle hints, industry knowledge, and big ideas.
搞大数据的人们是如何来年底回顾和来年预测的呢？使用数据是显而易见的答案，但是这个方法有一些问题：数据里面没有综合的结论--你需要找到数据背后的故事，选取一个角度并且寻找它的意义。另外，那个方法不包含精确的提示信息，行业知识和大方向。

To paraphrase Carl Sagan, "we wish to find the truth, no matter where it lies. But to find the truth we need imagination and data both. We will not be afraid to speculate, but we will be careful to distinguish speculation from fact." In this spirit, let's keep things equally opinionated and objective in 2017.
卡尔萨根的意思是，“我们希望找到真相，无论它在哪里。但是为了找到真相，我们需要想象力和数据。我们不害怕推测，但是我们会很仔细从事实中获取推测结果。” 在这种精神下，让我们在2017同等主观又客观地看事情吧。

It's the end of Hadoop as we know it, and I feel fine
正如我们所知道的那样，Hadoop要到头了，我觉得还好。

Hadoop turned 10 in 2016. It's come a long way from a pet project named after a toy elephant to the (metaphorical) stampeding beast now in most every CXO's name-dropping list. The latest Big Data maturity survey showed that 73 percent of respondents are now in production with Hadoop (vs. 65 percent last year). And yet we're here to tell you Hadoop as we know it is dead. And that's not even news.
Hadoop在2016年表现的十全十美。它从一个以玩具大象命名的实验项目成长到现在几乎出现在每个首席官的炫耀名单里的狂奔的怪兽花了很长的时间。最新的大数据成熟度调查显示百分之七十三的受访者现在产品中都在使用Hadoop（相对去年是百分之六十五）。然后据我们所知Hadoop已死，而这几乎不是新闻。

Hadoop has been constantly evolving, expanding, and re-inventing itself throughout its lifetime. A massive ecosystem has been developing around the initial bare-bones offering, and today Hadoop is more of a platform than "just" a storage and compute framework. The introduction of YARN was a game changer, enabling Hadoop to become a Big Data OS and to break away from its batch-oriented MapReduce origins.
Hadoop在它的生命过程中一直在持续的演进，扩张，和重新发明自己。围绕着最初的基础功能，Hadoop发展出了一个庞大的生态系统，并且今天它更像一个平台，而不仅仅是一个储存和计算的框架。YARN的引入颠覆了Hadoop，使得Hadoop成为了一个大数据操作系统，脱离了原来的面向批量操作的MapReduce。

In 2016, data and stories from the trenches all pointed to the same direction: batch, MapReduce Hadoop is dead, long live real-time, Spark Hadoop. 25 percent of organizations are using Spark in production today with an additional 33 percent using it in development, and all major Hadoop vendors are involved in it. Adding up suggests that by the end of 2017 up to 50 percent of organizations could be using Spark in production.
在2016年，现实中的数据和事例都指向了同一个方向：批处理，MapReduce Hadoop已死，实时处理万岁，Spark Hadoop。现在百分之二十五的组织中线上产品中都在用Spark，另外有33%正在使用Spark做开发，并且所有主流的Hadoop服务商都参与到Spark中了。到2017年底，加起来会有多达50%的公司在它们的线上产品中使用Spark。

But it's not necessarily a Spark or bust future: neither is Spark the only streaming game in town, nor is Hadoop the only Big Data platform. Alternatives do exist, and users may migrate or leapfrog to them skipping Spark or Hadoop altogether, the same way they are now migrating from or skipping MapReduce.
Spark未来会兴盛还是萧条都不一定：Spark既不是唯一最好的大数据平台，Hadoop也不是仅有的大数据平台。可选方案确实存在，用户可以迁移到或者跳过Spark和Hadoop到它们上面去，就像现在人们正从MapReduce迁移出去或者跳过MapReduce一样。
[图片上传中。。。（1）]
The Big Data landscape is host to a multitude of different approaches. But more and more it looks like everyone is adding everyone else's features. Convergence or me-too? Image: Martin Kleppmann.
大数据框架是基于许多不同方法的。但是看起来每个模块都在加入越来越多其余模块的功能。聚合还是复制？图片：Martin Kleppmann
**

Becoming all things to all men to save some
成为满足所有人的万能者来保留用户
Spark can do both streaming and batch processing. And it can also do SQL, and graphs. And of course on Hadoop you can also do SQL and/or NoSQL in a number of other ways, utilizing a wide choice of tools. That's what being an ecosystem is all about, right? But then again, everyone seems to be at it these days.
Spark既能做流处理也能做批量处理。它也能处理SQL和图片。当然在Hadoop上你也能通过使用许多可选的工具来处理SQL和/或NoSQL。这是作为一个生态系统所应该做的，是吗？但是再说一次，每个大数据系统现在看起来都是这样子的。

NoSQL databases like Cassandra / DataStax Enterprise can now also do graph, in addition to key-value, tabular and document. What about the iconic NoSQL document store - MongoDB? Well, besides document, you can now also do SQL . Microsoft's SQL Server? Youraverage SQL server no more: it can run on Linux, it supports R, in-memory processing and column store. MariaDB, the poor man's SQL server, also has its column store now.
像Cassandra / DataStax Enterprise 这样子的NoSQL数据库在能处理键值，格式化和文档之外现在也能处理图片。那著名的NoSQL文档库MongoDB怎么样呢？好吧，除了文档，你也能使用SQL了。微软的SQL Server呢？它不再是你认识那个平庸的SQL服务器了：它能再Linux上运行，它支持R语言，内存运行和列存储。MariaDB，穷人的SQL服务器，它现在也支持列存储了。

Neo4J, the iconic graph store? It's going ACID. Google's BigQuery now supports standard SQL , joining Amazon Redshift that has had it for a while as it's based on Postgres. Of course, analytics-oriented column stores have long supported SQL. And traditional relational DBs like Oracle and IBM have been adding features like in-memory processing and column store for a while as well. Key-stores do it, document-stores do it, graph-stores do it, even SQL incumbents do it.
Neo4J, 典型的图形数据库？它也要支持ACID了。谷歌的BigQuery现在支持标准SQL，Amazon Redshift使用了BigQuery一段时间了因为它基于Postgres。当然，面向统计的列存储数据库长久以来就支持SQL。传统的关系型数据库像Oracle和IBM也一直在增加像内存处理和列存储这样子的功能。键值存储数据库这样子，文档存储数据库这样子，图形存储数据库这样子，甚至就连SQL数据库也是如此。

The boundaries are blurring, as more and more data platforms try to be more things to more people. Doing most everything on the same platform is good for vendors that want to increase their retention and good for users who don't want to have to mix and match disparate platforms to get things done. But it's not a sheer land-ho of opportunity - threats lie ahead too. Most notably, vendor lock-in, half-baked features, and half-hearted users.
因为越来越多的平台都在为更多的人群提供更多的功能，平台之间的界限正越来越模糊。对于想增加客户保留率的供应商和不想混用和拼接不相干的平台来达到目的的用户来说，在相同的一个平台上把几乎所有事情都做了是极好的。但是它并不是一个纯粹的充满机会的土地，危险也同样存在. 最显著的问题有，供应商锁定，半吊子功能和意兴阑珊的用户。
[图片上传中。。。（2）]
Some are trying to get the basics right, while some are after up in the sky goals. Yet, there's a place for everyone under Big Data. Image: Martin Kleppmann
一些人在为了基本的权利而努力，同时一些人在追求远大的目标。然而，大数据下每个人都有自己的容身之地。图片：Martin Kleppmann

This article is from http://www.zdnet.com/article/big-data-crystal-balls-and-looking-glasses-reviewing-2016-predicting-2017/

最后编辑于：2017.12.05 06:10:51

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 202,529评论 5赞 475
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 85,015评论 2赞 379
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 149,409评论 0赞 335
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,385评论 1赞 273
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,387评论 5赞 364
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,466评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 37,880评论 3赞 395
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,528评论 0赞 256
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 40,727评论 1赞 295
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,528评论 2赞 319
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,602评论 1赞 329
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,302评论 4赞 318
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 38,873评论 3赞 306
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,890评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,132评论 1赞 259
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 42,777评论 2赞 349
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,310评论 2赞 342

Big Data, Crystal Balls and Looking Glasses: Reviewing 2016, predicting 2017

推荐阅读更多精彩内容