Leveraging Alluxio with Spark SQL to Speed Up Ad-hoc Analysis

Background

At present, hundreds of TB of data is processed in Momo bigdata cluster every day. However, most of the data will be read/write through disk repeatedly, which is ineffective. In order to speed up data processing and provide better user experience, after some investigation we found that Alluxio may fit our need. Alluxio works by providing a unified memory speed distributed storage system for various jobs. I/O speed in memory is faster than in hard disk. Hot data in Alluxio could server memory speed I/O just like memory cache. So the more frequent data read/written over Alluxio, the greater the benefit will have. In order to better understand the value Alluxio have to our ad-hoc service which uses Spark SQL as executing engine, we designed a series experiments of Alluxio with Spark SQL.

Experiment Design

There are a few designs which aims to take advantage of Alluxio:

  • Firstly, we use decoupled computer and storage architecture for the reason that mixed deployment will leave a heavy I/O burden to Alluxio, so DataNode is not deployed with Alluxio worker. The Alluxio cluster is decoupled from HDFS storage, it will read data from remote HDFS nodes for the first execution.
  • Secondly, in order to mock the online environment, we use YARN node label feature to divide an Alluxio cluster from production cluster, which means the Alluxio cluster share the same NameNode and ResourceManager from production cluster and may be affected by the pressure of production cluster.
  • Thirdly, there is only one copy of data stored in Alluxio, which means it can not guarantee high availability. What's more, persisting data to second tier storage such as HDFS is low efficient and space wasteful. Considering about stability and efficiency, we choose to use Alluxio as a read-only cache in our experiment .

The figure below shows the deployment of Alluxio cluster with production cluster.

Figure 1. Alluxio with Spark SQL Architecture

The experiment environment of Alluxio cluster is the same as production except for no DataNode process. So it will have data transportation cost for the first running. Besides, we use Spark Thrift Server to provide ad-hoc analysis service, then all the SQLs tests are made through Spark Thrift Server.

Environment Preparation

Basically, we followed the official instruction of Running Apache Hive with Alluxio and Running Spark on Alluxio to deploy Allxuio with Spark SQL.

Name Configuration
Software Spark 2.2.1, Hadoop 2.6.0(HDFS is 2.6.0 while YARN is 2.8.3), Alluxio 1.6.1, Hive 0.14.0
Hardware 32 core, 192G Mem, 5.4T*12 HDD

Here is the software and hardware environment on each node.

Name Configuration
Software Spark 2.2.1, Hadoop 2.6.0(HDFS is 2.6.0 while YARN is 2.8.3), Alluxio 1.6.1, Hive 0.14.0
Hardware 32 core, 192G Mem, 5.4T*12 HDD

And the configuration of Spark Thrift Server is:

/opt/spark-2.2.1-bin-2.6.0/sbin/start-thriftserver.sh   
--master yarn-client 
--name adhoc_STS_Alluxio_test
--driver.memory   15g
--num-executor 132
--executor-memory 10G
--executor-cores 3
--conf spark.yarn.driver.memoryOverhead=768m
--conf spark.yarn.executor.memoryOverhead=1024m
--conf spark.sql.adaptive.enabled=true

Performance Test

1) Test background

In production environment, we provide ad-hoc service by leveraging Spark SQL, which is of high performance and convenience over MR and TEZ.

2)Test case
1) Small data test

First we perform an approximately 5min job, which takes no more than 10 GB data. However, the average time cost is close to the time in production environment, which shows no obvious improvement. After searching for the reason, a blog from Alluxio User List explains that the testing job should be IO bounded because of OS's buffer cache. Spark will also temporarily persist the input data in the test. So when we performed the small data input test job, the running time of Spark on Alluxio has no difference with Spark alone. The data size is too small to fully utilize Alluxio's caching ability.

Figure 2. Detailed Information of Test Job

The content marked by red line shows the job only have approximate 5 GB data input size, which is relatively small and OS's buffer cache can hold the all data. Then we proposed another test.

2) Large data test
SQL NO. Input data size
SQL1 300G
SQL2 1T
SQL3 1.5T
SQL4 5.5T

To better evaluate the performance of Alluxio, we picked 4 different SQLs from online with data input size ranging form 300GB to 5.5TB. Besides, we designed four test groups of Alluxio, Spark, Yarn and Alluxio on disk.

SQL NO. Input data size
SQL1 300G
SQL2 1T
SQL3 1.5T
SQL4 5.5T

We executed each SQL for 4 times to eliminate the caching deviation of the first running and calculate the average running time of the latest 3 running times.

Test Group Comment
Alluxio Spark on Alluxio on Alluxio cluster
Spark Spark without Alluxio on Alluxio cluster
Yarn Spark without Alluxio on production cluster
Alluxio on Disk Spark on Alluxio with only one HDD tier on Alluxio cluster

Here is the explaination of the test group.

Test Group Comment
Alluxio Spark on Alluxio on Alluxio cluster
Spark Spark without Alluxio on Alluxio cluster
Yarn Spark without Alluxio on production cluster
Alluxio on Disk Spark on Alluxio with only one HDD tier on Alluxio cluster
Figure 3. Test Result

Then we get the following chart to better understand figure 3.


Figure 4. Comparision of Test Result

Conclusion

From the experiment we can get some interesting conclusions.

  1. From the large and small data test, it is shown that Alluxio is not suitable for small data jobs when using large memory machines, and it should be used in large data scenario.
  2. In figure 4, the max consumption of time is usually the first running time. In general, when getting data for the first running, reading data through Alluxio may be slower than directly from HDFS. The reason is that data is going from HDFS to Alluxio first and then go into Spark process. Actually, that method depends on SQL characteristics. In SQL 2, SQL 3 and SQL 4, Alluxio group performs better than Spark group.
  3. Reading data from cache is generally faster than from disk. However, by comparing Alluxio and Alluxio on disk group, the speed of reading data from cache is similar to reading from disk in SQL 2 and SQL 3, so the performance improvement depends on the SQL workload.
  4. Generally speaking, Spark on Alluxio can achieve a good acceleration, which is 3x - 5x than Spark on production environment and 1.5x - 3x than Spark without Alluxio.

All in all, Alluxio does have obvious effect to our ad-hoc analysis service and different featured SQL have different effect on Alluxio acceleration.The reason why our test does not achieve up to 10x improved performance which the official website declares may be that our testing SQL is selected from online job, which is more complicated and contains plenty of computing and I/O cost.

What We Have Done

  1. The memory in Alluxio cluster is limited, if all the used tables are loaded into Alluxio, the first memory tier will be full quickly and then spills the excessive data to second tier. It will severely affect Alluxio performance. In order to prevent it form being happended, we developed a whitelist feature to decide which table to be loaded into Alluxio for caching.
  2. As Alluxio has only one copy of data in memory and do not guarantee high availability, so it is used as a read only cache in our scenerio, we developed a feature to read from Alluxio and write directly to HDFS. If there is a query sql and table involved is in whitelist, the scheme of path of the table will be transformed to Alluxio scheme and applications can read data from Alluxio and write to HDFS.
  3. The official ad-hoc usage scenerio of Alluxio is to use Presto and Hive as query enging, this method is being widely used. In this article, we proved that Spark SQL can also be use in ad-hoc with Alluxio. Compared to Presto, Spark has better fault tolerance than Presto and still keeps a good performance. So Alluxio with Spark SQL is another good technical option for ad-hoc service.

Future Work

As Alluxio does speed up our service remarkably, we would like to applying more framework such as hive and Spark MLlib to Alluxio and let it become the uniform data ingestion interface to computing layer. Besides, more efforts on security, stability and job monitoring will be on our way.

References

www.alluxio.org
Alluxio users
How Alluxio is Accelerating Apache Spark Workloads

About MOMO

MOMO Inc (Nasdaq: MOMO) is a leading mobile pan-entertainment social platform in China. It has reached approximately 100 million monthly active users and over 300 million users worldwide. Currently the MOMO big data platform has over 25,000 cores and 80TB memory which support approximately 20,000 jobs every day. The MOMO data infrastructure team works on providing stable and efficient batch and streaming solutions to support services like personalized recommendation, ads and business data analysis demand. By using Alluxio, the efficiency of ad-hoc service has achieved a significant improvement. This blog provides some performance tests to evaluate how much benefit can Alluxio contributes to our ad-hoc service as well as a new scenario of using Spark SQL on Alluxio.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 202,723评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,080评论 2 379
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 149,604评论 0 335
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,440评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,431评论 5 364
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,499评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,893评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,541评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,751评论 1 296
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,547评论 2 319
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,619评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,320评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,890评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,896评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,137评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,796评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,335评论 2 342

推荐阅读更多精彩内容

  • 原文 季康子问:“使民敬、忠以劝,如之何?”子日:“临之以庄则敬,孝慈则忠,举善而教不能则劝。” 释文 季康子向孔...
    番茄妈阅读 486评论 0 1
  • 到这个点又想起来写日记了,想想今天写的啥呢!我这肚子里面的墨水可真够少的,接下来就慢慢积累呗! 最近很忙很忙,真的...
    brood阅读 310评论 0 1
  • 1 儿童节,本来该想个快乐的事、跟儿童有关系的事,但是最终还是走神了。 不小心看到了王阳明。 王阳明的故事被传的很...
    杨庆瑞阅读 1,069评论 0 3