Nick Dimiduk blog et al.

HBase: Where Online Meets Low Latency

HBaseCon was another fantastic conference this year! It’s a great resource for information about and around HBase, no matter where you are along your path. This year I presented a talk along with a colleague of mine, Nicolas Liochon of Scaled Risk fame. Our topic: HBase as an online, low-latency system.

HBase: Where Online Meets Low Latency

HBase is an online database so response latency is critical. This talk will examine sources of latency in HBase, detailing steps along the read and write paths. We’ll examine the entire request lifecycle, from client to server and back again. We’ll also look at the different factors that impact latency, including GC, cache misses, and system failures. Finally, the talk will highlight some of the work done in 0.96+ to improve the reliability of HBase.

I really enjoyed preparing this material. Getting into the weeds with how the HBase read path works was very informative, and complemented the BlockCache work I’ve been doing lately. Plus, it’s a real joy to work closely with someone who really knows his stuff.

The slides are up, so go have a look!

Edit: the video is available now as well.

Edit: The talk has been accepted for Hadoop Summit as well. I’ll be presenting solo this time, so if I say something about the write path that doesn’t make sense, consult the video here from Nicolas’s verion ;)

« BlockCache Showdown Latency talk at Hadoop Summit »

About the Author

Nick found Hadoop and HBase in 2008 when his nightly ETL jobs started taking 20+ hours to complete. Since then, he has applied these tools to projects over social media, social gaming, click-stream analysis, climatology, and geographic data. Nick also helped establish Seattleā€™s Scalability Meetup and tried his hand at entrepreneurship. He is an HBase committer and coauthored HBase in Action, the unofficial user's guide for HBase. His passion is scalable, online access to scientific data.