n10k.com: blog et al.

May 9, 2026 A System Implementer's Flywheel

Several months ago I took up a monster of a project I had no clear idea how to tackle. What was abundantly clear was that I would need to gain a broad-reaching understanding of company policies, internal systems, and open source technologies to succeed. I had one quarter to deliver meaningfully. In an act of desperation, I reached for my coding agent as a research assistant.

Mar 25, 2026 Releasing HBase on Linux with Podman

I have previously released artifacts for Apache HBase using MacOS and Windows 11 + WSL2. Now I am running a native Linux installation, and so I again have some minor details to work through. This install is built on systemd, which is of minor concern. More interestingly, I decided to drop Docker and instead use Podman and crun as my interface over Linux containers.

Jul 11, 2014 HBase root dir on a Mac

While working on HBase bug fixes and feature development, it’s often quite convenient to test changes on a local-mode HBase. This is done by running HBase right out of your developer sandbox. Though a lot of HBase development happens on Macs these days, it’s a system designed first to run on Linux. That means there are a couple minor annoyances for non-Linux users. Let me show you how I work around one of them.

Jun 12, 2014 Greetings from Europe

Between HBaseCon and Hadoop Summit I took a short trip to Europe. I got to spend some more time working along side Nicolas and meet some of the Scaled Risk crew. I also took a small holiday through the hillside in Romania! Along the way, I was invited to present for both the Paris HPC Meetup and the London HBase Meetup.

Jun 10, 2014 BlockCache 101: Lightning Talk Edition

Every year at Hadoop Summit there’s a little un-conference, call the Birds of a Feather Sessions, or BoF for short. These are topical meetups that take place after the conference proceedings and are open to non-attendees. This year I helped organize the HBase BoF, along with Subash D’Souza.

May 29, 2014 Latency talk at Hadoop Summit

The Latency Talk Nicolas and I gave at HBaseCon has been accepted for Hadoop Summit San Jose. If you missed us at HBaseCon, you get one more opportunity! We’re speaking on June 4th at 3:25p.

See you in June!

Edit: Unfortunately, Nicolas was unable to make it so I presented solo. I hope I did his section justice.

May 12, 2014 HBase: where online meets low latency

HBaseCon was another fantastic conference this year! It’s a great resource for information about and around HBase, no matter where you are along your path. This year I presented a talk along with a colleague of mine, Nicolas Liochon of Scaled Risk fame. Our topic: HBase as an online, low-latency system.

Mar 7, 2014 BlockCache Showdown

The HBase BlockCache is an important structure for enabling low latency reads. As of HBase 0.96.0, there are no less than three different BlockCache implementations to choose from. But how to know when to use one over the other? There’s a little bit of guidance floating around out there, but nothing concrete. It’s high time the HBase community changed that! I did some benchmarking of these implementations, and these results I’d like to share with you here.

Note that this is my second post on the BlockCache. In my previous post, I provide an overview of the BlockCache in general as well as brief details about each of the implementations. I’ll assume you’ve read that one already.

Feb 13, 2014 BlockCache 101

Edit: The sequel post, BlockCache Showdown is now available!

HBase is a distributed database built around the core concepts of an ordered write log and a log-structured merge tree. As with any database, optimized I/O is a critical concern to HBase. When possible, the priority is to not perform any I/O at all. This means that memory utilization and caching structures are of utmost importance. To this end, HBase maintains two cache structures: the “memory store” and the “block cache”. Memory store, implemented as the MemStore, accumulates data edits as they’re received, buffering them in memory 1. The block cache, an implementation of the BlockCache interface, keeps data blocks resident in memory after they’re read.

Nov 15, 2013 HBase via Hive, Part 2

Apache Hive

This is the second of two posts examining the use of Hive for interaction with HBase tables. This is a hands-on exploration so the first post isn’t required reading for consuming this one. Still, it might be good context.

“Nick!” you exclaim, “that first post had too many words and I don’t care about JIRA tickets. Show me how I use this thing!”

This is post is exactly that: a concrete, end-to-end example of consuming HBase over Hive. The whole mess was tested to work on a tiny little 5-node cluster running HDP-1.3.2, which means Hive 0.11.0 and HBase 0.94.6.1.