Your Use of Double Check Pattern May Not Be That Great!

What is the Double Check pattern?

Synchronization in any programming language is considered to be expensive. The Double Check pattern is simply a way to try to eliminate locks by first testing for the existence of a resource without holding a lock and return the resource directly if it existed. If it didn’t exist, a lock is obtained and the check is done again, and if the resource still doesn’t exist, it’s created and returned. In Java it would look something like this:

class DoubleCheck {
private SomeResource myResource;
public SomeResource getResource() {
if(myResource != null) {
return myResource
}
synchronized(this) {
if(myResource != null) {
return myResource
}
myResource = new SomeResource();
return myResource
}
}
}

view raw
DoubleCheck.java
hosted with ❤ by GitHub

Why are two checks cheaper than one?

Locks can be expensive to manage. First of all, there’s the obvious situation where the lock is held by someone and your code has to wait. But even if no one is holding the lock, there could be some performance implications. Locks can be implemented in many different ways, but they all need some kind of so called atomic instruction. These are machine code instructions that guarantee thread safety at the hardware level. What this means is that they have to halt all other cores and hardware threads for a brief moment. There are also implications on hardware caches that could further slow down execution. Experiments have shown that an atomic instruction can be up to 30 times slower than its non-atomic counterpart.

So checking something without holding a lock seems like a much quicker way. Now the lock only needs to be held while a new resource is created. Most of the time, you’d just get the resource back without acquiring a lock! Great, isn’t it?

A Blatantly Broken Example

Recently I saw some code in a widely used package that prompted me to alert the maintainer. Someone was trying to save a few nanoseconds using a Double Check pattern in a place where it definitely didn’t belong. Here’s the essence of that code:

class BrokenDontUse {
private Map<String, SomeResource> aHashMap = new HashMap<>();
public SomeResource getResource(String name) {
SomeResource r = aHashMap.get(name);
if(r != null) {
return r;
}
synchronized(aHashMap) {
SomeResource r = aHashMap.get(name);
if(r != null) {
return r;
}
r = new SomeResource(name);
aHashMap.put(name, r)
return r;
}
}
}

It should be pretty obvious why this is broken. A HashMap is inherently unsafe and the put operation may completely rearrange its internal structures. If the get should happen to execute at the same time, chances are very high that you’ll end up with strange results and very hard to find bugs.

In this case, you can still use the Double Check pattern. In fact, the code above only needs to change a single line to be safe. Instead if instantiating the Map as a HashMap, you could use a ConcurrentHashMap. This variant of a Map uses some clever tactics internally to make sure most accesses can take place without locks being acquired, while being guaranteed to be thread safe.

This ensures that we won’t run into strange bugs stemming from race conditions in the HashMap. But is this code really thread safe? Well, it depends…

Why Double Check May Be a Bad Idea

Even if you avoid the obvious HashMap problem, there’s still some potential issues here. Consider this code where we want to record when something was accessed for the first time. We could do that using a Double Check.

class QuestionableUseOfDoubleCheck {
private long firstAccess = 1;
public long accessSomething() {
if (firstAccess != 1) {
return this.firstAccess;
}
synchronized (this) {
if (this.firstAccess != 1) {
return this.firstAccess;
}
this.firstAccess = System.currentTimeMillis();
return this.firstAccess;
}
}
}

view raw
Questionable.java
hosted with ❤ by GitHub

Looks pretty safe, doesn’t it? Well, on most modern processors this code IS safe. Storing and reading a 64-bit long should require a single instruction and a single access to memory across its 64-bit wide bus. But there’s no guarantee that’s the case. In fact, the Java spec explicitly states that 64-bit assignments are not guaranteed to be atomic. So what if someone tried to run your code on a 32-bit machine, like a Raspberry Pi? There’s a chance you’d see a timestamp where only half of the 64 bits had been updated!

Luckily, in Java there’s the volatile keyword. By declaring “firstAccess” volatile, you would guarantee that accesses to it are atomic. But guess what? Depending on your platform, you may now have introduced the need for an atomic instruction, which is what we tried to avoid in the first place!

Your JVM is Probably Smarter than You!

As we have seen, there’s really no safe way of avoiding synchronization or atomic accesses. And when it comes to synchronization, you should understand that in most cases, it’s pretty fast. Most languages implement synchronization something like this (pseudocode):

int waiters = 0;
void acquireLock() {
if(atomicIncrement(waiters) == 0) {
rerurn;
}
callSlowAndPainfulLockingLogic();
}

view raw
lock.c
hosted with ❤ by GitHub

Do you see what’s going on here? It’s pretty close to a Double Check pattern, isn’t it? It tries the quick and easy way first, then takes the more arduous route if needed. The “atomicIncrement” pseudo function deserves some explanation. Most modern CPUs have an instruction for atomically incrementing a value and returning what it was just before (or is some cases just after) it was incremented. The “waiters” variable holds the number of waiting threads. If I increment it atomically and the number before it was incremented is zero, I can be sure of two things: No one was holding it when I tried to take it and I now own it, since all other threads will see waiters > 0.

Yes, there’s still an atomic instruction here, but as we have shown above, you would need them anyway to implement a Double Check that’s truly safe.

Empirical Testing

So how much does really synchronization affect performance? The answer is, as usual, “it depends”.

On my MacBook Pro with an i7 processor, a loop incrementing a single integer 100,000,000 times took 112ms without a synch inside the loop. With a synch inside the look, it took 221ms. So a 100% performance degradation. That seems bad. Yes, but this isn’t a very realistic use case. How often do you write code like this? Rarely. Also, if we look at the cost for each synchronization, it’s around 2 nanoseconds! Yes, the impact could be higher on a very busy massively parallel machine, but it’s still fairly low for most operations.

Here’s the code:

public class Test {
public static void main(String[] args) {
int value = 0;
long now = System.currentTimeMillis();
for(int i = 0; i < 1e8; i++) {
value++;
}
System.out.println("Unsynched version took " + (System.currentTimeMillis() now) + "ms");
Object syncher = new Object();
now = System.currentTimeMillis();
for(int i = 0; i < 1e8; i++) {
synchronized(syncher) {
value++;
}
}
System.out.println("Synched version took " + (System.currentTimeMillis() now) + "ms");
}
}

view raw
Test.java
hosted with ❤ by GitHub

A more realistic example may be to access a HashMap 10,000,000 times. The unsynched version takes 40ms and the synched version takes 50ms. Still a difference, but there’s a very limited number of applications where such a difference would have any meaningful impact.

Again, here’s my example code in Java:

import java.util.HashMap;
public class HashTest {
public static void main(String[] args) {
HashMap<String, String> map = new HashMap<>();
map.put("foo", "bar");
long now = System.currentTimeMillis();
for(int i = 0; i < 1e7; i++) {
map.get("foo");
}
Object syncher = new Object();
System.out.println("Unsynched version took " + (System.currentTimeMillis() now) + "ms");
now = System.currentTimeMillis();
for(int i = 0; i < 1e7; i++) {
synchronized(syncher) {
map.get("foo");
}
}
System.out.println("Synched version took " + (System.currentTimeMillis() now) + "ms");
}
}

view raw
HashTest.java
hosted with ❤ by GitHub

When Does Double Check Make Sense?

So far, this article reads like I’m bashing the Double Check pattern. In fact, that’s not at all what I’m trying to do. What I’m worried about are all the improper uses of Double Check that I’ve seen and how they could introduce some very subtle and hard-to-find bugs. It also makes the code more complex and a but harder to maintain. But it does have its virtues.

So by all means, use Double Check, but use it with caution and only when it makes sense!

Here are some basic rules.

Use Double Check when the lock contention in the fast path is likely

If your code is called millions of times per second, there’s a high likelihood that threads will be stuck waiting on a lock for no reason. If performance is an issue, you may consider implementing a “fast path” that doesn’t require locking.

Your fast path MUST use atomic accesses only!

In Java, with the exception of object reference assignment and assignment of 32-bit datatypes, nothing is atomic. So you need to take care that your fast path takes the appropriate precautions to make sure all accesses are atomic. The volatile keyword or the java.util.concurrent.atomic package are very useful.

Also keep in mind that even if you make atomic accesses, you’re typically only allowed one such access in your fast path. If you check more than one value, your code is not atomic anymore and may very well end up in a race condition with the slow path.

Consider using a read/write lock!

Sometimes it’s not possible to make the fast path fully atomic. Does that mean that all hope is lost for Double Check? Not necessarily! You can use something like a java.util.concurrent.locks.ReadWriteLock. These are what’s known as asymmetric locks which allows multiple readers, but only one writer. Once the writer acquires a lock, it also blocks all readers. I’m planning to write an article about this in the near future, but in basic terms, you would essentially wrap a read lock around your fast a path and a write lock around the slow path.

Document, document and document! Did I mention “document”?

When you’re implementing a Double Check pattern, add code comments clearly stating what you’re doing. Some maintainer may poke around in the code without understanding the requirements for the fast path to be atomic and someone may have to spend days or weeks chasing strange bugs!

The  catch-all: Use only when needed!

I recently reviewed some code where a Double Check pattern was used in a function that was called maybe ten times during application startup. To make matters worse, the code had a subtle bug in it. So the developer shaved maybe a couple of microseconds off the application startup time at the expense of code complexity that caused a bug. So don’t bother using this pattern unless you expect your code to be called very frequently and where lock contention could have a meaningful impact on performance!

Conclusion

The Double Check pattern can be a life saver when you are under heavy performance requirements with code that’s called millions or billions of times. But there are many pitfalls and I’ve seen a fair amount of bugs caused by programmers that don’t fully understand the semantics of the pattern. So use it with care and only when needed!

Follow me on Twitter @prydin!

 

 

 

Bosphorus: A portal framework for vRealize Automation

I’m back!

I know it’s been a while since I posted anything here. I’ve been pretty busy helping some large financial customers, being a part of the CTO Ambassador team at VMware and preparing for VMworld. But now I’m back, and boy do I have some exciting things to show you!

One of the things I’ve been working on is a project that provides a framework for those who want to build their own portal in front of vRealize Automation. The framework also comes with a mobile-friendly reference implementation using jQuery Mobile.

This article is just a teaser. I’m planning to talk a lot more in detail about this and how to use the vRA API for building custom portals and other cool things. Below is an excerpt from the description on github. For a full description, along with code and installation instructions, check out my github page here: https://github.com/njswede/bosphorus

Background

This project is aimed at providing a custom portal framework for vRealize Automation (vRA) along with a reference implementation. It is intended for advanced users/developers of vRealize Automation who need to provide an alternate User Interface or who need to integrate vRA into a custom portal.

Bosphorus is written in Java using Spring MVC, Spring Boot, Thymeleaf and jQuery. The reference implementation uses jQuery Mobile as a UI framework. The UI framework can very easily be swapped out for another jQuery-based framework.

The choice of Java/Spring/Thymeleaf/jQuery was deliberate, as it seems to be a combination that’s very commonly used for Enterprise portals at the time of writing.

Why the name?

I wanted a name that was related to the concept of a portal. If you paid attention during geography class, you know that the Bosphorus Strait, located in Turkey is the portal between the Mediterranean Sea and the Black Sea. Plus is sounds cool.

Design goals

  • Allow web coders to develop portals with no or little knowledge of vRA
  • Implement on a robust platform that’s likely to be used in an enterprise setting.
  • Easy to install.
  • Extremely small footprint.
  • Extremely fast startup time.
  • Avoid cross-platform AJAX issues.

Features

Bosphorus was designed to be have a very small footprint, start and run very fast. At the same time, Bosphorus offers many advanced features such as live updates using long-polling and lazy-loading UI-snippets using AJAX.

Known bugs and limitations

  • Currently only works for the default tenant.
  • Only supports day 2 operations for which there is no form.
  • Displays some day 2 operations that won’t work outside the native vRA portal (such as Connect via RDP).
  • Only allows you to edit basic machine parameters when requesting catalog items. Networks, software components, etc. will be created using their default values.
  • In the Requests section, live update doesn’t work for some day 2 operations.

Future updates

I’m currently running Bosphorus as a side project, so updates may be sporadic. However, here is a list of updates I’m likely do post in the somewhat near future:

  • Support for tenants other than the default one.
  • More robust live update code.
  • Support for “skins” and “themes”.
  • Basic support for approvals (e.g. an “inbox” where you can do approve/reject)

Screenshots

screen-shot-2016-09-09-at-4-52-22-pmscreen-shot-2016-09-09-at-4-53-21-pmscreen-shot-2016-09-09-at-4-57-02-pm

screenshot_20160812-125150

Glucose Monitoring – A vRealize Operations Adapter using the SDK

Background

I have close friend who is unfortunate enough to suffer from Type 1 Diabetes. As you may know, this deadly disease can only be managed by constantly monitoring your blood glucose and injecting insulin. Fortunately, there’s a silver lining and it’s called “cool gadgets”. Many diabetes sufferers are now wearing continuous glucose monitors, which electronically monitor the patient’s glucose level and displays the numbers and trends.

But it doesn’t stop there. The Dexcom glucose monitor allows the metrics to be sent to the cloud in real time for remote monitoring an analysis. Of course it didn’t take long before some geeky diabetics figured out how to harvest the data and build open-source tools around it. They founded the Nightscout Foundation and continue to build useful and life-saving tools for managing diabetes on an open-source basis. You should check them out! They do amazing work!

When I heard about this, I was thinking that this is just time series data with trends and patterns that’s accessible through a simple REST-API. Then I thought of vR Ops, which is a tool that can analyze such data and I figured it would be fun to write an adapter that pulls in blood glucose data into vR Ops. So the next time I had a break between two meetings, I went to work!

The most interesting thing about this project is, in my opinion, how it leverages the dynamic thresholds to determine the normal range for blood glucose levels over the course of a day. It serves as a great demo of how the algorithms in vRealize Operations can adapt to virtually anything!

graph

vRealize Operations Java API

Since I have a background as a Java developer, I realized that using the Java API for vRealize Operations would be the easiest way to go. The Java API is essentially a wrapper around the REST API that allows you to interact with vRealize Operations through regular Java classes. If you are a Java programmer and want to interact with vRealize Operations, this is the way to go!

The Nightscout API

This API is dead simple. All you have to do is to issue a GET against http://{host}/api/v1/entries/current to get the latest sample. The sample is then returned as a tab-separated string containing timestamp, glucose reading and a trend indicator. For this project, we’re just using the timestamp and the glucose reading.

Running the code

We could have created some kind of intelligent scheduler trying to synchronize data collection with the timestamps of returned data to minimize lag, but in this case, I just went with a simple Java program that’s kicked off by a cron-job set to execute every five minutes.

The Java program takes the following parameters: A username and password for vRealize Operations, a URL for Nightscout and the name of the patient we’re collecting data for.

A walkthrough of the code

The run() method

The code is fairly simple, but it illustrates a few key concepts of the Java API for vRealize Operations. The bulk of the work happens in the run() method.

Screen Shot 2015-07-10 at 3.41.18 PM

This function is called once the parameters are extracted and parsed. First, it opens an HTTP connection to Nightscout and issues a GET. The result is then parsed, simply by splitting up the comma-separated string.

Once that’s done, we call vRealize Operations to look up an object representing the patient we’re monitoring. We do that by looking up an object of the type “Human” with the patient name supplied on the command like. If we don’t find the patient record, we create it.

Finally, we pack the data samples into an array and call “addStats” to send the new sample to vR Ops.

The findResourceByName() method

This method is really just a wrapper around the API call for looking up an object by name.

Screen Shot 2015-07-10 at 3.48.38 PM

The first three lines deal with the resource type. When this is first executed, the resource type “Human” isn’t going to exist so we have to create it.

Once we have the resource kind identifier, we can go ahead and call the “getResources” API method. This will return a list of resource identifiers, but since we know there should only be one, we can return the first (and only) object we find.

The createHuman() method

If the object representing the patient doesn’t exist, we need to create it.

Screen Shot 2015-07-10 at 3.52.33 PM

Again, this is just a prettier version of the API method for creating a resource. All it does is to fill out a structure with the resource kind identifiers, adapter kind and adapter name. The “resourceKey” is the unique identifier for this object and in our case consists of the patient name.

The addStats() method

This method performs the final task of actually sending the metrics to vRealize Operations.

Screen Shot 2015-07-10 at 3.56.06 PM

 

As usual, this method does little more than just assembling a structure containing the necessary values for the API call. Notice that we could send multiple timestamps and values if we wanted. In this case, however, we’re just dealing with the latest value from Nightscout, so we’re only supplying a single timestamp and value.

The end result

Screen Shot 2015-07-10 at 4.02.39 PM

The end result is actually quite fascinating and a great testament to the dynamic thresholds in vRealize Operations. Most people experience a spike in blood glucose level after a meal and a lower blood glucose level during the night. For a person with diabetes, these swings are typically more pronounced. If you look at the dynamic thresholds, you’ll see a lower and more narrow range during the night. After lunch and dinner, the range is wider and spikes are more common. This is clearly visible from the dynamic thresholds (the gray area) behind the graph.

This particular person usually has a low-carb breakfast, so you don’t normally see a spike. However, this day, he had pancakes for breakfast and the blood glucose spiked significantly outside out the normal range. How is that for a demonstration of the dynamic threshold feature in vRealize Operations?

Dashboards

No metric collector is complete without some great dashboards. Here are some examples of dashboards we built for the glucose data.

real-time

Real time view

long-term

Long term view

Conclusion

Although this is a somewhat exotic example of API use, it clearly illustrates the simplicity of use and the power of the automatically calculated dynamic thresholds.

Please consider supporting the research for a cure to Type 1 diabetes. JDRF is an excellent organization to support!

http://jdrf.org/get-involved/ways-to-donate/

The Nightscout Foundation also could use your donation to further promote and develop the open source technology for helping diabetics.

http://www.nightscoutfoundation.org/product/donate-2/

Downloads

The complete source code can be downloaded here.

Source code

Disclaimer

This project is intended as an example of what is possible using glucose monitoring technology and vRealize Operations. It is NOT intended to diagnose or cure any disease or condition. Type 1 Diabetes is a serious condition and should ONLY be managed using methods approved by the appropriate government bodies and recommended by your healthcare professional!