Nimble Nick Dyer: Hybrid Storage

Showing posts with label Hybrid Storage. Show all posts

Wednesday, 12 February 2014

The McLaren P1 - The Nimble Storage of the Motor Industry

I’m quite a fan of the UK BBC TV series Top Gear. I’m not a massive “petrol head” yet I enjoy the show's car related features, celebrity banter & the often humiliation of James May.

So I sat down on Sunday night to watch the latest episode and the main feature for the show was based on McLaren’s new road car - called the P1.

Beautiful, isn't it!

If you haven’t seen this episode yet I implore you to watch it on BBC iPlayer (UK) or make sure you see it on YouTube here: (international) https://www.youtube.com/watch?v=xbW65hJHiTE or here: (UK) https://www.youtube.com/watch?v=nVJNpwTYhR0.

Jeremy Clarkson was the lucky sod that was assigned (probably more like jumped at) the chance to review the P1 on the mean streets of Belgium. And he couldn’t stop coo’ing about it! I’ve littered this blog post with his quotes from the show…

#moodypressphoto

"Now, what is the P1" I hear you ask, "and what has it got to do with enterprise storage?!"

Well - the McLaren P1 is a HYBRID car; powered by Electric and Petrol (or Gas if you’re American), yet it produces 903 horsepower and can achieve 0-60mph in under 2.8 seconds.... and 0-120mph takes 6.8 seconds. In fact it hits 120mph in the same time it takes a Golf GTI to hit 60mph! It also can cruise around in electric-only mode should you wish, or happily sit on a motorway at 70-80mph in hybrid mode and be economical. But when performance is required, it can adjust on the fly and deliver raw power to whatsoever needed.

“Oh, they should have called this the widowmaker!"

This is not the standard type of hybrid car you and I know e.g. a Toyota Prius (yuk!). This is a redesigned-from-the-ground-up “hypercar” that uses electric technology to ENHANCE the internal combustion engine performance whilst taking advantage of aerodynamics to force airflow for better centre of gravity whilst going at various speeds… so using new technology as an enhancement to legacy, well known gear to produce something phenomenal.

“This is partly because it’s made of stuff from the future, and partly because it’s clever; it adapts and moves around on demand to suit it’s environment”.

Of course the key to all of this is “redesign”. If McLaren (or indeed Ford, VW, GM etc) were to try and do this to any of their current flock of road cars they sell today it would frankly fail, mostly because they are based on the Internal Combustion Engine alone would not have the ability to harness the great things of electric in the best way possible. This is what we in vendor land call “bolt-on solutions”; someone offering a hot piece of tech on top of their legacy systems in order to try and capture some of the market.

An example of "bolt-on" solutions to legacy technologies

This whole analogy is a very familiar story that we at Nimble Storage have been saying for the last 3-4 years with enterprise storage. Current storage manufacturers are pushing their wares to customers based on legacy technology of disk, RAID sets, single-core CPUs, single threaded OS’s, with high service cost to put it all together. These guys are the storage equivalent of selling the Internal Combustion Engine - few tweaks can be made to the system for a bit more efficiency, but ultimately have engineering challenges that cannot overcome without a full redesign.

The analogy of "sticking a rocket on a beetle" somehow feels appropriate here!

It’s also possible to go the storage equivalent of fully electric with Flash-Only Arrays, who promise massive reductions in power and a ridiculous increases in performance. However those guys have the same challenges as Electric cars - which are typically very early generation immature systems with few enterprise software features that we would expect on today’s gear at a relatively high cost to procure.

Everyone has different needs, of course, but I know I could not invest in an electric car to do 15000 miles a year as I do today in my current diesel based motor... although I may in 10-15 years when it makes more sense and becomes more cost effective, but I wouldn't be willing to compromise on features I deem mandatory such as iPhone integration, sat nav, heated seats, cruise control etc etc etc.

Full electric certainly has it's use cases but they are very much a niche 1% of the market (mostly adopted by inner city taxi drivers i've observed - but i digress) which is why most manufacturers are not diving head first into releasing a full fleet of electric only models. And the same could be argued for Flash-Only systems in an enterprise environment for 99% of data storage requirements (regardless of what the very well put marketing says!).

Nimble converges tier 0 & 1 primary, backup/snapshot storage in one, with no gotchas.

Which leads us back to Hybrid offerings of both cars and storage. These are offered from the legacy vendors as a bolt-on to their current platforms (think GM/IBM, Ford/Dell, VW/HP) or offered from companies such as McLaren/Nimble Storage with a redesigned, from the ground up solution (with a brand new file system in our case!) which allows us to use the cool things around today (e.g. MLC Flash, lots of DRAM & NVRAM and multi-core Intel CPUs) to enhance our internal combustion engine (RAID 6 7.2K NL-SAS) but with some cool redesigned secret source called CASL to get the best of both worlds that our competitors cannot achieve - meaning you can potter about at 30mph, cruise at 70mph but if big demanding workloads come along at 200mph we can easily deliver & cope with the requirements.

CASL - Cache Accelerated Sequential Layout - the secret sauce behind the groundbreaking tech.

“…and what I find hysterical is that McLaren has taken this hybrid technology which is designed to reduce the impact of the internal combustion engine…. and is using it to INCREASE the impact. It’s like weaponising a wind farm”.

One area to break the comparison with the P1 is the features and functionality. The P1 was designed to be incredibly light and agile - Clarkson even stated it “was designed to be as fat as Iggy Pop”, which makes it economical and fast.

With Nimble we’ve managed to optimise ourselves to be as mindblowingly fast as we need to be (60,000 IOPS is normally fast enough for most enterprise environments) but we’ve also built a robust enterprise feature set of data protection and application integration that most of the traditional vendors would be proud to have of. All wrapped up in our big-data analytics engine called Infosight for true cloud based, proactive monitoring and support.

“For years, cars have all been basically the same, but this really isn’t. It’s a game changer; genuinely a new chapter in the history of motoring”.

Could not have put it better myself.

Howto: Nimble Storage CS210 Live SSD Cache Upgrade

One recent Nimble Storage upgrade feature you may have missed recently is the ability to add bigger SSD drives into our starter array (the CS210). Originally this system shipped with 8x 1TB 7.2K drives & 2x 80GB SSDs primarily designed for the SMB market (or ROBO/DR/UAT environments) - and this system could not be upgraded.

From using our Infosight deep data analytics engine we’ve detected that quite a few CS210 systems are being used for more than what was originally scoped for the use case - some of my customers are using these modest arrays for VDI, SQL, Exchange, Oracle - the list goes on. Check out the screenshot below from one of my customers Infosight to see what I mean…

Infosight screenshot showing SSD cache saturation >200%

So, starting from Nimble OS 1.4.8 and 2.0.5 we now fully support live upgrading the SSDs to two bigger configurations:

Adding two more 80GB SSDs for a total of 320GB of cache (known as CS210-X2)
Replacing the two 80GB SSDs and adding four 160GB SSDs for a total of 640GB of cache (known as CS210-X4)

I recently upgraded my >2 year old CS210 to the new -X4 configuration.

Notice the 2x80GB SSDs in the middle of the array (yes, it is sat on carpet!)

Here are the steps I took:

1. Check out the “CS210 Cache Upgrade Guide” on Infosight as a guide (available here - needs Infosight login).

2. For both the -X2 and the -X4 upgrades, the new drives are to be fitted in drive bays 7 and 10 only, starting with drive 7.

2x 160GB SSDs installed alongside the old 80GB SSDs.

Notice the model is still CS210, with two RED X's - this is because it currently has an invalid SSD configuration.

3. Next I had to remove the two 80GB SSDs to replace them with new 160GB SSDs. This is an online process and does not need any downtime for the array or your servers, nor does it require RAID rebuild times (as there is no RAID on the SSDs) - although you may want to space out removing both drives in a production environment as the array will have to recache any data that was on the removed drive. I started with bay 8, and then finally bay 9.

All four SSDs now added!

4. That’s it - my whole upgrade was complete. You can now see I have all four new SSDs (denoted by the new green Nimble Storage logo) alongside my old blue blanking plates & 7.2K drives!

Also note the GUI has updated automatically from CS210 to CS210-X4, with a total cache of approximately 600GB usable.

This upgrade is available NOW from Nimble Storage. If you’re not sure if you need an upgrade head to Infosight! It’s an amazing tool which gives you great information about many things, but in this case pinpoints CPU and cache usage & potential upgrade benefits.

Thursday, 3 October 2013

Roll up, roll up! Nimble Storage & VMworld Europe - Oct 14 - 17.

It's about THAT time of the year again; VMworld Europe is only days away! (12 days exactly as I write this).

VMworld has always been a very important conference for me personally and professionally, and is one that I always try to attend. This year's event will be my SEVENTH show in four years (I have been lucky enough to be posted across to the US version a few times too!) and it's great to see the European show getting stronger every year. I believe this year will be the strongest one yet.

There are many things I enjoy about the conference - from meeting customers new and old, to catching up with colleagues from years gone by (it's always amusing to see who works for which vendor/parter each year). I'll also be spending some of my time in the actual conference centre this year, as I have the ability to attend more seminars and breakout sessions than previous years. A key point I've always believed in is to never stop learning and expanding knowledge on subjects, and VMworld is certainly the place to do that! On the flip side, I'm not looking forward to the dull pain in my feet come Thursday afternoon!

For Nimble Storage it's quite possibly the most important show of the year for us, as we have the opportunity to meet with current customers and new customers in a friendly atmosphere over three days. Virtualisation and/or VDI as a platform is possibly the number one key driver behind a customer's decision to implement Nimble Storage and as such we aim to give VMworld the respect and attention it deserves!

So, what are we planning for this year's show?

Firstly, myself and my colleague and all-round genius Devin Hamilton will be hosting and presenting a customer panel on Disaster Recovery strategy, experiences and lessons for virtualised environments. This is session BCO5431 and is taking place Tuesday October 15th at 11am local time. The direct link is here.

We have a pretty cool booth stand organised (number E513) and will be manning the booth with all sorts of super-smart techies. We'll be running on-demand demos of the following for your pleasure.

Introduction to Nimble Storage

Nimble Storage OS 2.0

VMware vSphere Integration & Automated Connection Manager (very cool!)

Microsoft Windows Connection Manager (also very cool!)

Microsoft SQL & Exchange integration & recovery

Nimble Storage Infosight (Support based on Data Analytics!)

We will be raffling A WEEKEND EXPERIENCE WITH A PORSCHE 911S!!! Yes, you read that right, we will be giving away a Porsche 911S to you, for free, for the whole weekend! How great is that? (I'm a little annoyed that employees can't enter...)

If you don't win the first prize, we'll also be raffling off other cool goodies such as Beats By Dre headphones.

You've got to be "in it to win it" as we say in the UK - and you can enter ahead of time here, or enter at the booth.

Nimble Storage are also sponsoring the vRockstar party which is taking place at the Hard Rock Cafe on Sunday night before the show. Attendance is free (and so is the drink!). You can sign up here if you're in town, we'll be pleased to see you!

I'll be at the show all week from Monday through Thursday - so if you have 5 minutes please feel free to drop by booth E513 (or say hello to me in a session I may be attending - I'm going to learn just like you guys are!).

See you there!

Nick

Monday, 30 September 2013

Using VMware I/O Analyser To Test Storage Performance

Today i've been experimenting with the VMware I/O Analyser as a useful tool to drive storage performance from a fairly even baseline of tests.

The tool essentially is an Ubuntu VM with IOMeter and a web front end (which is OK, but doesn't help much), wrapped up in a nice OVF package.

First step is to download the package, which is available from the VMware Flings website (http://labs.vmware.com/flings/io-analyzer).

Next we need to push the OVF into the environment. As this is the controller VM we are not wanting to monitor the backend performance stats of the VM itself, so it should NOT be published on the datastore that we wish to test. Instead, publish the VM onto a local datastore or another shared volume. This is mostly due to the workload being generated may saturate the disks, SAN controllers or network which could affect this VM and may be detrimental to your test and not provide a fair result.

In my case "Nimble-IOAnalyser" was my local datastore, and VMFS-01 was my datastore to be tested:

Select a datastore which is NOT to be tested by the I/O Analyser for your OVF deployment.

VMware's best practice for deploying Virtual Machines is to deploy the disks using "Thick Provision, Eager Zero" as a profile, so go ahead and use that.

Once the VM has been deployed it's key to take a look at the settings and at the disks it's created. What you'll notice is that the test disk (Hard Disk 2) has been created with only 100MB as a size; which means 100% of the testing will reside in memory or in controller cache. This is something we need to avoid as it does not provide a fair and real-world result. This Hard Disk is also provisioned into the incorrect datastore by default.

The default testing drive needs to be deleted (Hard Disk 2)

Delete this disk and create a new one with at least 100GB as it's size. This disk should also be placed on the datastore for which you wish to test the performance of, and again use "Thick Provision, Eager Zero".

Once deployed, power the VM on. You should be greeted with the familiar VMware appliance screen. Change your timezone to what's relevant to you (GMT for me). You could also change the network settings here, but I left mine as DHCP as it's not really needed. Login to the VM with the default credentials of root/vmware to continue.

Note: we're not going to use the web client provided for this tool, as whilst it's ok it's not possible to change any of the default values in the I/O testing phase of this workload.

Once logged in you'll see an Ubuntu screen with a terminal window open. Right-Click on the desktop and open another (Terminals->xterm).

In the first terminal, you want to type "/usr/bin/dynamo". This starts the backend IOMeter worker and thread engine.

In the second window, type "wine /usr/bin/Iometer.exe". This will open the IOMeter application, and should tie into the dynamo engine started.

Note: ignore the comment underneath of "Fail to open kstat...", as long as you opened the engine before IOMeter it'll be OK.

From here onwards it's down to how you use IOMeter. There are lots of incorrect methods to using this tool to provide expected results, so here are my settings:

Two workers, both mapped to the new 100GB volume (sdb).
Maximum disk size of 200000000 sectors (translates to 95GB).
32 Outstanding I/Os per target (if this is left to 1 the test is not adequately driving the storage array!).

A new workload policy is created for 4K blocks. 100% random workload, and 100% write. (100% write should ALWAYS be the very first workload to the volume, otherwise there is no data to actually read back which does not provide real-world results).
Align the I/Os to the block size of your volume to remove disk misalignment performance discrepancies. In my case it's 4K.
Assign this workload to each of your workers to ensure you're consistent with your tests.

Some people may ask why 100% random; if you wish to test for IOPS performance in your storage, random IO is the key to generate these statistics. If you wish to test network performance (rather than IOPS) then Sequential I/O should be used. You should NOT mix these workloads together as you will be given inconsistent and inconclusive results for both disk and network stats.

Before running the test ensure you set a ramp up period (I use 15 seconds) and set a standard run-time (I use 10 minutes). Ensure all your workers are selected.
Click the Green Flag to start the test!

I hope the above has been useful to you. Please feel free to run this test and let me know your stats in the comments box on this page, along with the make/model of your storage array - would be a fun survey!

PS - if you wish to see my results:

Wednesday, 25 September 2013

Nimble Storage 2.0 Part 1 - Scale-To-Fit

This is the first in a series of blog posts focusing on the new Nimble OS 2.0 which has been made available in Release Candidate status to our customer & partner base.

Nimble Storage has hit a milestone in it's product development and feature set by hitting 2.0 of it's awarding winning Operating System based on the CASL file system (Cache Accelerated Sequential Layout). 2.0 introduces some nice features which i'll cover over a series of blog posts in the following weeks.

Scale-To-Fit - Nimble OS 1.4

Approximately 12-14 months ago Nimble launched a concept of "Scale-To-Fit" storage technology. The idea being that traditional storage systems had limited ways for expansion once the customer had deployed the environment; these typically being:

Add lots more slow disk for capacity or fast disk for performance (very common, not much risk involved).
Forklift replace controller headers for new generation controllers to allow for further disk expansion (very expensive, high risk due to potential data downtime or loss, lots of Professional Services involved, rarely done).

Scale-To-Fit was designed to break this mould and give the customer a choice in how to upgrade their investment as their environment evolves. This was released in firmware version 1.4 of the Nimble OS, and allowed any Nimble customer to do the following:

Add additional shelves of high-capacity drives for extra storage capacity.
Add bigger SSD drives into the array to allow for bigger working sets for random-read cache.
Upgrade Nimble controllers from CS200 to CS400 series to add more IOPS in the array.

All of the above upgrades were (and still are) designed to be performed online, without any maintenance windows or downtime of data - and it worked without any hiccups, which is very impressive (especially number 2 & 3 - see here for how our very first production customer with a 3 year old system did it this year: http://www.nimblestorage.com/blog/customers/non-disruptive-upgrade-to-nimble-storages-first-deployed-array/).

So what does 2.0 bring to Scale-To-Fit?

In Nimble OS 2.0 we complete the Scale-to-Fit strategy by bringing Scale-Out technology into the Nimble family!

Scale-Out is a technology which was mostly championed by EqualLogic (now Dell, my former employers) and Lefthand Networks (now HP) and is designed to allow customers to place multiple arrays into a "group" which allows volumes and/or datasets to be distributed, balanced and "live migrated" across different storage platforms without downtime.

The downside to the two legacy technologies mentioned above was that scale-out is/was their only way of scaling. If a customer required just more capacity for their investment they had to buy a whole new array with controllers, networking and rack space just for a few more additional TB's of space - and these boxes were typically more expensive than their initial purchase due to less aggressive discounts offered (the age old sales ploy of discount high to win business then discount low to increase margin - seen it time and time again). One of my old EQL customers ended up with TWENTY THREE 3 or 5u arrays in their production site!

What Nimble offers is a full suite of scaling technologies without any gotchas; so if a customer just needs capacity they can buy additional shelves, but if they want to Scale-Out to group performance and capacity of their arrays together, they can do that too! All of this can be done live, without any downtime or any professional services work! Nice!

The beauty about scale-out is it does not limit customers to generations of gear; meaning that older arrays can be grouped together with newer, faster generations for seamless migration of data before evacuating the older array to repurpose for UAT, Disaster Recovery or other means (as one example).

On first release we are supporting up to FOUR Nimble arrays in a scale-out group. These can be of any capacity or generation, have any size of SSD, HDD or controller!

Note: This does NOT mean we are clustering arrays together - we do not need a special backend cluster fabric to handle array data which you may have seen with out vendors implementation of scale-out or cluster-mode. We also do not require downtime or any swing-kit to move data off an array to enable this new feature.

The new "Arrays/Groups" section of the Nimble GUI

Scale-Out is the last piece of the Nimble Scale-to-Fit strategy, meaning customers who started with a single 3u, 500w array can now add more capacity (an additional 96TB usable using 3TB drives), add bigger MLC SSD drives for bigger cache working sets (up to 2.4TB usable), add bigger controllers to take their array from 20K IOPS to 70K IOPS, and now add additional Nimble arrays for a single management point & more performance, capacity and scale!

Nimble OS 2.0 is currently in Release Candidate stage and is available for customers to upgrade via Support. The code & technology is fully production-ready and has been through extensive beta and QA testing. It is the same process to upgrade the firmware as previous updates; a software package is downloaded from Nimble Support then applied to the array controllers one at a time (so no downtime!). If you'd like to be a part of the RC rollout please contact Nimble Support for more information.

Alongside Scale-Out we are launching some new tools for VMware and Microsoft Windows platforms to simplify the overall integration of these solutions; stay tuned for my blog posts on these features!