scaling up or scaling out - aws choices

Amazon Web Services offers a range of tiny (t2.nano) to massive (m3.2xlarge) servers for any power your application may require. Upgrading your server along this scale is usually called Scaling up. Where as creating or adding servers to a cluster for your application to sit on could be referred to as Scaling out.

What are the differences between scaling up and scaling out

Scaling up or vertical scaling

This is where you upgrade the cpu or add processors and memory in your server. The general idea being, more power equals better.

  • Pros
    • Less licensing cost
    • Easier to configure
    • Quicker upgrades on virtual hardware with less downtime

  • Cons
    • Price
    • Bigger risk to failures, all eggs in one basket
    • Limit to how big you can grow

Scaling out or horizontal scaling

Having a cluster of servers working together means you can add (or remove) servers as the need changes.

  • Pros
    • Flexible
    • Redundancy and more resiliency to failures
    • Easy to upgrade and add extra or remove surplus capacity

  • Cons
    • More licence fees (per server models)
    • More complicated to manage
    • Data consistency may need consideration

Embracing cloud services and the phoenix approach can offer further savings to your business by scaling dynamically throughout the week or even day based on the demands on your servers.

Either way can work for your application as it grows. You are best placed to see what works best, depending on the stage your application is in, but we aware of the inherent limitations that scaling up has and will bite you sooner or later.

(Read more...)

Heartbeat networks on AWS

What happens when you want a private non-routable network such that you would use for a cluster heartbeat or in this case, a vertica spread network, on AWS?

If like me, you use DHCP when attaching an ENI to your EC2 instance, then you will find that when your linux (centos or redhat distro) executes ifup-eth1, the default dhcp options will mean the default route will be set to this network.

The second network card in this configuration is on its own private non-routable network, and with AWS there is no way to remove the default gateway for this network. The best I could find was an article stating you could use ACL (access control list) to restrict the network to that subnet. This is of little use if your default gateway has been set to a router that blocks all traffic.

There is an answer however. The DHCP client looks for config files, and if they are not found, uses defaults.

The fix for my private non-routable network, create a file that looks like this:

request subnet-mask, broadcast-address, time-offset,

This little file means only request the basic info from the DHCP Server. If we don't ask for the router, then we aren't going to get it.


(Read more...)

Cost of Technical Debt

Technical debt is created when the 'proper way' to do things is not followed, or to put it another way, get it working and we'll fix it up later.

Does that 'later' ever come along. I know in the race to deliver value, these small items can be overlooked, especially if the question is 'fix debt' or 'ship new feature'.

Unfortunately these small debts can add up until the status quo is tipped in the balance of technical debt dragging your team down.

This can look like

  1. more support calls coming into the helpdesk
  2. more support impacting the call out rota
  3. longer delivery times when adding new features due to work arounds for that bug that will be fixed in the next version
  4. bad publicity due to slow bug fixing
  5. staff turnover due to quality team members losing faith in the team.

The first step in managing this technical debt, is to make it visible. Add it to the issues list and backlog, and help the product owner understand what a pain it creating during the development process.

Once it is visible, it can then be prioritised or perhaps the decision can be made to live with it. At least it is a conscious choice.

This series on the topic by the sharp folks over at 18F draws a nice, clear picture of what it is and why you need to do something about it.

(Read more...)

Vertica and puppet

For the last couple of weeks I have been working on a puppet manifest to enable more resilience and build of vertica nodes on AWS.

Using automation tools has allowed the fast prototyping and development of many software solutions. Using puppet and Jenkins, a full environment is provided for the Data Analysts to use within about 1 hour.

Part of what I've been working on is to build vertica nodes whilst retaining the data. In addition to this, building in the logic to enable a node to fail, recover and join the cluster with next to no manual work.

Monitoring and detection is useful in this phase, so the database admins can double check the health of the data during and after the failure. Part of the solution includes automatically configuring the Vertica Management Console ready for use.

More progress is expected in the next month as I build auto scaling on top of this resiliency.

(Read more...)

What does a DevOps engineer do?

Within a project team you require a number of skills usually found across a number of team members.

Developer - This is the programming and design skill required for the end product this project is sponsored to deliver. For a web application, this might be Javascript, PHP, Paintshop etc. For a data analysis BI team, these skills will be more like R, tableaux or Qlikview.

Testing - These skills are about ensuring the best experience for the end user or customer of the deliverables. Performance (the speed of the product), usability (if it actually works as designed) and load testing (how many users or how much data can be provided) are important measurements in this area.

Operations - Providing the IT bits that the project and application rely on to function and serve. These are concerned with the servers, networks and other dependencies.

In terms of what DevOps engineers do, well that spans the developer and operations skills. The reason it is so specialist, is the blend of these skills. Someone who understand programming and infrastructure.

My team and I spend the majority of our time, writing and refining puppet modules. These are definitions of what a server needs to do, in order to provide a server, service or piece of infrastructure. They are programs in their own right.

We also spend time on writing tools to help the developers and testers move through the deployment processes. We don't want to slow the development or project down, so providing tools to enable the developers and testers to 'self serve' allows that flexibility.

Automation of the project steps and providing tools, buttons, one-click actions, to move the project and help the developers write, test and deploy code as easily as possible.


(Read more...)