Big Data and housing part 1: What is Big Data?

By Jim Vine - on 12/05/2014

In part one of this short introductory series Jim Vine explains what is meant by Big Data and how this relates to the world of housing.

It seems like you cannot go far these days without bumping into someone claiming that they are using Big Data in some way. It has become a buzz phrase that has broken out of a few technical and scientific communities into other expert fields and now into the general consciousness.

HACT’s Housing Big Data project is the housing sector's first ever shared analytics Big Data project, which will bring together data from a number of housing providers to generate insights that no housing provider alone could gain, opening up new possibilities for serving tenants and communities more efficiently and effectively. In its initial wave the project is working with a select group of housing providers to get Housing Big Data off to a great start, and later in the year we will be having discussions about the possibility of opening participation up to a wider group.

As with any project that breaks new ground, we have been learning as we go, meeting novel challenges and creating innovative ways to resolve them. The story of how we created a data protection approach that keeps everyone’s data secure, keeps us all on the right side of the law, and that still gives the project access to the valuable data that will reveal insights is one that needs writing. But do not worry – I am not going to try and squeeze it into this blog post.

Over the coming months I will be writing more about the specifics of Housing Big Data, but as the project moves into its next phase I wanted to take this opportunity to introduce Big Data more generally, to start to give an idea of how it might benefit the housing world. Too often, people have fallen into the trap of talking about Big Data without really defining what they mean.

When most experts refer to Big Data, what they really mean is Hard Data - data that are in some way difficult to extract value from. One of the most common definitions used is Gartner's 3 Vs definitions, which says that Big Data are data where either the Volume, Velocity or Variety are high enough to require new forms of processing to gain value from it.

Volume refers to the sheer quantity of data. Volume starts to be an issue when data gets too large to be held and processed in the memory of a single machine, and the complexity in difficulty of dealing with large quantities of data can grow to huge proportions when we consider areas like particle physics, with CERN’s Large Hadron Collider generating over a petabyte of collision data per month.

Velocity describes the rate at which data arrive in your system and the rate at which they need to be processed. It is particularly an issue where a system is needed to react in real-time.

Variety relates to the challenges that occur when the data in question are in varying formats, stored with different schemas, or come from different systems.

So when we talk about Big Data we are not necessarily referring to vast quantities of data. As anyone who has worked with housing IT systems will recognise, it is the last of these where Housing Big Data will find many of its challenges. Not only is there a range of systems in use within every housing provider (housing management systems, asset management, CRMs, specialised systems for various areas of the organisation…), there are also a number of suppliers of systems (too numerous to mention) without common data formats (and even incompatible systems from the same supplier in some cases). And even when two housing providers use the same system, they have often been so heavily customised that some of the fields are used in slightly (or completely) different ways.

Of course, in building its capacity in Big Data to cope with the variety of today, the housing sector will also be opening doors to deal with greater volumes of data at higher velocity in the future.

In my next blog post I will talk a little more about the techniques available with Big Data, and the approaches that the Housing Big Data project is taking to deliver powerful new insights to the housing sector. In the meantime, do please feel free to get in touch if you would like to be considered for inclusion when we open the project up to wider participation: email

HACT’s Housing Big Data project has been generously supported by the Nominet Trust, the UK’s only dedicated Tech for Good funder that invests in the use of technology to transform the way we address social challenges.

