Profile photo of Robin van het Hof

Robin van het Hof

SAP NetWeaver Magician, Qualiture

SCN Profile

Robin is a highly experienced (10+ years) and certified SAP NetWeaver and SAP Enterprise Portal lead architect / SAP HANA Cloud Platform / SAPUI5 developer with strong focus on integration and business process streamlining, as well as over 15 years experience in Java/J2EE.

Robin has extensive and in-depth knowledge of HANA Cloud Platform and Cloud Portal, as well as UI development using SAPUI5 and OpenUI5.

In 2014 he was nominated and selected as a new SAP Mentor. In this role he can not only help my clients, users, fellow developers and SAP Community members, but also help streamline SAP’s vision and roadmap to proactively improve or influence their products and services.

And last but certainly not least, he's an accomplished and award winning Web Designer / Developer and User Experience / Interaction designer for more than 15 years.

(NB: This blog first appeared on SCN)

At this year’s TechEd in Las Vegas, SAP introduced the highly anticipated SAP HANA Express Edition.

For the very first time, developers can now run SAP HANA on their own laptops using as little (*) as 16 GB RAM. It is optimised to run on laptops, desktops and mini-PC’s, and is free to use up until a memory consumption of 32 GB. You can install it as a native binary on Linux, or as a pre-configured, ready-to-use VM image.

(*) 16 GB RAM is still not standard on every laptop, but RAM is getting cheaper, 32 GB laptops are becoming more widespread, and really, 16 GB to run SAP HANA on is truly an achievement!

I have successfully run the VM on a 2012 MacBook Pro with 16 GB. And much to my surprise, if you tweak both the laptop’s as well as the VM’s resources well, it runs pretty smoothly!

Would I only do SAP HANA development, this 16GB configuration would be pretty adequate.
However, I also code Java and Javascript/SAPUI5 for a living, and those development tools eat up a significant amount of resources too. And since my MacBook already has the maximum RAM available, I was looking for a separate solution to run SAP HANA Express.

Enter the Intel NUC “Skull Canyon”.

The device is literally as cool as it’s name implies. Mine has 32 GB RAM, a 250 GB SSD, and the latest Intel i7 processor. It has peripheral connectivity to whatever thingamajiggies you can imagine, and has a nice hack0rz-approved ‘skull’ embossed on its lid. It measures just 8.3″ x 4.6″ x 1.1″ (211mm x 116mm x 28mm).

I decided to install Windows 10 on it, so I could run Eclipse with the HANA Tools on it, and benefit from the ease-of-use of simply loading the VM instead of going through the hassle of installing HANA Express Edition by hand. I assigned 16 GB to the VM and assigned 4 cores, fired up the VM and finalised the recommended steps for security settings and installing the developer license.

To see whether it was all successful, I opened the HANA Cockpit, and was happy to see I had a working SAP HANA environment in the palms of my hand.

(Honestly, I get tears in my eyes of pure joy when you think about it: the sheer power of SAP HANA in a device that fits in your hand)

After this short, blissful moment, I decided to see how it would handle lots of data and whether it could hold its own against the enterprisey HANA installations.

First problem was, where to get loads of data.

After some web scouring, I found the Bureau of Transportation Statistics website, part of the United States Department of Transportation. It has lots of data available in CSV format for aviation, marine, railway, highway, basically anything related to transport.
In the Aviation section, they held data from airline “On-Time Performance” of every domestic US flight since 1987. At least, it would give me the big dataset I wanted.

I downloaded the performance data for the month of August (In October, the latest available data was for August) and it provided a zip with ~450K rows of data. Honestly, I wouldn’t have expected that many air movements in one month, but for my test this wasn’t enough data. I downloaded all 8 months of 2016 worth of data, and ended up with a dataset of almost 3.8 million records.

Using Eclipse with SAP HANA Tools, I created a schema, and using the HANA import tools I imported the CSV airline performance data:

The website also provided lookup data such as airlines and airports in CSV format, so I imported those as well. Again, I was surprised to find over 1.600 airlines in them, but since the data goes back to 1987, it also held airlines long gone — remember Pan Am? It has 4 entries in the lookup table…

Matching the airlines with the performance data revealed this year only 310 distinct airlines performed domestic flights in the US, which seemed much more reasonable.

Intrigued by the amount of airlines and number of daily flights, I wanted to stress my dataset a bit further. The half-a-millisecond response for retrieving the COUNT was nice, but not much real-use value.

Next, I decided to show the number of air movements per day:

Holy cannelloni, on average between 11k-15k air movements per day!

Although the query is still a walk in the park for any database, it took just 17 milliseconds to deduct that data from the 3.8 million records. Still impressive, but not enough.

I wanted to show the maximum delay per day, which airline was affected, and show the departure and arrival airports. This query was a bit more complex, using an INNER JOIN SELECT with a MAX aggregation, and a couple of LEFT JOINS for the airline and airport lookup tables:

select     P.FL_DATE, 
           P.ARR_DELAY, 
           C.DESCRIPTION as AIRLINE,
           O.DESCRIPTION as ORIGIN,
           D.DESCRIPTION as DEST
from       BTS.ONTIMEPERF P
inner join (
    select     FL_DATE, 
               max(ARR_DELAY) as ARR_DELAY  
    from       BTS.ONTIMEPERF
    group by   FL_DATE
) PMAX
        on P.FL_DATE = PMAX.FL_DATE
       and P.ARR_DELAY = PMAX.ARR_DELAY
left join BTS.CARRIERS C
        on P.UNIQUE_CARRIER = C.CODE  
left join BTS.AIRPORTS O
        on P.ORIGIN_AIRPORT_ID = O.CODE  
left join BTS.AIRPORTS D
        on P.DEST_AIRPORT_ID = D.CODE  
order by   P.FL_DATE

The result is the following:

Three things are obvious here:

1) On a daily basis, the delay for a flight can be more than 18 hours (ARR_DELAY is the delay in minutes for that flight)

2) If you want to travel on schedule, I’d recommend to avoid American Airlines

3) The query took just 5 milliseconds more than the previous query to complete. You could also say, it took just one twentieth of the duration of the blink of an eye to finish. Please note, I am no DBA as you could tell from the above SQL, I haven’t used any indices, and I haven’t optimised the HANA instance significantly yet.

I think it is safe to state the Intel NUC has true server capabilities and, running SAP HANA Express Edition in a VM with only 16 GB RAM assigned to it, has no visible degradation in performance.

If you want to try it out for yourself, go to http://go.sap.com/developer/topics/sap-hana-express.html and start developing!

Oh, and if you have installed SAP HANA Express Edition on a NUC (or laptop, desktop, etc), do not hesitate to share your experiences in the comments

1 vote

8 thoughts on “Big Data on a Small NUC: SAP HANA Express Edition in 42 cubic inches

  1. Dear Robin ,

    I’m a newbie to SAP HANA and wanted to try this express edition. I only have 8 GB RAM in my old laptop ( 2 * 4GB 1600 DDR3 ) with Intel i5 & win 10.

    Is it a must that I need to upgrade to 16 GB?

    • Hi Senthil,

      You might try running the “Server Only” version, since it may already run on 8GB machines. I haven’t tried it, but it ‘should work’.

      The real fun of course is in the “Server + Apps” version, since this includes XSA. However, the it requires 16GB…

  2. Hi Robin, thanks for sharing your story which re sparks my motivation again to make SAP HXE/HXE2 work on my iMac (mid 2012, 16GB RAM, intel i5, MacOS Sierra) using VM image.
    You wrote “… if you tweak both the laptop’s as well as the VM’s resources well, it runs pretty smoothly!..”. Can you share more details about what/how you tweaked to make it run smoothly?
    It has been a frustrating journey for me over the last weeks because most of the apps (xs apps) are listed with 0/1 instance (such as webide). Restarting an app (xs restart) takes ages and it is like playing a Russian roulette waiting for the outcome: sometimes works, sometimes fails.
    I only run chrome with necessary tabs (the How-to tutorial) and Oracle Virtual Box with the VM to minimize RAM usage.
    Thanks for an answer.

    • Hi Sunny,

      I can’t really recall, but I did some analysis from HANA Studio (Configuration and Monitoring > Open Administration) to check memory usage / limits (I only focussed on memory consumption.

      As for the not started instances, these could be related to memory as well — however, you could also try to start them from HANA Studio (Systems view > Administration Console perspective > Configuration tab > daemon.ini)

      • Hi Robin,
        thanks for the quick reply.
        Here is what I have done to tweak:
        – Oracle Virtual Box: increased base memory from initially,automatically assigned 13312 MB to max. 15237 MB. Any single MB more the OK button is greyed out.
        – MacOs: closed all apps (Chrome, Antivirus, Dropbox, etc.) , i.e. only Oracle Virtual box
        Result: the same 🙁
        Your suggestion to start the instance from HANA Studio would likely make it worse because MacOs would need to allocate huge chunk of RAM to HANA Studio in addition to VM since the VM needs to be running to enable HANA Studio to access its services. Or is there a way only to run HANA Studio without firing up the VM?
        I started HANA Studio while the VM is running and HANA Studio didn’t even want to show up and I had to use Force Quit in application monitor to shut down my Mac.

Comments are closed.