Talend and the Internet Of Things

This blog shows how we used Talend to synchronise the central heating settings using google calendar and save on energy costs.
The tutorial describes how to connect the thermostat to the internet and using google calendar build a Talend job to control the thermostat.

Connecting things to the internet is hot! It can be used on large scale, for example putting sensors all over the country and collect the measurements from these sensors in a Big Data environment. But recently I ran into this social project that I wanted to share.

For a local social activity center / indoor training centre I found out that the energy cost was quiet huge and I was asked to improve this. I noticed that the sports hall contained 2 gas heaters, which were regulated by a manual controlled thermostat.


After some analyses, I found that when the sports hall is used, the temperature is turned on to 18 till 22 degrees (depending on the type of sport). Most users turned down the temperature to +/- 15 degrees to prevent humidity after finishing their sports acitivity.
There are 3 imperfections in this method:
– Heating is allways started 15 minutes to late and also end 15 minutes to late
– During winter the heating is still running when outside temperature is less then 15 degrees.
– Users regularly forget to turn down the temperature.

After some research I found that there is no advanced thermostat that is compatible (as it is a 220v thermostat) and able to solve these 3 issues.

If it doesn’t exist, create!

Since the availability of 3D printers and low-energy devices like Raspberry and Arduino together with a lot of low-priced sensors you can practically build anything yourself!

Together with a electronical engineer we created a custom thermostat that could be programmed.

– Raspberry Pi, Display, Relais, Humidity / Temperature sensor, 3d printed housing

Total hardware cost: +/- €100

Using a 3d Printer to create a nice box to package the whole thing.

Tech IMG_20160109_120137

The software

To control the device a Python script was created. The script reads the schedule from a csv file to check if it should switch to a specific temperature. The csv file can be edited with an simple webinterface. Furthermore it checks that the humidity level, so it doesn’t pass a maximum level to prevent the floor getting too wet (potentially causing dangerous situations).
The heater will only switch on when either of these conditions are met:
– There is an activity and the temperature has not yet reaached the required level
– There is no activity but the humidity is higher than the maximum level.

The unit is now stand-alone working unit but a bit hard-manageable. Now this is where Talend comes in place. Wouldn’t it be great to have, for example, a Google Calendar to manage the schedule?

1. Create a public Google Calendar.
And add some tasks to it. Also add some people who would be able to program the Calendar.


Find the calender ID, so you can download the corresponding ICS file from Google. The URL would look like this:

2. Create a Talend job

Now we need to create a Talend job to retrieve the Google Calendar activities and convert it to the CSV format.

First we need to fetch the ICS file from Google using the public URL using a file fetch.


Now we can create an interpreter to get the information we need for the calendar:
– Startdate (time)
– Enddate (time)
– Description
– Recurrency

The extraction of these fields is a bit tricky as the ICS file is not very well structured.


For the extraction we used the following components.

For the Row Seperator use “BEGIN:VEVENT” and skip the first row as this is the calendar metadata. Make sure there is a field separator that will not appear in the file.


Now we have created one line per event.

Next we extract the delimited fields using as field seperator the carriage return / Linefeed (“\r\n”).


Now we have our data in fields. Create a random schema as we do not know which value is in which field at this point.

Add a unique ID per row as we will need this later on.


Not every record in the ICS file has the same structure, for example only recurrent entries contain the “RCURR” value. Using the tUnpivotRow we create key-value pairs for all values.


No we add a replicate for each value of the ICS file that we want to extract and add these 3 components for every field:
1. tFilterRow; to filter the value for the appointment (For Example FilterActiviteitRRULE.pivot_value.startsWith(“RRULE”))
2. tJavaRow; to get the corresponding value (output_row.ID=row8.newColumn.substring(row8.newColumn.indexOf(“:”)+1);
3. tHashOutput; to write the value to a hash

In the next subjob we will combine the hashes and create outputs for single and repetitive activities.


Use the “RRULE==null” To determine if the acitivity has a single occurance or is a repeating activity.
To prevent errors in output, we added some validations/corrections in the tMaps


The result of the job should look like this:

Pin it

Stefan is an experienced data management consultant with many years of experience in business intelligence and data integration. Recently his focus shifted towards Big Data and the Internet of Things.

Website: http://www.xsed.nl

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Verplichte velden zijn gemarkeerd met *