Semantic Pill 21

Semantic Pill 21




We are living in the era of `Big Data.' Spatiotemporal data, whether captured through remote sensors (e.g., remote sensing imagery, Atmospheric Radiation Measurement (ARM) data) or large scale simulations (e.g., climate data) has always been `Big.' However, recent advances in instrumentation and computation making the spatiotemporal data even bigger, putting several constraints on data analytics capabilities. In addition, large-scale (spatiotemporal) data generated by social media outlets is proving to be highly useful in disaster mapping and national security applications. Spatial computation needs to be transformed to meet the challenges posed by the big spatiotemporal data.


  • The Big ones: the ESG, Earth System Grid could be considered one of the World Big Data scientific portals;


  • Sampling and particularly representative sampling is pure statistics and an all times universal problem: how to wisely sample a universe to get the information we need. However Big Data universes and situations may present new challenges specially when considering unstructured data, relatively unknown, noisy, erratic and most times unpredictable like we may find in Social Data: “The Pitfalls of using online and social data in Big Data analysis”:


In her draft paper, Big Data: Pitfalls, Methods and Concepts for an Emergent Field, UNC professor and Princeton CITP fellow Zeynep Tufekci (@zeynep) compares the methodological challenges of developing socially-based big data insights using Twitter to biological testing on Drosophila flies, better known as fruit flies. Drosophila flies are usually chosen because they’re relatively easy to use in lab settings, easy to breed, have rapid and “stereotypical” life cycles, and the adults are pretty small. The problem? They’re not necessarily representative of non-lab (read: real-life) scenarios. Tufekci posits that the dominance of Twitter as the “model organism” for social media in big data analyses similarly skews analysis.


Sampling was, is and will be fundamental. Now within the “Big Data” move we have to be more careful than “before” (one year from now!) concerning this problem, The figure below depicts 4 ways of a “zonal sampling” each one coherent but with 4 probable different outcomes;


  • Roadway Traffic Control is an old Big Data experience and now there is a proliferation of integral solutions, fundamentally to avoid congestions and whether possible keep the circulating community communicated: see the T-system, Big Data in Traffic and Big Data in the Automotive Industry:


As you know, cars can’t speak. If they could, they would be able to provide a wealth of information that would be invaluable to drivers, repair shops and automakers alike. To gain access to this data – and help the car talk – more and more vehicles are being fitted with sensors and connectivity solutions. According to a study by management consultants Oliver Wyman, 80 percent of all autos sold in 2016 will be connected. That would equate to approximately 210 million talking cars cruising round our streets. Compared to 45 million autos in 2011, that is a projected annual growth rate of over 36 percent. Connected cars could provide a steady stream of data on vehicle movements, condition, wear and tear of parts, and ambient conditions. Extracting meaning from this mass of mixed data is no easy task. The challenge is transmitting the information, analyzing it and redistributing it to the relevant recipients – all at high speed. It is a challenge that T-Systems can master.


  • SaaS, stands for Software as a Service, has meaning as a standalone Software Delivery Model or as forming part oof the Cloud Computing trilogy [SaaS, IaaS, PaaS];


 Source: Habitat Maps for the EU (MESH)