Friday, July 19, 2013

E-Infrastructure is a Global Trend

The Skolkovo Institute of Science and Technology (Skoltech) in Moscow, Russian Federation and the National Association of Research and Educational e-Infrastructures (e-ARENA) announced the first steps in their partnership to improve national and international e-infrastructure.

On June 26, Skoltech President Edward Crawley and e-ARENA Director-General Marat Biktimirov signed a Letter of Intent to collaborate in establishing permanent high bandwidth networking between Skoltech and its national and international partners and collaborators.
Skoltech has recently launched its Center for Stem Cell Research that includes a close long-term partnership with academic partners in the Netherlands, Russia and the USA. Researchers at the Center will rely on state-of-the-art e-infrastructure to advance research programs in the application of new genomics technologies towards the realization of personalized medicine. Skoltech and its international collaborators will need access to an exponentially growing amount of genomic data. This data must then be stored, transferred and analyzed, demanding a high level of network speed between Skoltech and the rest of the world. The Institute also expects to launch at least 14 more CREIs each of which will include international and national collaborations target complex and data-rich scientific challenges.
Besides these research projects, Skoltech will develop opportunities for web-based classes and data-intensive collaborative experimentation and modeling similar to such educational initiatives as MITx and edX. All of these educational and research initiatives also benefit from increased network connection.

Skoltech has already launched a partnership with SURFnet to begin providing for the high-bandwidth needs for this first Center for Research, Education and Innovation (CREI). The new partnership with e-ARENA will extend networking options by leveraging Russian research and education networks RASNet, RUNNet and RBNet, as well as international connections to the pan-European GEANT network and the advanced science network GLORIAD.
Skoltech Acting CIO, Professor Gabrielle Allen, said of the new cooperation, “Robust, world-class cyberinfrastructure is absolutely essential for modern data-intensive science which today takes place in a global setting. As a new institute, we are delighted to be collaborating with e-ARENA and leveraging their long and deep experience in networking to provide Skoltech researchers and educations with necessary e-infrastructure.”

The Network of Performance Facilities proposed in the previous post: Lessons Learned - Part 3  is such example of the modern data-intensive scientific and research application. Skolkovo Institute of Science and Technology through one of its Centres for Research and Innovation (CREI) could become potential international partner in the Consortium.

Wednesday, July 10, 2013

Old "New" Challenge for performance buildings, or Lessons Learned - Part 3

Monitoring data in Performance Building proves to be a challenge for a reason, which may seem unexpected. There is no clear understanding on what data to collect, at which points, how often to take the measurements and for how long to store the data.
One approach is a "bulldozer" approach - take as many data as possible, at as many locations as possible and as frequent as possible. From the first glance this seems to be a bullet proof method - you will never miss anything. In fact, the opposite is true. The amount of data collected quickly becomes overwhelming and unmanageable, and apart from the difficulty of retrieving necessary piece of information, it presents another unexpected challenge. The value of storing the data is in ability to keep track of historical records, because only a relatively long period of time can be representative for the actual performance of any complex system, performance building included. Now, an attempt to estimate what would it take to create a data storage for one such building, utilizing the "bulldozer" approach described above, hits an obstacle - the data warehouse for keeping track of ALL data will cost over $400,000 !!  No wonder, it is decided, or rather occurs automatically, that the wast amount of accumulated data is discarded after a relatively short period of time to let the room for the new batch of data. But what about the analysis?
What if we need the data for more than one month, and typically we want to monitor performance for at least a year? How can we even be assured that what we need is there, when what is collected spills over?
Following up on and consistent with what I have discussed previously (see e.g. Lessons Learned - Part 2)  there is a need in the agreed upon hierarchy of the data sets, common format of data being collected, stored and retrieved. With the time it may and probably will evolve into the industry standard. But the work needs to be started, or we are going to face the hurdle not unlike or even worse than the Tower of Babylon - not only being unable to speak one language, but not even understand ourselves... 

In order to be able to communicate we need to speak one language.