Since 18 of December 2019 conferences.iaea.org uses Nucleus credentials. Visit our help pages for information on how to Register and Sign-in using Nucleus.
13-17 May 2019
Daejeon, Republic of Korea
Europe/Vienna timezone
Meeting Material is now available and accessible from the left-menu

Design for the Distributed Data Locator Service for Multi-site Data Repositories

15 May 2019, 09:40
20m
Daejeon, Republic of Korea

Daejeon, Republic of Korea

Board: O/5-3
Oral (Plenary Session) Database Techniques for Information Storage and Retrieval Plenary Oral

Speaker

Hideya Nakanishi (National Institute for Fusion Science)

Description

In modern fusion experiments, the remote data access has already come into wide use in both domestic and international research collaborations. SNET mutual data exchanging platform in Japan interconnects four fusion experimental sites, LHD, QUEST, GAMMA10, and TST-2, over 1 000 km distance and enables the remote collaborators access each site’s data seamlessly as if they were in local site. Also, ITER plans to replicate the full dataset to the REC, over 10 000 km distance, which will equip massive computer resources to analyze the ITER physics data off-site. The outcome results would possibly be shared with international collaborators of other sites. In such multi-site data repository environments, the data location informing service will be essential to find the nearest repository from which users can retrieve the data most efficiently. Considering the latency time becomes more than 100 milli-seconds for inter-continental network transactions, it is more preferable to distribute not only the data repositories but also the locator servers to multiple sites in the world. Since the data locations will be served by a relational database (RDB), such as PostgreSQL, real-time data synchronization between multiple locator RDBs will be necessary to provide a consistent service over the world.
PostgreSQL 9.0 and higher has a mechanism to replicate the database from a single master to multiple slaves, which is called as “streaming replication.” Since PostgreSQL 9.4, it has been equipped with the multi-master bi-directional replication (BDR) capability. To register the analyzed outcome results at each site, BDR of data indexes will be necessary for the worldwide distributed data indexing services.
From the viewpoint of the data consuming clients, the most important thing is to find out the best data locator and repository site with which the clients can communicate at the highest speed through the network. There are some methods to find the neighbor place on network; typically, the DNS top domain shows the country region but have some exceptions like “iter.org.” GeoIP is another network service to know where the site IP address exists geographically, however, it is also reported that the precision is not necessarily so high. It must be also noted that geographical distance does not corresponds to the network neighborhood.
Therefore, this study proposed to use the ICMP echo reply (ping) or TCP SYN+ACK response (tcpping) to measure the network distance to each data replication site and then decide which site the client can reach with the minimum latency. The latency measurement should be once made before requesting the data, and the values can be stored for some while in each client. The combination uses of the BDR of locator RDBs and the site pre-selection by latency measurement have been tentatively implemented and verified. The test proved that the client can automatically choose the nearest data site, and the newly inserted record was replicated to the other locators within a definite delay. We can conclude that this method is effective for the multi-site data environments, such as SNET in Japan and also ITER with the REC.

Primary author

Hideya Nakanishi (National Institute for Fusion Science)

Co-authors

Dr Kenjiro Yamanaka (National Institute for Informatics) Shinsuke Tokunaga (National Institutes for Quantum and Radiological Science and Technology (QST)) Takahisa Ozeki (National Institute for Quantum and Radiological Science and Technology (QST) ) Yasutomo Ishii (National Institute for Quantum and Radiological Science and Technology (QST)) Hideo Ohtsu (National Institutes for Quantum and Radiological Science and Technology (QST)) Yoshihiko Sugie (National Institutes for Quantum and Radiological Science and Technology (QST)) Noriyoshi Nakajima (National Institute for Fusion Science) Masahiko Emoto (National Institute for Fusion Science) Dr Takashi Yamamoto (National Institute for Fusion Science) Mr Masaki Ohsuna (National Institute for Fusion Science) Mr Tatsuki Ito (National Institute for Fusion Science) Mr Setsuo Imazu (National Institute for Fusion Science) Ms Miki Nonomura (National Institute for Fusion Science) Mr Mitsuhiro Yokota (National Institute for Fusion Science) Mr Hideki Ogawa (National Institute for Fusion Science) Mr Hiroya Maeno (National Institute for Fusion Science) Ms Miwa Aoyagi (National Institute for Fusion Science) Mr Masanobu Yoshida (National Institute for Fusion Science) Mr Tomoyuki Inoue (National Institute for Fusion Science) Mr Osamu Nakamura (National Institute for Fusion Science) Dr Shunji Abe (National Institute for Informatics) Prof. Shigeo Urushidani (National Institute for Informatics)

Presentation Materials