International Journal of Networking and Computing

Special Issue on Selected Papers from the Third International Conference on Networking and Computing

Special Issue on Selected Papers from the Third International Conference on Networking and Computing

Yasuaki Ito, Sayaka Kamei

2013Volume 3Issue 2 Pages 181
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_181

JOURNAL FREE ACCESS

Show abstractHide abstract

The third International Conference on Networking and Computing (ICNC) on December 5-7, 2012, in Okinawa, Japan, - aims to provide a timely forum for exchange and discussion of the latest research findings in all aspects of networking and computing including parallel and distributed systems, architectures, and applications.
Also, four workshops, 4th International Workshop on Parallel and Distributed Algorithms and Applications (PDAA), 3rd International Workshop on Advances in Networking and Computing (WANC), 2nd International Workshop on Challenges on Massively Parallel Processors (CMPP), and 2nd International Workshop on Networking, Computing, Systems, and Software (NCSS) were held in conjunction with ICNC.
The program committee has encouraged the authors of selected papers including the workshops to submit full-versions of their manuscripts to the International Journal on Networking and Computing (IJNC) after the conference. After a thorough reviewing process, with extensive discussions, six articles on various topics have been selected for publication on the IJNC special issue on ICNC.
On behalf of the ICNC, we would like to express our appreciation for the large efforts of reviewers who reviewed papers submitted to the special issue. Likewise, we thank all the authors for submitting their excellent manuscripts to this special issue. We also express our sincere thanks to the editorial board of the International Journal on Networking and Computing, in particular, to the Editor-in-chief Professor Koji Nakano. This special issue would not have been possible without his support.

View full abstract

Download PDF (17K)
A Faster Algorithm for Finding Disjoint Ordering of Sets

E. Cheng, K. Qiu, Z. Shen

2013Volume 3Issue 2 Pages 182-191
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_182

JOURNAL FREE ACCESS

Show abstractHide abstract

Consider the problem of routing from a single source node to multiple target nodes with the additional condition that these disjoint paths be the shortest. This problem is harder than the standard one-to-many routing in that such paths do not always exist. Various sufficient and necessary conditions have been found to determine when such paths exist for some interconnection networks. And when these conditions do hold, the problem of finding such paths can be reduced to the problem of finding a disjoint ordering of sets. In addition to the applications in finding disjoint shortest paths in interconnection networks, the problem of finding disjoint ordering of sets is an interesting combinatorial problem in its own right. We study the problem of finding a disjoint ordering of sets A₁, A₂, ..., A_m where A_i ⊆ A = ｛a₁, a₂,…, a_n} and m ≤ n. We present an O(n³) algorithm for doing so, under certain conditions, thus improving the previously known O(n⁴) algorithm, and consequently, improving the corresponding one-to-many routing algorithms for finding disjoint and shortest paths.

View full abstract

Download PDF (103K)
Energy Optimization using Fine-Grain Variable Stages Pipeline Processor Chip

Tomoyuki Nakabayashi, Takahiro Sasaki, Hitoshi Nakamura, Kazuhiko Ohno ...

2013Volume 3Issue 2 Pages 192-204
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_192

JOURNAL FREE ACCESS

Show abstractHide abstract

Increased energy consumption in processors caused by performance enhancement has recently become a critical problem. Many current processors employ dynamic voltage and frequency scaling (DVFS) which dynamically lowers the supply voltage and clock frequency in order to reduce energy consumption. However, it is difficult to deliver fine-grain energy optimization by using DVFS. Since a voltage regulator takes a long time for scaling the voltage and charging/discharging a power line has a large energy overhead, the useful interval of DVFS is limited to coarse-grain. To optimize energy consumption at fine-grain interval, we have proposed a variable stages pipeline (VSP) processor. VSP reduces energy consumption by dynamically varying the pipeline depth to suitable pipeline depth according to behavior of a running program. VSP can obtain finer-grained energy reduction than DVFS because pipeline scaling only requires a small overhead. In this paper, we fabricated a VSP processor chip using 180 nm technology and evaluated energy consumption of the chip. We present that the fabricated VSP chip dynamically varies the pipeline depth while a program is running and reduces the energy consumption at shorter interval than DVFS. We also analyze how to optimize the energy consumption according to system demand. Our analysis result shows that the VSP can adjust the energy consumption in the same manner for diverse program phases.

View full abstract

Download PDF (341K)
Constant-Time Approximation Algorithms for the Optimum Branching Problem on Sparse Graphs

Mitsuru Kusumoto, Yuichi Yoshida, Hiro Ito

2013Volume 3Issue 2 Pages 205-216
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_205

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a constant-time algorithm for approximating the weight of the maximum weight branching in the general graph model. A directed graph is called a branching if it is acyclic and each vertex has at most one incoming edge. An edge-weighted digraph G of average degree d whose weights are real values in [0, 1] is given as an oracle access, and we are allowed to ask degrees and incoming edges for any vertex through the oracle. Then, with high probability, our algorithm estimates the weight of the maximum weight branching in G with an absolute error of at most εn with query complexity O(d/ε³), where n is the number of vertices. We also show a lower bound of Ω(d/ε²). Additionally, our algorithm can be modified to run with query complexity O(1/ε⁴) for unweighted digraphs, i.e., it runs in time independent of the input size even for digraphs with d = Ω(n) edges. In contrast, we show that it requires Ω(n) queries to approximate the weight of the minimum (or maximum) spanning arborescence in a weighted digraph.

View full abstract

Download PDF (156K)
Naxim: A Fast and Retargetable Network-on-Chip Simulator with QEMU and SystemC

Keita Nakajima, Shuto Kurebayashi, Yusuke Fukutsuka, Takuji Hieda, Itt ...

2013Volume 3Issue 2 Pages 217-227
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_217

JOURNAL FREE ACCESS

Show abstractHide abstract

Systems-on-Chip (SoC) architectures have been shifting from single-core to multi-core solutions, and they are at present evolving towards many-core ones. Network-on-Chip (NoC) is considered as a promising interconnection scheme for many-core SoCs since it offers better scalability than traditional bus-based interconnection. In this work, we have developed a fast simulator of NoC architectures using QEMU and SystemC. QEMU is an open-source CPU emulator which is widely used in many simulation platforms such as Android Emulator. In the proposed simulator, each CPU core is emulated by a QEMU, and the network part including NoC routers is modeled with SystemC. The SystemC simulator and QEMUs are connected by TCP sockets on a host computer. Our simulator is fast because QEMUs run in parallel on a multi-core host computer or even multiple host computers. Also, our simulator is highly retargetable because QEMU provides a variety of CPU models and we use QEMU as is. In our experiments, our simulator successfully simulates a 108-core NoC in a practical time. We have also confirmed the scalability and retargetability of our NoC simulator.

View full abstract

Download PDF (323K)
Place Recommendation Based on Users Check-in History for Location-Based Services

Hongbo Chen, Mohammad Shamsul Arefin, Zhiming Chen, Yasuhiko Morimoto

2013Volume 3Issue 2 Pages 228-243
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_228

JOURNAL FREE ACCESS

Show abstractHide abstract

With rapid growth of the GPS enabled mobile devices, location-based online social network services become very popular, and allow their users to share life experiences with location information. In this paper, we considered a method for recommending places to a user based on spatial databases of location-based online social network services. We used a user-based collaborative filtering method to make a set of recommend places. In the proposed method, we calculate similarity of users’ check-in activities based on not only their positions but also their semantics such as “shopping”, “eating”, “drinking”, and so forth. We empirically evaluated our method in a real database and found that the proposed method outperforms the naive singular value decomposition collaborative filtering in its recommendation accuracy.

View full abstract

Download PDF (3581K)
Distributed Spatial k-dominant Skyline Maintenance Using Computational Object Preservation

Md. Anisuzzaman Siddique, Asif Zaman, Md. Mahbubul Islam, Yasuhiko Mor ...

2013Volume 3Issue 2 Pages 244-263
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_244

JOURNAL FREE ACCESS

Show abstractHide abstract

The existing k-dominant skyline solutions are restricted to centralized query processors, limiting scalability, and imposing a single point of failure. To overcome those problems in this paper, we propose the computation and maintenance algorithms for spatial k-dominant skyline query processing in large-scale distributed environment. Where the underlying dataset is partitioned into geographically distant computing core (personal computer) that are connected to the coordinator (server). Our proposed techniques preserve the spatial k-dominant computation object itself into a serialized form. This preservation is done in client's core after completing a computational job successfully. When the issue of maintenance comes in action, preserve data object retrieves and use for computation. This procedure eliminates the necessity of intermediate re-send and re-computation of k-dominant skyline for the maintenance issue. Thus, we quantify the gain of data transferring consecutively into different cores to maximize the overall gain as well as the query or balancing the load on different cores fairly. Extensive performance study shows that proposed algorithms are efficient and robust to different data distributions.

View full abstract

Download PDF (647K)

Regular Papers

Data Structures and Algorithms for Counting Problems on Graphs using GPU

Amlan Chatterjee, Sridhar Radhakrishnan, John K. Antonio

2013Volume 3Issue 2 Pages 264-288
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_264

JOURNAL FREE ACCESS

Show abstractHide abstract

The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limited shared memory have been main impediments to fully exploiting potential performance in architectures having a massive number of GPUs. After performing extensive study of data structures and complexity analysis of various data access methodologies, we propose novel memory storage and retrieval techniques that enable parallel graph computations to overcome the above issues. More specifically, given a graph G = (V,E) and an integer k <= ｜V｜, we provide both storage techniques and algorithms to count the number of: a) connected subgraphs of size k; b) k cliques; and c) k independent sets, all of which can be exponential in number. Our storage techniques are based on creating a breadth-first search tree and storing it along with non-tree edges in a novel way. Our experiments solve the above mentioned problems by using both naïve and advanced data structures on the CPU and GPU. Speedup is achieved by solving the problems on the GPU even using a brute-force approach as compared to the implementations on the CPU. Utilizing the knowledge of BFS-tree properties, the performance gain on the GPU increases and ultimately outperforms the CPU by a factor of at least 5 for graphs that completely fit in the shared memory and by a factor of 10 for larger graphs stored using the global memory. The counting problems mentioned above have many uses, including the analysis of social networks.

View full abstract

Download PDF (353K)
Adaptive Flux Calculation Scheme in Advection Term Computation Using Partial Reconfiguration

Mohamad Sofian Abu Talip, Takayuki Akamine, Mao Hatto, Hideharu Amano, ...

2013Volume 3Issue 2 Pages 289-306
Published: 2013
Released on J-STAGE: March 23, 2017

DOIhttps://doi.org/10.15803/ijnc.3.2_289

JOURNAL FREE ACCESS

Show abstractHide abstract

In aerospace industry, computational fluid dynamics (CFD) is used as a common design tool. Fast Aerodynamics Routines (FaSTAR) is one of the most recent CFD software package, convenient for users with various solvers and automatic generation of grid data. The problem of FaSTAR is hard to be executed in parallel machines because of its irregular and unpredictable data structure. Exploiting reconfigurable hardware with their advantages to make up for the inadequacy of the existing high performance computers had gradually become the solutions and trends. However, a single FPGA is not enough for the FaSTAR package because the whole module is very large. Instead of using a large number of chips, partially reconfigurable hardware available in recent FPGAs is explored for this application. Advection term computation module in FaSTAR is chosen as a target subroutine. We proposed a reconfigurable flux calculation scheme using partial reconfiguration technique to save hardware resources to fit in a single FPGA. We developed flux computational module and three flux calculation schemes are implemented as reconfigurable modules. This implementation has advantages of up to 42% resource saving and enhancing the configuration speed by 6.28 times. Performance evaluation also shows that 2.65 times acceleration is achieved compared to Intel Core 2 Duo at 2.4GHz.

View full abstract

Download PDF (845K)

Register with J-STAGE for free!