PROJECTS | mysite

LATEST PROJECTS

Project | 01

Project | 01 Refine Coarse-grained Independent Tasks for Tip Decomposition (VLDB 2021)

Tip Decomposition is a crucial analytic used to find dense subgraphs in bipartite graphs. Unfortunately, it is computationally very expensive and the existing algorithm provides little work per synchronization to parallelize. In this work, we innovate a novel two-step approach that first computes bounds on the decomposition values and then refines the bounds for exact computation. This approach provides ample parallelism and opportunity for optimizations. We can decompose the largest open bipartite graph in half an hour (state of the art takes >10 days). This is my second work in "Dealing with the dependencies" series (after hub labeling).

Paper: The arXiv paper can be found here.
Code: Interested people can get the code here.

Project | 02

Project | 02 Scalable Parallel Algorithms for Hub Labeling (accepted at VLDB 2019)

Hub Labeling is an indexing that enables an analyst to quickly find shortest paths/distance between any 2 vertices in a graph. However, it is very challenging and time-consuming to construct the Hub labels. In this work, we developed new parallel and distributed hub labeling algorithms. The highlight is our PLaNT algorithm whose embarrassingly parallel nature inspires highly scalable, communication avoiding and efficient construction of Hub Labeling. We could label certain graphs within few minutes that state-of-the-art couldn't do in a day.

Paper: The arXiv paper can be found here.
Code: Interested people can get the code here.

Project | 03

Project | 03 Coordinating Multiple Ad Campaigns on Social Networks
(CIKM 2019)

We often see sponsored ads on Facebook. When we like and share these ads, they get shown to our friends through us. In this work, we develop intuitive models and new approximation algorithms to guide the host company (social network provider) in selecting seed nodes for sponsored ads when multiple advertisers are competing to advertise on a social network. Our work also unifies algorithms previously developed for this problem, under a common umbrella of matroid constraints.

Paper: Conference version, detailed arXiv version.
Slides: Please click here to download.
Code: The code for this work can be found here.

Project | 04

Project | 04 Parallel Static and Streaming Induced Edge Sampling (ACM CF 2019)

Graphs can sometimes get too large to process. In such situations, sampling becomes useful to create smaller and manageable representative subgraphs. But what if sampling itself takes too much time?

We develop novel parallel algorithms to accelerate Edge Sampling and subgraph induction for both static and streaming graphs. Algorithms for streaming graphs are especially useful in today's big data era but challenging to design due to streaming constraints.

Paper: This work was presented in ACM Computing Frontiers'19 and the paper can be found here.
Slides: Please click here to download.
Code: Can be found here.

Project | 05

Project | 05 Partition-centric Graph Processing (USENIX ATC 2018, PPoPP 2019)

Processing large graphs generates very irregular memory accesses and fails to utilize caches effectively. Is it fundamental or is it a result of the way we write our parallel programs?

In this project, we develop a new Partition-centric abstraction for the graphs that keeps cacheable vertex subsets at the center of computation. It enables hordes of memory and cache optimizations guaranteeing high cache data reuse and sustained DRAM bandwidth. We also develop optimizations for work-efficiency, scalability and fast convergence.

Paper: USENIX ATC 2018, arXiv
Slides: Please click here to download.
Code: GPOP Framework, PageRank only with baselines

To see more or discuss possible work let's talk >>

Let's Talk