Barna Saha

Assistant Professor, University of California Berkeley

Department of IEOR

Simons Institute for the Theory of Computing, Affiliate Faculty

Email: barnas (at) berkeley (dot) edu

I am looking for postdocs to work on Theoretical Computer Science. Please contact me directly with your cv. I am also looking for highly motivated and self-driven Ph.D. students with strong foundation in Theoretical Computer Science. Please apply to our Ph.D. program.

Research Interests: I am a Theoretical Computer Scientist. I specialize in Algorithm Design & Analysis, Probabilistic Method and Large Scale Data Analytics.

Previous Affiliations: University of Massachusetts Amherst, Computer Science (Associate Professor with tenure), AT&T Shannon Research Laboratory and University of Maryland College Park.

Selected Awards.

Selected Publications

PUBLICATIONS

[By TCS tradition author names are in alphabetical order of lastnames, with few exceptions.]

Algorithms/TheoryData Analytics
Sublinear Algorithms for Gap Edit Distance, Elazar Goldenberg, Robert Krauthgamer, Barna Saha,
FOCS 2019 
Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost, Barna Saha, Sanjay Subhramaniam, ESA 2019, Track B.
Connectivity in Random Annulus Graphs and the Geometric Block Model, Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha, RANDOM 2019.Paper Matching with Local Fairness ConstraintsAri Kobren, Barna Saha, Andrew McCallum, KDD 2019. Oral Presentation.
Dynamic Set Cover: Improved Algorithms & Lower Bounds, Amir Abboud, Raghavendra Addanki, Fabrizio Grandoni, Debmalya Panigrahi , Barna Saha, STOC 2019The Geometric Block Model, Sainyam Galhotra, Arya Mazumdar, Soumyabrata Pal, Barna Saha, AAAI 2018. 
Shorter version in NIPS 2017, Workshop on Learning on Distributions, Functions, Graphs and Groups.
Min-Max Correlation Clustering via MultiCut, Saba Ahmadi, Samir Khuller, Barna Saha, IPCO 2019
Robust Entity Resolution using Random Graphs, Sainyam Galhotra, Donatella Firmani, Barna Saha, Divesh Srivastava, SIGMOD 2018.

The Most Reproducible Paper Award
Fast Space-Efficient Approximations of Language Edit Distance and RNA-Folding: An Amnesic Dynamic Programming Approach, Barna Saha, FOCS 2017.Robust Entity Resolution Using a CrowdOracle, Donatella Firmani, Sainyam Galhotra, Barna Saha, Divesh Srivastava, IEEE Data Eng. Bull., 2018.
Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultra-linear grammars are where Parsing Becomes Hard!, Rajesh Jayaram and Barna Saha, ICALP 2017. Extended AbstractA Theoretical Analysis of First Heuristics of Crowdsourced Entity Resolution, Arya Mazumdar, Barna Saha. AAAI 2017.
Query Complexity of Clustering with Side Information, Arya Mazumdar and Barna Saha, NIPS 2017.Online Entity Resolution with an Oracle, Donatella Firmani, Barna Saha, Divesh Srivastava, VLDB 2016.
Clustering with Noisy Queries, Arya Mazumdar and Barna Saha, NIPS 2017.TreeScope: Finding Structural Anomalies In Semi-Structured Data, Shanshan Ying, Flip Korn, Barna Saha and Divesh Srivastava, VLDB 2015.
Truly Sub-cubic Algorithms for Language Edit Distance and RNA Folding via Fast Bounded-Difference Min-Plus Product, Karl Bringmann, Fabrizio Grandoni, Barna Saha and Virginia Williams, FOCS 2016. Invited to special issue for selected best papers.Size-constrained Weighted Set Cover, Lukasz Golab, Flip Korn, Feng Li, Barna Saha and Divesh Srivastava. ICDE 2015.
Language Edit Distance & Maximum Likelihood Parsing of Stochastic Grammars: Faster Algorithms & Connection to Fundamental Graph Problems, Barna Saha, FOCS 2015.Discovering Conservation Rules, Lukasz Golab, Howard Karloff, Flip Korn, Barna Saha and Divesh Srivastava.  IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014. Special Issue on the Best Papers of ICDE 2012.
New Approximation Results for Resource Replication Problems, Samir Khuller, Barna Saha and Kanthi Sarpatwar. Algorithmica, 2015. [Preliminary Version in APPROX 2012]Data Quality: The other Face of Big Data, Barna Saha and Divesh Srivastava. Tutorial, ICDE 2014.
The Dyck Language Edit Distance Problem in Near-linear Time, Barna Saha, FOCS 2014.Distributed Data Placement to Minimize Communication Costs via Graph Partitioning, Lukasz Golab, Marios Hadjieleftheriou, Howard Karloff and Barna Saha. SSDBM 2014.
Hierarchical Graph Partitioning,  Mohammadtaghi Hajiaghayi, Theodore Johnson, Reza Khani and Barna Saha. SPAA 2014.Firewall Placement on Cloud Data Centers, Manish Purohit Seungjoon Lee and Barna Saha. Poster, SOCC 2013.
Facility Location with Matroid or Knapsack Constraints , Ravishankar Krishnaswamy, Amit Kumar, Vishwanath Nagarajan, Yogish Sabharwal and Barna Saha. Mathematics of Operations Research, 2014.On Repairing Structural Problems in Semi-structured Data, Flip Korn, Barna Saha, Divesh Srivastava and Shanshan Ying. VLDB 2013.
A Constant Factor Approximation Algorithm for Fault Tolerant k-Median Problem, Mohammadtaghi Hajiaghayi, Wei Hu, Jian Li, Shi Li and Barna Saha.  SODA, 2014.Less is More, Selecting Sources Wisely for Integration, Xin Luna Dong, Barna Saha and Divesh Srivastava. VLDB 2013.
Renting a Cloud, Barna Saha, FSTTCS, 2013.Discovering Conservation Rules, Lukasz Golab, Howard Karloff, Flip Korn, Barna Saha and Divesh Srivastava. ICDE 2012. Among the selected best papers of ICDE 2012.
On Chebyshev Radius of a Set in Hamming Space and the Closest String Problem , Yury Polyanskiy, Arya Mazumdar and Barna Saha. ISIT 2013.Link Prediction for Annotation Graphs, Philip Anderson, Samir Khuller, Saket Navlakha, Louiqa Raschid, Barna Saha, Andreas Thor and Xiao-Ning Zhang. ISWC 2011.
New Approximation Results for Resource Replication Problems, Samir Khuller, Barna Saha and Kanthi Sarpatwar. APPROX 2012.A Unified Approach to Ranking in Probabilistic Databases, Jian Li, Barna Saha and Amol Deshpande. The VLDB Journal, 2011. Best Paper Issue of VLDB 2009.
Set Cover revisited: Hypergraph Cover with Hard Capacities,Barna Saha and Samir Khuller. ICALP-Track A 2012.

This paper settles an open question raised by Chuzhoy and Naor in [FOCS02].

Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs, Allison Hoch, Samir Khuller, Louiqa Raschid, Barna Saha and Xiao-Ning Zhang. RECOMB 2010.
The Matroid Median Problem, Ravishankar Krishnaswamy,  Vishwanath Nagarajan and Barna Saha. SODA 2011.

Merged with a paper by Amit Kumar and Yogish Sabharwal who obtained similar results for the case of partition matroid.

Schema Covering: A Step Towards Enabling Reusability in Information Integration, Barna Saha, Ioana Stanoi and Ken Clarkson. ICDE 2010.
New Constructive Aspects of the Lovasz Local Lemma,  Bernhard Haeupler,  Barna Saha and Aravind Srinivasan. Journal of the ACM (JACM), 2011. [Preliminary version in FOCS 2010]A Unified Approach to Ranking in Probabilistic Databases, Jian Li, Barna Saha and Amol Deshpande. VLDB 2009. [BEST PAPER AWARD]   Invited to the BEST PAPER ISSUE OF VLDBJ.
AdCell-Ad Allocation in Cellular Networks, Saeed Alaei, Mohammad Taghi Hajiaghayi, Vahid Liaghat, Dan Pei and Barna Saha. ESA 2011.

Here is a much polished version for the online algorithm. Link.

On Maximum Coverage in the Streaming Model & Application to Multi-topic Blog-Watch, Barna Saha and Lise Getoor. SDM 2009.
New Constructive Aspects of the Lovasz Local Lemma, Bernhard Haeupler, Barna Saha and Aravind Srinivasan. FOCS 2010.

This paper settles an outstanding open question related to resource allocation, known as the Santa Claus problem.

Group Proximity Measure for Recommending Groups in Online Social Networks., Barna Saha and Lise Getoor. ACM SIGKDD Workshop on Social Network Mining and Analysis (SNA-KDD), 2008.
Energy Efficient Scheduling via Partial Shutdown, Samir Khuller, Barna Saha and Jian Li. SODA 2010.Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration, Bogdan Alexe, Michael Gubanov, Mauricio A. Hernandez, Howard Ho, Jen-Wei Huang, Yannis Katsis, Lucian Popa, Barna Saha and Ioana Stanoi. Lecture Notes in Business Information Processing, Revised Selected Papers from  Second International VLDB Workshop, BIRTE 2008.
A New Approximation Technique for Resource-Allocation Problems, Barna Saha and Aravind Srinivasan. ITCS 2010, known as ICS at that time.Bidirectional Fuzzy-Regression Model for Road-lines Detection, Barna Saha, Arya Mazumdar and N R Pal. ICEIS 2006.
On Finding Dense Subgraphs, Samir Khuller and Barna Saha. ICALP 2009. Longer version.
On Estimating Path-Aggregates over Streaming Graphs, Sumit Ganguly and Barna Saha. ISAAC 2006.