Find the "largest" dense sub matrix in a large sparse matrix

Posted by BCS on Stack Overflow See other posts from Stack Overflow or by BCS
Published on 2009-08-01T19:59:32Z Indexed on 2010/05/09 4:28 UTC
Read the original article Hit count: 684

Given a large sparse matrix (say 10k+ by 1M+) I need to find a subset, not necessarily continuous, of the rows and columns that form a dense matrix (all non-zero elements). I want this sub matrix to be as large as possible (not the largest sum, but the largest number of elements) within some aspect ratio constraints.

Are there any known exact or aproxamate solutions to this problem?

A quick scan on Google seems to give a lot of close-but-not-exactly results. What terms should I be looking for?


edit: Just to clarify; the sub matrix need not be continuous. In fact the row and column order is completely arbitrary so adjacency is completely irrelevant.


A thought based on Chad Okere's idea

  1. Order the rows from largest count to smallest count (not necessary but might help perf)
  2. Select two rows that have a "large" overlap
  3. Add all other rows that won't reduce the overlap
  4. Record that set
  5. Add whatever row reduces the overlap by the least
  6. Repeat at #3 until the result gets to small
  7. Start over at #2 with a different starting pair
  8. Continue until you decide the result is good enough

© Stack Overflow or respective owner

Related posts about sparse-matrix

Related posts about dense-matrix