table design for storing large number of rows

Posted by hyperboreean on Stack Overflow See other posts from Stack Overflow or by hyperboreean
Published on 2010-05-10T10:12:13Z Indexed on 2010/05/10 10:14 UTC
Read the original article Hit count: 309

Filed under:

I am trying to store in a postgresql database some unique identifiers along with the site they have been seen on. I can't really decide which of the following 3 option to choose in order to be faster and easy maintainable. The table would have to provide the following information:

the unique identifier which unfortunately it's text
the sites on which that unique identifier has been seen

The amount of data that would have to hold is rather large: there are around 22 millions unique identifiers that I know of.

So I thought about the following designs of the table:

id - integer

identifier - text

seen_on_site - an integer, foreign key to a sites table

This approach would require around 22 mil multiplied by the number of sites.

id - integer

identifier - text

seen_on_site_1 - boolean

seen_on_site_2 - boolean

............

seen_on_site_n - boolean

Hopefully the number of sites won't go past 10. This would require only the number of unique identifiers that I know of, that is around 20 millions, but it would make it hard to work with it from an ORM perspective.

one table that would store only unique identifiers, like in:

id - integer

unique_identifier - text,

one table that would store only sites, like in:

id - integer

site - text

and one many to many relation, like:

id - integer,

unique_id - integer (fk to the table storing identifiers)

site_id - integer (fk to sites table)

another approach would be to have a table that stores unique identifiers for each site

So, which one seems like a better approach to take on the long run?

Developer IT

table design for storing large number of rows - Developer IT

table design for storing large number of rows

database-design

postgresql

Related posts about database-design

(Database Design - products attributes): What is better option for product attribute database design

Book Review: Pro SQL Server 2008 Relational Database Design and Implementation

Advice on database design / SQL for retrieving data with chronological order

Fiscal year handling strategies in database design

Database design for summarized data

Related posts about postgresql

Postgresql fails to start on Ubuntu 10.04.4 LTS

can't install psycopg2 in my env on mac os x lion

Postgresql has broken apt-get on Ubuntu

Installing PostgreSQL on FreeBSD (with ports)

Strange permission errors in new PostgreSQL installation

Categories cloud