How can I build a generic dataset-handling Perl library?

Posted by Pep. on Stack Overflow See other posts from Stack Overflow or by Pep.
Published on 2010-03-21T10:50:01Z Indexed on 2010/03/22 15:51 UTC
Read the original article Hit count: 270

Filed under:

perl

|

data

Hello,

I want to build a generic Perl module for handling and analysing biomedical character separated datasets and which can, most certain, be used on any kind of datasets that contain a mixture of categorical (A,B,C,..) and continuous (1.2,3,881..) and identifier (XXX1,XXX2...). The plan is to have people initialize the module and then use some arguments to point to the data file(s), the place were the analysis reports should be placed and the structure of the data.

By structure of data I mean which variable is in which place and its name/type. And this is where I need some enlightenment. I am baffled how to do this in a clean way. Obviously, having people create a simple schema file, be it XML or some other format would be the cleanest but maybe not all people enjoy doing something like this.

The solutions I can think of are:

Create a configuration file in XML or similar and with a prespecified format.
Pass the information during initialization of the module.
Use the first row of the data as headers and try to guess types (ouch)

Surely there must be a "canonical" way of doing this that is also usable and efficient.

Thanks p.

© Stack Overflow or respective owner

Related posts about perl

Munin on Centos 6 - missing perl MODULE_COMPAT_5.8.8

as seen on Server Fault - Search for 'Server Fault'
I'm trying to install Munin on a new VPS through yum install munin but I keep getting an error about a missing perl module: Requires: perl(:MODULE_COMPAT_5.8.8). This is the perl version currently installed: v5.10.1. I've searched all around and still haven't found a solution for this. Here's the… >>> More
Pain removing a perl rootkit

as seen on Server Fault - Search for 'Server Fault'
So, we host a geoservice webserver thing at the office. Someone apparently broke into this box (probably via ftp or ssh), and put some kind of irc-managed rootkit thing. Now I'm trying to clean the whole thing up, I found the process pid who tries to connect via irc, but i can't figure out who's… >>> More
How To Avoid a Perl script calling an Another Perl Script

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello, i am calling a perl script client.pl from a main script to capture the output of client.pl in @output. is there anyway to avoid the use of these two files so i can use the output of client.pl in main.pl itself here is my code.... main.pl ======= my @output = readpipe("client.pl"); client… >>> More
Perl :how to sort dates in perl

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, How can I sort the dates in perl. my @dates = ( "02/11/2009" , "12/20/2001" , "11/21/2010" ) ; I have above dates in my array . How can I sort those dates... ? My date format is dd/mm/YYYY. >>> More
please suggest a perl book exclusively for perl programs

as seen on Stack Overflow - Search for 'Stack Overflow'
I want tha name of a perl book for only PERL PROGRAMS. The reason behind is I want to improve my programming skill in perl >>> More

Related posts about data

timetable in a jTable

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to create a timetable in a jTable. For the top row it will display from monday to sunday and the left colume will display the time of the day with 2h interval e.g 1st colume (0000 - 0200), 2nd colume (0200 - 0400) .... And if i click a button the timing will change from 2h interval to 1h interval… >>> More
Reading data from an Entity Framework data model through a WCF Data Service

as seen on ASP.net Weblogs - Search for 'ASP.net Weblogs'
This is going to be the fourth post of a series of posts regarding ASP.Net and the Entity Framework and how we can use Entity Framework to access our datastore. You can find the first one here , the second one here and the third one here . I have a post regarding ASP.Net and EntityDataSource. You… >>> More
SQL SERVER – Advanced Data Quality Services with Melissa Data – Azure Data Market

as seen on SQL Authority - Search for 'SQL Authority'
There has been much fanfare over the new SQL Server 2012, and especially around its new companion product Data Quality Services (DQS). Among the many new features is the addition of this integrated knowledge-driven product that enables data stewards everywhere to profile, match, and cleanse data.… >>> More
Modifying a HTML page to fix several "bugs" add a function to next/previous on a option dropdown

as seen on Stack Overflow - Search for 'Stack Overflow'
SOF, I've got a few problems plaguing me at the moment and am wondering if anyone could assist me with them. I'm trying to get Next Class | Previous Class to act as buttons so that when Next Class is clicked it will go to the next item in the dropdown list and for previous it would go to back one… >>> More
Shrinking TCP Window Size to 0 on Cisco ASA

as seen on Server Fault - Search for 'Server Fault'
Having an issue with any large file transfer that crosses our Cisco ASA unit come to an eventual pause. Setup Test1: Server A, FileZilla Client <- 1GBPS - Cisco ASA <- 1 GBPS - Server B, FileZilla Server TCP Window size on large transfers will drop to 0 after around 30 seconds of a large… >>> More