Search Results

Search found 6 results on 1 pages for 'rizwank'.

Page 1/1 | 1 

  • Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

    - by RizwanK
    I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject. I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines.... Now, many subject lines would fail a string match, but "Google Friends : Our latest news" "Google Friends : What we're doing today" are more similar to each other than a random subject line, as is: "Virgin Airlines has a great sale today" "Take a flight with Virgin Airlines" So -- how can I start to automagically extract trends/examples of strings that may be more similar. Approaches I've considered and discarded ('because there must be some better way'): Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones Stripping off the first word or two and then count the occurrence of each sub string Comparing Levenshtein distance between entries Some sort of string similarity index ... Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..? In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set. After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one. Any guidance would be appreciated.

    Read the article

  • Ways of breaking down SQL transactional/call data into reports -- 'square data'?

    - by RizwanK
    I've got a large database of call-traffic information (although the question could be answered with any generic data set.) For instance, a row contains : call endpoint server (endpoint_name) call endpoint status (sip_disconnect_reason) call destination (destination) call completed (duration) [duration 0 is completed] call account group (account_group) It's pretty easy to run SQL reports against the data, i.e. select count(*), endpoint_name from calls where duration0 group by endpoint_name select count(*),destination from calls where blah group by destination I've been calling this filtering or breakdown reports (I get the number of calls per carrier, etc.). Add another breakdown, and you've got two breakdowns, a la select count(*), endpoint_name, sip_disconnect_reason from calls where duration=0 group by endpoint_name, sip_disconnect_reason Of course, if you keep adding breakdowns, you end up making super-large reports and slicing your data so thin that you can't extract any trends from it. So my question is this : Is there a name for this sort of method of report writing? (I've heard words like squares, slicing and breakdown reports applied to them) --- I'm looking for a Python/Reporting toolkit that I can use to make these easier to generate for my end users. aside : Are there other ways of representing transactional data that might be useful rather than the above method? Thanks,

    Read the article

  • Looking for an email template engine for end-users ...

    - by RizwanK
    We have a number of customers that we have to send monthly invoices too. Right now, I'm managing a codebase that does SQL queries against our customer database and billing database and places that data into emails - and sends it. I grow weary of maintaining this every time we want to include a new promotion or change our customer service phone numbers. So, I'm looking for a replacement to move more of this into the hands of those requesting the changes. In my ideal world, I need : A WYSIWYG (man, does anyone even say that anymore?) email editor that generates templates based upon the output from a Database Query. The ability to drag and drop various fields from the database query into the email template. Display of sample email results with the database query. Web application, preferably not requiring IIS. Involve as little code as possible for the end-user, but allow basic functionality (i.e. arrays/for loops) Either comes with it's own email delivery engine, or writes output in a way that I can easily write a Python script to deliver the email. Support for generic Database Connectors. (I need MSSQL and MySQL) F/OSS So ... can anyone suggest a project like this, or some tools that'd be useful for rolling my own? (My current alternative idea is using something like ERB or Tenjin, having them write the code, but not having live-preview for the editor would suck...)

    Read the article

  • Tools for managing code deployment/versioning for IIS / Windows enviroments

    - by RizwanK
    I've got a strong background in Linux and OSX, and just left a job where I was architecting systems based on those platforms. Now I've got a Windows Server running IIS that has a number of different websites that it hosts. Most of them are just a bunch of HTML, JS and Images, with some ASP for some customer tools. (Each website has a different set of customer tools, or they are the same tools, but with minor code changes between them.) I'm also adding a develop web server with the same code, but the 'bleeding edge' stuff. I need an effective way of managing changes and updates to the overall codebase (henceforth referring to both the images and the html and the asp, for all the sites). When a dev (or webmaster) checks in changes, I want it to show up automatically on the developer server, but should be manually pushed out to the live server. I'd be tempted to just make the websites SVN repositories, but I'd be concerned about the overhead of having the webdeveloper having to log into the server and trigger an SVN update via commandline/tortise (and heaven forbid, manage tags). Ideally I'd also manage IIS profile settings between the systems, but the major need is to be able to manage the process, and expose it to our ASP developer, and our webmaster, both of which are used to just FTPing up the files to the live site. So, any recommendations on tools (beyond some SVN hacking with BAT files + teaching the webmaster how to log into the server and do updates) or workflows that would help this out? I even considered an RPM type package (or some Windows equivalent, of course) to manage the live server, but that seems like a bit too much overhead. Thanks.

    Read the article

  • Looking for an email/report templating engine with database backend - for end-users ...

    - by RizwanK
    We have a number of customers that we have to send monthly invoices too. Right now, I'm managing a codebase that does SQL queries against our customer database and billing database and places that data into emails - and sends it. I grow weary of maintaining this every time we want to include a new promotion or change our customer service phone numbers. So, I'm looking for a replacement to move more of this into the hands of those requesting the changes. In my ideal world, I need : A WYSIWYG (man, does anyone even say that anymore?) email editor that generates templates based upon the output from a Database Query. The ability to drag and drop various fields from the database query into the email template. Display of sample email results with the database query. Web application, preferably not requiring IIS. Involve as little code as possible for the end-user, but allow basic functionality (i.e. arrays/for loops) Either comes with it's own email delivery engine, or writes output in a way that I can easily write a Python script to deliver the email. Support for generic Database Connectors. (I need MSSQL and MySQL) F/OSS So ... can anyone suggest a project like this, or some tools that'd be useful for rolling my own? (My current alternative idea is using something like ERB or Tenjin, having them write the code, but not having live-preview for the editor would suck...)

    Read the article

1