disappearedng - Developer IT

Customizable mail server - what are my options? [closed]

- by disappearedng

This question was originally on SO but it was closed since it is considered off topic. I am interested to build a mail service that allows you to incorporate custom logic in the your mail server. For example, user A can reply to [email protected] once and subsequent emails from user A to [email protected] will not go through until certain actions are taken. I am looking for something simple and customizable, preferably open-sourced. I am fluent in most modern languages. What email servers do you guys recommend for this? Mailgun looks promising, but are there any simpler options?

Read the article

Python Memory leak - Solved, but still puzzled

- by disappearedng

Dear everyone, I have successfully debugged my own memory leak problems. However, I have noticed some very strange occurence. for fid, fv in freqDic.iteritems(): outf.write(fid+"\t") #ID for i, term in enumerate(domain): #Vector tfidf = self.tf(term, fv) * self.idf( term, docFreqDic) if i == len(domain) - 1: outf.write("%f\n" % tfidf) else: outf.write("%f\t" % tfidf) outf.flush() print "Memory increased by", int(self.memory_mon.usage()) - startMemory outf.close() def tf(self, term, freqVector): total = freqVector[TOTAL] if total == 0: return 0 if term not in freqVector: ## When you don't have these lines memory leaks occurs return 0 ## return float(freqVector[term]) / freqVector[TOTAL] def idf(self, term, docFrequencyPerTerm): if term not in docFrequencyPerTerm: return 0 return math.log( float(docFrequencyPerTerm[TOTAL])/docFrequencyPerTerm[term]) Basically let me describe my problem: 1) I am doing tfidf calculations 2) I traced that the source of memory leaks is coming from defaultdict. 3) I am using the memory_mon from http://stackoverflow.com/questions/276052/how-to-get-current-cpu-and-ram-usage-in-python 4) The reason for my memory leaks is as follows: a) in self.tf, if the lines: if term not in freqVector: return 0 are not added that will cause the memory leak. (I verified this myself using memory_mon and noticed a sharp increase in memory that kept on increasing) The solution to my problem was 1) since fv is a defaultdict, any reference to it that are not found in fv will create an entry. Over a very large domain, this will cause memory leaks. I decided to use dict instead of default dict and the memory problem did go away. My only puzzle is: since fv is created in "for fid, fv in freqDic.iteritems():" shouldn't fv be destroyed at the end of every for loop? I tried putting gc.collect() at the end of the for loop but gc was not able to collect everything (returns 0). Yes, the hypothesis is right, but the memory should stay fairly consistent with ever for loop if for loops do destroy all temp variables. This is what it looks like with that two line in self.tf: Memory increased by 12 Memory increased by 948 Memory increased by 28 Memory increased by 36 Memory increased by 36 Memory increased by 32 Memory increased by 28 Memory increased by 32 Memory increased by 32 Memory increased by 32 Memory increased by 40 Memory increased by 32 Memory increased by 32 Memory increased by 28 and without the the two line: Memory increased by 1652 Memory increased by 3576 Memory increased by 4220 Memory increased by 5760 Memory increased by 7296 Memory increased by 8840 Memory increased by 10456 Memory increased by 12824 Memory increased by 13460 Memory increased by 15000 Memory increased by 17448 Memory increased by 18084 Memory increased by 19628 Memory increased by 22080 Memory increased by 22708 Memory increased by 24248 Memory increased by 26704 Memory increased by 27332 Memory increased by 28864 Memory increased by 30404 Memory increased by 32856 Memory increased by 33552 Memory increased by 35024 Memory increased by 36564 Memory increased by 39016 Memory increased by 39924 Memory increased by 42104 Memory increased by 42724 Memory increased by 44268 Memory increased by 46720 Memory increased by 47352 Memory increased by 48952 Memory increased by 50428 Memory increased by 51964 Memory increased by 53508 Memory increased by 55960 Memory increased by 56584 Memory increased by 58404 Memory increased by 59668 Memory increased by 61208 Memory increased by 62744 Memory increased by 64400 I look forward to your answer

Read the article

Scipy Negative Distance? What?

- by disappearedng

I have a input file which are all floating point numbers to 4 decimal place. i.e. 13359 0.0000 0.0000 0.0001 0.0001 0.0002` 0.0003 0.0007 ... (the first is the id). My class uses the loadVectorsFromFile method which multiplies it by 10000 and then int() these numbers. On top of that, I also loop through each vector to ensure that there are no negative values inside. However, when I perform _hclustering, I am continually seeing the error, "Linkage Z contains negative values". I seriously think this is a bug because: I checked my values, the values are no where small enough or big enough to approach the limits of the floating point numbers and the formula that I used to derive the values in the file uses absolute value (my input is DEFINITELY right). Can someone enligten me as to why I am seeing this weird error? What is going on that is causing this negative distance error? ===== def loadVectorsFromFile(self, limit, loc, assertAllPositive=True, inflate=True): """Inflate to prevent "negative" distance, we use 4 decimal points, so *10000 """ vectors = {} self.winfo("Each vector is set to have %d limit in length" % limit) with open( loc ) as inf: for line in filter(None, inf.read().split('\n')): l = line.split('\t') if limit: scores = map(float, l[1:limit+1]) else: scores = map(float, l[1:]) if inflate: vectors[ l[0]] = map( lambda x: int(x*10000), scores) #int might save space else: vectors[ l[0]] = scores if assertAllPositive: #Assert that it has no negative value for dirID, l in vectors.iteritems(): if reduce(operator.or_, map( lambda x: x < 0, l)): self.werror( "Vector %s has negative values!" % dirID) return vectors def main( self, inputDir, outputDir, limit=0, inFname="data.vectors.all", mappingFname='all.id.features.group.intermediate'): """ Loads vector from a file and start clustering INPUT vectors is { featureID: tfidfVector (list), } """ IDFeatureDic = loadIdFeatureGroupDicFromIntermediate( pjoin(self.configDir, mappingFname)) if not os.path.exists(outputDir): os.makedirs(outputDir) vectors = self.loadVectorsFromFile( limit, pjoin( inputDir, inFname)) for threshold in map( lambda x:float(x)/30, range(20,30)): clusters = self._hclustering(threshold, vectors) if clusters: outputLoc = pjoin(outputDir, "threshold.%s.result" % str(threshold)) with open(outputLoc, 'w') as outf: for clusterNo, cluster in clusters.iteritems(): outf.write('%s\n' % str(clusterNo)) for featureID in cluster: feature, group = IDFeatureDic[featureID] outline = "%s\t%s\n" % (feature, group) outf.write(outline.encode('utf-8')) outf.write("\n") else: continue def _hclustering(self, threshold, vectors): """function which you should call to vary the threshold vectors: { featureID: [ tfidf scores, tfidf score, .. ] """ clusters = defaultdict(list) if len(vectors) > 1: try: results = hierarchy.fclusterdata( vectors.values(), threshold, metric='cosine') except ValueError, e: self.werror("_hclustering: %s" % str(e)) return False for i, featureID in enumerate( vectors.keys()):

Read the article

facebook integration for sites

- by disappearedng

Hi everyone, AFAIK, the following are not possible for now: (Assuming we have user A which has allowed us all permissions:) Send a message to a A's friend get A's friends email Aside from the obvious "Like" button, what are the ways that an app can leverage facebook's graph API to increase traffic flow? I find the Like button to be more of a broadcasting mechanism. I was wondering what are the ways that I can specifically tell user B, whom is a friend of User A, that he/she should come check out my site. Many thanks

Developer IT