Search Results

Search found 11543 results on 462 pages for 'partition wise join'.

Page 373/462 | < Previous Page | 369 370 371 372 373 374 375 376 377 378 379 380 | Next Page >

finding long repeated substrings in a massive string

- by Will

I naively imagined that I could build a suffix trie where I keep a visit-count for each node, and then the deepest nodes with counts greater than one are the result set I'm looking for. I have a really really long string (hundreds of megabytes). I have about 1 GB of RAM. This is why building a suffix trie with counting data is too inefficient space-wise to work for me. To quote Wikipedia's Suffix tree: storing a string's suffix tree typically requires significantly more space than storing the string itself. The large amount of information in each edge and node makes the suffix tree very expensive, consuming about ten to twenty times the memory size of the source text in good implementations. The suffix array reduces this requirement to a factor of four, and researchers have continued to find smaller indexing structures. And that was wikipedia's comments on the tree, not trie. How can I find long repeated sequences in such a large amount of data, and in a reasonable amount of time (e.g. less than an hour on a modern desktop machine)? (Some wikipedia links to avoid people posting them as the 'answer': Algorithms on strings and especially Longest repeated substring problem ;-) )

Read the article
DTOs Collections mapping Problem

- by the_knight5000

I'm working now on a multi-tier project which has layers as following : DAL BLL GUI Layer and Shared DTOs between BLL and GUI layers. I'm facing a problem in mapping the Objects from DAO To DTO, No problem in the simple objects. The problem is in the Objects who have child collections of another objects. ex: Author Category --Categories --Authors the execution goes in an infinite loop of mapping and it get more complex when I want model Self-join tables ex: Safe Safe --TransferSafe(Collection<Safe>) --TransferSafe(Collection<Safe>) the execution goes in an infinite loop of mapping any suggestions about a good solution or a practical mapping pattern?

Read the article
Ruby - Is there a way to overwrite the __FILE__ variable?

- by Markus Orrelly

I'm doing some unit testing, and some of the code is checking to see if files exist based on the relative path of the currently-executing script by using the FILE variable. I'm doing something like this: if File.directory?(File.join(File.dirname(__FILE__),'..','..','directory')) blah blah blah ... else raise "Can't find directory" end I'm trying to find a way to make it fail in the unit tests without doing anything drastic. Being able to overwrite the __ FILE __ variable would be easiest, but as far as I can tell, it's impossible. Any tips?

Read the article
deleting unaccessed files using python

- by damon

My django app parses some files uploaded by the user.It is possible that the file uploaded by the user may remain in the server for a long time ,without it being parsed by the app.This can increase in size if a lot of users upload a lot of files. I need to delete those files not recently parsed by the app -say not accessed for last 24 hours.I tried like this import os import time dirname = MEDIA_ROOT+my_folder filenames = os.listdir(dirname) filenames = [os.path.join(dirname,filename) for filename in filenames] for filename in filenames: last_access = os.stat(filename).st_atime #secs since epoch rtime = time.asctime(time.localtime(last_access)) print filename+'----'+rtime This shows the last accessed times for each file..But I am not sure how I can test if the file access time was within the last 24 hours..Can somebody help me out?

Read the article
Concatenating string with number in Javascript

- by Sparky

I'm trying to create a simple calculator in Javascript. I have an array named expression chunk[0] = 12 chunk[1] = + (the "+" sign) chunk[1] = 5 I used a for loop to loop through the chunks (chunk[]) and join then into a single expression as follows:- equation = ""; // To make var equation a string for(i = 0; i <= length; i++) { equation = equation + expression[i]; alert(expression[i]); } alert(equation); alert(expression[i]) showed values 12, + and 5. But alert(equation) showed 125 (instead of "12+5"). I need the variable equation to be "12+5" so that I can later call eval(equation) and get the value of 12+5. What am I doing wrong here?

Read the article
index error:list out of range

- by kaushik

from string import Template from string import Formatter import pickle f=open("C:/begpython/text2.txt",'r') p='C:/begpython/text2.txt' f1=open("C:/begpython/text3.txt",'w') m=[] i=0 k='a' while k is not '': k=f.readline() mi=k.split(' ') m=m+[mi] i=i+1 print m[1] f1.write(str(m[3])) f1.write(str(m[4])) x=[] j=0 while j<i: k=j-1 l=j+1 if j==0 or j==i: j=j+1 else: xj=[] xj=xj+[j] xj=xj+[m[j][2]] xj=xj+[m[k][2]] xj=xj+[m[l][2]] xj=xj+[p] x=x+[xj] j=j+1 f1.write(','.join(x)) f.close() f1.close() It say line 33,xj=xj+m[l][2] has index error,list out of range please help thanks in advance

Read the article
How do I use a custom select statement in Hibernate using the HibernateDaoSupport class

- by Bill Leeper

I am trying to write a custom select statement in Hibernate using the getHibernateTemplate() method. I am having problems with the resulting mapping. Example Code: List<User> users = getHibernateTemplate().find("Select user, sysdate as latestPost from User as user"); for (User user : users) { assertNotNull(users.name); } The goal of the above line is to eventually have a join where I get the max(date) of posts made by the user. The problem I am having is that the resulting users list is not a list of User objects and I get a class cast exception. Hopefully that is enough code. It is a greatly simplified version of my problem and a combination of snippets from various parts of my application.

Read the article
Even lighter than SQLite

- by Richard Fabian

I've been looking for a C++ SQL library implementation that is simple to hook in like SQLite, but faster and smaller. My projects are in games development and there's definitely a cutoff point between needing to pass the ACID test and wanting some extreme performance. I'm willing to move away from SQL string style queries, allowing it to be code driven, but I haven't found anything out there that provides SQL like flexibility while also preferring performance over the ACID test. I don't want to go reinventing the wheel, and the idea of implementing an SQL library on my own is quite daunting, even if it's only going to be simple subset of all the calls you could make. I need the basic commands (SELECT, MODIFY, DELETE, INSERT, with JOIN, and WHERE), not data operations (like sorting, min, max, count) and don't need the database to be atomic, or even enforce consistency (I can use a real SQL service while I'm testing and debugging).

Read the article
executorservice to read data from database in chuncks and run process on them

- by TazMan

I'm trying to write a process that would read data from a database and upload it onto a cloud datastore. How can I decide the partition strategy of the data? I want to query the table in chunks and process each chunk in 10 threads. Each thread basically will send the data to an individual node on a 10 node cluster on the cloud.. Where in the below multi threading code will the dataquery to extract and send 10 concurrent requests for uploading data to cloud would be? public class Caller { public static void main(String[] args) { ExecutorService executor = Executors.newFixedThreadPool(10); for (int i = 0; i < 10; i++) { Runnable worker = new DomainCDCProcessor(i); executor.execute(worker); } executor.shutdown(); while (!executor.isTerminated()) { } System.out.println("Finished all threads"); } }

Read the article
linq with Include and criteria

- by JMarsch

How would I translate this into LINQ? Say I have A parent table (Say, customers), and child (addresses). I want to return all of the Parents who have addresses in California, and just the california address. (but I want to do it in LINQ and get an object graph of Entity objects) Here's the old fashioned way: SELECT c.blah, a.blah FROM Customer c INNER JOIN Address a on c.CustomerId = a.CustomerId where a.State = 'CA' The problem I'm having with LINQ is that i need an object graph of concrete Entity types (and it can't be lazy loaded. Here's what I've tried so far: // this one doesn't filter the addresses -- I get the right customers, but I get all of their addresses, and not just the CA address object. from c in Customer.Include(c = c.Addresses) where c.Addresses.Any(a = a.State == "CA") select c // this one seems to work, but the Addresses collection on Customers is always null from c in Customer.Include(c = c.Addresses) from a in c.Addresses where a.State == "CA" select c; Any ideas?

Read the article
In Python, how do I search a flat file for the closest match to a particular numeric value?

- by kaushik

have file data of format 3.343445 1 3.54564 1 4.345535 1 2.453454 1 and so on upto 1000 lines and i have number given such as a=2.44443 for the given file i need to find the row number of the numbers in file which is most close to the given number "a" how can i do this i am presently doing by loading whole file into list and comparing each element and finding the closest one any other better faster method? my code:i need to ru this for different file each time around 20000 times so want a fast method p=os.path.join("c:/begpython/wavnk/",str(str(str(save_a[1]).replace('phone','text'))+'.pm')) x=open(p , 'r') for i in range(6): x.readline() j=0 o=[] for line in x: oj=str(str(line).rstrip('\n')).split(' ') o=o+[oj] j=j+1 temp=long(1232332) end_time=save_a[4] for i in range((j-1)): diff=float(o[i][0])-float(end_time) if diff<0: diff=diff*(-1) if temp>diff: temp=diff pm_row=i

Read the article
Query useing two databases in SQL Report Builder

- by user912447

I am new to SQL Server Report Builder 2.0 and I need to compare two different databases in one query. Basically I need to check if values from one database table exist in a different database's table. I know I can add multiple Datasources to my report and access each one with Subreports, but each DataSet that I create can only have one query in it. So how can I go about using one query to access two databases? Or if there is another way to somehow join my results from multiple DataSets, that would work too. Also, the databases are on the same server.

Read the article
Get latest record from second table left joined to first table

- by codef0rmer

I have a candidate table say candidates having only id field and i left joined profiles table to it. Table profiles has 2 fields namely, candidate_id & name. e.g. Table candidates: id 1 2 and Table `profiles`: candidate_id name 1 Foobar 1 Foobar2 2 Foobar3 i want the latest name of a candidate in a single query which is given below: SELECT C.id, P.name FROM candidates C LEFT JOIN profiles P ON P.candidate_id = C.id GROUP BY C.id ORDER BY P.name; But this query returns: 1 Foobar 2 Foobar3 Instead of: 1 Foobar2 2 Foobar3

Read the article
Killing a subprocess including its children from python

- by user316664

Hi, I'm using the subprocess module on python 2.5 to spawn a java program (the selenium server, to be precise) as follows: import os import subprocess display = 0 log_file_path = "/tmp/selenium_log.txt" selenium_port = 4455 selenium_folder_path = "/wherever/selenium/lies" env = os.environ env["DISPLAY"] = ":%d.0" % display command = ["java", "-server", "-jar", 'selenium-server.jar', "-port %d" % selenium_port] log = open(log_file_path, 'a') comm = ' '.join(command) selenium_server_process = subprocess.Popen(comm, cwd=selenium_folder_path, stdout=log, stderr=log, env=env, shell=True) This process is supposed to get killed once the automated tests are finished. I'm using os.kill to do this: os.killpg(selenium_server_process.pid, signal.SIGTERM) selenium_server_process.wait() This does not work. The reason is that the shell subprocess spawns another processfor the java program, and the pid of that process is unknown to my python code. I've tried killing the process group with os.killpg, but that kills also the python process which runs this code in the first place. Setting shell to false, thus avoiding the java program to run inside a shell environment, is also out of the question, due to other reasons. Does anyone have an idea how I can kill the shell and any other processes generated by it? Cheers, Ulas

Read the article
Object for storing strings geted from prints

- by evg

class MyWriter: def __init__(self, stdout): self.stdout = stdout self.dumps = [] def write(self, text): self.stdout.write(smart_unicode(text).encode('cp1251')) self.dumps.append(text) def close(self): self.stdout.close() writer = MyWriter(sys.stdout) save = sys.stdout sys.stdout = writer I use self.dumps list to store geted data from prints. Is it exists more convinient object for storing string lines in memory? ideally i want dump it to one big string. I can get it like this "\n".join(self.dumps) from code above. Mb it's better to just concat strings - self.dumps += text ?

Read the article
Wrapping multiple images inside a <div> in jQuery

- by alekone

hello! I need to find all the images inside a div, and wrap a div around them. here's the code I come up with, but that's not working! any ideas? thanks! jQuery(function(){ my_selection = []; $('.post').each(function(i){ if ($(this).find('img').length > 1 ){ my_selection.push(['.post:eq(' + i + ')']); } }); $( my_selection.join(',')).wrapAll('<div class="wrapper"></div>'); });

Read the article
How to delete files with a Python script from a FTP server which are older than 7 days?

- by Tom

Hello I would like to write a Python script which allows me to delete files from a FTP Server after they have reached a certain age. I prepared the scipt below but it throws the error message: WindowsError: [Error 3] The system cannot find the path specified: '/test123/*.*' Do someone have an idea how to resolve this issue? Thank you in advance! import os, time from ftplib import FTP ftp = FTP('127.0.0.1') print "Automated FTP Maintainance" print 'Logging in.' ftp.login('admin', 'admin') # This is the directory that we want to go to directory = 'test123' print 'Changing to:' + directory ftp.cwd(directory) files = ftp.retrlines('LIST') print 'List of Files:' + files # ftp.remove('LIST') #------------------------------------------- now = time.time() for f in os.listdir(directory): if os.stat(f).st_mtime < now - 7 * 86400: if os.directory.isfile(f): os.remove(os.directory.join(directory, f)) #except: #exit ("Cannot delete files") #------------------------------------------- print 'Closing FTP connection' ftp.close()

Read the article
In a SQL XDL File, how do I read the waitresource attribute on process nodes which are deadlocking?

- by skimania

On SQL Server 2005, I'm getting a deadlock when updating two different keys in the same table. note from below that these two waitresources have the same beginning part, but different ending parts. waitresource="KEY: 6:72057594090487808 (d900ed5a6cc6)" and waitresource="KEY: 6:72057594090487808 (d900fb5261bb)" These two keys are locking, and I need to figure out why. The question: If the values in parenthesis are different, why are the first half of the key's the same? <deadlock-list> <deadlock victim="processffffffff8f5863e8"> <process-list> <process id="processaf02f8" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (d900fb5261bb)" waittime="2281" ownerId="1370264705" transactionname="user_transaction" lasttranstarted="2010-06-17T00:35:25.483" XDES="0x69453a70" lockMode="U" schedulerid="3" kpid="7624" status="suspended" spid="339" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-17T00:35:25.483" lastbatchcompleted="2010-06-17T00:35:25.483" clientapp=".Net SqlClient Data Provider" hostname="RISKBBG_VM" hostpid="5848" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1370264705" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056"> <executionStack> <frame procname="MKP_RISKDB.dbo.MarketDataCurrentRtUpload" line="14" stmtstart="840" stmtend="1220" sqlhandle="0x03000600005f9d24c8878f00849d00000100000000000000"> UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid; -- Insert new MDID </frame> <frame procname="adhoc" line="1" sqlhandle="0x010006004a58132228bf8d73000000000000000000000000"> MarketDataCurrentBlbgRtUpload </frame> </executionStack> <inputbuf> MarketDataCurrentBlbgRtUpload </inputbuf> </process> <process id="processffffffff8f5863e8" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (d900ed5a6cc6)" waittime="2281" ownerId="1370264646" transactionname="user_transaction" lasttranstarted="2010-06-17T00:35:25.450" XDES="0x1cb72be8" lockMode="U" schedulerid="5" kpid="1880" status="suspended" spid="287" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-17T00:35:25.450" lastbatchcompleted="2010-06-17T00:35:25.450" clientapp=".Net SqlClient Data Provider" hostname="RISKAPPS_VM" hostpid="1424" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1370264646" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056"> <executionStack> <frame procname="MKP_RISKDB.dbo.MarketDataCurrent_BulkUpload" line="28" stmtstart="1062" stmtend="1720" sqlhandle="0x03000600a28e5e4ef4fd8e00849d00000100000000000000"> UPDATE c WITH (ROWLOCK) SET LastUpdate = getdate(), Value = t.Value, Source = @source FROM MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid WHERE c.lastUpdate < @updateTime and c.mdid not in (select mdid from MarketData where BloombergTicker is not null and PriceSource like 'Live.%') and c.value <> t.value </frame> <frame procname="adhoc" line="1" stmtstart="88" sqlhandle="0x01000600c1653d0598706ca7000000000000000000000000"> exec MarketDataCurrent_BulkUpload @clearBefore, @source </frame> <frame procname="unknown" line="1" sqlhandle="0x000000000000000000000000000000000000000000000000"> unknown </frame> </executionStack> <inputbuf> (@clearBefore datetime,@source nvarchar(10))exec MarketDataCurrent_BulkUpload @clearBefore, @source </inputbuf> </process> </process-list> <resource-list> <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock64ac7940" mode="U" associatedObjectId="72057594090487808"> <owner-list> <owner id="processffffffff8f5863e8" mode="U"/> </owner-list> <waiter-list> <waiter id="processaf02f8" mode="U" requestType="wait"/> </waiter-list> </keylock> <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lockffffffffb8d2dd40" mode="U" associatedObjectId="72057594090487808"> <owner-list> <owner id="processaf02f8" mode="U"/> </owner-list> <waiter-list> <waiter id="processffffffff8f5863e8" mode="U" requestType="wait"/> </waiter-list> </keylock> </resource-list> </deadlock> </deadlock-list>

Read the article
iDevice for Dummies: Can a device be assigned multiple provisions (Personal/Enterprise)?

- by Alex

Hi guys, Is it possible to assign multiple provisions to an iDevice? To be honest, I'm not sure if I'm using the correct terminology, but basically, I'm developing an iPad app for a company and I've only been testing it in the simulator because I don't have a registration to the developer program and they haven't setup their enterprise registration yet either. And I'm sure you all know how limited the simulator is... I don't really care about the $99 it costs to join, but what I'm worried about is having my iDevices locked permanently to my personal registration and unable to switch back and forth to the enterprise registration. I'd appreciate it if someone can explain to me how the registrations work. And keep in mind, I'm a dummy. :) Thanks in advance!

Read the article
Is it possible to use os.walk over SSH?

- by LeoB

I'm new to Python so forgive me if this is basic, I've searched but can't find an answer. I'm trying to convert a Perl script into Python (3.x) which connects to a remote server and copies the files in a given directory to the local machine. Integrity of the transfer is paramount and there are several steps built-in to ensure a complete and accurate transfer. The first step is to get a complete listing of the files to be passed to rsync. The Perl script has the following lines to accomplish this: @dir_list = `ssh user@host 'find $remote_dir -type f -exec /bin/dirname {} \\;'`; @file_list = `ssh user@host 'find $remote_dir -type f -exec /bin/basename {} \\;'`; The two lists are then joined to create $full_list. Rather than open two separate ssh instances I'd like to open one and use os.walk to get the information using: for remdirname, remdirnames, remfilesnames in os.walk(remotedir): for remfilename in remfilesnames: remfulllist.append(os.path.join(remdirname, remfilename)) Thank you for any help you can provide.

Read the article
Best way to keep a .net client app updated with status of another application

- by rwmnau

I have a Windows service that's running all the time, and takes some action every 15 minutes. I also have a client WinForms app that displays some information about what the service is doing. I'd like the forms application to keep itself updated with a recent status, but I'm not sure if polling every second is a good move performance-wise. When it starts, my Windows Service opens a WCF named pipe to receive queries (from my client form) Every second, a timer on the winform sends a query to the pipe, and then displays the results. If the pipe isn't there, the form displays that the service isn't running. Is that the best way to do this? If my service opens the pipe when it starts, will it always stay open (until I close it or my service stops)? In addition to polling the service, maybe there's some way for the service to notify any watching applications of certain events, like starting and stopping processing? That way, I could poll less, since I'd presumably know about big events already, and would only be polling for progress. Anything that I'm missing?

Read the article
Rails find by *all* associated tags.id in

- by mark

Hi Say I have a model Taggable has_many tags, how may I find all taggables by their associated tag's taggable_id field? Taggable.find(:all, :joins => :tags, :conditions => {:tags => {:taggable_id => [1,2,3]}}) results in this: SELECT `taggables`.* FROM `taggables` INNER JOIN `tags` ON tags.taggable_id = taggables.id WHERE (`tag`.`taggable_id` IN (1,2,3)) The syntax is incredible but does not fit my needs in that the resulting sql returns any taggable that has any, some or all of the tags. How can I find taggables with related tags of field taggable_id valued 1, 2 and 3? Thanks for any advice. :)

Read the article
How to save to two tables using one SQLAlchemy model

- by Oatman

I have an SQLAlchemy ORM class, linked to MySQL, which works great at saving the data I need down to the underlying table. However, I would like to also save the identical data to a second archive table. Here's some psudocode to try and explain what I mean my_data = Data() #An ORM Class my_data.name = "foo" #This saves just to the 'data' table session.add(my_data) #This will save it to the identical 'backup_data' table my_data_archive = my_data my_data_archive.__tablename__ = 'backup_data' session.add(my_data_archive) #And commits them both session.commit() Just a heads up, I am not interested in mapping a class to a JOIN, as in: http://www.sqlalchemy.org/docs/05/mappers.html#mapping-a-class-against-multiple-tables

Read the article
Incorrect value for UNIQUE_CONSTRAINT_NAME in REFERENTIAL_CONSTRAINTS

- by van

I am listing all FK constraints for a given table using INFORMATION_SCHEMA set of views with the following query: SELECT X.UNIQUE_CONSTRAINT_NAME, "C".*, "X".* FROM "INFORMATION_SCHEMA"."KEY_COLUMN_USAGE" AS "C" INNER JOIN "INFORMATION_SCHEMA"."REFERENTIAL_CONSTRAINTS" AS "X" ON "C"."CONSTRAINT_NAME" = "X"."CONSTRAINT_NAME" AND "C"."TABLE_NAME" = 'MY_TABLE' AND "C"."TABLE_SCHEMA" = 'MY_SCHEMA' Everything works perfectly well, but for one particular constraint the value of UNIQUE_CONSTRAINT_NAME column is wrong, and I need it in order to find additional information from the referenced Column. Basically, for most of the rows the UNIQUE_CONSTRAINT_NAME contains the name of the unique constraint (or PK) in the referenced table, but for one particular FK it is the name of some other unique constraint. I dropped and re-created the FK - did not help. My assumption is that the meta-data is somehow screwed. Is there a way to rebuild the meta data so that the INFORMATION_SCHEMA views would actually show the correct data?

Read the article
Delete on a very deep tree

- by Kathoz

I am building a suffix trie (unfortunately, no time to properly implement a suffix tree) for a 10 character set. The strings I wish to parse are going to be rather long (up to 1M characters). The tree is constructed without any problems, however, I run into some when I try to free the memory after being done with it. In particularly, if I set up my constructor and destructor to be as such (where CNode.child is a pointer to an array of 10 pointers to other CNodes, and count is a simple unsigned int): CNode::CNode(){ count = 0; child = new CNode* [10]; memset(child, 0, sizeof(CNode*) * 10); } CNode::~CNode(){ for (int i=0; i<10; i++) delete child[i]; } I get a stack overflow when trying to delete the root node. I might be wrong, but I am fairly certain that this is due to too many destructor calls (each destructor calls up to 10 other destructors). I know this is suboptimal both space, and time-wise, however, this is supposed to be a quick-and-dirty solution to a the repeated substring problem. tl;dr: how would one go about freeing the memory occupied by a very deep tree? Thank you for your time.

Read the article

< Previous Page | 369 370 371 372 373 374 375 376 377 378 379 380 | Next Page >