python time - Page 752 - Developer IT

Which technology is best suited to store and query a huge readonly graph?

- by asmaier

I have a huge directed graph: It consists of 1.6 million nodes and 30 million edges. I want the users to be able to find all the shortest connections (including incoming and outgoing edges) between two nodes of the graph (via a web interface). At the moment I have stored the graph in a PostgreSQL database. But that solution is not very efficient and elegant, I basically need to store all the edges of the graph twice (see my question PostgreSQL: How to optimize my database for storing and querying a huge graph). It was suggested to me to use a GraphDB like neo4j or AllegroGraph. However the free version of AllegroGraph is limited to 50 million nodes and also has a very high-level API (RDF), which seems too powerful and complex for my problem. Neo4j on the other hand has only a very low level API (and the python interface is not mature yet). Both of them seem to be more suited for problems, where nodes and edges are frequently added or removed to a graph. For a simple search on a graph, these GraphDBs seem to be too complex. One idea I had would be to "misuse" a search engine like Lucene for the job, since I'm basically only searching connections in a graph. Another idea would be, to have a server process, storing the whole graph (500MB to 1GB) in memory. The clients could then query the server process and could transverse the graph very quickly, since the graph is stored in memory. Is there an easy possibility to write such a server (preferably in Python) using some existing framework? Which technology would you use to store and query such a huge readonly graph?

Read the article

Change Data Capture or Change Tracking - Same as Traditional Audit Trail Table?

- by HardCode

Before I delve into the abyss of Microsoft documentation any deeper, I'd like to know if someone experienced with Change Data Capture and Change Tracking know if one or both of these can be used to replace the traditional ... "Audit trail table copy of the 'real table' (all of the fields of the original table, plus date/time, user ID, and DML action field) inserted into by Triggers" ... setup for a database table audit trail, where the trigger populates the audit trail table (which is all manual work). The MSDN overview documentation explains at a high level what Change Data Capture and Change Tracking are, but it isn't clear enough to me, and doesn't state outright, that these tools can be used to replace the traditional audit trail tables we've made so often. Can someone with any experience using Change Data Capture and Change Tracking save me a lot of time, or confirm that I am spending time looking at the right tool? The critical part of our audit trail is capturing all changes to a table's fields (on INSERT, UPDATE, DELETE), when it happened, and who did it. These changes are commonly provided to an end user chronologically via an audit trail report. Which is another question ... Change Data Capture or Change Tracking is the solution, I'd assume that this data can be queried just like data from a normal table? EDIT: I need a permanent audit trail, irregardless of time. I see that Change Data Capture has to do with the transaction logs, so this sounds finite to me.

Read the article

CSS Parser - Insert mtimes

- by brad

What command line tool can I use to automatically insert mtimes into urls in my css files for the purposes of breaking the cache? /* before */ .example { background: url(example.jpg); } /* after */ .example { background: url(example.jpg?1271298451); } Also, I would like this tool to spit out the latest mtime as the css files mtime. (If the css file is still cached then the new urls will not get to the client.) In searching the web, I have found very few tools that can do this. I am even considering rolling my own, but have found very little in the way of css parsers that are actively maintained. A candidate should be: fast (I don't want to wait 30 seconds on deployment) command line accessible (something like "cat foo.css bar.css | cssmtime out.css") What I've found so Far yui compressor - initially I thought I would extend the yui compressor to do this, but found that it is implemented as a bunch of regex's and not a parser. csstidy - last release was in 2007 and development has been suspended, but does have an option for inserting mtimes (also written in php, something I have no experience in) cssutils - python sac implementation - seems to be actively maintained, but also seems like overkill for my needs. Also, written in python which I have experience with csspool - ruby sac implementation - I don't know much ruby, but would like to learn other sac implementations - There are several java implementations, and a c implementation neither of which I know much about What's your experience? Have you used any of these libraries? Was the experience positive? Would you recommend I go with them for my purposes?

Read the article

Jquery Internet Explorer 8 compatibility issue, does not load data unless history is deleted...

- by Scarface

Hey guys, I have a weird problem. I have an update system that refreshes data on a time interval. It works well in all browsers except internet explorer 8. The problem is that once it loads the data, it does not matter if the data updates further, it will not update the data visually until the internet history is cleared. I am not using any cookies server-side...Anyone ever encounter something like this? Here is my javascript, thanks for any assistance in advance function prepare(response) { var d = new Date(); count++; d.setTime(response.time*1000); var mytime = d.getHours()+':'+d.getMinutes()+':'+d.getSeconds(); var string = '<li class="shoutbox-list" id="list-'+count+'">' + '<a href="statistics.php?user='+response.user+'">'+response.user+'</a>' + ' '+mytime+' ' + ''+response.message+'' +'</li>'; return string; } function refresh() { $.getJSON(files+"shoutbox.php?action=view&time="+lastTime+"&topic_id="+topic_id, function(json) { if(json.length) { for(i=0; i < json.length; i++) { $('#daddy-shoutbox-list').prepend(prepare(json[i])); $('#list-' + count).fadeIn(1500); } var j = i-1; lastTime = json[j].time; } //alert(lastTime); }); timeoutID = setTimeout(refresh, 3000); } $(document).ready(function() { var options = { dataType: 'json', beforeSubmit: validate, success: function(response, status){ if (response.error=='success'){ success(response, status); } else { $.prompt(response.error); } } }; $('#daddy-shoutbox-form').ajaxForm(options); timeoutID = setTimeout(refresh, 100); });

Read the article

PHP Performance Metrics

- by bigstylee

I am currently developing a PHP MVC Framework for a personal project. While I am developing the framework I am interested to see any notable performance by implementing different techniques for optimization. I have implemented a crude BenchMark class that logs mircotime. The problem is I have no frame of reference for execution times. I am very near the beginnig of this project with a database connection and a few queries but no output (bar some debugging text and BenchMark log). I have a current execution time of 0.01917 seconds. I was expecting this to be lower but as I said before I have no frame of reference. I appreciate there are many variables to take into account when juding performance but I am hoping to find some sort of metric to a) techniques to measure performance for example requests per second and b) compare results for example; how a "moderately" sized PHP application on a "standard" webserver will perform. I appreciate "moderately" and "standard" are very subjective words so perhaps a table of known execution times for a particular application (eg StackOverFlow's executing time). What are other techniques of measuring performance are there other than execution time? When looking at MVC Framework Performance Comparisom it talks about Requests Per Second (RPS). How is this calculated? I am guessing with my current execution time of 0.01917 seconds can handle 52 RPS (= 1 / 0.01917 ). This seems to be significantly lower than that quoted on the graph especially when you consider my current limited funcitonality.

Read the article

Trouble using latex in Matplotlib / Scipy etc.

- by ajhall

I'm having some issues with my first attempts at using matplotlib and scipy to make some scatter plots of my data (too many variables, trying to see many things at once). Here's some code of mine that is working fairly well... import numpy from scipy import * import pylab from matplotlib import * import h5py FileID = h5py.File('3DiPVDplot1.mat','r') # (to view the contents of: list(FileID) ) group = FileID['/'] CurrentsArray = group['Currents'].value IvIIIarray = group['IvIII'].value PFarray = group['PF'].value growthTarray = group['growthT'].value fig = pylab.figure() ax = fig.add_subplot(111) cax = ax.scatter(IvIIIarray, growthTarray, PFarray, CurrentsArray, alpha=0.75) cbar = fig.colorbar(cax) ax.set_xlabel('Cu / III') ax.set_ylabel('Growth T') ax.grid(True) pylab.show() I tried to change the code to include latex fonts and interpreting, none of it seems to work for me, however. Here's an example attempt that didn't work: import numpy from scipy import * import pylab from matplotlib import * import h5py rc('text', usetex=True) rc('font', family='serif') FileID = h5py.File('3DiPVDplot1.mat','r') # (to view the contents of: list(FileID) ) group = FileID['/'] CurrentsArray = group['Currents'].value IvIIIarray = group['IvIII'].value PFarray = group['PF'].value growthTarray = group['growthT'].value fig = pylab.figure() ax = fig.add_subplot(111) cax = ax.scatter(IvIIIarray, growthTarray, PFarray, CurrentsArray, alpha=0.75) cbar = fig.colorbar(cax) ax.set_xlabel(r'Cu / III') ax.set_ylabel(r'Growth T') ax.grid(True) pylab.show() I'm using fink installed python26 with corresponding packages for scipy matplotlib etc. I've been using iPython and manual work instead of scripts in python. Since I'm completely new to python and scipy, I'm sure I'm making some stupid simple mistakes. Please enlighten me! I greatly appreciate the help!

Read the article

Functional way to get a matrix from text

- by Elazar Leibovich

I'm trying to solve some Google Code Jam problems, where an input matrix is typically given in this form: 2 3 #matrix dimensions 1 2 3 4 5 6 7 8 9 # all 3 elements in the first row 2 3 4 5 6 7 8 9 0 # each element is composed of three integers where each element of the matrix is composed of, say, three integers. So this example should be converted to #!scala Array( Array(A(1,2,3),A(4,5,6),A(7,8,9), Array(A(2,3,4),A(5,6,7),A(8,9,0), ) An imperative solution would be of the form #!python input = """2 3 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 0 """ lines = input.split('\n') print lines[0] m,n = (int(x) for x in lines[0].split()) array = [] row = [] A = [] for line in lines[1:]: for elt in line.split(): A.append(elt) if len(A)== 3: row.append(A) A = [] array.append(row) row = [] from pprint import pprint pprint(array) A functional solution I've thought of is #!scala def splitList[A](l:List[A],i:Int):List[List[A]] = { if (l.isEmpty) return List[List[A]]() val (head,tail) = l.splitAt(i) return head :: splitList(tail,i) } def readMatrix(src:Iterator[String]):Array[Array[TrafficLight]] = { val Array(x,y) = src.next.split(" +").map(_.trim.toInt) val mat = src.take(x).toList.map(_.split(" "). map(_.trim.toInt)). map(a => splitList(a.toList,3). map(b => TrafficLight(b(0),b(1),b(2)) ).toArray ).toArray return mat } But I really feel it's the wrong way to go because: I'm using the functional List structure for each line, and then convert it to an array. The whole code seems much less efficeint I find it longer less elegant and much less readable than the python solution. It is harder to which of the map functions operates on what, as they all use the same semantics. What is the right functional way to do that?

Read the article

Django and Google App Engine Helper not finding the ipaddr module.

- by Phil

I'm trying to get Django running on GAE using this tutorial. When I run python manage.py runserver I get the stacktrace below. I'm new to both django and python so I don't know what my next steps are (This is Ubuntu Jaunty btw). It seems django isn't finding the GAE module ipaddr which comes with SDK 1.3.1. How do I get django to find this module? /home/username/bin/google_appengine/google/appengine/api/datastore_file_stub.py:40: DeprecationWarning: the md5 module is deprecated; use hashlib instead import md5 /home/username/bin/google_appengine/google/appengine/api/memcache/__init__.py:31: DeprecationWarning: the sha module is deprecated; use the hashlib module instead import sha Traceback (most recent call last): File "manage.py", line 18, in <module> InstallAppengineHelperForDjango() File "/home/username/Development/GAE/myapp/appengine_django/__init__.py", line 543, in InstallAppengineHelperForDjango InstallDjangoModuleReplacements() File "/home/username/Development/GAE/myapp/appengine_django/__init__.py", line 260, in InstallDjangoModuleReplacements import django.db File "/home/username/Development/GAE/myapp/django/db/__init__.py", line 57, in <module> 'TIME_ZONE': settings.TIME_ZONE, File "/home/username/Development/GAE/myapp/appengine_django/db/base.py", line 117, in __init__ self._setup_stubs() File "/home/username/Development/GAE/myapp/appengine_django/db/base.py", line 128, in _setup_stubs from google.appengine.tools import dev_appserver_main File "/home/username/bin/google_appengine/google/appengine/tools/dev_appserver_main.py", line 82, in <module> from google.appengine.tools import appcfg File "/home/username/bin/google_appengine/google/appengine/tools/appcfg.py", line 53, in <module> from google.appengine.api import dosinfo File "/home/username/bin/google_appengine/google/appengine/api/dosinfo.py", line 25, in <module> import ipaddr ImportError: No module named ipaddr

Read the article

Neural Network settings for fast training

- by danpalmer

I am creating a tool for predicting the time and cost of software projects based on past data. The tool uses a neural network to do this and so far, the results are promising, but I think I can do a lot more optimisation just by changing the properties of the network. There don't seem to be any rules or even many best-practices when it comes to these settings so if anyone with experience could help me I would greatly appreciate it. The input data is made up of a series of integers that could go up as high as the user wants to go, but most will be under 100,000 I would have thought. Some will be as low as 1. They are details like number of people on a project and the cost of a project, as well as details about database entities and use cases. There are 10 inputs in total and 2 outputs (the time and cost). I am using Resilient Propagation to train the network. Currently it has: 10 input nodes, 1 hidden layer with 5 nodes and 2 output nodes. I am training to get under a 5% error rate. The algorithm must run on a webserver so I have put in a measure to stop training when it looks like it isn't going anywhere. This is set to 10,000 training iterations. Currently, when I try to train it with some data that is a bit varied, but well within the limits of what we expect users to put into it, it takes a long time to train, hitting the 10,000 iteration limit over and over again. This is the first time I have used a neural network and I don't really know what to expect. If you could give me some hints on what sort of settings I should be using for the network and for the iteration limit I would greatly appreciate it. Thank you!

Read the article

Why doesn't Perl file glob() work outside of a loop in scalar context?

- by Rob

According to the Perl documentation on file globbing, the <*> operator or glob() function, when used in a scalar context, should iterate through the list of files matching the specified pattern, returning the next file name each time it is called or undef when there are no more files. But, the iterating process only seems to work from within a loop. If it isn't in a loop, then it seems to start over immediately before all values have been read. From the Perl docs: In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted. http://perldoc.perl.org/functions/glob.html However, in scalar context the operator returns the next value each time it's called, or undef when the list has run out. http://perldoc.perl.org/perlop.html#I/O-Operators Example code: use warnings; use strict; my $filename; # in scalar context, <*> should return the next file name # each time it is called or undef when the list has run out $filename = <*>; print "$filename\n"; $filename = <*>; # doesn't work as documented, starts over and print "$filename\n"; # always returns the same file name $filename = <*>; print "$filename\n"; print "\n"; print "$filename\n" while $filename = <*>; # works in a loop, returns next file # each time it is called In a directory with 3 files...file1.txt, file2.txt, and file3.txt, the above code will output: file1.txt file1.txt file1.txt file1.txt file2.txt file3.txt Note: The actual perl script should be outside the test directory, or you will see the file name of the script in the output as well. Am I doing something wrong here, or is this how it is supposed to work?

Read the article

PHP database simulation

- by Emdiesse

I have a PHP script that works by calling items from a database based upon the time they were placed in there and it deletes them if they are older than 5 minutes. Basically, I want to now simulate what would happen if this database was being updated regularly. So I was considering sticking in some code that loads an XML file then goes through and parses that into the database based upon the time data located within a node of the xml data... but the problem there is I want it to continually loop through an enter this data so it'll never actually run the other processes So I was thinking of having another PHP script do that that could do this independantly of the php script that is going to display this data... In theory: I am looking to have a button that I can press and it will then run some php code to load up an XML file from a directory on my web server and then iterate though the data sending the data, to a database, based upon the time within a node in the PHP script and when the script was first called So back to my page that displayed the data... if I continually hit refresh it will contain different results each time because data is being added by the other process and this php script removes the older data when it is refreshed Any information on this? Is there a way I can silently, and safely, run a php script without it being loaded into a browser... like a thread!?

Read the article

Why do I have to unserialize this serialized value twice? (wordpress/bbpress maybe_serialize)

- by 55skidoo

What am I doing wrong here? I'm serializing a value, storing it in a database (table bb_meta), retrieving it... OK so far... but then I have to unserialize it twice. Shouldn't I be able to just unserialize once? This seems to work, but I'm wondering what point about serialization I'm missing here. //check database to see if user has ever visited before. $querystring = $bbdb->prepare( "SELECT `meta_value` FROM `$bbdb->meta` WHERE `object_type` = %s AND `object_id` = %s AND `meta_key` = %s LIMIT 1", $bbtype, $bb_this_thread, $bbuser ); $bb_last_visits = $bbdb->get_row($querystring, OBJECT); //if $bb_last_visits is empty, add time() as the metavalue using bb_update_meta if (empty($bb_last_visits)) { $first_visit = time(); echo 'serialized first visit: ' . $bb_this_visit_time_serialized = serialize(array($bb_this_thread => $first_visit)); bb_update_meta( $bb_this_thread, $bbuser, $bb_this_visit_time_serialized, $bbtype ); //add to database, bb_meta table echo '$bb_last_visits was empty. Setting first visit time as ' . $bb_this_visit_time_serialized . ' '; } else { //else, test by unserializing the data for use. echo 'last visit time already set: '; echo $bb_last_visits->meta_value; echo ' '; //fatal error - echo 'unserialized: ' . $bb_last_visits_unserialized = unserialize($bb_last_visits[0]->meta_value); echo ' '; echo 'unserialize: ' . $unserialized_visits = unserialize($bb_last_visits->meta_value); echo ' '; echo 'hmm, need to unserialize again??: '; echo $unserialized_unserialized_visits = unserialize($unserialized_visits); echo ' '; echo 'hey look, it\'s an array value I can finally use now. phew: ' . $unserialized_unserialized_visits[$bb_this_thread]; }

Read the article

Why does the page posts take so long?

- by Olle

Hi! I am having some problems with some page post backs that take a loooong time to execute. If I do a "appcmd list requests" I can get something like this: REQUEST "79000001800004e3" (url:POST /dir/file.aspx, time:87219 msec, client:xxx.xxx.xxx.xxx, stage:ExecuteRequestHandler, module:ManagedPipelineHandler) REQUEST "8600000080002f82" (url:POST /dir/file.aspx, time:61391 msec, client:xxx.xxx.xxx.xxx, stage:AcquireRequestState, module:Session) REQUEST "5e00010280000420" (url:POST /dir/file.aspx, time:21047 msec, client:xxx.xxx.xxx.xxx, stage:AcquireRequestState, module:Session) It's one particular file that causes the problem (dir/file.aspx in this case). It comes from the same IP-adress. And the first on is from ManagedPipelineHandler module and the two after that from Session module. I do not have any details about the web browser, or anything more about the client for that matter. I have looked for sql dead locks and did not find any. There are no long running sql queries at all. Do you have any idea of what can be the problem? Regards.

Read the article

Why do I have to unserialize this serialized value twice?

- by 55skidoo

What am I doing wrong here? I'm serializing a value, storing it in a database (table bb_meta), retrieving it... OK so far... but then I have to unserialize it twice. Shouldn't I be able to just unserialize once? This seems to work, but I'm wondering what point about serialization I'm missing here. //check database to see if user has ever visited before. $querystring = $bbdb->prepare( "SELECT `meta_value` FROM `$bbdb->meta` WHERE `object_type` = %s AND `object_id` = %s AND `meta_key` = %s LIMIT 1", $bbtype, $bb_this_thread, $bbuser ); $bb_last_visits = $bbdb->get_row($querystring, OBJECT); //if $bb_last_visits is empty, add time() as the metavalue using bb_update_meta if (empty($bb_last_visits)) { $first_visit = time(); echo 'serialized first visit: ' . $bb_this_visit_time_serialized = serialize(array($bb_this_thread => $first_visit)); bb_update_meta( $bb_this_thread, $bbuser, $bb_this_visit_time_serialized, $bbtype ); //add to database, bb_meta table echo '$bb_last_visits was empty. Setting first visit time as ' . $bb_this_visit_time_serialized . ' '; } else { //else, test by unserializing the data for use. echo 'last visit time already set: '; echo $bb_last_visits->meta_value; echo ' '; //fatal error - echo 'unserialized: ' . $bb_last_visits_unserialized = unserialize($bb_last_visits[0]->meta_value); echo ' '; echo 'unserialize: ' . $unserialized_visits = unserialize($bb_last_visits->meta_value); echo ' '; echo 'hmm, need to unserialize again??: '; echo $unserialized_unserialized_visits = unserialize($unserialized_visits); echo ' '; echo 'hey look, it\'s an array value I can finally use now. phew: ' . $unserialized_unserialized_visits[$bb_this_thread]; }

Read the article

Industry-style practices for increasing productivity in a small scientific environment

- by drachenfels

Hi, I work in a small, independent scientific lab in a university in the United States, and it has come to my notice that, compared with a lot of practices that are ostensibly followed in the industry, like daily checkout into a version control system, use of a single IDE/editor for all languages (like emacs), etc, we follow rather shoddy programming practices. So, I was thinking of getting together all my programs, scripts, etc, and building a streamlined environment to increase productivity. I'd like suggestions from people on Stack Overflow for the same. Here is my primary plan.: I use MATLAB, C and Python scripts, and I'd like to edit, compile them from a single editor, and ensure correct version control. (questions/things for which I'd like suggestions are in italics) 1] Install Cygwin, and get it to work well with Windows so I can use git or a similar version control system (is there a DVCS which can work directly from the windows CLI, so I can skip the Cygwin step?). 2] Set up emacs to work with C, Python, and MATLAB files, so I can edit and compile all three at once from a single editor (say, emacs) (I'm not very familiar with the emacs menu, but is there a way to set the path to the compiler for certain languages? I know I can Google this, but emacs documentation has proved very hard for me to read so far, so I'd appreciate it if someone told me in simple language) 3] Start checking in code at the end of each day or half-day so as to maintain a proper path of progress of my code (two questions), can you checkout files directly from emacs? is there a way to checkout LabVIEW files into a DVCS like git? Lastly, I'd like to apologize for the rather vague nature of the question, and hope I shall learn to ask better questions over time. I'd appreciate it if people gave their suggestions, though, and point to any resources which may help me learn.

Read the article

How Search Engine Bots Crawl Forums?

- by Waleed Eissa

If I have a forums site with a large number of threads, will the search engine bot crawl the whole site every time? Say I have over 1,000,000 threads in my site, will they get crawled every time the bot crawls my site? or how does it work? I want my website to be indexed but I don't want the bot to kill my website! In other words I don't want the bot to keep crawling the old threads again and again every time it crawls my website. Also, what about the pages crawled before? Will the bot request them every time it crawls my website to make sure they are still on the site? I'm asking this because I only link to the latest threads, i.e. there's a page that contains a list of all the latest threads, but I don't link to the older threads, they have to be explicitly requested by URL, e.g. http://www.mysite.com/showthread.aspx?threadid=7 , will this work to stop the bot from bringing my site down and consuming all my bandwidth? P.S. The site is still under development but I want to know in order to design the site so that search engine bots don't bring it down. Thanks

Read the article

Use LaTeX Listings to correctly detect and syntax highlight embedded code of a different language in

- by D W

I have scripts that have one-liners or sort scripts from other languages within them. How can I have LaTeX listings detect this and change the syntax formating language withing the script? This would be especially useful for awk withing bash I believe. Bash #!/bin/bash ... # usage message to catch bad input without invoking R ... # any bash pre-processing of input ... # etc echo "hello world" R --vanilla << EOF # Data on motor octane ratings for various gasoline blends x <- c(88.5,87.7,83.4,86.7,87.5,91.5,88.6,100.3, 95.6,93.3,94.7,91.1,91.0,94.2,87.5,89.9, 88.3,87.6,84.3,86.7,88.2,90.8,88.3,98.8, 94.2,92.7,93.2,91.0,90.3,93.4,88.5,90.1, 89.2,88.3,85.3,87.9,88.6,90.9,89.0,96.1, 93.3,91.8,92.3,90.4,90.1,93.0,88.7,89.9, 89.8,89.6,87.4,88.9,91.2,89.3,94.4,92.7, 91.8,91.6,90.4,91.1,92.6,89.8,90.6,91.1, 90.4,89.3,89.7,90.3,91.6,90.5,93.7,92.7, 92.2,92.2,91.2,91.0,92.2,90.0,90.7) x length(x) mean(x);var(x) stem(x) EOF perl -n -e ' @t = split(/\t/); %t2 = map { $_ => 1 } split(/,/,$t[1]); $t[1] = join(",",keys %t2); print join("\t",@t); ' knownGeneFromUCSC.txt awk -F'\t' '{ n = split($2, t, ","); _2 = x split(x, _) # use delete _ if supported for (i = 0; ++i <= n;) _[t[i]]++ || _2 = _2 ? _2 "," t[i] : t[i] $2 = _2 }-3' OFS='\t' infile Python #!/usr/local/bin/python print "Hello World" os.system(""" VAR=even; sed -i "s/$VAR/odd/" testfile; for i in `cat testfile` ; do echo $i; done; echo "now the tr command is removing the vowels"; cat testfile |tr 'aeiou' ' ' """)

Read the article

Merging k sorted linked lists - analysis

- by Kotti

Hi! I am thinking about different solutions for one problem. Assume we have K sorted linked lists and we are merging them into one. All these lists together have N elements. The well known solution is to use priority queue and pop / push first elements from every lists and I can understand why it takes O(N log K) time. But let's take a look at another approach. Suppose we have some MERGE_LISTS(LIST1, LIST2) procedure, that merges two sorted lists and it would take O(T1 + T2) time, where T1 and T2 stand for LIST1 and LIST2 sizes. What we do now generally means pairing these lists and merging them pair-by-pair (if the number is odd, last list, for example, could be ignored at first steps). This generally means we have to make the following "tree" of merge operations: N1, N2, N3... stand for LIST1, LIST2, LIST3 sizes O(N1 + N2) + O(N3 + N4) + O(N5 + N6) + ... O(N1 + N2 + N3 + N4) + O(N5 + N6 + N7 + N8) + ... O(N1 + N2 + N3 + N4 + .... + NK) It looks obvious that there will be log(K) of these rows, each of them implementing O(N) operations, so time for MERGE(LIST1, LIST2, ... , LISTK) operation would actually equal O(N log K). My friend told me (two days ago) it would take O(K N) time. So, the question is - did I f%ck up somewhere or is he actually wrong about this? And if I am right, why doesn't this 'divide&conquer' approach can't be used instead of priority queue approach?

Read the article

Slowdowns when reading from an urlconnection's inputstream (even with byte[] and buffers)

- by user342677

Ok so after spending two days trying to figure out the problem, and reading about dizillion articles, i finally decided to man up and ask to for some advice(my first time here). Now to the issue at hand - I am writing a program which will parse api data from a game, namely battle logs. There will be A LOT of entries in the database(20+ million) and so the parsing speed for each battle log page matters quite a bit. The pages to be parsed look like this: http://api.erepublik.com/v1/feeds/battle_logs/10000/0. (see source code if using chrome, it doesnt display the page right). It has 1000 hit entries, followed by a little battle info(lastpage will have <1000 obviously). On average, a page contains 175000 characters, UTF-8 encoding, xml format(v 1.0). Program will run locally on a good PC, memory is virtually unlimited(so that creating byte[250000] is quite ok). The format never changes, which is quite convenient. Now, I started off as usual: //global vars,class declaration skipped public WebObject(String url_string, int connection_timeout, int read_timeout, boolean redirects_allowed, String user_agent) throws java.net.MalformedURLException, java.io.IOException { // Open a URL connection java.net.URL url = new java.net.URL(url_string); java.net.URLConnection uconn = url.openConnection(); if (!(uconn instanceof java.net.HttpURLConnection)) { throw new java.lang.IllegalArgumentException("URL protocol must be HTTP"); } conn = (java.net.HttpURLConnection) uconn; conn.setConnectTimeout(connection_timeout); conn.setReadTimeout(read_timeout); conn.setInstanceFollowRedirects(redirects_allowed); conn.setRequestProperty("User-agent", user_agent); } public void executeConnection() throws IOException { try { is = conn.getInputStream(); //global var l = conn.getContentLength(); //global var } catch (Exception e) { //handling code skipped } } //getContentStream and getLength methods which just return'is' and 'l' are skipped Here is where the fun part began. I ran some profiling (using System.currentTimeMillis()) to find out what takes long ,and what doesnt. The call to this method takes only 200ms on avg public InputStream getWebPageAsStream(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; WebObject wobj = new WebObject(url, 10000, 10000, true, "Mozilla/5.0 " + "(Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 ( .NET CLR 3.5.30729)"); wobj.executeConnection(); l = wobj.getContentLength(); // global variable return wobj.getContentStream(); //returns 'is' stream } 200ms is quite expected from a network operation, and i am fine with it. BUT when i parse the inputStream in any way(read it into string/use java XML parser/read it into another ByteArrayStream) the process takes over 1000ms! for example, this code takes 1000ms IF i pass the stream i got('is') above from getContentStream() directly to this method: public static Document convertToXML(InputStream is) throws ParserConfigurationException, IOException, SAXException { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(is); doc.getDocumentElement().normalize(); return doc; } this code too, takes around 920ms IF the initial InputStream 'is' is passed in(dont read into the code itself - it just extracts the data i need by directly counting the characters, which can be done thanks to the rigid api feed format): public static parsedBattlePage convertBattleToXMLWithoutDOM(InputStream is) throws IOException { // Point A BufferedReader br = new BufferedReader(new InputStreamReader(is)); LinkedList ll = new LinkedList(); String str = br.readLine(); while (str != null) { ll.add(str); str = br.readLine(); } if (((String) ll.get(1)).indexOf("error") != -1) { return new parsedBattlePage(null, null, true, -1); } //Point B Iterator it = ll.iterator(); it.next(); it.next(); it.next(); it.next(); String[][] hits_arr = new String[1000][4]; String t_str = (String) it.next(); String tmp = null; int j = 0; for (int i = 0; t_str.indexOf("time") != -1; i++) { hits_arr[i][0] = t_str.substring(12, t_str.length() - 11); tmp = (String) it.next(); hits_arr[i][1] = tmp.substring(14, tmp.length() - 9); tmp = (String) it.next(); hits_arr[i][2] = tmp.substring(15, tmp.length() - 10); tmp = (String) it.next(); hits_arr[i][3] = tmp.substring(18, tmp.length() - 13); it.next(); it.next(); t_str = (String) it.next(); j++; } String[] b_info_arr = new String[9]; int[] space_nums = {13, 10, 13, 11, 11, 12, 5, 10, 13}; for (int i = 0; i < space_nums.length; i++) { tmp = (String) it.next(); b_info_arr[i] = tmp.substring(space_nums[i] + 4, tmp.length() - space_nums[i] - 1); } //Point C return new parsedBattlePage(hits_arr, b_info_arr, false, j); } I have tried replacing the default BufferedReader with BufferedReader br = new BufferedReader(new InputStreamReader(is), 250000); This didnt change much. My second try was to replace the code between A and B with: Iterator it = IOUtils.lineIterator(is, "UTF-8"); Same result, except this time A-B was 0ms, and B-C was 1000ms, so then every call to it.next() must have been consuming some significant time.(IOUtils is from apache-commons-io library). And here is the culprit - the time taken to parse the stream to string, be it by an iterator or BufferedReader in ALL cases was about 1000ms, while the rest of the code took 0ms(e.g. irrelevant). This means that parsing the stream to LinkedList, or iterating over it, for some reason was eating up a lot of my system resources. question was - why? Is it just the way java is made...no...thats just stupid, so I did another experiment. In my main method I added after the getWebPageAsStream(): //Point A ba = new byte[l]; // 'l' comes from wobj.getContentLength above bytesRead = is.read(ba); //'is' is our URLConnection original InputStream offset = bytesRead; while (bytesRead != -1) { bytesRead = is.read(ba, offset - 1, l - offset); offset += bytesRead; } //Point B InputStream is2 = new ByteArrayInputStream(ba); //Now just working with 'is2' - the "copied" stream The InputStream-byte[] conversion took again 1000ms - this is the way many ppl suggested to read an InputStream, and stil it is slow. And guess what - the 2 parser methods above (convertToXML() and convertBattlePagetoXMLWithoutDOM(), when passed 'is2' instead of 'is' took, in all 4 cases, under 50ms to complete. I read a suggestion that the stream waits for connection to close before unblocking, so i tried using HttpComponentsClient 4.0 (http://hc.apache.org/httpcomponents-client/index.html) instead, but the initial InputStream took just as long to parse. e.g. this code: public InputStream getWebPageAsStream2(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; HttpClient httpclient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(url); HttpParams p = new BasicHttpParams(); HttpConnectionParams.setSocketBufferSize(p, 250000); HttpConnectionParams.setStaleCheckingEnabled(p, false); HttpConnectionParams.setConnectionTimeout(p, 5000); httpget.setParams(p); HttpResponse response = httpclient.execute(httpget); HttpEntity entity = response.getEntity(); l = (int) entity.getContentLength(); return entity.getContent(); } took even longer to process(50ms more for just the network) and the stream parsing times remained the same. Obviously it can be instantiated so as to not create HttpClient and properties every time(faster network time), but the stream issue wont be affected by that. So we come to the center problem - why does the initial URLConnection InputStream(or HttpClient InputStream) take so long to process, while any stream of same size and content created locally is orders of magnitude faster? I mean, the initial response is already somewhere in RAM, and I cant see any good reasong why it is processed so slowly compared to when a same stream is just created from a byte[]. Considering I have to parse million of entries and thousands of pages like that, a total processing time of almost 1.5s/page seems WAY WAY too long. Any ideas? P.S. Please ask in any more code is required - the only thing I do after parsing is make a PreparedStatement and put the entries into JavaDB in packs of 1000+, and the perfomance is ok ~ 200ms/1000entries, prb could be optimized with more cache but I didnt look into it much.

Read the article

VB .Net - Reflection: Reflected Method from a loaded Assembly executes before calling method. Why?

- by pu.griffin

When I am loading an Assembly dynamically, then calling a method from it, I appear to be getting the method from Assembly executing before the code in the method that is calling it. It does not appear to be executing in a Serial manner as I would expect. Can anyone shine some light on why this might be happening. Below is some code to illustrate what I am seeing, the code from the some.dll assembly calls a method named PerformLookup. For testing I put a similar MessageBox type output with "PerformLookup Time: " as the text. What I end up seeing is: First: "PerformLookup Time: 40:842" Second: "initIndex Time: 45:873" Imports System Imports System.Data Imports System.IO Imports Microsoft.VisualBasic.Strings Imports System.Reflection Public Class Class1 Public Function initIndex(indexTable as System.Collections.Hashtable) As System.Data.DataSet Dim writeCode As String MessageBox.Show("initIndex Time: " & Date.Now.Second.ToString() & ":" & Date.Now.Millisecond.ToString()) System.Threading.Thread.Sleep(5000) writeCode = RefreshList() End Function Public Function RefreshList() As String Dim asm As System.Reflection.Assembly Dim t As Type() Dim ty As Type Dim m As MethodInfo() Dim mm As MethodInfo Dim retString as String retString = "" Try asm = System.Reflection.Assembly.LoadFrom("C:\Program Files\some.dll") t = asm.GetTypes() ty = asm.GetType(t(28).FullName) 'known class location m = ty.GetMethods() mm = ty.GetMethod("PerformLookup") Dim o as Object o = Activator.CreateInstance(ty) Dim oo as Object() retString = mm.Invoke(o,Nothing).ToString() Catch Ex As Exception End Try return retString End Function End Class

Read the article

How to recalculate all-pairs shorthest paths on-line if nodes are getting removed?

- by Pavel Shved

Latest news about underground bombing made me curious about the following problem. Assume we have a weighted undirected graph, nodes of which are sometimes removed. The problem is to re-calculate shortest paths between all pairs of nodes fast after such removals. With a simple modification of Floyd-Warshall algorithm we can calculate shortest paths between all pairs. These paths may be stored in a table, where shortest[i][j] contains the index of the next node on the shortest path between i and j (or NULL value if there's no path). The algorithm requires O(n³) time to build the table, and eacho query shortest(i,j) takes O(1). Unfortunately, we should re-run this algorithm after each removal. The other alternative is to run graph search on each query. This way each removal takes zero time to update an auxiliary structure (because there's none), but each query takes O(E) time. What algorithm can be used to "balance" the query and update time for all-pairs shortest-paths problem when nodes of the graph are being removed?

Read the article

How can I set drawable to a ListView in android

- by sxingfeng

I am writing a app for android 1.5. I want to use a complex listview to display my data. I want to show a ImageView of a drawable object in my List item. I learned from a demo: ------> listData.put("Img", listData.put("Img", R.drawable.XXX)); listData.put("Time", "100"); listItems.add(listData); It can display correctly, however, I want to change Img at runtime, The image maybe generated at run-time, so I change the code as follow, but it falls. Can anyone help me ? many thanks! protected void onCreate(Bundle savedInstanceState) { // TODO Auto-generated method stub super.onCreate(savedInstanceState); setContentView(R.layout.item_list); itemListView = (ListView) findViewById(R.id.listview); ArrayList<HashMap<String, Object>> listItems = new ArrayList<HashMap<String, Object>>(); for(int i = 0;i <XXX.size(); ++i) { HashMap<String, Object> listData = new HashMap<String, Object>(); ---------> 1) listData.put("Img", new Drawable(XXX)); 2) listData.put("Time", "100"); 3) listItems.add(listData); } SimpleAdapter listItemAdapter = new SimpleAdapter(this, listItems, R.layout.listitem, new String[] { "Img", "Time"}, new int[] { R.id.listitem_img, R.id.listitem_time }); itemListView.setAdapter(listItemAdapter); listitem.xml <?xml version="1.0" encoding="utf-8"?> <LinearLayout android:layout_width="fill_parent" xmlns:android="http://schemas.android.com/apk/res/android" android:layout_height="wrap_content" android:paddingBottom="4dip" android:paddingLeft="12dip" android:paddingRight="12dip"> <ImageView android:paddingTop="12dip" android:layout_width="wrap_content" android:layout_height="wrap_content" android:id="@+id/listitem_img" /> <TextView android:layout_height="wrap_content" android:textSize="20dip" android:layout_width="wrap_content" android:id="@+id/listitem_time" /> </LinearLayout>

Read the article

Trying to use an Xslt for an xml in asp.net

- by Josemalive

Hello, i have the following xslt sheet: <?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:variable name="nhits" select="Answer[@nhits]"></xsl:variable> <xsl:output method="html" indent="yes"/> <xsl:template match="/"> <div> <xsl:call-template name="resultsnumbertemplate"/> </div> </xsl:template> <xsl:template name="resultsnumbertemplate"> <xsl:value-of select="$nhits"/> matches found </xsl:template> </xsl:stylesheet> And this is the xml that im trying to mix with the previous xslt: <Answer xmlns="exa:com.exalead.search.v10" context="n%3Dsl-ocu%26q%3Dlavadoras" last="9" estimated="false" nmatches="219" nslices="0" nhits="219" start="0"> <time> <Time interrupted="false" overall="32348" parse="0" spell="0" exec="1241" synthesis="15302" cats="14061" kwds="14061"> <sliceTimes>15272 </sliceTimes> </Time> </time> </Answer> Im using a xslcompiledtransform and that's working fine: XslCompiledTransform transformer = new XslCompiledTransform(); transformer.Load(HttpContext.Current.Server.MapPath("xslt\\" + requestvariables["xslsheet"].ToString())); transformer.Transform(xmlreader, null, writer); My problems comes when im trying to put into a variable the "nhits" attribute value placed on the Answer element, but i'm not rendering anything using my xslt sheet. Do you know what could be the cause? Could be the xmlns attribute in my xml file? Thanks in advance. Best Regards. Jose

Read the article

django inner redirects

- by Zayatzz

Hello I have one project that in my own development computer (uses mod_wsgi to serve the project) caused no problems. In live server (uses mod_fastcgi) it generates 500 though. my url conf is like this: # -*- coding: utf-8 -*- from django.conf.urls.defaults import * # Uncomment the next two lines to enable the admin: from django.contrib import admin admin.autodiscover() urlpatterns = patterns('', url(r'^admin/', include(admin.site.urls)), url(r'^', include('jalka.game.urls')), ) and # -*- coding: utf-8 -*- from django.conf.urls.defaults import * from django.contrib.auth import views as auth_views urlpatterns = patterns('jalka.game.views', url(r'^$', view = 'front', name = 'front',), url(r'^ennusta/(?P<game_id>\d+)/$', view = 'ennusta', name = 'ennusta',), url(r'^login/$', auth_views.login, {'template_name': 'game/login.html'}, name='auth_login'), url(r'^logout/$', auth_views.logout, {'template_name': 'game/logout.html'}, name='auth_logout'), url(r'^arvuta/$', view = 'arvuta', name = 'arvuta',), ) and .htaccess is like that: Options +FollowSymLinks RewriteEngine on RewriteOptions MaxRedirects=10 # RewriteCond %{HTTP_HOST} . RewriteCond %{HTTP_HOST} ^www\.domain\.com RewriteRule (.*) http://domain.com/$1 [R=301,L] AddHandler fastcgi-script .fcgi RewriteCond %{HTTP_HOST} ^jalka\.domain\.com$ [NC] RewriteCond %{REQUEST_FILENAME} !-f RewriteRule ^(.*) cgi-bin/fifa2010.fcgi/$1 [QSA,L] RewriteCond %{HTTP_HOST} ^subdomain\.otherdomain\.eu$ [NC] RewriteCond %{REQUEST_FILENAME} !-f RewriteRule ^(.*) cgi-bin/django.fcgi/$1 [QSA,L] Notice, that i have also other project set up with same .htaccess and that one is running just fine with more complex urls and views fifa2010.fcgi: #!/usr/local/bin/python # -*- coding: utf-8 -*- import sys, os DOMAIN = "domain.com" APPNAME = "jalka" PREFIX = "/www/apache/domains/www.%s" % (DOMAIN,) # Add a custom Python path. sys.path.insert(0, os.path.join(PREFIX, "htdocs/django/Django-1.2.1")) sys.path.insert(0, os.path.join(PREFIX, "htdocs")) sys.path.insert(0, os.path.join(PREFIX, "htdocs/jalka")) # Switch to the directory of your project. (Optional.) os.chdir(os.path.join(PREFIX, "htdocs", APPNAME)) # Set the DJANGO_SETTINGS_MODULE environment variable. os.environ['DJANGO_SETTINGS_MODULE'] = "%s.settings" % (APPNAME,) from django.core.servers.fastcgi import runfastcgi runfastcgi(method="threaded", daemonize="false") Alan

Read the article

Is 1/0 a legal Java expression?

- by polygenelubricants

The following compiles fine in my Eclipse: final int j = 1/0; // compiles fine!!! // throws ArithmeticException: / by zero at run-time Java prevents many "dumb code" from even compiling in the first place (e.g. "Five" instanceof Number doesn't compile!), so the fact this didn't even generate as much as a warning was very surprising to me. The intrigue deepens when you consider the fact that constant expressions are allowed to be optimized at compile time: public class Div0 { public static void main(String[] args) { final int i = 2+3; final int j = 1/0; final int k = 9/2; } } Compiled in Eclipse, the above snippet generates the following bytecode (javap -c Div0) Compiled from "Div0.java" public class Div0 extends java.lang.Object{ public Div0(); Code: 0: aload_0 1: invokespecial #8; //Method java/lang/Object."<init>":()V 4: return public static void main(java.lang.String[]); Code: 0: iconst_5 1: istore_1 // "i = 5;" 2: iconst_1 3: iconst_0 4: idiv 5: istore_2 // "j = 1/0;" 6: iconst_4 7: istore_3 // "k = 4;" 8: return } As you can see, the i and k assignments are optimized as compile-time constants, but the division by 0 (which must've been detectable at compile-time) is simply compiled as is. javac 1.6.0_17 behaves even more strangely, compiling silently but excising the assignments to i and k completely out of the bytecode (probably because it determined that they're not used anywhere) but leaving the 1/0 intact (since removing it would cause an entirely different program semantics). So the questions are: Is 1/0 actually a legal Java expression that should compile anytime anywhere? What does JLS say about it? If this is legal, is there a good reason for it? What good could this possibly serve?

Search Results

Search found 70655 results on 2827 pages for 'python time'.

Page 752/2827 | < Previous Page | 748 749 750 751 752 753 754 755 756 757 758 759 | Next Page >

- by asmaier

- by HardCode

- by brad

- by Scarface

- by bigstylee

- by ajhall

- by Elazar Leibovich

- by Phil

- by danpalmer

- by Rob

- by Emdiesse

- by 55skidoo

- by Olle

- by 55skidoo

- by drachenfels

- by Waleed Eissa

- by D W

- by Kotti

- by user342677

- by pu.griffin

- by Pavel Shved

- by sxingfeng

- by Josemalive

- by Zayatzz

- by polygenelubricants

< Previous Page | 748 749 750 751 752 753 754 755 756 757 758 759 | Next Page >