Search Results

Search found 9017 results on 361 pages for 'efficient storage'.

Page 247/361 | < Previous Page | 243 244 245 246 247 248 249 250 251 252 253 254  | Next Page >

  • Remove Duplicates from List of HashMap Entries

    - by HonorGod
    I have a List<HashMap<String,Object>> which represents a database where each list record is a database row. I have 10 columns in my database. There are several rows where the values of 2 particular columns are equals. I need to remove the duplicates from the list after the list is updated with all the rows from database. What is the efficient way? FYI - I am not able to do distinct while querying the database, because the GroupName is added at a later stage to the Map after the database is loaded. And since Id column is not primary key, once you add GroupName to the Map. You will have duplicates based on Id + GroupName combination! Hope my question makes sense. Let me know if we need more clarification.

    Read the article

  • Video learning for database design

    - by donpal
    I'm trying to learn good relational database design (using mysql and php if that makes any difference). I've already done some database work, so I'm not totally clueless, but I suspect that my solutions may not have adhered to best practices for efficient searching, optimization, etc. Can someone suggest a good set of videos on the topic? If you know something is superb or has really made a difference in your own learning, please post your suggestion. Prefer videos, but books (as long as they're not too huge) are ok too. But prefer videos. Thank you

    Read the article

  • C# Binary File Compare

    - by Simon Farrow
    I'm in a situation where I want to compare two binary files. One of them is already stored on the server with a pre-calculated Crc32 in the database from when I stored it originally. I know that if the Crc is different then the files are definitely different. However, if the Crc is the same I don't know that the files are. So what I'm looking for is a nice efficient way of comparing the two streams on from the posted file and one from the file system. I'm not an expert on streams but I'm well aware that I could easily shoot myself in the foot here as far as memory usage is concerned. Any help is greatly appreciated.

    Read the article

  • Gmail-like labelling system

    - by Dimitris
    Hi I am looking into a number of ways to implement a labelling system similar to the one in Gmail. Basically I have a Resource at the lowest level and I would like to provide a number of organisational groupings for that resource in the form of labels. If anyone has implemented something like that I would like to hear your views. My idea is to have within the Resource instance a List<Label>. I need to have an efficient mechanism in order to do very fast searches based on the labels or based on the resources. Thanks Dimitris

    Read the article

  • how does NTFS actually work with B-tree ?

    - by bakra
    To improve performance, NTFS directories use a special data management structure called a B-tree. "B-tree" concept here refers to a "tree of storage units" that hold the contents of an individual directory. What I don't understand is where on the disk is this tree stored? Its surely not created every-time we reboot...that would take lots of time. and since its a tree(dynamic Data structure) unlike arrays it will grow. so space needs to be allocated every-time it grows. so how is this "dynamic meta-data" stored ?

    Read the article

  • Coordinate geometry operations in images/discrete space

    - by avd
    I have images which have line segments, rays etc. I am representing these line segments using Bresenham algorithm (means whatever coordinates I get using this algorithm between two points). Now I want to do operations such as finding intersection point between two line segments, finding the projection of one vector onto other etc... The problem is I am not working in continuous space. The line segments are being approximated using Bresenham algorithm. So I want suggestions on what are the best and most efficient ways to do this? A link to C++ library or implementation would also be good enough. Please suggest some books which deal with such problems.

    Read the article

  • Memory Issues When DOM Parsing A Large XML File on Android Devices

    - by tonyc
    Hey awesome SO users, I have an Android application that parses an XML file for users and displays results in a much more mobile friendly format. The app works great for most users, but some users have lots and lots of data and the app crashes on them because it runs out of memory. Is there any way I have a DOM style XML parser quit parsing data after a certain amount of parsing? I only need the first 30 or so elements so it would make the application much more efficient. I'd like to use a SAX or pull parser instead, but the XML I'm parsing is not valid and I have no control over it. Unless anyone has some good SAX solutions that let me parse messy, invalid XML, I think DOM is the only way to go. Thanks for reading!

    Read the article

  • Handling inverse kinematics: animation blending or math?

    - by meds
    I've been working for the past four days on inverse kinematics for my game engine. I'm working on a game with a shoestring budget so when the idea of inverse kinematics came up I knew I had to make it such that the 3D models bones would be mathematically changed to appear to be stepping on objects. This is causing some serious problems with my animation, after it was technically implemented the animations started looking quite bad when the character was wlaking up inclines or steps even though mathematically the stepping was correct and was even smoothly interpolating. So I was wondering, is it actually possible to get a smooth efficient inverse kinematic system based exclusively on math where bones are changed or is this just a wild goose chase and I should either solve the inverse kinematics problem with animation blending or don't do it at all?

    Read the article

  • Universal iPhone/iPad Windows-based app with Core Data crashes on iPhone SDK 4 beta 3

    - by Tarfa
    Hi all. I installed iPhone OS 4.0 Beta 3. When I create a new Windows-based universal app with Core Data (File New Project Windows-based Application --- select Universal in drop down and check the "Use Core Data for storage" check box) the app launches fine into the iPhone simulator but crashes in the iPad simulator. The console message returned is: dyld: Symbol not found: _OBJC_CLASS_$_NSURL Referenced from: /Users/tarfa/Library/Application Support/iPhone Simulator/3.2/Applications/5BB644DC-9370-4894-8884-BAEBA64D9ED0/Universal.app/Universal Expected in: /Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator3.2.sdk/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation I'm stumped. Anyone else experiencing this problem?

    Read the article

  • Distributed datastore

    - by Julien Genestoux
    We're trying to add some kind of persistence in our app. The app generates about 250 entries per second. Each of these entries belong to one of 2M files. For each file, we want to keep the last 10 entries, so we can look them up later. The way our client application works : it gets a stream of all the data it fetches the right file (GET) it adds the new content it saves the file back (PUT) We're looking for an efficient way to store this data that can scale horizontally as the amount of data we're getting is doubling every few weeks. We initially looked at S3. It works fine, but becomes very expensive very fast ($1000 monthly just in PUT operations!) We then gave a shot at Riak. But it seems we can't get more than 60 write/sec on each node, which is very very slow. Any other solution out there?

    Read the article

  • Drawing Rounded Rectangle in DirectX/3D for 2D

    - by Jengerer
    I'm using Direct3D to draw 2D elements in a C++ application of mine, and it'd be neat if I could create rounded-rectangle GUI elements that were varying in size, but I'm not sure how to do that in the most efficient manner possible. I thought of the "easy" way which would be to have images of the four corners and then just place them in the proper positions, and fill in the rest, but varying radii for the rectangle corners would be a definite plus, and this method doesn't accommodate that feature well. Through my searches I've come across the terms Pixel Shader, Stencil Buffering, and HLSL, but I'm not sure whether these terms are relevant and which one to jump into if so. Thanks in advance, Jengerer

    Read the article

  • Cascading S3 Sink Tap not being deleted with SinkMode.REPLACE

    - by Eric Charles
    We are running Cascading with a Sink Tap being configured to store in Amazon S3 and were facing some FileAlreadyExistsException (see [1]). This was only from time to time (1 time on around 100) and was not reproducable. Digging into the Cascading codem, we discovered the Hfs.deleteResource() is called (among others) by the BaseFlow.deleteSinksIfNotUpdate(). Btw, we were quite intrigued with the silent NPE (with comment "hack to get around npe thrown when fs reaches root directory"). From there, we extended the Hfs tap with our own Tap to add more action in the deleteResource() method (see [2]) with a retry mechanism calling directly the getFileSystem(conf).delete. The retry mechanism seemed to bring improvement, but we are still sometimes facing failures (see example in [3]): it sounds like HDFS returns isDeleted=true, but asking directly after if the folder exists, we receive exists=true, which should not happen. Logs also shows randomly isDeleted true or false when the flow succeeds, which sounds like the returned value is irrelevant or not to be trusted. Can anybody bring his own S3 experience with such a behavior: "folder should be deleted, but it is not"? We suspect a S3 issue, but could it also be in Cascading or HDFS? We run on Hadoop Cloudera-cdh3u5 and Cascading 2.0.1-wip-dev. [1] org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory s3n://... already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132) at com.twitter.elephantbird.mapred.output.DeprecatedOutputFormatWrapper.checkOutputSpecs(DeprecatedOutputFormatWrapper.java:75) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:923) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:882) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:882) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:856) at cascading.flow.hadoop.planner.HadoopFlowStepJob.internalNonBlockingStart(HadoopFlowStepJob.java:104) at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:174) at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:137) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:122) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.j [2] @Override public boolean deleteResource(JobConf conf) throws IOException { LOGGER.info("Deleting resource {}", getIdentifier()); boolean isDeleted = super.deleteResource(conf); LOGGER.info("Hfs Sink Tap isDeleted is {} for {}", isDeleted, getIdentifier()); Path path = new Path(getIdentifier()); int retryCount = 0; int cumulativeSleepTime = 0; int sleepTime = 1000; while (getFileSystem(conf).exists(path)) { LOGGER .info( "Resource {} still exists, it should not... - I will continue to wait patiently...", getIdentifier()); try { LOGGER.info("Now I will sleep " + sleepTime / 1000 + " seconds while trying to delete {} - attempt: {}", getIdentifier(), retryCount + 1); Thread.sleep(sleepTime); cumulativeSleepTime += sleepTime; sleepTime *= 2; } catch (InterruptedException e) { e.printStackTrace(); LOGGER .error( "Interrupted while sleeping trying to delete {} with message {}...", getIdentifier(), e.getMessage()); throw new RuntimeException(e); } if (retryCount == 0) { getFileSystem(conf).delete(getPath(), true); } retryCount++; if (cumulativeSleepTime > MAXIMUM_TIME_TO_WAIT_TO_DELETE_MS) { break; } } if (getFileSystem(conf).exists(path)) { LOGGER .error( "We didn't succeed to delete the resource {}. Throwing now a runtime exception.", getIdentifier()); throw new RuntimeException( "Although we waited to delete the resource for " + getIdentifier() + ' ' + retryCount + " iterations, it still exists - This must be an issue in the underlying storage system."); } return isDeleted; } [3] INFO [pool-2-thread-15] (BaseFlow.java:1287) - [...] at least one sink is marked for delete INFO [pool-2-thread-15] (BaseFlow.java:1287) - [...] sink oldest modified date: Wed Dec 31 23:59:59 UTC 1969 INFO [pool-2-thread-15] (HiveSinkTap.java:148) - Now I will sleep 1 seconds while trying to delete s3n://... - attempt: 1 INFO [pool-2-thread-15] (HiveSinkTap.java:130) - Deleting resource s3n://... INFO [pool-2-thread-15] (HiveSinkTap.java:133) - Hfs Sink Tap isDeleted is true for s3n://... ERROR [pool-2-thread-15] (HiveSinkTap.java:175) - We didn't succeed to delete the resource s3n://... Throwing now a runtime exception. WARN [pool-2-thread-15] (Cascade.java:706) - [...] flow failed: ... java.lang.RuntimeException: Although we waited to delete the resource for s3n://... 0 iterations, it still exists - This must be an issue in the underlying storage system. at com.qubit.hive.tap.HiveSinkTap.deleteResource(HiveSinkTap.java:179) at com.qubit.hive.tap.HiveSinkTap.deleteResource(HiveSinkTap.java:40) at cascading.flow.BaseFlow.deleteSinksIfNotUpdate(BaseFlow.java:971) at cascading.flow.BaseFlow.prepare(BaseFlow.java:733) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:761) at cascading.cascade.Cascade$CascadeJob.call(Cascade.java:710) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)

    Read the article

  • What do I use when a cron job isn't enough? (php)

    - by mike
    I'm trying to figure out the most efficient way to running a pretty hefty PHP task thousands of times a day. It needs to make an IMAP connection to Gmail, loop over the emails, save this info to the database and save images locally. Running this task every so often using a cron isn't that big of a deal, but I need to run it every minute and I know eventually the crons will start running on top of each other and cause memory issues. What is the next step up when you need to efficiently run a task multiple times a minute? I've been reading about beanstalk & pheanstalk and I'm not entirely sure if that will do what I need. Thoughts???

    Read the article

  • Finding the heaviest length-constrained path in a weighted Binary Tree

    - by Hristo
    UPDATE I worked out an algorithm that I think runs in O(n*k) running time. Below is the pseudo-code: routine heaviestKPath( T, k ) // create 2D matrix with n rows and k columns with each element = -8 // we make it size k+1 because the 0th column must be all 0s for a later // function to work properly and simplicity in our algorithm matrix = new array[ T.getVertexCount() ][ k + 1 ] (-8); // set all elements in the first column of this matrix = 0 matrix[ n ][ 0 ] = 0; // fill our matrix by traversing the tree traverseToFillMatrix( T.root, k ); // consider a path that would arc over a node globalMaxWeight = -8; findArcs( T.root, k ); return globalMaxWeight end routine // node = the current node; k = the path length; node.lc = node’s left child; // node.rc = node’s right child; node.idx = node’s index (row) in the matrix; // node.lc.wt/node.rc.wt = weight of the edge to left/right child; routine traverseToFillMatrix( node, k ) if (node == null) return; traverseToFillMatrix(node.lc, k ); // recurse left traverseToFillMatrix(node.rc, k ); // recurse right // in the case that a left/right child doesn’t exist, or both, // let’s assume the code is smart enough to handle these cases matrix[ node.idx ][ 1 ] = max( node.lc.wt, node.rc.wt ); for i = 2 to k { // max returns the heavier of the 2 paths matrix[node.idx][i] = max( matrix[node.lc.idx][i-1] + node.lc.wt, matrix[node.rc.idx][i-1] + node.rc.wt); } end routine // node = the current node, k = the path length routine findArcs( node, k ) if (node == null) return; nodeMax = matrix[node.idx][k]; longPath = path[node.idx][k]; i = 1; j = k-1; while ( i+j == k AND i < k ) { left = node.lc.wt + matrix[node.lc.idx][i-1]; right = node.rc.wt + matrix[node.rc.idx][j-1]; if ( left + right > nodeMax ) { nodeMax = left + right; } i++; j--; } // if this node’s max weight is larger than the global max weight, update if ( globalMaxWeight < nodeMax ) { globalMaxWeight = nodeMax; } findArcs( node.lc, k ); // recurse left findArcs( node.rc, k ); // recurse right end routine Let me know what you think. Feedback is welcome. I think have come up with two naive algorithms that find the heaviest length-constrained path in a weighted Binary Tree. Firstly, the description of the algorithm is as follows: given an n-vertex Binary Tree with weighted edges and some value k, find the heaviest path of length k. For both algorithms, I'll need a reference to all vertices so I'll just do a simple traversal of the Tree to have a reference to all vertices, with each vertex having a reference to its left, right, and parent nodes in the tree. Algorithm 1 For this algorithm, I'm basically planning on running DFS from each node in the Tree, with consideration to the fixed path length. In addition, since the path I'm looking for has the potential of going from left subtree to root to right subtree, I will have to consider 3 choices at each node. But this will result in a O(n*3^k) algorithm and I don't like that. Algorithm 2 I'm essentially thinking about using a modified version of Dijkstra's Algorithm in order to consider a fixed path length. Since I'm looking for heaviest and Dijkstra's Algorithm finds the lightest, I'm planning on negating all edge weights before starting the traversal. Actually... this doesn't make sense since I'd have to run Dijkstra's on each node and that doesn't seem very efficient much better than the above algorithm. So I guess my main questions are several. Firstly, do the algorithms I've described above solve the problem at hand? I'm not totally certain the Dijkstra's version will work as Dijkstra's is meant for positive edge values. Now, I am sure there exist more clever/efficient algorithms for this... what is a better algorithm? I've read about "Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees" but that is really complicated and I don't understand it at all. Are there other algorithms that tackle this problem, maybe not as efficiently as spine decomposition but easier to understand? Thanks.

    Read the article

  • setting up/installing/configuring nginx LEMP stack on fresh VPS server

    - by grant tailor
    I need some help in settingup/installing and configuring nginx LEMP stack on a fresh new VPS i have. The specs of the CentOS 5.7 VPS are 2GB DDR3 ECC RAM(4GB burst), 1 core 1.5Ghz(3Ghz burst) and 100GB RAID 10 storage, unmetered bandwidth @ 100Mpbs all for a whopping $25/month(unbeatable, yeah i know :) Anyways i have followed this LEMP(will also need MySQL and PHP) stack guide on linode http://library.linode.com/lemp-guides/centos-5 but basically what i want is to be able to host multiple website on this webserver after everything is setup. I am used to using DirectAdmin control panel on other server and want to have things setup so i can host multiple websites...mostly wordpress and drupal themes. Lets say 10 websites on this nginx web server. So can someone please help me on what i need to do to take "full" advantage of nginx power and performance, while been able to easily manage these multiple websites (wordpress and drupal themes)? Thanks.

    Read the article

  • Matrix Comparison algorithm

    - by SysAdmin
    If you have 2 Matrices of dimensions N*M. what is the best way to get the difference Rect? Example: 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 4 5 4 3 2 3 <---> 2 3 2 3 2 3 2 3 2 3 4 5 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 | | \ / Rect([2,2] , [3,4]) 4 5 4 4 5 2-> A (2 x 3 Matrix) The best I could think of is to scan from Top-Left hit the point where there is difference. Then scan from Bottom Right and hit the point where there is a difference. But In worst case, this is O(N*M). is there a better efficient algorithm?

    Read the article

  • How to save big "database-like" class in python

    - by Rafal
    Hi there, I'm doing a project with reasonalby big DataBase. It's not a probper DB file, but a class with format as follows: DataBase.Nodes.Data=[[] for i in range(1,1000)] f.e. this DataBase is all together something like few thousands rows. Fisrt question - is the way I'm doing efficient, or is it better to use SQL, or any other "proper" DB, which I've never used actually. And the main question - I'd like to save my DataBase class with all record, and then re-open it with Python in another session. Is that possible, what tool should I use? cPickle - it seems to be only for strings, any other? In matlab there's very useful functionality named save workspace - it saves all Your variables to a file that You can open at another session - this would be vary useful in python!

    Read the article

  • Displaying Google Calendar event data on FullCalendar

    - by aurealus
    I am using Google Calendar as a storage engine for a calendar system I am building, however, I am using a single Google user account with multiple calendars, i.e. each user on my system has their own calendar within the one user account. I'm able to create a calendar per user just fine, but I would like to have FullCalendar retrieve the events for display purposes, without manually getting the magic cookie url from Google Calendar settings. I would like to be able to retrieve it programmatically or 'proxy' the feed via an authenticated call to get event data that I'm doing in Django. $('#calendar').fullCalendar({ events: $.fullCalendar.gcalFeed( "http://www.google.com/calendar_url/" <-- or /my/event/feed/url ) });

    Read the article

  • Scala - Enumeration vs. Case-Classes

    - by tzofia
    I've created akka actor called LogActor. The LogActors's receive method handling messages from other actors and logging them to the specified log level. I can distinguish between the different levels in 2 ways. The first one: import LogLevel._ object LogLevel extends Enumeration { type LogLevel = Value val Error, Warning, Info, Debug = Value } case class LogMessage(level : LogLevel, msg : String) The second: (EDIT) abstract class LogMessage(msg : String) case class LogMessageError(msg : String) extends LogMessage(msg) case class LogMessageWarning(msg : String) extends LogMessage(msg) case class LogMessageInfo(msg : String) extends LogMessage(msg) case class LogMessageDebug(msg : String) extends LogMessage(msg) Which way is more efficient? does it take less time to match case class or to match enum value? (I read this question but there isn't any answer referring to the runtime issue)

    Read the article

  • Are there any programs to aid in the mass-editing of Visual SourceSafe checkin comments?

    - by Schnapple
    I know that in Visual SourceSafe you can go in and drill down to the history of an individual file and then drill down to an individual check-in and apply a comment to the check-in that way but that's tedious and time consuming - if you have a lot of files that were checked in at the same time and you want the same comment to apply to all of them this will take forever. I use the tool VSSReporter to generate reports of checkins and other stuff from VSS, but it cannot edit anything, only report on them. Are there any tools which will let you go back and retroactively apply comments to check-ins in an efficient and easy manner?

    Read the article

  • Initializing "new users" in Rails

    - by mathee
    I'm creating a Ruby on Rails application, and I'm trying to create/login/logout users. This is the schema for Users: create_table "users", :force => true do |t| t.string "first_name" t.string "last_name" t.text "reputation" t.integer "questions_asked" t.integer "answers_given" t.string "request" t.datetime "created_at" t.datetime "updated_at" t.string "email_hash" t.string "username" t.string "hashed_password" t.string "salt" end The user's personal information (username, first/last names, email) is populated through a POST. Other things such as questions_asked, reputation, etc. are set by the application, so should be initialized when we create new users. Right now, I'm just setting each of those manually in the create method for UsersController: def create @user = User.new(params[:user]) @user.reputation = 0 @user.questions_asked = 0 @user.answers_given = 0 @user.request = nil ... end Is there a more elegant/efficient way of doing this?

    Read the article

  • Chart Control Inside SSRS ReportViewer is Viewable From Localhost But Not Internet

    - by Daniel Coffman
    A project I own was just moved from an older server to a new one, and in the process of moving the web folder, re-deploying the SSRS reports, restoring the database, configuring IIS, etc... I have lost the ability to view the Microsoft Chart Controls that are embedded in the SSRS reports, that are then displayed by a Microsoft.ReportViewer. I could view them both locally and remotely (via the internet) on the old server. I can view them if I preview the SSRS report in Visual Studio. The report displays fine, only missing all the embedded charts. I can still view them locally through the web browser, just not from the internet. What am I missing? I tried giving permissions to the ChartImageHandler temp storage folder, but it didn't work. I'm getting the Javascript error: Error: ClientReport380ec8ca0c294a809e9986c1bef9db1c is undefined

    Read the article

  • I do not write tests. Am I stupid?

    - by Josh Stodola
    I've done a little bit of reading on unit testing and TDD, and I've never seriously considered writing tests to such a precise extent. Granted, I am not working on any projects that are ridiculously huge. If all I build are small apps, am I stupid for not writing tests? Edit: To clarify, when I say "small apps", I mean apps that are not going to control a persons life and/or their belongings. I generally build things that are supposed to make peoples lives easier and to make them more efficient.

    Read the article

  • How can I efficiently group a large list of URLs by their host name in Perl?

    - by jesper
    I have text file that contains over one million URLs. I have to process this file in order to assign URLs to groups, based on host address: { 'http://www.ex1.com' = ['http://www.ex1.com/...', 'http://www.ex1.com/...', ...], 'http://www.ex2.com' = ['http://www.ex2.com/...', 'http://www.ex2.com/...', ...] } My current basic solution takes about 600 MB of RAM to do this (size of file is about 300 MB). Could you provide some more efficient ways? My current solution simply reads line by line, extracts host address by regex and puts the url into a hash. EDIT Here is my implementation (I've cut off irrelevant things): while($line = <STDIN>) { chomp($line); $line =~ /(http:\/\/.+?)(\/|$)/i; $host = "$1"; push @{$urls{$host}}, $line; } store \%urls, 'out.hash';

    Read the article

  • Efficiency of checking for null varbinary(max) column ?

    - by Moe Sisko
    Using SQL Server 2008. Example table : CREATE table dbo.blobtest (id int primary key not null, name nvarchar(200) not null, data varbinary(max) null) Example query : select id, name, cast((case when data is null then 0 else 1 end) as bit) as DataExists from dbo.blobtest Now, the query needs to return a "DataExists" column, that returns 0 if the blob is null, else 1. This all works fine, but I'm wondering how efficient it is. i.e. does SQL server need to read in the whole blob to its memory, or is there some optimization so that it just does enough reads to figure out if the blob is null or not ? (FWIW, the sp_tableoption "large value types out of row" option is set to OFF for this example).

    Read the article

< Previous Page | 243 244 245 246 247 248 249 250 251 252 253 254  | Next Page >