Search Results

Search found 11409 results on 457 pages for 'large teams'.

Page 309/457 | < Previous Page | 305 306 307 308 309 310 311 312 313 314 315 316  | Next Page >

  • HBase as a multimap

    - by Ibrahim
    Hi guys, I'm doing some large scale text processing work and I'm trying to get started with Hadoop and HBase. One of the things I need to do is build a multimap of some stuff, which I later use to look up things and get all items with a certain key (in a M/R job). Would it be OK to use HBase and insert many rows with the same key and rely on versions/timestamps to achieve a multimap-like setup or is this a bad idea? The multimap is built up in the reduce phase of a Mapreduce task by the way, or at least in the way I've formulated it on paper. Thanks! If more information is needed, I'd be happy to provide it. Not sure whether this question is clear.

    Read the article

  • java servlet: generate zip file from BLOBs

    - by Zack
    I'm trying to zip a large number of pdf files (stored as BLOBs in the DB) and then return the zip as an attachment to the user. What's the best way to do this without running into memory issues? Another note: I actually need to merge some PDFs prior to adding them to the ZipOutputStream. Therefore, a couple PDFs will need to be stored in memory at a time. I assume it would be best to then store them as temporary files on the server before zipping them all?

    Read the article

  • Composite Primary and Cardinality

    - by srini.venigalla
    I have some questions on Composite Primary Keys and the cardinality of the columns. I searched the web, but did not find any definitive answer, so I am trying again. The questions are: Context: Large (50M - 500M rows) OLAP Prep tables, not NOSQL, not Columnar. MySQL and DB2 1) Does the order of keys in a PK matter? 2) If the cardinality of the columns varies heavily, which should be used first. For example, if I have CLIENT/CAMPAIGN/PROGRAM where CLIENT is highly cardinal, CAMPAIGN is moderate, PROGRAM is almost like a bitmap index, what order is the best? 3) What order is the best for Join, if there is a Where clause and when there is no Where Clause (for views) Thanks in advance.

    Read the article

  • Efficient most common suffix algorithm?

    - by taw
    I have a few GBs worth of strings, and for every prefix I want to find 10 most common suffixes. Is there an efficient algorithm for that? An obvious solution would be: Store sorted list of <string, count> pairs. Identify by binary search extent for prefix we're searching. Find 10 highest counts in this extent. Possibly precompute it for all short prefixes, so it doesn't ever need to look at large portion of data. I'm not sure if that would actually be efficient at all. Is there a better way I overlooked? Answers must be real time, but it can take as much preprocessing as necessary.

    Read the article

  • android java.lang.OutOfMemoryError

    - by xiangdream
    hi, all, when i download large data from website, i got this error information: I/global (20094): Default buffer size used in BufferedInputStream constructor. It would be better to be explicit if an 8k buffer is required. D/dalvikvm(20094): GC freed 6153 objects / 3650840 bytes in 335ms I/dalvikvm-heap(20094): Forcing collection of SoftReferences for 3599051-byte al location D/dalvikvm(20094): GC freed 320 objects / 11400 bytes in 144ms E/dalvikvm-heap(20094): Out of memory on a 3599051-byte allocation. I/dalvikvm(20094): "Thread-9" prio=5 tid=17 RUNNABLE I/dalvikvm(20094): | group="main" sCount=0 dsCount=0 s=0 obj=0x439b9480 I/dalvikvm(20094): | sysTid=25762 nice=0 sched=0/0 handle=4065496 anyone can help me?

    Read the article

  • MySQL: Is it faster to use inserts and updates instead of insert on duplicate key update?

    - by Nir
    I have a cron job that updates a large number of rows in a database. Some of the rows are new and therefore inserted and some are updates of existing ones and therefore update. I use insert on duplicate key update for the whole data and get it done in one call. But- I actually know which rows are new and which are updated so I can also do inserts and updates seperately. Will seperating the inserts and updates have advantage in terms of performance? What are the mechanics behind this ? Thanks!

    Read the article

  • Any strategies for assessing the trade-off between CPU loss and memory gain from compression of data

    - by indiehacker
    Are very large TextProperties a burden? Should they be compressed? Say I have a information stored in 2 attributes of type TextProperty in my datastore entities. The strings are always the same length of 65,000 characters and have lots of repeating integers, a sample appearing as follows: entity.pixel_idx = 0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,5,5,5,5,5,5,5,5,5,5,5,5....etc. entity.pixel_color = 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,...etc. So these above could also be represented using much less storage memory by compressing say using only each integer and the length of its series ( '0,8' for '0,0,0,0,0,0,0,0') but then its takes time and CPU to compress and decompress? Any general ideas? Are there some tricks for testing different attempts to the problem?

    Read the article

  • rsync useful w/ encrypted files?

    - by barrycarter
    Is rsync efficient for transferring encrypted files? More specifically: I encrypt 'x' with my public key and call the result 'y'. I rsync 'y' to my backup server. 'x' changes slightly I encrypt the modified 'x' and rsync the modified 'y' to my backup server. Is this efficient? I know a small change in 'x' yields a large change in 'y', but is the change localized? Or has 'y' changed so thoroughly that rsync is not much better than scp? I currently backup my "critical" files by tarring/bzipping them nightly, then encrypting the .tar.bz file and rsync'ing it to my backup server. Many of the individual files don't change, but, of course, the tar file changes if even one of the files change. Is this efficient? Should I be encrypting and backing up each file individually? That way, unchanged files will take no time to rsync.

    Read the article

  • What is branched in a repository?

    - by Peter M
    Ok I hope that this will end up sounding like a reasonable question. From what I understand of subversion if you have a repo that contains multiple projects, then you can branch individual projects within that repo (see SVN Red book - Using Branches) However what I don't quite follow is what happens when you create a branch in one of the distributed systems (Git, Hg, Bazaar - I don't think it matters which one). Can you branch just a sub-directory of the repo, or when you create the branch are you branching the entire repo? This question is part of a larger one that I posted on superuser (choice and setup of version control) and has come about as I am trying to figure out how to best version control a large hierarchal layout of independent projects. It may be that for distributed systems that what I would like to do is best handled by a sub-project mechanism of some sort - but again that is something I am not clear on although I have heard the term mentioned in regards to git.

    Read the article

  • Java multiple Images Uploader

    - by Padur
    Hello Folks I have this new requirement to develop a software which is a large scale image up loader in a web application. I was able to do the same using swing contains several feature like drag and drop, progress bar, remove file / files , modify, limit file size, verify file information, timer, verify at run time ..and its a very powerful tool which uploads images. I would like to do the same in web based app, like user selects 200 images process it and click upload and it should start uploading, like to know any feasible frameworks or any API's which help me do this faster and achieve the same kind of functionality. Please point me in correct direction. -PD

    Read the article

  • TortoiseMerge: is there a way to hide deleted lines in the merge window

    - by baash05
    Hello, I am attempting to merge a large quantity of files. Several of these files are in conflict and TortoiseMerge is the tool of choice. When I view the code in the "merged" windows it shows me all the code that was deleted and added as well as ??? for code that is in-conflict. I am not really sure why it shows me the deleted code, but I do know that I don't want to see it. Seeing lines of code that were removed in a window that is "the results" doesn't make any sense; and, it makes it harder to read the. Is there a way to hide the deleted code in that "results" window so only the text that is actually going to be in the file, shows up? I have read the manual (cursively) and I didn't see anything in there that indicated how to accomplish this (seemingly insignificant) task.

    Read the article

  • Unusual Subversion Folders Appeared After Update

    - by Mark Lansdown
    Hello Everyone, I have been using Subversion for about 2 years to manage a large C# project. On a recent Subversion update, a number of new folders were added to my source code folder: \conf \db \locks \hooks 35+ files were also added during the update, all appearing under the 4 new folders. I haven't changed any client (I use TortoiseSVN) or server software related to Subversion, so I'm puzzled why these folders and files were suddenly introduced. It also seems strange that files seemingly related to the internal workings of Subversion are now part of my source code repository. Can anyone shed some light on why this happened? Thanks in advance, Mark

    Read the article

  • Java Counting # of occurrences of a word in a string

    - by Doug
    I have a large text file I am reading from and I need to find out how many times some words come up. For example, the word "the". I'm doing this line by line each line is a string. I need to make sure that I only count legit "the"'s the the in other would not count. This means I know I need to use regular expressions in some way. What I was trying so far is this: numSpace += line.split("[^a-z]the[^a-z]").length; I realize the regular expression may not be correct at the moment but I tried without that and just tried to find occurrences of the word the and I get wrong numbers to. I was under the impression this would split the string up into an array and how many times that array was split up was how many times the word is in the string. Any ideas I would be grateful.

    Read the article

  • Best way to globally set every control's ValidationGroup property in an asp.net?

    - by Steve Flook
    I have a User Control with form items that is re-used considerably throughout a large web application, and up until this point the validation summary upon an invalid form submission is being handled by the .aspx that consumes the User Control. Now I need to set the ValidationGroup property at runtime for each of my form items controls (textboxes, lists, validators, etc). Rather than do it manually by setting each control I'm interested in iterating through all the controls in the User Control, detecting if that control has a ValidationGroup property, and setting it's value that way. Something like this: For Each ctrl As System.Web.UI.Control In Me.Controls ' so now what is the proper way to detect if this control has the ValidationGroup property Next Code sample in vb.net or c# works for me. Many thanks!

    Read the article

  • IE downloads and installs CAB dialog popup upon every page refresh

    I have a signed cab on an aspx page. I am seeing the following inconsistent behavior. Any insights would be highly appreciated. On some machines, the cab is downloaded and installed on every page refresh. On few of those machines, the IE "install cab" dialog pops up on every page refresh, while on the others it pops up only once. Additional info: The CAB contains a .NET DLL The CAB is slightly large (around 30 MB), hence recurring download behavior is a pain Target browsers are IE6 and IE7, and the behavior is common to both!

    Read the article

  • Windows Azure local development environment speed

    - by Paperjam
    I've started porting an existing ASP.NET web app to Windows Azure and have noticed that the development process is really slow. Each time I make a change to my code and want to view it, I have to effectively redeploy it to the local dev cloud (using Start debugging (F5) or Start without debugging (Ctrl-F5). The process itself takes over a minute, during which time Visual Studio is completely unresponsive. Am I doing something wrong or is that simply how things are developing for Azure? My specs: Visual Studio 2008 9.0.30729.1 SP 5 projects running on .NET 3.5 SP1 Azure SDK 1.1 (February 2010) Single instance of a single web role Dual-core AMD 64 machine with 8GB RAM, 64-bit Windows 7, fully patched The main project itself is quite large (3k files, ~200k lines) but compiles normally in 10-15 seconds

    Read the article

  • IE Print CSS and spanning page breaks

    - by DA
    I've been working on trying to fix an issue with print CSS and IE where things would disappear when printing in landscape mode. It appears the issue is that the element I'm trying to print (a large DIV with content inside it) spans two pages when put into landscape mode. What is happening is when the element spans two pages, the first page is blank, and the second page is printing what would normally be left over from the first page. I think it's related to contained floats: wrapper div floated div1 floated div2 If I set the two nested divs to float: none in the print CSS file, then IE will print them, albeit not in the layout we'd like. Before I spend another hour on this, anyone know what, specifically, is the issue here and if there's a known workaround?

    Read the article

  • On Solaris, what is the difference between cut and gcut?

    - by Chris J
    I recently came across this crazy script bug on one of my Solaris machines. I found that cut on Solaris skips lines from the files that it processes (or at least very large ones - 800 MB in my case). > cut -f 1 test.tsv | wc -l 457030 > gcut -f 1 test.tsv | wc -l 840571 > cut -f 1 test.tsv > temp_cut_1.txt > gcut -f 1 test.tsv > temp_gcut_1.txt > diff temp_cut_1.txt temp_gcut_1.txt | grep '[<]' | wc -l 0 My question is what the hell is going on with Solaris cut? My solution is updating my scripts to use gcut but... what the hell?

    Read the article

  • How can I track down these Firefox warning messages?

    - by Charles Anderson
    Since I upgraded to jQuery 1.4.4 I've been getting several new warning messages when I run my unit tests in Firefox 3.6.13. Here's a typical one: Warning: Unexpected token in attribute selector: '!'. Source File: http://localhost/unitTests/devunitTests.html Line: 0 Or the even more useful: Warning: Selector expected. Source File: http://localhost/unitTests/ui/editors/iframe2.html?test=15 Line: 0 The web page renders nicely, and all my JavaScript code seems to be running okay too, so I'm reluctant to spend a potentially large amount of time chopping away at my code to track these messages down. However, can anyone suggest what's provoking the warnings?

    Read the article

  • How do I send data from Java to Flash locally?

    - by terence
    I have a website with a Java applet and a Flash application. I want the Java applet to send data to the Flash application locally. What's the best way to do this? The data I want to send are potentially large images (possibly up to 1MB in size). This means sending a base64 string to javascript and then to Flash would probably be too cumbersome. I don't want to have to contact external servers or anything; all of it should be possible locally and offline. Is there some easy way to just send this sort of data around? If I saved the file locally first, Flash wouldn't be able to access that, would it?

    Read the article

  • Drive space hungry NoSQL's databases

    - by forum_inquisitor
    I've tested NoSQL databases like CouchDB, MongoDB and Cassandra and observed tendence to absorbing very large amount of drive space relative to inserted key-value pairs. When comparing CouchDB and MySQL schemaless databases CouchDB is consuming much more drive space than MySQL. I know about that key-value DBs by default are versioning and have long uuid and need key optimalisation - the comparison was between about 15 mln rows in MySQL and 1-5 mln documents listed NoSQL DB's. My question is : Is there any NoSQL with good compaction / compression of data? So that I can have NoSQL database with a size closer to 5GB than 50GB?

    Read the article

  • Performance problems when loading local JSON via <script> elements in IE8

    - by Jens Bannmann
    I have a web page with some JS scripts that needs to work locally, e.g. from hard disk or a CD-ROM. The scripts load JSON data from files by inserting <script> tags. This worked fine in IE6, but now in IE8 it takes an enormous amount of time: it went from "instantly" to 3-10 seconds. The main data file is 45KB large. How can I solve this? I would switch from <script> tags to another method of loading JSON (ideally involving the new native JSON parser), but it seems locally loaded content cannot access the XMLHttpRequest object. Any ideas?

    Read the article

  • Preg_match class name from PHP file

    - by talentedmrjones
    I have a script that recursively scans a directory pulling out class names from php files, and storing those classes names in an array. This is working nicely even through the rather large Zend Framework library folders. The issue is that classes that extend other classes are not being included in the array. Here is my current preg_match: if (preg_match("/class\s*(\w*)\s*\{/i",strip_comments(file_get_contents($file)),$matches)) $classes[] = $matches[1]; I know that the last \s* is not right; there should be something there that can catch "{" or " extends Some_Other_Class {" .

    Read the article

  • .NET framework 4 total application deployment size

    - by kzen
    After watching in horror as the .NET framework 3.5 SP1 bloated to whopping 231 MB I was amazed to see that .NET Framework 4 Full (x86) is only 35 MB and client profile just 29 MB. My question is if .NET Framework 4 is in any way dependent on previous versions of the framework being installed on the client machine or if my users will have to download only 29 (or 35) MB if I develop a Winforms or WPF desktop application in VS 2010 targeting .NET Framework version 4.0? Edit: Wikipedia concurs with the answers: Some developers have expressed concerns about the large size of .NET framework runtime installers for end-users. The size is around 54 MB for .NET 3.0, 197 MB for .NET 3.5, and 250 MB for .NET 3.5 SP1 (while using web installer the typical download for Windows XP is around 50 MB, for Windows Vista - 20 MB). The size issue is partially solved with .NET 4 installer (x86 + x64) being 54 MB and not embedding full runtime installation packages for previous versions.

    Read the article

  • Make errors - can the gcc compiler warnings prevent a C file from being compiled into an object file

    - by Xolstice
    I'm trying to compile a wireless network card driver for my Linux box and I ran into a problem with the Make command. During the compilation process I normally see warnings on some of the C files that being are compiled; despite the warnings these files were still able to be compiled to an object file. When the Make process comes to a file called rtmp_wext.c however, the compiler generates a large number of warnings and then the whole Make process stops and returns an exit status of error 1, i.e. make: *** [rtmp_wext.o] Error 1. Usually I see an error with the C file for compilation to halt. This is the first time where it seems compiler warnings are preventing the file from being turned into an object file; is this possible or is something else the cause for the unsuccessful compilation?

    Read the article

< Previous Page | 305 306 307 308 309 310 311 312 313 314 315 316  | Next Page >